1 / 27

Coreference Based Event-Argument Relation Extraction on Biomedical Text

Coreference Based Event-Argument Relation Extraction on Biomedical Text. Katsumasa Yoshikawa 1) , Sebastian Riedel 2) , Tsutomu Hirao 3) , Masayuki Asahara 1) , Yuji Matsumoto 1) 1) Nara Institute of Science and Technology, Japan 2) University of Massachusetts, Amherst, USA

elton
Download Presentation

Coreference Based Event-Argument Relation Extraction on Biomedical Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coreference Based Event-Argument Relation Extraction on Biomedical Text Katsumasa Yoshikawa1), Sebastian Riedel2), Tsutomu Hirao3), Masayuki Asahara1), Yuji Matsumoto1) 1) Nara Institute of Science and Technology, Japan 2) University of Massachusetts, Amherst, USA 3) NTT Communication Science Lab. Japan SMBM 201025th - 26th October, 2010 Hinxton, Cambridge, UK

  2. Outline Research summary Related work of event extraction Proposed coreference based approach Experimental setup and highlighted data Conclusion and future work

  3. Summary of Our Research • Coreference Based Approach for biomedical event extraction with Markov Logic • Why coreference? • Extraction of valuable event-argument relations in discourse structure • Identification of arguments crossing sentence boundaries • Why Markov Logic? • Implementation of Salience in Discourse and Transitivity in very direct fashion

  4. Event-Argument Relation with Coreference Information We analyzed the effect on the binding and the activity of transcription factors at a regulatory element. S1 Theme Cause Theme Theme Theme TPA induction increases the binding of AP-1 factors to this element. S2 TPA induction inhibits the binding of the transcription factor NF-E2 to this transcriptional control element. S3 • Arguments are often related to the other mentions through coreference relations

  5. Event-Argument Relation with Coreference Information • "this element" in S2 is coreferent to… "a regulatory element" in S1 We analyzed the effect on the binding and the activity of transcription factors at a regulatory element. S1 Corefer Theme Cause Theme Theme Theme TPA induction increases the binding of AP-1 factors to this element. S2 TPA induction inhibits the binding of the transcription factor NF-E2 to this transcriptional control element. S3

  6. Event-Argument Relation with Coreference Information • The true argument (Theme) of binding is "a regulatory element“ and "this element" is just an anaphor of it • Transitivity enables us to extract it We analyzed the effect on the binding and the activity of transcription factors at a regulatory element. S1 (C) Theme (B) Corefer Theme Cause Theme (A) Theme Theme TPA induction increases the binding of AP-1 factors to this element. S2 TPA induction inhibits the binding of the transcription factor NF-E2 to this transcriptional control element. S3 (A) Theme & (B) Corefer => (C) Theme

  7. Event-Argument Relation with Coreference Information • Arguments mentioned over and over again have higher salience in discourse and should be extracted at any cost • Our approach can aggressively extracts such arguments that are valuable in discourse structure We analyzed the effect on the binding and the activity of transcription factors at a regulatory element. S1 Theme Corefer Theme Cause Theme Theme Theme TPA induction increases the binding of AP-1 factors to this element. S2 Corefer Theme TPA induction inhibits the binding of the transcription factor NF-E2 to this transcriptional control element. S3 Theme

  8. Outline Research summary Related work of event extraction Proposed coreference based approach Experimental setup and highlighted data Conclusion and future work

  9. Biomedical Event Extraction(BioNLP'09 Task 1) • Extracting events, arguments, and their relations in a document Theme Cause Theme Theme Theme TPA induction increases the binding of AP-1 factors to this element. event event event argument argument argument argument argument • Main targets : Event-Argument relations (E-As)

  10. Previous Work [in BioNLP’09] • Pairwise pipeline by SVM classifiers [Bjorne et al., 2009] No Theme arg1 event arg2 arg1 event arg2 • Coupling with proteins and labeling the roles • Identification of events • Collective approach by Markov Logic[Riedel et al., 2009] [Poon et al., 2010] Theme Cause Theme Cause arg1 event1 arg2 event2 arg3 • Jointly identify the most probable E-A assignments in a sentence

  11. Outline Research summary Related work of event extraction Proposed coreference based approach Experimental setup and highlighted data Conclusion and future work

  12. Markov Logic[Richardson and Domingos, 2006] A Statistical Relational Learning framework An expressive template language of Markov Networks Not only hard but alsosoft constraints A Markov Logic Network (MLN) is a set of pairs (φ, w) where φ is a formula in first-order logic w is a real number weight Higher weight  stronger constraint

  13. Coreference Based Event Extraction with Markov Logic • Hidden predicate (Query) • Observed predicate (Given) • Features are described by combinations of these predicates

  14. Example of Markov Logic Networks • Feature definition by weighted First-Order Logic grounded ※ all features are binary protein(6) pos(3,Verb) dep(3,6,obj) grounding wc(obj,Theme) wb(regulation, Theme) wa(Verb) event(3) role (3,6,Theme) eventType(3,regulation)

  15. Basic Ideas of Proposed Method • Effective employment of coreference information based on discourse structure • Salience in Discourse :aggressive extraction of valuable E-As • Consider event-argument relations crossing sentence boundaries • Transitivity involving coreference relations

  16. How to Use Coreference with Markov Logic? • Salience in Discourse • Transitivity • Feature Copy The IRF-2 promoter region contains a CpG island . S1 1 3 5 7 9 2 4 6 8 Theme Corefer Cause Theme The region is inducible by both interferons . S2 10 12 14 16 11 13 15 17

  17. Coreference Based Approach① (Salience in Discourse) • Tokens coreferent to something have higher salience in discourse and are more likely to be arguments of events The IRF-2 promoter region contains a CpG island . S1 1 3 5 7 9 2 4 6 8 Corefer Theme The region is inducible by both interferons . S2 10 12 14 16 11 13 15 17 ・・・(SiD) If "The region" is coreferent to "The IRF-2...", then there is at least one event related to "The region"

  18. Coreference Based Approach② (Transitivity) • Transition rules involving coreference relations allow us to extract cross sentential event-arguments with "sentence by sentence" manner The IRF-2 promoter region contains a CpG island . S1 1 3 5 7 9 2 4 6 8 (C) Theme (B) Corefer (A) Theme The region is inducible by both interferons . S2 10 12 14 16 11 13 15 17 (A) (B) (C) ・・・(T)

  19. Coreference Based Approach③(Feature Copy) • If a token coreferent to something, then we exploit the features of antecedents to identify intra sentential E-A relations The IRF-2 promoter region contains a CpG island . S1 1 3 5 7 9 2 4 6 8 Copy Corefer Theme The region is inducible by both interferons . S2 10 12 14 16 11 13 15 17 ・・・(FC)

  20. Outline Research summary Related work of event extraction Proposed coreference based approach Experimental setup and highlighted data Conclusion and future work

  21. Experimental Setup • Data:GENIA Event Corpus ver. 0.9 [Kim et al., 2008] • Preprocess : POS tagging, NE tagging, Parsing • Coreference resolver:pairwise model [Soon et al., 2001] • Learning & Inference:SVM • Event extraction: • Joint Markov Logic model [Riedel et al., 2009] • Learning : one-best MIRA • Inference : ILP solver with CPI [Riedel, 2008] • Provided by Markov thebeast • SVM pipeline [Bjorne et al., 2009] • Learning & Inference:multi-class SVM

  22. Experimental Result (Summary) • Results of Event Extraction (F1) ρ< 0.01 (McNemar’s test, 2-tailed) • We got statistically significant improvements by both models, SVM and MLN

  23. Three Types of E-A Relations The IRF-2 promoter region contains a CpG island . S1 1 3 5 7 9 2 4 6 8 (1) Cross Corefer (3) Normal (2) W-ANT The region is inducible by both interferons . S2 10 12 14 16 11 13 15 17 • Evaluation for the three types of E-A relations

  24. Experimental Result (E-A Relation) • Results of E-A Relation Extraction (F1) • Both Transitivity and Salience in Discourse work well • MLN with gold coreference annotations outperforms SVM pipeline both on Cross and on W-ANT

  25. Outline Research summary Related work of event extraction Proposed coreference based approach Experimental setup and highlighted data Conclusion and future work

  26. Summary • We proposed a new method for biomedical event extraction with coreference information • Our systems successfully extract cross-sentential E-As by transitivity including coreference relations • The concept of salience in discourse can also help E-A extraction • We got further improvements with gold coreference annotations especially for MLN

  27. Future Work • Make more effort to coreference resolution • From pairwise model to clustering approach • Full joint approach of event extraction and coreference resolution • Fighting against computational costs • Narrative Event Chains [Chambers et al., 2008]

More Related