1 / 17

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS. Kadri Hacioglu 1 , Sameer Pradhan 1 , Wayne Ward 1 James H. Martin 1 , Daniel Jurafsky 2. 2 Stanford NLP Group Stanford University. 1 The Center for Spoken Language Research. University of Colorado at Boulder. OUTLINE.

meris
Download Presentation

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS Kadri Hacioglu1, Sameer Pradhan1, Wayne Ward1 James H. Martin1, Daniel Jurafsky2 2 Stanford NLP Group Stanford University 1 The Center for Spoken Language Research University of Colorado at Boulder

  2. OUTLINE • Semantic Role Labeling (SRL) • Nature of Shared Task Data • Our Strategy • System Description & Features • Experiments • Concluding Remarks

  3. Predicate: pursue A0 (Agent) A1 (Theme) AM-MNR (Manner) we completion of this transaction aggressively SEMANTIC ROLE LABELING • Based on predicate-argument structure: • First explored by (Gildea & Jurafsky, 2000) PropBank style Thematic role [A0We] are prepared to [PREDpursue] [A1aggressively] [AM-MNRcompletion of this transaction] he says

  4. EXAMPLE OF SHARED TASK DATA words POS tags Clause tags Semantic labels Sales NNS B-NP (S* - (A1*A1) declined VBD B-VP * decline (V*V) 10 CD B-NP * - (A2* % NN I-NP * - *A2) to TO B-PP * - * $ $ B-NP * - (A4* 251.2 CD I-NP * - * million CD I-NP * - *A4) from IN B-PP * - * $ $ B-NP * - (A3* 287.7 CD I-NP * - * million CD I-NP * - *A3) . . O *S) - * Predicate Info BP tags (BOI2)

  5. OUTLINE OF OUR STRATEGY • Change Shared Task Representation • make sure that it is reversible • Engineer additional features • use intuition, experience and data analysis • Optimize system settings • context size • SVM parameters; degree of polynomial, C

  6. CHANGE IN REPRESENTATION • Restructure available information • - words collapsed into respective BPs • - only headwords are retained (rightmost words) • - exceptions: VPs with the predicate; Outside (O) chunks • Modify semantic role labeling • - BOI2 scheme instead of bracketing scheme

  7. NEW REPRESENTATION BPs POS tags Clause tags Semantic labels (BOI2) NP Sales NNS B-NP (S* - B-A1 VP declined VBD B-VP * decline B-V NP % NN I-NP * - B-A2 PP to TO B-PP * - O NP million CD I-NP * - B-A4 PP from IN B-PP * - O NP million CD I-NP * - B-A3 O . . O *S) - O Predicate Info Headwords BP tags (BOI2)

  8. DIFFERENCES BETWEEN REPRESENTATIONS

  9. SYSTEM DESCRIPTION • Phrase-by-phrase • Left-to-right • Binary feature encoding • Discriminative • Deterministic • SVM based (YamCha toolkit, developed by Taku Kudo) • Simple post-processing (for consistent bracketing)

  10. BASE FEATURES • Words • Predicate lemmas • Part of speech tags • Base phrase IOB2 tags • Clause bracketing tags • Named Entities

  11. ADDITIONAL FEATURES Sentence level Token level Token position Path Clause bracket patterns Clause Position Headword suffixes Distance Length Predicate POS tag Predicate Frequency Predicate Context (POS, BP) Predicate Argument Frames Number of predicates

  12. EXPERIMENTAL SET-UP • Corpus: Flattened PropBank (2004 release) • Training set: Sections 15-18 • Dev set: Section 20 • Test set: Section 21 • SVMs: 78 OVA classes, polynomial kernel, d=2, C=0.01 • Context: sliding +2/-2 tokens window

  13. Method Data Precision Precision Recall Recall F1 F1 Dev set W-by-W 74.17% 68.34% 69.42% 45.16% 54.39 71.72 P-by-P Test set 69.04% 72.43% 66.77% 54.68% 61.02 69.49 RESULTS Base features, W-by-W & P-by-P approaches, dev set All features, P-by-P approach

  14. CONCLUSIONS • We have done SRL by tagging base phrase chunks • - original representation has been changed • - additional features have been engineered • - SVMs have been used • Improved performance with new representation and • additional features • Compared to W-by-W approach, our method • - classifies larger units • - uses wider context • - runs faster • - performs better

  15. THANK YOU! Boring! So so… Cool! Wow!… That’s OK!… Awesome! Not too bad! Yawning..

  16. CLAUSE FEATURES Clause (CL) markers CL pattern to predicate One CD B-NP (S* - OUT (S*(S**S) - troubling VBG I-NP * - OUT (S**S) (S*aspect NN I-NP * - OUT (S**S) (S*of IN B-PP * - OUT (S**S) (S*DEC NNP B-NP * - OUT (S**S) (S*'s POS B-NP * - OUT (S**S) (S*results NNS I-NP * - OUT (S**S) (S*, , O * - OUT (S**S) (S*analysts NNS B-NP (S* - IN (S**S) (S*said VBD B-VP *S) say IN - - , , O * - OUT *S) *S) was VBD B-VP * - OUT *S) *S)its PRP$ B-NP * - OUT *S) *S) performance NN I-NP * - OUT *S) *S) in IN B-PP * - OUT *S) *S) Europe NNP B-NP * - OUT *S) *S) . . O *S) - OUT *S)*S) - CL pattern to sentence begin predicate CL pattern to sentence end

  17. SUFFIXES • The confusion • B-AM-MNR  B-AM-TMP • single word cases: fetchingly, tacitly, provocatively •  suffixes of length 2-4 as features for head words are tried

More Related