1 / 6

The SALSA experience: semantic role annotation

The SALSA experience: semantic role annotation. Katrin Erk University of Texas at Austin. Semantic role annotation in SALSA. SALSA: The Sa arbr ücken L exical S emantics Annotation and A nalysis project Manual annotation of the German TIGER corpus with lexical semantic information

Download Presentation

The SALSA experience: semantic role annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin

  2. Semantic role annotation in SALSA • SALSA: The Saarbrücken Lexical Semantics Annotation and Analysis project • Manual annotation of the German TIGER corpus with lexical semantic information • Basis: The Berkeley FrameNet database • Verbs annotated with their Frame (~ sense),plus semantic roles • TIGER corpus: • 1.5 million words / 80 K sentences of German newspaper text (Frankfurter Rundschau) • Stuttgart/Potsdam/Saarbrücken • Phrase types and grammatical functions

  3. Semantics: Independent frames Trees of depth one One edge points to target, others to frame elements Sem. roles point to syn. constituents TIGER Syntax: Node labels: constituents Edge labels: gramm. functions Crossing edges POS Annotation Scheme (They didn‘t want to pay the move back because the employee had quit.)

  4. Experiences with the semantic role annotation in Salsa • Frame (~ sense) assignment more difficult than role assignment • Multiple tags possible, at frame level and at role level • Limited compositionality phenomena, each with separate annotation format in Salsa: • Light verbs, metaphor, idioms • Distinction often difficult: metaphor vs idiom, bleaching • If I did this again, one format, multiple tags possible • Annotation beyond the sentence boundary • Message role in Communication frames • Annotation below the word boundary: German noun compounds • Mietrechtsdiskussion: discussionof tenant law

  5. Encoding sem. role annotation: TIGER XML as a great basis • TIGER XML: • each constituent is an XML element with a globally unique ID • Syn. edges explicitly encoded:<edge> elements links two nodes, referring to their IDs • Models discontinuous constituents • Salsa/Tiger XML: • Sem. annotation by adding a modular <sem> block to the XML structure of a sentence • Semantics points to syn. constituents using their IDs • Annotation beyond sentence boundary possible: globally unique syn. IDs

  6. Extracting a lexicon: need for a deeper, richer syntax • Extracting syntax/semantics mapping: • needs to identify gramm. functions filled by sem. roles • Problems: • Constituent structure rather thandependencies: subjects hard to retrieve • TIGER does not mark voice • Shallow format for PPs: determining heads is hard • Coordination is a pain

More Related