1 / 5

Ulrich Heid, IMS-CL, Universität Stuttgart

Ulrich Heid, IMS-CL, Universität Stuttgart. Comments on Emanuele Pianta: Exploiting Parallel Texts to leverage the manual annotation bottleneck: the MultiSemCor case. The methodology: transfer of annotations. It does around 75% of the annotation work It produces

bryga
Download Presentation

Ulrich Heid, IMS-CL, Universität Stuttgart

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ulrich Heid, IMS-CL, Universität Stuttgart Comments on Emanuele Pianta:Exploiting Parallel Texts to leveragethe manual annotation bottleneck:the MultiSemCor case

  2. The methodology: transfer of annotations • It does around 75% of the annotation work • It produces • an annotated TL corpus (pos, lemma, sense) • an annotated parallel corpus

  3. Transfer of annotations: required infrastructure • „Controlled“ translation: sentence-wise, pos-preserving where possible • Multiword recognition • Parallel WordNets: Princeton  Target Language Problems could arise: • with „free“ translations (cf. Translation Memories) • with more „deviant“ WordNets, e.g. GermaNet

  4. Analysing the transfer result Systematic cases of non-alignment: • lack of „cross-linguistic synonymy“ • translation not 1:1 • not pos-preserving: coexist - coesistenza • 1:2: successfully - con successo • Do we get the same problems as those discussed as „divergences“/“mismatches“ in MT? • Would a marking of chunks in SL/TL help? • Would a morphology system help?

  5. Towards relaxing the conditions on the infrastructure • To get the system to work under suboptimal conditions • Would the integration of morphological relations across pos be useful? (Yes for alignment, no for WN synset transfer) • Could the system be made „aware“ of transfer problems (and signal these to the user?) • Test with e.g. Acquis Communautaire? • Test with Germanic languages?

More Related