Lending a Hand:
Download
1 / 19

Lending a Hand: Sign Language Machine Translation - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

Lending a Hand: Sign Language Machine Translation. Sara Morrissey NCLT Seminar Series 21 st June 2006. Overview. Introduction What, why, how…? Out with the old… SL Corpora The System …in with the new *new and improved* Lost in Translation Evaluation issues Conclusion.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lending a Hand: Sign Language Machine Translation' - rimona


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Lending a Hand:

Sign Language Machine Translation

Sara Morrissey

NCLT Seminar Series

21st June 2006


Overview
Overview

  • Introduction

    • What, why, how…?

  • Out with the old…

    • SL Corpora

    • The System

  • …in with the new

    • *new and improved*

  • Lost in Translation

    • Evaluation issues

  • Conclusion


Introduction

  • WHAT ?

  • Sign Language

  • Visually articulated language

  • Linguistic phenomena prevalent to SLs

    • Classifiers

    • Non-manual features (NMFs)

    • Discourse mapping and use of signing space


Introduction 2
Introduction (2)

  • WHY ?

  • a)Improve communication

    b) Stretching application of EBMT

  • HOW?

  • Our approach

    • Annotated SL corpora

    • Example-based MT employing Marker Hypothesis (Green, 1979)


Introduction 3
Introduction (3)

  • Other approaches

    • Transfer - Grieve-Smith, 1999; Marshall & Sáfár, 2002, Sáfár & Marshall 2002; Van Zijl & Barker, 2003

    • Interlingua – Veale et al., 1998; Zhao et al., 2000

    • Multi-path – Huenerfauth, 2004, 2005

    • Statistical – Bauer et al., 1999, Bungeroth & Ney, 2004, 2005, 2006


Out with the old…Corpora

  • Difficult to find

  • ECHO project

  • Nederlandse Gebarentaal (NGT) corpora

    • 40 minutes of video data

    • 5 Aesop’s fables by two signers and SL poetry

    • Combined corpus of 561 sentences


Out with the old annotation
Out with the old… Annotation

  • Why annotate?

    • No formal written form for SLs

    • Linguistic description including NMFs

    • Can include translation making corpus bi/trilingual

    • Time for chunking and aligning present

  • ELAN annotation toolkit

    • Graphical user interface displaying videos and annotations simultaneously (Fig. 1)

    • Time-aligned and non-time-aligned annotations including NMF description, repetition notation and notes on indexing and role.



Out with the old the system
Out with the old…The System

  • Segmentation using the ‘Marker Hypothesis’ (MH) (Green, 1979)

    • Analagous to system of (Way & Gough, 2003; Gough & Way, 24a/b)

    • Segments spoken language sentences according to a set of closed class words

    • Chunks start with closed class words and usually encapsulate a concept or an attribute of a concept forming concept chunks, e.g <CONJ> or with tiny curls


Out with the old the system 2
Out with the old…The System (2)

  • MH not suitable for use with SL side of corpus due to sparseness of closed class item markers

    • NGT gloss tier segmented based on time spans of its annotations, remaining annotations with same time span grouped with gloss tier segments forming concept chunks similar to English marker chunks

    • Despite different methods, they are successful in forming potentially alignable concept chunks


Out with the old the system 3

English chunk

<CONJ> or with tiny curls

NGT chunk

<CHUNK>

(Gloss RH) TINY CURLS

(Gloss LH) TINY CURLS

(Repetition RH) u

(Repetition LH) u

(Eye Gaze) l,d

Out with the old…The System (3)


Out with the old the system 4
Out with the old…The System (4)

  • Searches for exact sentence match in aligned bilingual corpus

  • Uses MH to segment input and searches matching or close match chunks in English side of aligned corpus

  • Looks for individual words in the bilingual lexicon


Out with the old experiments
Out with the old…Experiments

  • English and Dutch to NGT (Morrissey & Way, 2005)

    • 100 sentences

    • Annotations subjective so evaluation difficult, but promising results

  • NGT to English Dutch

    • Traditional MT evaluation metrics can be applied (SER, WER, PER, BLEU)

    • Sparse output and low scores due to lack of closed class lexical items in NGT

    • Common marker word insertion


Out with the old…Experiments (2)

  • SER 96% WER 119%

  • PER 78% BLEU 0

  • Example output and reference translation:

    • mouse promised help

    • “you see” said the mouse, “I promised to help you”


…in with the new

  • New Corpus

    • ~1400 sentences (SunDial and ATIS corpora)

    • Flight information queries

  • ISL signed video version

  • Homespun annotation

    • With view to end product

  • New system

    • OpenLab


Lost in translation evaluation issues
Lost in TranslationEvaluation issues

  • Mainstream evaluation techniques

    • Exact text matching

    • No recognition of synonyms, syntactic structure, semantics

    • SLs no gold standard

  • Other possible evaluation metrics

    • Number of content words/number of words in ref translation

    • Evaluation of syntactic or semantic relations


Conclusions
Conclusions

  • Basic system

  • Corpus problems - Larger corpus such as ISL one in creation, more scope for matches, annotations subjective

  • EBMT caters for some SL linguistic phenomena

  • Evaluation metrics unsuitable oral <->non-oral translation


Future work
Future Work

  • Adding in NMF information

  • Manual analysis

  • Language model to improve output

  • Suitable evaluation metrics

  • Review other writing systems for SLs

  • Avatar…


Thank you
Thank You

Questions?


ad