European Language Resources Association (ELRA). HLT Evaluations. Khalid CHOUKRI ELRA/ELDA 55 Rue Brillat-Savarin, F-75013 Paris, France Tel. +33 1 43 13 33 33 -- Fax. +33 1 43 13 33 30 Email: [email protected] http://www.elda.org/ or http://www.elra.info/.
The mission of the Association is to promote language resources (henceforth LRs) and evaluation for the Human Language Technology (HLT) sector in all their forms and all their uses;ELRA: An efficient infrastructure to serve the HLT CommunityStrategies for the next Decade … New ELRA status:
Meeting points with technology development
which have been
Long term / high risk
Large return of investment
What about good technology ? ….
Software industryTechnology performance & Applications
MT users (humans or software e.g. CLIR ) want to improve productivity using the most suitable MT system (e.g. multilinguality)
….HLT Evaluations …. For whom
A.1) Face Detection
A.2) Visual Person Tracking
A.3) Visual Speaker Identification
A.4) Head Pose Estimation
A.5) Hand Tracking
B) Sound and Speech technologies
B.1) Close-Talking Automatic Speech Recognition
B.2) Far-Field Automatic Speech Recognition
B.3) Acoustic Person Tracking
B.4) Acoustic Speaker Identification
B.5) Speech Activity Detection
B.6) Acoustic Scene Analysis
C) Contents Processing technologies
C.1) Automatic Summarisation … Question AnsweringSome of the technologies being evaluated within CHIL …http://chil.server.de/
more at the CHIL/CLEAR workshops
ARCADE II: evaluation of bilingual corpora alignment systems.
CESART: evaluation of terminology extraction systems.
CESTA: evaluation of machine translation systems (Ar, Eng => Fr).
EASY: evaluation of parsers.
ESTER: evaluation of broadcast news automatic transcribing systems.
EQUER: evaluation of question answering systems.
EVASY: evaluation of speech synthesis systems.
MEDIA: evaluation of in and out-of context dialog systems.Evaluation Projects …. The French sceneSome projects in NL, Italy, ...
European Parliament Plenary Sessions: (EPPS): English (En) and Spanish (Es),
Broadcast News (Voice of America VoA): Mandarin Chinese (Zh) and English (En)Back to Evaluation Tasks within TC-STAR (http://www.tc-star.org/)
Input = Text ,
Evaluation of ASR (Rover) + SLT (Rover) +TTS (UPC) system
Same segments as for SLT human evaluation
Adequacy: comprehension test
Fluency: judgement test with several questions related to fluency and also usability of the systemEnd-to-End
1: Not at all , ...........5: Yes, absolutely
[Fluent Speech] Is the speech in good Spanish?
1: No, it is very bad ...... 5: Yes, it is perfect
[Effort] Rate the listening effort
1: Very high ............ 5: Low, as natural speech
[Overall Quality] Rate the overall quality of this audio sample
1: Very badm unusable ...... 5: It is very usefulFluency questionnaire