1 / 42

Data Collection in Nespole!

Recent Advances in Speech Translation Systems. Data Collection in Nespole!. Goals, procedures and tools. Susanne Burger (Carnegie Mellon University) Erica Costantini (University of Trieste). New Idea. Why data collection?. Learning by Data. Speech Material:

diata
Download Presentation

Data Collection in Nespole!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recent Advances in Speech Translation Systems Data Collection in Nespole! Goals, procedures and tools Susanne Burger (Carnegie Mellon University) Erica Costantini (University of Trieste)

  2. New Idea Why data collection? Learning by Data • Speech Material: • Domain, concept, vocabulary • Style (Human machine conversation) • Quality (Robustness) ... • Information about Users: • Acceptance • Usage • Behavior • Wish-list • Problem solving ... • System Information (Dry Run): • Stability • Speed • Bugs ... J.T. Hackos, J.C. Redish, User and Task Analysis for interface design, J. Wiley & Sons, 1998.

  3. Mass-Data from the scratch Artificial Scenario/Environment/Set upWizard of OzCooperative User/Actor 1 2 AnalysisDevelopmentTrainingTestingEvaluation Data Corpus Data collection through usage of beta-system with increasing reality User-study Data Beta-System Learning by Data

  4. Data Collection: Planning • Who are the “Data Customers”?Nespole!: • ASR • MT • Synthesis • Interface Development • ... • Customer Needs? • Nespole!: • Audio / Video • Transcription (levels of transcription) • Segmentation • Data Usage? • Nespole!: • Analysis • Development • Training • Testing • Evaluation • Type of Collection? • Nespole!: • Mass Data Collection • Specific features • User study Time and Budget

  5. IDEA: NEgotiation through SPOken Language in E-commerce Mass-Data Collection: Showcase 1 Travel Scenario / H323 Set up Monolingual Cooperative Users AnalysisDevelopmentTrainingTestingEvaluation Data Corpus Travel + Multimodality Beta System MTUnseen Users Multimodal Experiment NespoleShowcase1-System Nespole! Data Collection

  6. Scen./Topic Recording Data Participants Environment Equipment Data Collection Procedure Example: Mass-Data Collection (Showcase 1) Monolingual data collection for system development “Assembling Line”

  7. Scen./Topic Recording Data Participants Environment Equipment

  8. Scenarios • Scenario: “story” about users, their work, their environment, how they do tasks, the task they need to do, and all combinations of these elements (*). • Scenario in Nespole! Detailed description of: • the customers’ features (age, marital status…); • the destination of the travel; • the objectives and preferences for the holiday (accommodation, sport activities, cultural events…) J. M. Carroll, Ed., Scenario-Based Design: Envisioning Work and Technology in System Development, New York, J. Wiley & Sons, 1995.

  9. Scenarios in Nespole!

  10. Scenario example Situation (Winter Holidays in Val di Fiemme): • choose your vacation starting date after December 10th you want to stay there for (a weekend, 1 week, 2 weeks) • you have 2 children (choose 2 ages between 2 and 11) and wife/husband • you want to travel by car and park it at the hotel • you already know the road to Val di Fiemme • you want accommodation in ** or *** hotels in Val di Fiemme with bed & breakfast • choose two hotels among: Latemar in Molina, Bellavista in Cavalese, Excelsior in Cavalese, Lagorai in Cavalese, Belvedere in Panchia, Bellaria in Predazzo, Cimon in Predazzo, Erica in Tesero, Lucia in Tesero, Montanara in Ziano, Zanon in Ziano • you want to practice a winter sport (choose your favorite winter sport among the following: down hill skiing, cross-country skiing/snowshoeing, ice skating, snow-boarding)

  11. Scenario example Things to ask for: • prices and how far in advance to book • types of ski-lifts nearby and their distance from hotel • existence of cross-country trails and ice skating areas • details about favorite winter-sport (exact location, prices, possibility of renting equipment) • type of parking facilities for the car • possibility of eating in the hotel and prices of dinner and late supper • daycare and activities for children in the hotel • special prices for children

  12. Scenario definition in Nespole! Example: Showcase 1 • analysis of 5000 e-mail messages (in four languages); • clustering of the e-mails on the base of the request type; • selection e-mails concerning requests which could be discussed through phone call; • construction of 21 scenarios; • selection of 5 scenarios* among the 21 (done by the APT tourist board office manager) * http://www.is.cs.cmu.edu/nespole/datacoll.html

  13. Scen./Topic Recording Data Participants Environment Equipment

  14. Participants CUSTOMERS: AGENTS: Italian professional agents working at Trentino tourist office APT

  15. Scen./Topic Recording Data Participants Environment Equipment

  16. File .wav (stereo) File .wav (stereo) Environment File .wav (stereo) H323 Agent Eng. Customer (local) File .wav (stereo) H323 Eng. customer Agent (local) • APT (agent’s site, Italy) records the English client via H323 connection and the Italian agent via headset • CMU (client’s site, USA) records the Italian agent via H323 connection and the English client via headset

  17. Scen./Topic Recording Data Participants Environment Equipment

  18. Hardware: PC Pentium 200 and up Software: Windows NT or Win 98 Total Recorder NetMeeting3.01 Microphone: Headset or close microphone Environment: Quiet office Equipment

  19. Scen./Topic Recording Data Participants Environment Equipment

  20. Recording Procedure(customer’s site)

  21. Recording:LTI Data Collection Database Oracle database, accessible online, containing detailed information and descriptions about meetings recorded, demographics of the speakers, transcriptions and audio files (currently two separate interfaces to enter data into and retrieve data from the database)

  22. Scen./Topic Recording Data Participants Environment Equipment

  23. 2 stereo wav files Spr protocol Rpr protocol video tapes (200 collected dialogues )

  24. Example from Nespole! file naming conventions File naming conventions Confusion with parallel recordings; different types of files concerning the same recording; different languages, types of scenario, locations; stereo vs mono files, etc. Why?

  25. Log data: recording protocol

  26. Log data: speaker protocol

  27. Audio Data ... m054_1_0575_QXE_00: if it was , I don't know , in the beginning of the century , I would think so , but . m054_5_0576_MTY_00: yeah , I mean , +/we d=/+ we don't know a lot <B> about anything . m054_4_0577_ZMW_00: but +/even/+ I think even if they would have known a little bit more . <B> think about all these chicken farms or things like all this <B> +/k=/+ kind of really <B> terrible <B> behavior against animals , anyway . <B> so , +/I/+ +/I don't think/+ <B> <hes> +/I th=/+ I think as soon as some financial or land things or things like this <B> came into the game , <B> they don't think anymore <Laugh> about <B> animal behavior . this is +/ku=/+ just <B> <Noise> secondary% . m054_3_0578_AAH_00: <hm> m054_5_0579_MTY_00: right . <B> m054_4_0580_ZMW_00: so , <B> this... Transcriptionprocess TranscriptionConventions Transcription Tool TRL FilesMAR FilesVoc Lists

  28. Audio Data ... m054_1_0575_QXE_00: if it was , I don't know , in the beginning of the century , I would think so , but . m054_5_0576_MTY_00: yeah , I mean , +/we d=/+ we don't know a lot <B> about anything . m054_4_0577_ZMW_00: but +/even/+ I think even if they would have known a little bit more . <B> think about all these chicken farms or things like all this <B> +/k=/+ kind of really <B> terrible <B> behavior against animals , anyway . <B> so , +/I/+ +/I don't think/+ <B> <hes> +/I th=/+ I think as soon as some financial or land things or things like this <B> came into the game , <B> they don't think anymore <Laugh> about <B> animal behavior . this is +/ku=/+ just <B> <Noise> secondary% . m054_3_0578_AAH_00: <hm> m054_5_0579_MTY_00: right . <B> m054_4_0580_ZMW_00: so , <B> this... Transcriptionprocess TranscriptionConventions Transcription Tool TRL FilesMAR FilesVoc Lists

  29. Transcription (trl) Conventions • Verbmobil II: • - we are familiar with VMB and we have appropriate tools • - BAS partitur format • - finite/close system (parsing, filtering,converting) • - line oriented, no formats (one line/turn) • - turn oriented (turn-IDs contain full identification) - time stamps and trl are in different files linked by turn-ID • (- http://www.is.cs.cmu.edu/trl_conventions/) S. Burger, L. Besacier, P. Coletti, F. Metze and C. Morel, “The NESPOLE! VoIP Dialogue Database”, in Proc. of Eurospeech 2001. Aalborg, Denmark.

  30. Content -words Orthography: - orthographic rules as long as they are non-ambiguous- no capitalization in case of initial sentence position - vocabulary lists to keep vocabulary spelled the same • word tags • non-grammatical phrases • broken words • interrupted words • acoustically hard to understand • pauses and breathing • filled pauses • acoustically not understandable • human noise • elements -rules • capitalization • punctuation • white space • turn-end • syntax

  31. <*tENG> Foreign Language Turn (JAP, GER, ..) ;.. global Comment ..'.. Apostrophe (reduced word) ..-.. (--) Hyphen (compound word) $.. spelled Letter ~..Name #.. Number *.. Neologism/Mispronunciation <*XXX.. Foreign Word (FRA,ITA, ..) ...<L>.. / ..<Z>.. Lengthening ..% Poor intelligible ..= Articulated Break-off .._ Interruption of a Word, Left Fragment _.. Interruption of a Word, Right Fragment <T_>.. Technical Interruption of a Word, Beginning ..<_T> Technical Interruption of a Word, End <*T> Technical interruption of a Turn <*T>t Technical Break-off of a Turn <!n ..> Comment on Pronunciation . / ? / , Punctuation +/.. Beginning of a Repetition/Correction ../+ End of a Repetition/Correction -/.. Beginning of a False Start ../- End of a False Start <B> / <A> Respiration <uh> / <"ah> Filled Pause (Hesitation) <uhm> / <"ahm> Filled Pause (Hesitation) <hm> Filled Pause (Hesitation) <hes> / <h"as> Filled Pause (Hesitation) <%> Unidentifiable Sound Production <Smack> / <Schmatzen> Nonverbal Artikulatory Sound (sound: smacking) <Swallow> / <Schlucken> Nonverbal Artikulatory Sound (sound: swallowing) <Throat> / <R"auspern> Nonverbal Artikulatory Sound (sound: clearing one's throat) <Cough> / <Husten> Nonverbal Artikulatory Sound (sound: cough) <Laugh> / <Lachen> Nonverbal Artikulatory Sound (sound: laughing) <Noise> / <Ger"ausch> Nonverbal Artikulatory Sound (other sounds) <#Click> / <#Klicken> Technical Noise <#Ring> / <#Klingeln> Technical Noise <#Knock> / <#Klopfen> Technical Noise <#Mtouch> / <#Mikrobe> Technical Noise <#Mwind> / <#Mikrowind> Technical Noise <#Rustle> / <#Rascheln> Technical Noise <#Squeak> / <#Quietschen> Technical Noise <#> Technical Noise <P> Pause during Speech @n.. Active Interference by a Speaker ..n@ Passively Interfered Speaker <@n.. Active Interference by Acoustic Events ..n@> Passive Interference of Acoustic Events <:<..> .. Beginning of Noise Interference ..:> End of Noise Interference <;..> Local Comment !KEY!.. Code Word <PP> Scenario Caused Pause

  32. Audio Data ... m054_1_0575_QXE_00: if it was , I don't know , in the beginning of the century , I would think so , but . m054_5_0576_MTY_00: yeah , I mean , +/we d=/+ we don't know a lot <B> about anything . m054_4_0577_ZMW_00: but +/even/+ I think even if they would have known a little bit more . <B> think about all these chicken farms or things like all this <B> +/k=/+ kind of really <B> terrible <B> behavior against animals , anyway . <B> so , +/I/+ +/I don't think/+ <B> <hes> +/I th=/+ I think as soon as some financial or land things or things like this <B> came into the game , <B> they don't think anymore <Laugh> about <B> animal behavior . this is +/ku=/+ just <B> <Noise> secondary% . m054_3_0578_AAH_00: <hm> m054_5_0579_MTY_00: right . <B> m054_4_0580_ZMW_00: so , <B> this... Transcriptionprocess TranscriptionConventions Transcription Tool TRL FilesMAR FilesVoc Lists

  33. Why another tool? Other requirements as before: - Windows instead of Linux - Meetings – multiparty transcription - Transcriber from different backgrounds At that time (over three years ago) there wasn’t a sufficient transcriber tool Transcription Tools • We did a study what would be the basic requirements. • We asked transcribers what they would find convenient. • We programmed a beta tool according to that. • We are still using this tool (and so do different other places in the mean time) • We call it TransEdit.

  34. TransEdit:transcription tool just for transcribers • MFC program • Windows text editor • click-able buttons for transcription elements • automatic turn naming and counting • label editor • parallel display of multi audio signals • easy turn segmentation • lots of listen functions • easy handling, no research functions • “home work” but available for universities • (write to: sburger@cs.cmu.edu)

  35. Audio Data ... m054_1_0575_QXE_00: if it was , I don't know , in the beginning of the century , I would think so , but . m054_5_0576_MTY_00: yeah , I mean , +/we d=/+ we don't know a lot <B> about anything . m054_4_0577_ZMW_00: but +/even/+ I think even if they would have known a little bit more . <B> think about all these chicken farms or things like all this <B> +/k=/+ kind of really <B> terrible <B> behavior against animals , anyway . <B> so , +/I/+ +/I don't think/+ <B> <hes> +/I th=/+ I think as soon as some financial or land things or things like this <B> came into the game , <B> they don't think anymore <Laugh> about <B> animal behavior . this is +/ku=/+ just <B> <Noise> secondary% . m054_3_0578_AAH_00: <hm> m054_5_0579_MTY_00: right . <B> m054_4_0580_ZMW_00: so , <B> this... Transcriptionprocess TranscriptionConventions Transcription Tool TRL FilesMAR FilesVoc Lists

  36. ; CDR: 00.00 ; TRV: 00.00 ; File: e025at ; Last changes made on 09/29/2000 ; Transcriber: VLM ; Comments: ; e025_1_0000_ITL_00: hello ? <P> can you hear me now ? e025_2_0001_XYZABC_00: hello . e025_1_0002_ITL_00: hello% . yeah% . e025_2_0003_ XYZABC _00: <uh> yes , I can . e025_1_0004_ITL_00: yes , okay . <P> so ? e025_2_0005_ XYZABC _00: -/hi I would like/- <P> yes ? e025_1_0006_ITL_00: yes , can you hear me now ? e025_2_0007_ XYZABC _00: <uh> yes , I can . e025_1_0008_ITL_00: okay . <B> wonderful . <Laugh> <B> <P> <Smack> <B> so , can I help you ? <B> e025_2_0009_ XYZABC _00: -/all right I would like/- <uh> yes , madam . I would like to schedule a winter vacation <P> in the north of Italy . e025_1_0010_ITL_00: <hm> <B> e025_1_0011_ITL_00: yes . <B> would you like t= <*T>t e025_1_0012_ITL_00: yes . would you like to come here% in summer or during winter ? e025_2_0013_ XYZABC _00: <uh> in winter please .

  37. first pass transcription (but not rough ..) close check and correction by another transcriber marker file and trl file cross-check spell-checking Data transcription process automatic convention check

  38. Audio Data ... m054_1_0575_QXE_00: if it was , I don't know , in the beginning of the century , I would think so , but . m054_5_0576_MTY_00: yeah , I mean , +/we d=/+ we don't know a lot <B> about anything . m054_4_0577_ZMW_00: but +/even/+ I think even if they would have known a little bit more . <B> think about all these chicken farms or things like all this <B> +/k=/+ kind of really <B> terrible <B> behavior against animals , anyway . <B> so , +/I/+ +/I don't think/+ <B> <hes> +/I th=/+ I think as soon as some financial or land things or things like this <B> came into the game , <B> they don't think anymore <Laugh> about <B> animal behavior . this is +/ku=/+ just <B> <Noise> secondary% . m054_3_0578_AAH_00: <hm> m054_5_0579_MTY_00: right . <B> m054_4_0580_ZMW_00: so , <B> this... Transcriptionprocess TranscriptionConventions Transcription Tool TRL FilesMAR FilesVoc Lists

  39. Following mass-data collectionShowcase 2a and 2b

  40. Doctors Medical scenarios development Analysis of medical databases Definition of some scripts Pre-tests Scenarios Data collection

More Related