1 / 57

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3

Ninth Conference of the European Chapter of the Association for Computational Linguistics EACL'99 Bergen, June 10, 1999. Deep Processing of Shallow Structures The Robust Integration of Speech, Language and Translation Technology for Intelligent Interface Agents. Wolfgang Wahlster.

chaynes
Download Presentation

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ninth Conference of the European Chapter of the Association for Computational Linguistics EACL'99 Bergen, June 10, 1999 Deep Processing of Shallow StructuresThe Robust Integration of Speech, Language and Translation Technology for Intelligent Interface Agents Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49 681) 302-5341 e-mail: wahlster@dfki.de WWW:http://www.dfki.de/~wahlster

  2. Outline 1. Speech-to-Speech Translation: Challenges for Language Technology 2. A Multi-Blackboard Architecture for the Integration of Deep and Shallow Processing 3. Integrating the Results of Multiple Deep and Shallow Parsers 4. Packed Chart Structures for Partial Semantic Representations 5. Robust Semantic Processing: Merging and Completing Discourse Representations 6. Combining the Results of Deep and Shallow Translation Threads 7. The Impact of Verbmobil on German Language Industry 8. SmartKom: Integrating Verbmobil Technology Into an Intelligent Interface Agent 9. Conclusion

  3. Challenges for Language Engineering Input Conditions Naturalness Adaptability Dialog Capabilities Close-Speaking Microphone/Headset Push-to-talk Speaker Dependent Isolated Words Monolog Dictation Speaker Independent Information- seeking Dialog Read Continuous Speech Telephone, Pause-based Segmentation Increasing Complexity Spontaneous Speech Open Microphone, GSM Quality Multiparty Negotiation Speaker adaptive Verbmobil

  4. Context-Sensitive Speech-to-Speech Translation Wann fährt der nächste Zug nach Hamburg ab? When does the next train to Hamburg depart? Wo befindet sich das nächste Hotel? Where is the nearest hotel? V e r b m o b i l S e r v e r Final Verbmobil Demos: l World Expo-2000 (Hannover) l CeBIT-2000 (Hannover) l COLING-2000 (Saarbrücken)

  5. Dialog Translation 1 Wenn ich den Zug um 14 Uhr bekomme, bin ich um 4 in Frankfurt. If I get the train at 2 o‘clock I am in Frankfurt at 4 o‘clock. Am Flughafen könnten wir uns treffen. We could meet at the airport.

  6. Dialog Translation 2 Abends könnten wir Essen gehen. We could go out for dinner in the evening. Wann denn am Abend? What time in the evening?

  7. Dialog Translation 3 Ich könnte für 8 Uhr einen Tisch reservieren. I could reserve a table for 8 o‘clock.

  8. Verbmobil II: Three Domains of Discourse Scenario 2 Travel Planning & Hotel Reservation Scenario 3 PC-Maintenance Hotline Scenario 1 Appointment Scheduling When? What? When? Where? How? When? Where? How? Focus on temporal expressions Integration of special sublanguage lexica Focus on temporal and spatial expressions Vocabulary Size: 2500/6000 Vocabulary Size: 15000/30000 Vocabulary Size: 7000/10000

  9. Verbmobil Partner TU-BRAUNSCHWEIG DAIMLERCHRYSLER RHEINISCHE FRIEDRICH WILHELMS-UNIVERSITÄT BONN LUDWIG MAXIMILIANS UNIVERSITÄT MÜNCHEN Phase 2 UNIVERSITÄT BIELEFELD UNIVERSITÄT DES SAARLANDES TECHNISCHE UNIVERSITÄT MÜNCHEN UNIVERSITÄT HAMBURG FRIEDRICH- ALEXANDER- UNIVERSITÄT ERLANGEN-NÜRNBERG RUHR-UNIVERSITÄT BOCHUM EBERHARDT-KARLS UNIVERSITÄT TÜBINGEN UNIVERSITÄT STUTTGART UNIVERSITÄT KARLSRUHE  W. Wahlster, DFKI

  10. The Control Panel of Verbmobil

  11. The Control Panel of Verbmobil

  12. The Control Panel of Verbmobil

  13. The Control Panel of Verbmobil

  14. The Control Panel of Verbmobil

  15. The Control Panel of Verbmobil

  16. The Control Panel of Verbmobil

  17. The Control Panel of Verbmobil

  18. The Control Panel of Verbmobil

  19. The Control Panel of Verbmobil

  20. The Control Panel of Verbmobil

  21. From a Multi-Agent Architecture to a Multi-Blackboard Architecture Verbmobil I Verbmobil II  Multi-Agent Architecture  Multi-Blackboard Architecture M1 M3 M2 M3 M1 Blackboards M2 BB 1 BB 2 BB 3 M4 M5 M6 M5 M4 M6  Each module must know, which module produces what data  Direct communication between modules  Each module has only one instance  Heavy data traffic for moving copies around  Multiparty and telecooperation applications are impossible  Software: ICE and ICE Master  Basic Platform: PVM  All modules can register for each blackboard dynamically  No direct communication between modules  Each module can have several instances  No copies of representation structures (word lattice, VIT chart)  Multiparty and Telecooperation applications are possible  Software: PCA and Module Manager  Basic Platform: PVM

  22. A Multi-Blackboard Architecture for the Combinationof Results from Deep and Shallow Processing Modules Command Recognizer Channel/Speaker Adaptation Audio Data Spontaneous Speech Recognizer Prosodic Analysis Statistical Parser Chunk Parser Word Hypothesis Graph with Prosodic Labels Dialog Act Recognition HPSG Parser Semantic Construction Semantic Transfer VITs Underspecified Discourse Representations Robust Dialog Semantics Generation

  23. Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture Augmented Word Lattice Statistical Parser Chunk Parser HPSG Parser partial VITs Chart with a combination of partial VITs partial VITs partial VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs Complete and Spanning VITs

  24. Extracting Statistical Properties from Large Corpora Segmented Speech with Prosodic Labels Treebanks & Predicate- Argument Structures Annotated Dialogs with Dialog Acts Aligned Bilingual Corpora Transcribed Speech Data Machine Learning for the Integration of Statistical Properties into Symbolic Models for Speech Recognition, Parsing, Dialog Processing, Translation Neural Nets, Multilayered Perceptrons Probabilistic Transfer Rules Hidden Markov Models Probabilistic Automata Probabilistic Grammars

  25. VHG: A Packed Chart Representation of Partial Semantic Representations l Incremental chart construction and anytime processing l Rule-based combination and transformation of partial UDRS coded as VITs l Selection of a spanning analysis using a bigram model for VITs (trained on a tree bank of 24 k VITs) l Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) l Statistical LR parser trained on treebank (Block, Ruland) l Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction

  26. Robust Dialog Semantics: Deep Processing of Shallow Structures Goals of robust semantic processing (Pinkal, Worm, Rupp) l Combination of unrelated analysis fragments l Completion of incomplete analysis results l Skipping of irrelevant fragments Method: Transformation rules on VIT Hypothesis Graph: Conditions on VIT structures  Operations on VIT structures The rules are based on various knowledge sources: l lattice of semantic types l domain ontology l sortal restrictions l semantic constraints Results: 20% analysis is improved, 0.6% analysis gets worse

  27. Semantic Correction of Recognition Errors Wir treffen uns Kaiserslautern. (We are meeting Kaiserslautern.) We are meeting in Kaiserslautern. German English

  28. Robust Dialog Semantics: Combining and Completing Partial Representations Let us meet (in) the late afternoon to catch the train to Frankfurt the late afternoon the train to Frankfurt meet to catch Let us The preposition ‚in‘ is missing in all paths through the word hypothesis graph. A temporal NP is transformed into a temporal modifier using a underspecified temporal relation: [temporal_np(V1)]  [typeraise_to_mod (V1, V2)] & V2 The modifier is applied to a proposition: [type (V1, prop), type (V2, mod)] [apply (V2, V1, V3)] & V3

  29. The Understanding of Spontaneous Speech Repairs I need a car next Tuesday oops Monday Editing Phase Repair Phase Original Utterance Reparans Hesitation Reparandum Recognition of Substitutions Transformation of the Word Hypothesis Graph I need a car next Monday Verbmobil Technology: Understands Speech Repairs and extracts the intended meaning Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances.

  30. Automatic Understanding and Correction of Speech Repairs in Spontaneous Telephone Dialogs Wir treffen uns in Mannheim, äh, in Saarbrücken. (We are meeting in Mannheim, oops, in Saarbruecken.) We are meeting in Saarbruecken. German English

  31. Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition forSemantic Transfer HPSG Analysis Probabilistic Analysis of Dialog Acts (HMM) Robust Dialog Semantics Dialog Act Type VIT Dialog Act Type Recognition of Dialog Plans (Plan Operators) Semantic Transfer Dialog Phase

  32. The Dialog Act Hierarchy used for Planning,Prediction, Translation and Generation GREETING_BEGIN GREETING_END GREETING INTRODUCE POLITENESS_FORMULA THANK DELIBERATE BACKCHANNEL CONTROL_DIALOG INIT DEFER CLOSE MANAGE_TASK Dialog Act REQUEST_SUGGEST REQUEST_CLARIFY REQUEST_COMMENT REQUEST_COMMIT REQUEST SUGGEST INFORM FEEDBACK COMMIT DEVIATE_SCENARIO REFER_TO_SETTING DIGRESS EXCLUDE CLARIFY GIVE_REASON CLARIFY_ANSWER PROMOTE_TASK REJECT FEEDBACK_NEGATIVE EXPLAINED_REJECT ACCEPT CONFIRM FEEDBACK_POSITIVE

  33. Combining Statistical and Symbolic Processing for Dialog Processing Dialog-Act based Translation Dialog Module Context Evaluation Statistical Prediction Dialog Act Predictions Context Evaluation Main Proprositional Content Focus Plan Recognition Dialog Phase Transfer by Rules Dialog Act Dialog-Act based Translation Dialog Memory Dialog Act Generation of Minutes

  34. Learning of Probabilistic Plan Operators from Annotated Corpora ( OPERATOR-s-10523-6 goal [IN-TURN confirm-s-10523 ?SLASH-3314 ?SLASH-3316] subgoals (sequence [IN-TURN confirm-s-10521 ?SLASH-3314 ?SLASH-3315] [IN-TURN confirm-s-10522 ?SLASH-3315 ?SLASH-3316]) PROB 0.72) ( OPERATOR-s-10521-8 goal [IN-TURN confirm-s-10521 ?SLASH-3321 ?SLASH-3322] subgoals (sequence [DOMAIN-DEPENDENT accept ?SLASH-3321 ?SLASH-3322]) PROB 0.95) ( OPERATOR-s10522-10 goal [IN-TURN confirm-s-10522 ?SLASH-3325 ?SLASH-3326] subgoals (sequence [DOMAIN-DEPENDENT confirm ?SLASH-3325 ?SLASH-3326]) PROB 0.83)

  35. Automatic Generation of Multilingual Protocolsof Telephone Conversations Dialog Translation by Verbmobil Multilingual Generation of Protocols HTML-Document In English Transfered by Internet or Fax HTML-Document In English Transfered by Internet or Fax German Dialog Partner American Dialog Partner

  36. Automatic Generation of Minutes A and B greet each other. A: (INIT_DATE, SUGGEST_SUPPORT_DATE, REQUEST_COMMENT_DATE) I would like to make a date. How about the seventeenth? Is that ok with you? B: (REJECT_DATE, ACCEPT_DATE) The seventeenth does not suit me. I’m free for one hour at three o’clock. A: (SUGGEST_SUPPORT_DATE) How about the sixteenth in the afternoon? B: (CLARIFY_QUERY, ACCEPT_DATE, CONFIRM) The sixteenth at two o’clock? That suits me. Ok. A and B say goodbye. Minutes generated automatically on 23 May 1999 08:35:18 h

  37. The Control Panel of Verbmobil

  38. Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads Segment 1 Wenn wir den Termin vorziehen, Segment 1 If you prefer another hotel, Segment 2 das würde mir gut passen. Segment 2 please let me know. Statistical Translation Case-Based Translation Dialog-Act Based Translation Semantic Transfer Alternative Translations with Confidence Values Selection Module Segment 1 Translated by Semantic Transfer Segment 2 Translated by Case-Based Translation

  39. A Context-Free Approach to the Selection of the Best Translation Result SEQ := Set of all translation sequences for a turn SeqSEQ := Sequence of translation segments s1, s2, ...sn Each translation thread provides for every segment an online confidence value confidence (thread.segment) Input: Task: Compute normalized confidence values for translated Seq CONF (Seq) =  Length(segment) * (alpha(thread) + beta(thread) * confidence(thread.segment)) segment  Seq Best (SEQ) = {Seq  SEQ | Seq is maximal element in (SEQ CONF) Output:

  40. Learning the Normalizing Factors Alpha and Beta from an Annotated Corpus Turn := segment1, segment2...segmentn For each turn in a training corpus all segments translated by one of the four translation threads are manually annotated with a score for translation quality. For the sequence of n segments resulting in the best overall translation score at most 4n linear inequations are generated, so that the selected sequence is better than all alternative translation sequences. From the set of inequations for spanning analyses ( 4n) the values of alpha and beta can be determind offline by solving the constraint system.

  41. Example of a Linear Inequation Used for Offline Learning Turn := Segment_1 Segment_2 Segment_3 Statistical Translation = STAT Case-based Translation = CASE Dialog-Act Based Translation = DIAL Semantic Transfer = SEMT quality (CASE, Segment_1), quality (SEMT, Segment_2), quality (STAT, Sement_3) is optimal Length (Segment_1) * (alpha (CASE ) + beta (CASE) * confidence (CASE, Segment_1)) Length (Segment_2) * (alpha (SEMT) + beta (SEMT) * confidence (SEMT, Segment_2)) Length (Segment_3) * (alpha (STAT) + beta (STAT) * confidence (STAT, Segment_3)) > Length (Segment_1) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_1)) Length (Segment_2) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_2)) Length (Segment_3) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_3))

  42. The Context-Sensitive Selection of the Best Translation Using probabilities of dialog acts in the normalization process CONF (Seq) =  Length (segment) * (alpha (thread) + dialog-act (thread, segment) + beta (thread) * confidence (thread, segmnet)) e.g. Greet (Statistical_Translation, Segment > Greet (Semantic_Transfer, Segment) Suggest (Semantic_Transfer, Segment) > Suggest (Case_based Translation, Segment) segment  Seq Exploiting meta-knowledge If the semantic transfer generates  x disambiguation tasks then increase the alpha and beta values for semantic transfer. e.g. einen Termin vorziehen  prefer/give priority to/bring forward <a date> Observation: Even on the meta-control level (selection module) a hybrid approach is advantageous.

  43. Verbmobil: Long-Term, Large-Scale Funding and Its Impact l Funding by the German Ministry for Education and Research BMBF Phase I (1993-1996) $ 33 M Phase II (1997-2000) $ 28 M l 60% Industrial funding according to shared cost model $ 17 M l Additional R&D investments of industrial partners $ 11 M Total $ 89 M l > 400 Publications (>250 refereed) l > Many Patents l > 10 Commercial Spin-off Products l > Many new Spin-off Companies l > 100 New jobs in German Language l > 50 Academics transferred to Industry Industry Philips, DaimlerChrysler and Siemens are leaders in Spoken Dialog Applications

  44. Spoken Dialogs about Schedules Fielded applications l Train schedules (German Railway System, DB) l TABA (Philips) +49 241 60 40 20 l OSCAR (DaimlerChrysler) +49 1805 99 66 22 l Flight Schedules (Lufthansa) l ALF (Philips) +49 1803 00 00 74 Technical Challenges: phone -based dialogs, many proper names, clarification subdialogs

  45. Linguatronic : Spoken Dialogs with Mercedes-Benz Please call Doris Wahlster. Open the left window in the back. I want to hear the weather channel. When will I reach the next gas station? Where is the next parking lot? Microphone Push-to-talk Switch l Speech control of: cellular phone, radio, windows / AC, route guidance system l Option for S-, C-, and E-Class of Mercedes and BMW l Speaker-independent, Garbage models for non-speech (blinker, AC, wheels)

  46. The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998) Input Processing Media Interaction Management Media Analysis Analysis Language Media Fusion Graphics Discourse Modeling Gesture Biometrics Information Applications People Intention Recognition Application Interface Media Design Design Language User(s) User Modeling Graphics Gesture Animated Presentation Agent Presentation Design Output Rendering User Model Task Model Domain Model Media Models Discourse Model Representation and Inference  W. Wahlster, DFKI

  47. SmartKom: A Transportable and Transmutable InterfaceAgent Media Analysis Kernel of SmartKom Interface Agent Media Design Application Manage- ment SmartKom-Mobile: A Handheld Communication Assistant Interaction Management SmartKom-Public: A Multimodal Communication Booth SmartKom-Home/Office: A Versatile Agent-based Interface  W. Wahlster, DFKI

  48. SmartKom-Public:A Multimodal Communication Booth Loudspeaker Smartcard/ Credit Card for authentication and billing Room microphone Face-tracking camera Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity Virtual touchscreen protected against vandalism High-resolution scanner Multipoint video conferencing  W. Wahlster, DFKI

  49. SmartKom-Mobile: A Handheld Communication Assistant GSM for Telephone, Fax, Internet Connectivity GPS Camera Wearable Compute Server Stylus-Activated Sketch Pad Microphone MOBILE Biosensor for Authentication & Emotional Feedback Loudspeaker Docking Station for Car PC  W. Wahlster, DFKI

  50. SmartKom-Home/Office:A Versatile Agent-based Interface SpeechMike Natural Gesture Recognition Virtual Touchscreen  W. Wahlster, DFKI

More Related