html5-img
1 / 29

Markku Turunen Tampere Unit for Human-Computer Interaction University of Tampere

Speech Application Architectures. Markku Turunen Tampere Unit for Human-Computer Interaction University of Tampere MUMIN PhD course, Tampere, 18.-22.11.2002. Outline. Topics Background Architecture types Example architectures Topics for research Jaspis architecture.

temima
Download Presentation

Markku Turunen Tampere Unit for Human-Computer Interaction University of Tampere

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Application Architectures Markku Turunen Tampere Unit for Human-Computer InteractionUniversity of Tampere MUMIN PhD course, Tampere, 18.-22.11.2002

  2. Outline Topics • Background • Architecture types • Example architectures • Topics for research • Jaspis architecture

  3. Software architectures 1 Definitions • “software architecture defines the system in terms of components and interactions between them. Connectors are used to mediate interaction between the components” [Garlan & Shaw, 1994] • several views can be used to describe different aspects of software architectures: design view, run-time view, module view, logical view, control view, class view, … • human-computer interaction viewpoint: support for interaction methods and techniques

  4. Software architectures 2 Software development tools • support and tools for the construction of practical applications • core architecture: basic infrastructure (hub/facilitator, communication libraries, blackboard) • complete architecture: technology components (ASR, TTS), dialogue manager, database, … • toolkit: dialogue editor, ASR grammar builder, corpus collection tool, annotation editor, …

  5. Speech system components speech recognition natural language processing user telephone interface dialogue management database speech synthesis natural language generation

  6. ASR NLU DM NLG TTS Architecture types 1 Pipelines and dialogue management architectures • pipeline (batch-sequence) architectures • data flow • one-way interfaces • fixed processing order • dialogue manager architectures • function calls • dialogue manager as controller • relaxed processing order TTS ASR NLU DM DB NLG UM

  7. Architecture types 2 client-server and blackboard architectures • client-server architectures • two-way messages • hub as coordinator (star topology) • free processing order • blackboard (DB) architectures • data events / db operations • shared information • free processing order TTS DM NLU HUB ASR NLG DB UM TTS DM NLU IS ASR NLG UM

  8. Architecture types 3 agent architectures • independent agents • independent agents • facilitator • collaborative processing • compact agents • compact agents • shared knowledge • distributed processing TTS DM NLU Facilitator ASR NLG DB UM NLU DA PA DA DE ASR IA IS Facilitator TTS UA IA PA PE PE DE NLG PA DA

  9. Example architectures 1 GALAXY-II • MIT / MITRE • DARPA Communicator reference architecture • freely available • HUB and servers • frames (messages) • hub scripts route messages [Seneff et al., 1998]

  10. Example architectures 2 Open Agent Architecture • general agent architecture • Facilitator as coordinator • requesters (tasks) • services (solutions) • Interagent Communication Language (ICL) • freely available • used in speech applications [Martin et al., 1999]

  11. Example architectures 3 WITAS • dialogue manager agent reacts to events send by other agents • dialogue manager acts as blackboard • multimodal inputs are coordinated by DM • based on OAA [Lemon et al., 2001]

  12. Example architectures 4 MITRE architecture • dialogue manager as controller • default processing order • dialogue manager monitors other components • dialogue manager is a kind of blackboard • based on OAA [Luperfoy et al., 1998]

  13. Example architectures 5 TRIPS • agents, managers and shared databases • loosely coupled components • no dialogue manager • KQML messages • facilitator does not contain control logic [Allen et al., 2001]

  14. Current vs. new application areas

  15. Current vs. new application areas

  16. Adaptive systems Need for adaptive applications • different users: speech-based communication can differ greatly between individual users and situations • speech is language and culture dependent • preferences and needs between user groups can be large • different approaches: people from different backgrounds have different solutions for same problems • we need interaction methods and architectures that adapt to the different users and situations and support multiple approaches

  17. Future example

  18. Topics for research Topics for speech systems • adaptivity: how to support adaptive methods? how to make systems to be adaptive? • reusability: components, interaction methods, … • distributed systems: communication protocols, resource sharing, ubiquitous applications • distributed interaction management: centralized dialogue manager is not suitable for many tasks • shared knowledge: dialogue, user etc. • development and evaluation tools: WOZ, corpora, …

  19. Jaspis architecture speech application development framework • implementation of core architecture with extensions • designed especially for multilingual and distributed applications • overall focus on system level adaptivity • current focus on ubiquitous and multimodal applications • Java and XML, freely available • used in several projects and applications

  20. Jaspis architecture overview NGL NLU DB UM

  21. Jaspis components Agents, evaluators and managers • agents handle various interaction situations, such as speech input interpretations, dialogue decisions and speech output presentations • evaluators measure how well agents can handle current interaction situation • managers are used to coordinate agents and evaluators, especially to try to choose the best possible agents to handle each interaction situation

  22. Jaspis interaction management

  23. Information management in Jaspis • information storing method is not fixed (XML, DB) • information access protocol is defined (DTD) • Information Managers are used to access the Information Storage – these can be implemented in any language and they can use TCP/IP, XML-RPC or method calls

  24. Presentation management in Jaspis • presentation agents convert conceptual messages to speech outputs • for every output the most suitable agent is selected by presentation evaluators • multiple presentation management modules for different phases

  25. Dialogue management in Jaspis • different dialogue agents for different dialogue tasks • alternative dialogue agents for same dialogue tasks • dialogue evaluators select dialogue agents • no single controller (the dialogue manager) • multiple dialogue management modules

  26. Communication (I/O) management in Jaspis • i/o-agents and evaluators handle, combine and coordinate different input streams • devices – clients – servers – engines • run-time interpretation and multimodal fusion • separate module for selection of input modalities

  27. Jaspis extensions Beyond core infrastructure • XML-based linguistic information (Annotation Graphs) and log formats (corpus collection, usability tests) • visualization components (blackboard, interaction) • speech technology interfaces for common telephony cards, synthesizer and recognizers • reusable components: error handling, general tasks • SMS interface, graphical components • Wizard Of Oz tools

  28. Jaspis Future improvements • concurrent dialogues and multiple users • event-based interaction management

  29. TampereUnit forComputerHumanInteraction Department of Computer and Information Sciences http://www.cs.uta.fi/hci/spi/ spi@cs.uta.fi mturunen@cs.uta.fi

More Related