1 / 20

Multimodal applications for mobile devices in Java

Multimodal applications for mobile devices in Java. Michael Pucher (FTW Vienna) Georg Niklfeld (FTW Vienna), Robert Finan (Mobilkom Austria AG), Wolfgang Eckhart (Sonorys Vienna AG). Contents. Multimodality History and types of multimodality

fisk
Download Presentation

Multimodal applications for mobile devices in Java

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multimodal applications for mobile devices in Java Michael Pucher (FTW Vienna) Georg Niklfeld (FTW Vienna), Robert Finan (Mobilkom Austria AG), Wolfgang Eckhart (Sonorys Vienna AG)

  2. Contents • Multimodality • History and types of multimodality • The importance of multimodality for mobile devices • Applications • Architectures and Algorithms • Logical design of multimodal applications • Server and client side Speech processing • Java class architecture • Multimodal Integration algorithms in Java • Parsing and Integration • Servlet/Midlet architecture • VoiceXML

  3. History and types of multimodality • Multimodality research since the 1980’s • Early versus late fusion • Types of multimodality • First order multimodality which allows sequential multimodal input • Second order modality allows uncoordinated, simultaneous multimodal input • Third order multimodality allows coordinated, simultaneous multimodal input

  4. The importance of multimodality for mobile devices • Multimodal communication is perceived as natural • Disadvantages of unimodal interfaces for mobile devices • Small displays • No comfortable alphanumeric keyboards • Visual access to the display is not always possible • Disadvantages cannot be overcome by increasing processor and memory capabilities

  5. Applications • List selection (e.g. Adresses) • Map Navigation (Location Based Serices - GPS) • Voice mail • Car environments • Advanced call managment • Specialized applications for mobile working environments

  6. Logical design of multimodal applications

  7. Visual Browser

  8. Voice Browser

  9. Final architecture

  10. Server and client side speech processing • Server based ASR and TTS • Embedded ASR and TTS • Distributed Speech Recognition • ETSI standard • Feature extraction • Compression and error detection (4800bit/s)

  11. Java class architecture

  12. MMAction

  13. MMReaction

  14. MMRule

  15. public MMReaction[] getReactions(String id) { Transaction trans = this.getTransaction(id); trans.removeOldObjects(); ListIterator actions = trans.getAllObjects(); while (actions.hasNext()) { MMAction mma = (MMAction) actions.next(); ListIterator rules = ruleList.listIterator(); while(rules.hasNext()) { ((MMRule)rules.next()).addMMAction(mma); } } ....... ..... ListIterator rulesI = ruleList.listIterator(); while(rulesI.hasNext()) { ((MMRule)rulesI.next()).integrateActions(); } ListIterator rulesR= ruleList.listIterator(); while(rulesR.hasNext()) { MMReaction[] mmreac = ((MMRule) rulesR.next()).getMMReaction(); if (mmreac!=null) return mmreac; } return null; } Multimodal integration algorithms in Java Handling Parsing and Integration in MMIntegrator

  16. public void addMMAction(MMAction mmo) { if (mmo instanceof PointClick && actArray[0]==null) { this.intActionSize = this.intActionSize +1; actArray[0] = (MMAction)mmo; } else if (mmo instanceof PointClick && actArray[1]==null) { this.intActionSize = this.intActionSize +1; actArray[1] = (MMAction)mmo; } else if (mmo instanceof RouteShow && actArray[2]==null) { this.intActionSize = this.intActionSize +1; actArray[2] = (MMAction)mmo; } } public void integrateActions() { if (this.intActionSize==3) { ShowRoute show = (ShowRoute)this.reacArray[0]; show.pc0 = (PointClick)this.actArray[0]; show.pc1 = (PointClick)this.actArray[1]; SayRoute say = (SayRoute)this.reacArray[1]; say.pc0 = (PointClick)this.actArray[0]; say.pc1 = ((PointClick)this.actArray[1]; } } Handling Parsing and Integration in Route (MMRule)

  17. public MMReaction[] getReactions(String id) { ... while (actions.hasNext()) { MMAction mma = (MMAction) actions.next(); ListIterator rules = partialRuleList.listIterator(); while(rules.hasNext()) { ((MMRule)rules.next()).addMMAction(mma); } } ....... Optimizing Parsing and using probabilistic information 1.Adding a probability to each MMAction depending on empirical investigations. (usability studies) 2.Calculate the probability after the integration depending either on a specific rule for each MMRule or on a global rule, using the timestamp variable of MMObject. e.g. it is likely that the SpeechCommand occurs between the PointClick commands and not before it. public void integrateActions() { ... ((ShowRoute)this.reacArray[0]).calcProb(); ((SayRoute)this.reacArray[1]).calcProb(); ... }

  18. The act method is executed in the context of a Servlet public void act(Object obj) throws Exception { ((HttpServletResponse)obj).setContentType(res.getString("contenttype")); PrintWriter out = ((HttpServletResponse)obj).getWriter(); out.println(res.getString("xmlversion")); out.println(res.getString("vxmlversion")); ..... // Print VoiceXML page here ..... } The act method is executed in the context of an Applet/Midlet The Applet/Midlet implements MapInterface. public void act(Object obj) throws Exception { ((MapInterface)obj).drawRoute(pc0.getPoint(),pc1.getPoint()); } Servlet/Midlet architecture Act method of SayRoute and ShowRoute

  19. Servlet/Midlet architecture

  20. Dialogs <?xml version="1.0" encoding="Cp1252"?> <!DOCTYPE vxml PUBLIC '-//Nuance/DTD VoiceXML 1.0//EN' 'http://voicexml.nuance.com/dtd/nuancevoicexml-1-2.dtd'> <vxml> <form id="form1"> <field name="pagename"> <grammar src="http://mars.ftw.tuwien.ac.at/callmanag/gram/Main.grammar#Main" type="text/gsl" /> <prompt bargein="true">Sie können eine Nachricht hinterlassen eine Notiz abhören oder auf den Kalender zugreifen</prompt> <filled mode="any"> <submit method="get" enctype="application/x-www-form-urlencoded" next="http://mars.ftw.tuwien.ac.at/callmanag/servlet/ at.ftw.voicexml.GetVoiceXMLPageServlet" namelist="pagename" /> </filled> <catch event="noinput"> <reprompt /> </catch> </field> </form> </vxml> Grammars [ ( (?eine ?neue nachricht) ?[hinterlassen aufnehmen aufzeichnen hinterlegen] ?bitte ) { return("storemessage.vxml") } ] VoiceXML

More Related