1 / 11

eNTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008

eNTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008. 2 users in their home/office environment unrestricted natural language free human behavior. Application challenges. Components integrated. Video Stream. Audio Stream. Sound Waves.

noah
Download Presentation

eNTERFACE ’08 Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. eNTERFACE ’08Project 2 “Multimodal High Level Data Integration” Final Report August 29th, 2008

  2. 2 users in their home/office environment • unrestricted natural language • free human behavior Application challenges

  3. Components integrated Video Stream Audio Stream Sound Waves Sequence of Images Speech Recognizer Video Analyzer Recognized String Movements Coordinates Syntactic Analyzer Human Behavior Analyzer Syntactic Triple Movements Meanings Linguistic meanings Semantic Analyzer Fusion Mechanism Advise People Knowledge Base

  4. Audio Stream Video Stream Sound Waves Sequence of Images Sphinx-4 Open CV Recognized String Movements Coordinates C & C Parser Human Behavior Analyzer Syntax Analysis Movements Meanings Linguistic meanings C & C Boxer Fusion Mechanism Advise People Protegè Jena Semantic Validation

  5. [Ronald] I want to call Nick. Nick mentioned that he attended a wine tasting course. [Beto] It sounds interesting, I like wine. [Ronald] Actually I plan to join the next class. He also mentioned a book about French wines, but I cannot recall the name of the author. [Beto] Why don't you send a mail to Nick? [Ronald] Maybe I can find a book about it in the library. [Beto] Yes, you are right. [Beto] Did you find it? [Ronald] Yes, I did. Example Scenario

  6. Alerts: • want, need, wish, require, going to, plan, look for, wonder, can, may, must, do you know, do we have, etc. • Stop-alerts: • negation (I am not going to…) • pasttense (Yesterday I wasgoing to…) Hints for plan recognition by speech

  7. MaybeI can find a book about it in the library Ronald ismovingtowardsthe book shelves

  8. If (Ronald) [wants to send] {email to Nick} & (Ronald [is moving to] {the computer} | He [is close to] {the computer}) then open the mail client with the “to” field filled with nick@uclouvain.be  If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the library} then There is a book about French wines on the first shelf.  If (Ronald) [can] find {book} [about] {it} [in] {the library} & (Ronald [is moving to] {the computer}) then Open a web search website and put the keyword in the search field. Decision making

  9. spatial relationships (based on the fixed “anchor” objects in the room) • semantic fusion of events not coinciding in time • good results in speaker identification: synchronisation between image and speech identification • an open framework to manage fusion between two (our case) or more modalities was created during the project and will be enhanced further • each component can run in a separated machine thanks to the distribution mechanism interchanging data through a TCP/IP network. Achievements

  10. implement effective learning • efficient decision making even from information fragments • spatial relationships relatively to moving people • 3D video analysis • detection of orientation of the people in the scene • eye gaze tracking • recognition of various types of gestures • dealing with natural language redundancy (repeating the same idea in different words) Future work

  11. integration on the OpenInterface platform (openinterface.org) • create an open-source community around the project to • - gain ideas and contributions from outside • - have new modalities to fuse • create a website, a forum, a mailing list Further development of results

More Related