1 / 18

L 2 F - S poken Language Systems Lab

L 2 F - S poken Language Systems Lab. Genesis. Created in January 2001 As a result of a major restructuring of several groups within and outside INESC ID Lisbon Goal

zander
Download Presentation

L 2 F - S poken Language Systems Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. L2F - Spoken Language Systems Lab

  2. Genesis • Created in January 2001 • As a result of a major restructuring of several groups within and outside INESC ID Lisbon • Goal • Bring together several research groups to add relevant contributions to the area of computational processing of spoken language for European Portuguese • United by the problem we want to solve, not by the technology we share • People • About 10 PhD researchers, 10 PhD students, 3 MSc students, 12 undergraduated students • Formal cooperation with CLUL (Center of Linguistics of the Univ. of Lisbon)

  3. Mission Creating technology to bridge the gap between natural spoken language and the underlying semantic information

  4. Lines of Activity • Prioritary • Semantic processing of multimedia contents • Spoken dialogue systems platforms • Emerging • Computer enhanced human-to-human communication • Automatic transcription of meetings • Speech-to-speech translation • Continuing • Processing other varieties of Portuguese • E-inclusion • E-learning

  5. Core technologies • Speech Coding • Speech Synthesis • DIXI+ • Speech Recognition • AUDIMUS • Language / Accent Identification • Natural Language Processing • Dialogue Management

  6. DIXI+ • Continuation of the DIXI project (1991) • Synthesis by concatenation, instead of by rule • More elaborate prosodic models • Developed within the Festival framework • Focused on alternative and augmentative communication applications • Currently under development

  7. AUDIMUS • Continuous speech recognition system for the European Portuguese language • Hybrid system combining the Multilayer Perceptrons and Hidden Markov models (MLP/HMM) • Vocabularies from 5K, 64K, ... depending on the task • Stochastic language model of the N-gram type • Speaker independent system or speaker adapted depending on the task • First application: radiology report

  8. AUDIMUS results on BN Speech Recognition Word Error Rate (WER %)

  9. Semantic processing of multimedia contents • ALERT Selective Dissemination of Multimedia Information • IPSOM Indexing, Integration and Sound Retrieval in Multimedia Documents • Improved access to spoken books by the visually impaired (indexing words, sentences, topics) • Development of multimedia interfaces for accessing and retrieving spoken books(didactic applications, etc.)

  10. Multimedia Document Image / video processing Speech processing Automatic topic detection Match topics found against user profiles Alert Specific Users Multimedia document database Label database Video based segmentation Media watch If video contained Audio based segmentation If audio contained Transcription Keywords If text contained

  11. Spoken dialogue systems Goal: to develop Spoken dialogue systems and intelligent multimodal interfaces: • phone-based information system; • "intelligent" demo room controllable by voice; • the development of a story teller: a fully embodied conversational agent for reading stories to children.

  12. 118 - Telephone number synthesizer The requested number is xx-xxx-xx-xx, repeat, xx-xxx-xx-xx

  13. Telephone Database Internet Speech AUDIMUS SQL Dialogue Database Text DIXI+ Speech Speech based interface for a dialogue system Updater Dialogue • Telephone speech • Speech recognition (AUDIMUS) of natural language queries • Query understanding and info retrieval from database • Generation of natural language reply • Text-to-Speech synthesis (DIXI+) adapted to limited domain

  14. Speech based control system for an Hi-Fi TURN ON PLAY CD ONE Hi-Fi turn on and play CD one The computer interprets the command... Speech was recognized... …and sends the IR command The user spoke... Hi-FI - turn on and play CD 1.

  15. Processing other varieties of Portuguese • Research Topics: • Multi-accent corpora • Multi-accent robust speaker independent ASR • Language and accent ID • Computer Aided Language Learning (CALL)/ e-Learning

  16. E-inclusion: Eugénio - the word genius • Vord prediction tool for people with motor impairments • Cooperation with cerebral palsy centers • Public domain tool • New version released in 2003

  17. Agents Multimodal HCI Info. Retrieval Speech Mining Spoken Language Systems Computer Graphics Source separation Electronics Signal Processing Synergies with other INESC ID Groups

  18. More information in: www.l2f.inesc-id.pt info@l2f.inesc-id.pt

More Related