1 / 14

CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg

Julia’s Little Helper : A Real-time Demo of Cantonese/Mandarin Emotional Speech Detection. Suzanne Yuen Mechanical Engineering. William Y. Wang Computer Science. CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg Columbia University 12/21/2009.

abner
Download Presentation

CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Julia’s Little Helper: A Real-time Demo of Cantonese/Mandarin Emotional Speech Detection Suzanne Yuen Mechanical Engineering William Y. Wang Computer Science CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg Columbia University 12/21/2009

  2. Review • Target Languages: Cantonese (9 tones) , Mandarin (4 tones) • Target Emotions: Anger and Gladness • Lexical Features: ASR using a HMM acoustic model trained on Mandarin Broadcast News [1] and a simple hand-written decoding dictionary. • Prosodic Features: Energy and Tonal Features • Real-time drawing of pitch contour, waveform and energy. • A text-to-speech agent to greet and teach user how to use this demo. • [1] Yang Shao, Lan Wang, E-Seminar: an Audio-guide e-Learning System, IEEE International Workshop on Education Technology and Training (ETT) 2008.

  3. Lexical Scoring 1-3pts Energy 1 pt Tone 1 pt

  4. Dictionary of Affects in Language byDr. Cynthia Whissell Total words: 8742 words were included. Source: It was actually developed using various sources, for example, college student essays, interviews and teenagers description of their own emotion state. So, it can have a broad coverage and avoid biased data.

  5. Sentence Lexical Scoring “I won best paper award!” Score = (2.375 + 2.5556 + 2.5455 + 1.2857 + 2.8333) / 5 = 2.319

  6. Machine Translation Multilingual Challenges: English  Chinese

  7. Encoding and Mapping • Tasks: • Mandarin  Pinyin (Phone set used by Acoustic Model) • Mandarin  Cantonese • Note that not all words in Mandarin have theirs’ exact and direct mappings in Cantonese words and vice versa. • 3. Cantonese  Pinyin

  8. Text-to-speech Engine 1. Implement the text-to-speech engine. • “Play with” a text-to-speech engine. • 3. Engine: TruVoice • Lernout & Hauspie Speech Products, or L&H • Went bankrupt in 2001 • technology now owned by Nuance

  9. L&H TTS Functionality • Developed in 1997 • Advanced text pre-processing and no vocabulary restrictions • User-definable pronunciation dictionary • Accurately pronounces surnames and place names • Flexible pitch, volume and speech rate • Intonation support for punctuation

  10. Test Overview • Participants – • gender: 6 male, 6 female • Native Language – 6 Mandarin, 6 Cantonese • Two Parts • JLH module and self-rating (24 lines total) • Perception test – Rating lines from others (72)

  11. Sentences • Three types – questions, exclamations, statements • Randomized order of sentences for each participant • Examples:

  12. Analysis • Plan to examine differences and affects of following: • Ratings - JLH star rating, self rating, & 3 perception ratings • Language – Cantonese, Mandarin • Gender – female, male • Sentence structure – exclamation, question, & statement • Interesting points – • Huge range of Chinese accents • Tones of words may change depending on previous words (such as English a mug vs. an umbrella) • Variations in colloquial speech, addressed by using Chinese script

  13. Future Work • Improve the prosodic analysis. More features should be explored. • Improve the lexical scoring. Use POS tagger or other NLP tools to weigh different constituents of recognized sentence. • Finer-grain the emotion types and investigate the differences. • Study translational divergence in English-Chinese MT .

  14. Demo

More Related