slide1
Download
Skip this Video
Download Presentation
CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg

Loading in 2 Seconds...

play fullscreen
1 / 14

CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg - PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on

Julia’s Little Helper : A Real-time Demo of Cantonese/Mandarin Emotional Speech Detection. Suzanne Yuen Mechanical Engineering. William Y. Wang Computer Science. CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg Columbia University 12/21/2009.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg' - abner


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Julia’s Little Helper:

A Real-time Demo of Cantonese/Mandarin

Emotional Speech Detection

Suzanne Yuen

Mechanical Engineering

William Y. Wang

Computer Science

CS 6998

Computational Approach to Emotional Speech

Instructor: Prof. Julia Hirschberg

Columbia University 12/21/2009

slide2

Review

  • Target Languages: Cantonese (9 tones) , Mandarin (4 tones)
  • Target Emotions: Anger and Gladness
  • Lexical Features: ASR using a HMM acoustic model trained on Mandarin Broadcast News [1] and a simple hand-written decoding dictionary.
  • Prosodic Features: Energy and Tonal Features
  • Real-time drawing of pitch contour, waveform and energy.
  • A text-to-speech agent to greet and teach user how to use this demo.
  • [1] Yang Shao, Lan Wang, E-Seminar: an Audio-guide e-Learning System, IEEE International Workshop on Education Technology and Training (ETT) 2008.
slide3

Lexical Scoring 1-3pts

Energy 1 pt

Tone 1 pt

slide4

Dictionary of Affects in Language

byDr. Cynthia Whissell

Total words: 8742 words were included.

Source: It was actually developed using various sources, for example, college student essays, interviews and teenagers description of their own emotion state. So, it can have a broad coverage and avoid biased data.

slide5

Sentence Lexical Scoring

“I won best paper award!”

Score = (2.375 + 2.5556 + 2.5455 + 1.2857 + 2.8333) / 5 = 2.319

slide6

Machine Translation

Multilingual Challenges: English  Chinese

slide7

Encoding and Mapping

  • Tasks:
  • Mandarin  Pinyin (Phone set used by Acoustic Model)
  • Mandarin  Cantonese
  • Note that not all words in Mandarin have theirs’ exact and direct mappings in Cantonese words and vice versa.
  • 3. Cantonese  Pinyin
slide8

Text-to-speech Engine

1. Implement the text-to-speech engine.

  • “Play with” a text-to-speech engine.
  • 3. Engine: TruVoice
      • Lernout & Hauspie Speech Products, or L&H
      • Went bankrupt in 2001
      • technology now owned by Nuance
slide9

L&H TTS Functionality

  • Developed in 1997
  • Advanced text pre-processing and no vocabulary restrictions
  • User-definable pronunciation dictionary
  • Accurately pronounces surnames and place names
  • Flexible pitch, volume and speech rate
  • Intonation support for punctuation
slide10

Test Overview

  • Participants –
    • gender: 6 male, 6 female
    • Native Language – 6 Mandarin, 6 Cantonese
  • Two Parts
    • JLH module and self-rating (24 lines total)
    • Perception test – Rating lines from others (72)
slide11

Sentences

  • Three types – questions, exclamations, statements
  • Randomized order of sentences for each participant
  • Examples:
slide12

Analysis

  • Plan to examine differences and affects of following:
    • Ratings - JLH star rating, self rating, & 3 perception ratings
    • Language – Cantonese, Mandarin
    • Gender – female, male
    • Sentence structure – exclamation, question, & statement
  • Interesting points –
    • Huge range of Chinese accents
    • Tones of words may change depending on previous words (such as English a mug vs. an umbrella)
    • Variations in colloquial speech, addressed by using Chinese script
slide13

Future Work

  • Improve the prosodic analysis. More features should be explored.
  • Improve the lexical scoring. Use POS tagger or other NLP tools to weigh different constituents of recognized sentence.
  • Finer-grain the emotion types and investigate the differences.
  • Study translational divergence in English-Chinese MT .
ad