slide1 n.
Download
Skip this Video
Download Presentation
CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg

Loading in 2 Seconds...

play fullscreen
1 / 14

CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

Julia’s Little Helper : A Real-time Demo of Cantonese/Mandarin Emotional Speech Detection. Suzanne Yuen Mechanical Engineering. William Y. Wang Computer Science. CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg Columbia University 12/21/2009.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CS 6998 Computational Approach to Emotional Speech Instructor: Prof. Julia Hirschberg' - abner


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Julia’s Little Helper:

A Real-time Demo of Cantonese/Mandarin

Emotional Speech Detection

Suzanne Yuen

Mechanical Engineering

William Y. Wang

Computer Science

CS 6998

Computational Approach to Emotional Speech

Instructor: Prof. Julia Hirschberg

Columbia University 12/21/2009

slide2

Review

  • Target Languages: Cantonese (9 tones) , Mandarin (4 tones)
  • Target Emotions: Anger and Gladness
  • Lexical Features: ASR using a HMM acoustic model trained on Mandarin Broadcast News [1] and a simple hand-written decoding dictionary.
  • Prosodic Features: Energy and Tonal Features
  • Real-time drawing of pitch contour, waveform and energy.
  • A text-to-speech agent to greet and teach user how to use this demo.
  • [1] Yang Shao, Lan Wang, E-Seminar: an Audio-guide e-Learning System, IEEE International Workshop on Education Technology and Training (ETT) 2008.
slide3

Lexical Scoring 1-3pts

Energy 1 pt

Tone 1 pt

slide4

Dictionary of Affects in Language

byDr. Cynthia Whissell

Total words: 8742 words were included.

Source: It was actually developed using various sources, for example, college student essays, interviews and teenagers description of their own emotion state. So, it can have a broad coverage and avoid biased data.

slide5

Sentence Lexical Scoring

“I won best paper award!”

Score = (2.375 + 2.5556 + 2.5455 + 1.2857 + 2.8333) / 5 = 2.319

slide6

Machine Translation

Multilingual Challenges: English  Chinese

slide7

Encoding and Mapping

  • Tasks:
  • Mandarin  Pinyin (Phone set used by Acoustic Model)
  • Mandarin  Cantonese
  • Note that not all words in Mandarin have theirs’ exact and direct mappings in Cantonese words and vice versa.
  • 3. Cantonese  Pinyin
slide8

Text-to-speech Engine

1. Implement the text-to-speech engine.

  • “Play with” a text-to-speech engine.
  • 3. Engine: TruVoice
      • Lernout & Hauspie Speech Products, or L&H
      • Went bankrupt in 2001
      • technology now owned by Nuance
slide9

L&H TTS Functionality

  • Developed in 1997
  • Advanced text pre-processing and no vocabulary restrictions
  • User-definable pronunciation dictionary
  • Accurately pronounces surnames and place names
  • Flexible pitch, volume and speech rate
  • Intonation support for punctuation
slide10

Test Overview

  • Participants –
    • gender: 6 male, 6 female
    • Native Language – 6 Mandarin, 6 Cantonese
  • Two Parts
    • JLH module and self-rating (24 lines total)
    • Perception test – Rating lines from others (72)
slide11

Sentences

  • Three types – questions, exclamations, statements
  • Randomized order of sentences for each participant
  • Examples:
slide12

Analysis

  • Plan to examine differences and affects of following:
    • Ratings - JLH star rating, self rating, & 3 perception ratings
    • Language – Cantonese, Mandarin
    • Gender – female, male
    • Sentence structure – exclamation, question, & statement
  • Interesting points –
    • Huge range of Chinese accents
    • Tones of words may change depending on previous words (such as English a mug vs. an umbrella)
    • Variations in colloquial speech, addressed by using Chinese script
slide13

Future Work

  • Improve the prosodic analysis. More features should be explored.
  • Improve the lexical scoring. Use POS tagger or other NLP tools to weigh different constituents of recognized sentence.
  • Finer-grain the emotion types and investigate the differences.
  • Study translational divergence in English-Chinese MT .
ad