1 / 26

Unsupervised Learning for Speech Motion Editing

This paper presents a new statistical representation of facial motion and a decomposition technique that allows for intuitive motion editing. Using Independent Component Analysis (ICA), the semantics of speech-related components are extracted and editing operations are performed in an intuitive manner. The approach is demonstrated on a dataset of speech motion in different emotional states.

nmccray
Download Presentation

Unsupervised Learning for Speech Motion Editing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yong Cao1,2Petros Faloutsos1Frederic Pighin2 University of California, Los Angeles1 Institute for Creative Technologies, University of Southern California2 Eurographics/SIGGRAPH Symposium on Computer Animation (2003) Unsupervised Learning for Speech Motion Editing

  2. Problem • Motion Capture is convenient but lacks flexibility • Problem: How to extract the semantics of the data for intuitive motion editing?

  3. Related Work 1.Face motion synthesis • Physics-based face model Lee, Terzopoulos, Water ( SIGGRAPH 1995) Kähler, Haber, Seidel (Graphics Interface 2001) • Speech motion synthesis Bregler, Covell, Slaney (SIGGRAPH 1997) Brand (SIGGRAPH 1999) Ezzat, Pentland, Poggio (SIGGRAPH 2002) 2. Separation of style and content Brand, Hertzmann (SIGGRAPH 2000) Chuang, Deshpande, Bregler (Pacific Graphics 2002)

  4. Our Contribution • New statistical representation of facial motion • Decomposition into style and content • Intuitive editing operations

  5. Our Contribution Original Neutral Motion Edited Sad Motion

  6. Roadmap • Independent Component Analysis (ICA) • Facial motion decomposition • Semantics of components • Motion editing

  7. Independent Component Analysis (ICA) • Statistical technique • Linear transformation • Components are maximally independent

  8. Steps of ICA • Preprocessing (PCA) • Centering • Whitening • ICA decomposition Reconstruction: Decomposition:

  9. Roadmap • Independent Component Analysis (ICA) • Facial motion decomposition • Semantics of components • Motion editing

  10. Speech motion Dataset • Speech motion of 113 sentences in 5 emotion moods: • Frustrated18 sentences • Happy18 sentences • Neutral17 sentences • Sad30 sentences • Angry30 sentences • Each motion: 109 motion capture markers 2 – 4 seconds

  11. Components in ICA space Facial Motion and Decomposition Reconstruction Facial motion …………

  12. Roadmap • Independent Component Analysis (ICA) • Facial motion decomposition • Semantics of components • Motion editing

  13. Interpretation of independent components • Goal: Find the semantics of each component Classify each component into: Style(emotion) Content (speech) • Methodology • Qualitatively • Quantitatively

  14. changing Qualitatively Style (emotion) Content (speech)

  15. Quantitatively • Style: Emotion Same speech, different emotion ………… ………… Happy Frustrated

  16. Speech Content Grouping of motion markers • Mouth motion • Eyebrow motion • Eyelid motion

  17. ………… Reconstruct 0 0 0 0 Content: speech related motion Step1: Using each independent component to reconstruct facial motion

  18. Content: speech related motion Step2: Compare according to certain region

  19. Roadmap • Independent Component Analysis (ICA) • Facial motion decomposition • Semantic meaning of components • Motion editing

  20. Translate • Copy and Replace • Copy and Add Motion Editing with ICA • Edit the motion in intuitive ways

  21. Results • Changing emotional state by translating

  22. Conclusion • New statistical representation of facial motion • Decomposition into content and style • Intuitive editing operations

  23. The End Thanks to Wen Tien for his help on this paper, Christos Faloutsos for useful discussions, and Brian Carpenter for his excellent performance. Thanks to the USC School of Cinema – Television and House of Moves for motion capture.

More Related