1 / 30

Creating a Speech Enabled Avatar from a Single Photograph

Creating a Speech Enabled Avatar from a Single Photograph. Dmitri Bitouk Shree K. Nayar. Columbia University. Speech Enabled Avatar. Input photograph. Speech Enabled Avatar. Input photograph. Avatar. Speech Enabled Avatar. Input photograph. Avatar. Applications :

accalia
Download Presentation

Creating a Speech Enabled Avatar from a Single Photograph

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating a Speech Enabled Avatar from a Single Photograph Dmitri Bitouk Shree K. Nayar Columbia University

  2. Speech Enabled Avatar Input photograph

  3. Speech Enabled Avatar Input photograph Avatar

  4. Speech Enabled Avatar Input photograph Avatar • Applications: • mobile messaging and video conferencing • news reporting and information kiosks • novel user interfaces

  5. Facial Motion Synthesis Challenges • Mapping phonemes to static mouth shapes produces unrealistic, jerky animations • Co-articulation: facial articulations can be dominated the preceding as well upcoming phonemes • Asynchrony: facial motion may precede the corresponding sound

  6. Related Work • Avatars from video sequences Bregler et al 1997, Ezzat et al 2002, etc • 2D Avatars from photographs Blanz et al 2003, CrazyTalkTM , MotionPortraitTM

  7. Generic Facial Motion Model Prototype Surface Deformed Surface - Facial motion parameters Bitouk 2006

  8. Generic Facial Motion Model

  9. Facial Motion Transfer Prototype Face Novel Faces Bitouk 2006

  10. Facial Motion Transfer Prototype Face Novel Faces Bitouk 2006

  11. s2 s1 Hidden Markov Models Phonemes: /B/, /K/, /AA/, /IY/, etc With lexical: /B/, /K/, /AA0/, /AA1/, /IY0/, /IY1/, etc stress Triphones: Facial motion parameters

  12. Training Hidden Markov Models • Training set consists of motion capture data • Baum-Welch embedded re-estimation • Cluster triphone states to predict triphones not seen in the training set

  13. Text-to-Speech Engine Hidden Markov Models Speech Text Facial Motion Parameters Facial Motion Synthesis from Text Time-labeled phonemes

  14. Fitting the Prototype Model to an Image 2D Prototype Face Photograph

  15. Fitting the Prototype Model to an Image 2D Prototype Face Photograph

  16. Facial Motion Synthesis

  17. Eye Motion Synthesis

  18. Eyeball Texture Synthesis Eye Image Synthesized Eyeball Texture

  19. Eye Motion Synthesis Eye Motion Geometry

  20. Eye Motion and Blinking

  21. Visual Text-to-Speech Synthesis

  22. Visual Text-to-Speech Synthesis

  23. Speech Recognition Hidden Markov Models Speech Facial Motion Parameters Facial Motion Synthesis from Speech Time-labeled phonemes

  24. Facial Motion Synthesis from Speech

  25. 3D Avatars Captured Stereo Image Mirror View Direct View Gluckman & Nayar, 2001

  26. 3D Avatars Rectified Images 3D Model Mirror View Direct View

  27. 3D Avatars Point cloud engraved inside a glass cube Digital projector Nayar & Anand, 2007

  28. 3D Avatars

  29. Limitations and Future Work • Automatic facial feature detection • Synthesis of rigid head motion • Expressive speech • Web demo of our system will be available in early April www.cs.columbia.edu/CAVE/

  30. The End

More Related