1 / 26

Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study

Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study. Kyuchul Yoon Division of English Kyungnam University Spring 2008 Joint Conference of KSPS & KASS. Contents. Synthesis & evaluation of human utterances with exaggerated prosody

Download Presentation

Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synthesis & evaluation of prosodically exaggerated utterances:A preliminary study Kyuchul Yoon Division of English Kyungnam University Spring 2008 Joint Conference of KSPS & KASS

  2. Contents • Synthesis & evaluation of human utterances with exaggerated prosody • Synthesis of exaggerated prosody • Useful for native utterances • The definition of prosody “exaggeration” • The algorithm • Evaluation of exaggerated prosody • Useful for evaluating learner utterances • The algorithm & an experiment

  3. Teaching & evaluating prosody • Teaching language prosody • The need for “exaggeration” of native utterances • How to define “exaggeration” • Evaluating language prosody • Given the native version of an utterance, evaluate learner’s utterances w/ atypical prosody • How to measure the differences btw/ the native and learner utterances

  4. Exaggerating native prosody • Exaggeration of the F0 contour • One way would be to make the pitch peaks/valleys higher/lower • Exaggeration of the intensity contour • One way would be to manipulate the intensity contour of the pitch peaks/valleys • Exaggeration of the segmental durations • One way would be to manipulate the segmental durations of the pitch peaks/valleys

  5. Exaggerating native prosody F0 The fundamental frequency (F0) contour of an utterance Marianna!.

  6. Exaggerating native prosody Intensity The intensity contour of an utterance Marianna!.

  7. Exaggerating native prosody Duration The segmental durations of an utterance Marianna! before and after the exaggeration.

  8. Algorithm: prosody exaggeration • Definition of prosody exaggeration • F0 contour • Make pitch peaks/valleys higher/lower in Hz values • Intensity contour • Make pitch peaks higher in dB values • Segmental durations • Make pitch peaks longer in times values

  9. Algorithm: prosody exaggeration F0

  10. Algorithm: prosody exaggeration Intensity

  11. Algorithm: prosody exaggeration Durations

  12. How Praat script works

  13. How Praat script works F0 Intensity Durations

  14. How Praat script works Original F0 Durations F0 Durations Intensity

  15. Evaluating learner prosody • Assumes the existence of the native version • Evaluates the learner versions • Evaluation of the F0 & intensity contours • Is preceded by duration manipulation: • The durations of the matching segments of the two utterances are made identical [3] • Is preceded by F0/intensity normalization & F0 smoothing • The mean difference is added/subtracted to/from learner utterance • Is followed by pitch/intensity point-to-point comparison • Evaluation of segmental durations • Done without any duration manipulation. Segment-to-segment comparison • Evaluation measure: Euclidean distance metric

  16. Algorithm: prosody evaluation Before & after duration manipulation native learner before learner after

  17. Algorithm: prosody evaluation F0 point-to-point comparison btw/ native and learner native learner after

  18. Algorithm: prosody evaluation Intensity point-to-point comparison btw/ native and learner native learner after

  19. Algorithm: prosody evaluation Duration segment-to-segment comparison btw/ native and learner native learner before Euclidean distance metric for evaluation measure P = (p1, p2, p3,..., pn) and Q = (q1, q2, q3,..., qn) in Euclidean n-space

  20. A pilot experiment native learner after Euclidean distance should be minimum

  21. A pilot experiment native learner after F0: -100Hz to +100Hz with a 10Hz interval  21 stimuli Intensity: -25dB to +25dB with a 5dB interval  11 stimuli Duration: 0.25, 0.50, 0.75, 1.00, 1.50, 2.00, 2.50, 3.00 times the original  8 stimuli

  22. Results & Conclusion

  23. Results & Conclusion

  24. Results & Conclusion

  25. Results & Conclusion • Prosody exaggeration • Can be a tool for teaching language prosody • Can be used to test measures for evaluating prosody • Limitation of the current prosody evaluation • Native utterances should exist to yield measures • TTS systems with advanced prosody models could be helpful • “Weights” of the three separate measures (F0/intensity/duration) need to be determined • Experiments with human evaluators could provide the weights

  26. References [1] Boersma, Paul. 2001. Praat, a system for doing phonetics by computer. Glot International 5(9/10). pp.341-345. [2] Moulines, E. & F. Charpentier. 1990. Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Communication 9. pp.453-467. [3] Yoon, K. 2007. Imposing native speakers' prosody on non-native speakers' utterances: The technique of cloning prosody. Journal of the Modern British & American Language & Literature 25(4). pp.197-215.

More Related