1 / 15

Parameterisation of Glottal Waveforms for Characterisation of Laryngeal Voice-Quality

ISCA Tutorial and Research Workshop on “Voice Quality: Functions, Analysis and Synthesis” Geneva, Switzerland, 27-29 August 2003. VOQUAL-2003. Parameterisation of Glottal Waveforms for Characterisation of Laryngeal Voice-Quality. Parham Mokhtari Hartmut Pfitzinger & Carlos Toshinori Ishi.

shalin
Download Presentation

Parameterisation of Glottal Waveforms for Characterisation of Laryngeal Voice-Quality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISCA Tutorial and Research Workshop on “Voice Quality: Functions, Analysis and Synthesis” Geneva, Switzerland, 27-29 August 2003 VOQUAL-2003 Parameterisation of Glottal WaveformsforCharacterisation of Laryngeal Voice-Quality Parham Mokhtari Hartmut Pfitzinger & Carlos Toshinori Ishi JST/CREST-ESP Project, HIS Labs at ATR, Kyoto, Japan

  2. Articulation Setting Phonation Quality to Articulatory domain Vocal-TractArea-Function Glottal Waveform robust mapping Formants & F0 from Acoustic domain Reliable Centres Acoustic Speech Waveform Physiologically-motivated Acoustic Analysis of Speech

  3. Overall Muscular Tension Settings Lax Voice Tense Voice Summary of Laver’s (1980) classification of laryngeal voice qualities

  4. Example snapshot of acoustic measurements single cycle of the glottal-flow waveform retained for further analysis

  5. Number and Phonetic Distribution of Reliable Acoustic Measurements Total = 77 single-cycle glottal waveforms (all automatically processed, but hand-selected)

  6. Standardised Volume-Velocity Standardised Time Prototype Glottal-flow Waveforms measured from Laver’s (1980) recording

  7. Klatt & Klatt (1990) Rosenberg (1971) Fant, Liljencrants & Lin (1985) Parametric Models of the Glottal-flow Waveform: three well-known examples

  8. speed of opening-phase & energy of pulse peak fundamental period PC3 7.4% PC1 57.6% Standardised Volume-Velocity PC2 23.2% PC4 3.9% pulse skew & closing speed single-peak versus diplophony Standardised Time Principal Components of Glottal-flow Waveforms across the 13 voice qualities (first 4 principal components explain 92.1% of total variance)

  9. lax voice breathy voice modal harsh voice creaky tense voice & whispery voice tense voice falsetto harsh whispery voice Distribution of the 77 glottal waveforms in the I-II and III-IV principal component planes

  10. 64% correct Voice Quality classified as: Original Voice Quality Confusion Matrix for classification of the 77 glottal waveforms by a Decision Tree using Principal Components I, II, III and IV

  11. Conclusions Holistic approach to modelling the glottal-flow waveform Underlying basis-functions found by empirical analyses Top 4 principal components explain over 90% of variance Top 4 PCs can distinguish among 13 voice qualities Future Work Even more robust methods needed for fully automatic analysis Extension to spontaneous, conversational, expressive speech! Spectral and perceptual correlates of principal components…? Our approach can adapt to a wide variety of phonation-types Voice-quality control in speech synthesis!

  12. End of Presentation – Thank You –

  13. original more breathy more harsh lower AQ values higher AQ values

  14. Estimated glottal-flow waveforms… 1 – Breathy phonation ~ “effective decay time” (Fant et al., 1994) 2 – Pressed phonation Definition of the Glottal AQ (Amplitude Quotient) -- figures taken from Alku et al. (JASA, August 2002) -- AQ = fac / dpeak = T2 Stylised, triangular glottal-flow waveform glottal-flow waveform glottal-flow derivative

  15. Pressed/Harsh phonation Breathy/Nasal phonation Glottal waveform (72 msec) Glottal waveformderivative Spectrum of glottal waveform [0, 8] kHz Contrasting phonation-types – glottal AQautomatically measured at reliable centres in continuous speech • abrupt glottal closure • negative-peak of derivative • higher harmonics prominent • low AQ • quasi-sinusoidal glottal wave • smooth derivative-waveform • fundamental most prominent • high AQ

More Related