1 / 48

The Effect of Glottal Opening on the Acoustic Response of the Vocal Tract

The Effect of Glottal Opening on the Acoustic Response of the Vocal Tract. Anna Barney, Antonio De Stefano ISVR, University of Southampton, UK & Nathalie Henrich LAM, Universit é Paris VI, France. Introduction.

tertius
Download Presentation

The Effect of Glottal Opening on the Acoustic Response of the Vocal Tract

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Effect of Glottal Opening on the Acoustic Response of the Vocal Tract Anna Barney, Antonio De Stefano ISVR, University of Southampton, UK & Nathalie Henrich LAM, Université Paris VI, France

  2. Introduction We are interested in the interaction between the voice source and the vocal tract. We hope that an improved understanding of source-tract interaction will enhance naturalness in synthesised speech

  3. Structure of this talk • Types of source-tract interaction • Effect of source-tract interaction on formant frequencies: theory • Mechanical model • Measurement of the effect of source-tract interaction: static • Measurement of the effect of source-tract interaction: dynamic • Conclusions & Future work

  4. Assumptions of Source-Filter Theory • Source and vocal-tract filter do not interact • Non-linear effects are normally lumped into the source model • Formants are the resonances of the vocal-tract, calculated when the glottal impedance is infinite

  5. Source Tract Interaction (STI) Childers & Wong (1994) define 3 principal types of STI: • Loading of the source by the vocal tract impedance • Dissipation of vocal tract energy by glottal opening (mainly at F1) • Carry over of energy from one glottal period to the next (for low glottal damping) (D.G. Childers and C.-F. Wong, 'Measuring and Modeling Vocal Source-Tract Interaction', IEEE Transactions on Biomedical Engineering, Vol. 41. No. 7. pp. 663-671 (1994) )

  6. Source Tract Interaction (STI) Flanagan (Speech analysis synthesis and perception, 1965) considered the effect of finite glottal impedance on a transmission line model of the vocal tract glottis supraglottal vocal tract Zg Za Za Za Za Zo Zl Zb Zb Subglottal vocal tract

  7. Source Tract Interaction (STI) Flanagan stated that a finite glottal impedance would raise F1 and increase formant damping He predicted and increase in F1 of 1.4% for a glottal area of 5 mm2

  8. Source Tract Interaction (STI) • Ananthapadmanabha, T.V. & Fant G. (1982) (Calculation of the true glottal volume velocity and its components. Speech Commun. 1 (1982) 167-184). • Found the theoretical effect of glottal inertance to be small

  9. Source Tract Interaction (STI) • P. Badin and G. Fant, (Notes on Vocal tract computation. STL-QPSR 2-3/1984 (1984) 53-108) • Modelled the sub-glottal system as a short circuit • used a glottal area of 0.027 mm2, • glottis modelled by inductance only: • F1 increased by 0.2%

  10. Measurements on Real Speech • It is known that the formant estimates vary depending on where in a pitch period the estimation window is placed. • F1 estimated during open phase using group delay characteristics and a minimum phase assumption are generally a little higher during open phase than during closed phase. (B Yeganarayana, R Veldhuis IEEE trans speech & audio processing, 6(4) 1998) • Closed-phase formant analysis is used to get estimates of the vocal tract formants that are reliably decoupled from any sub-glottal formants. (L.C.Wood, D.J.P Pearce IEE Proceedings 136 pt 1 no.2 1989)

  11. Source Tract Interaction (STI) Shifts in F1 may be small but they may correlate with: • changes in glottal OQ and/or • changes in glottal amplitude And may be of interest when considering voice quality & naturalness of synthesis Also – glottal areas considered in the literature are always at the small end of the range found in normal voicing.

  12. Flanagan’s model We implemented Flanagan’s transmission line model with a uniform duct of length 17.5 cm and area 2.89 cm2 to explore the change as glottal width increased

  13. The formant shift – theory Log amplitude Frequency (Hz)

  14. Theoretical modelling of the formant shift – static glottis To match our experimental measurements we elaborated on Flanagan’s model We used 4 T-sections for the supra-glottal vocal tract and other parameters to match those of our mechanical model We chose the boundary condition at the lips to match the boundary condition for our measurements

  15. Theoretical modelling of the formant shift –glottal impedance model Flanagan (1965) & others for finite glottal impedance:

  16. Theoretical modelling of–glottal impedance model Laine & Karjalainen (1986): where

  17. Theoretical modelling of the formant shift –glottal impedance model Rösler & Strube (1989) Where

  18. Theoretical modelling of the formant shift –glottal impedance model • How should we model the sub-glottal impedance? • Speech models often assume that the lower end of the trachea is a fully absorbing boundary (r=0) so that there are no sub-glottal resonances.

  19. Theoretical modelling of the formant shift –glottal impedance model • We wanted to compare our theoretical model with measurements. We tried all three glottal impedance models and a range of sub-glottal impedance models to find the best fit to the data.

  20. The Mechanical Model We made our measurements of F1 shift using a mechanical model of the larynx and vocal tract

  21. The mechanical model

  22. Shutter Driver System The shutter region

  23. Schematic Diagram of the Model 55 50 130 flow 17 15 pt 3 pt1 pt2 115 175 All dimensions in mm, not to scale

  24. Instrumentation • Rotameter -Inlet volume flow rate • Manometer -Mean pressure upstream • Entran EPE-54 miniature pressure transducers, diameter of 2.36 mm, range 0 to 14kPa -Time-varying pressure at the duct wall for up to 4 locations. • Shutter driving signal - shutter position • All time-histories are captured by a simultaneous-sampling ADC connected to a PC with a samplingfrequency of 8928 Hz.

  25. Experimental measurements – static case • Glottal widths of 0,1,2,3 mm • Excitation provided by speaker at duct outlet – tonal discrete frequencies between 300 Hz and 2 kHz • Speaker modified duct boundary condition at “lips” so it was closer to a closed end condition. Impedance here was held constant throughout the measurements

  26. Experimental measurements – static case • 2 pressure transducers between “glottis” and “lips” • Pressure transducer separation 80 mm • Standing wave component pressure amplitudes extracted as specified by K R Holland & POAL Davies (The measurement of sound power flux in flow ducts. Journal of Sound and Vibration 230 (2000) 915 - 932 ) • Transfer function from “glottis” to “lips” obtained.

  27. Transfer function from glottis to lips – measured & theoretical - static dB

  28. Transfer function from glottis to lips – measured & theoretical - static dB

  29. Transfer function from glottis to lips – measured & theoretical - static dB

  30. Transfer function from glottis to lips – measured & theoretical - static dB

  31. 0 mm 1 mm dB 2 mm 3 mm

  32. Static case - Summary • F1 & F2 increased with increasing glottal width Predicted values of F1 (799 Hz, 854 Hz, 882 Hz, 896 Hz) match well to measurements • Increase in F1 between closed glottis and 1 mm wide glottis is ~6% • Increase in F1 between closed glottis and 3 mm wide glottis is ~13% • Increase in F1 larger than found by previous researchers, perhaps due to using greater glottal widths

  33. Dynamic Experimental measurements • How do our measurements for the static case transfer to a model excited by a vibrating larynx? • What is the dependence of F1 on the open quotient? • What is the dependence of F1 on the glottal amplitude?

  34. Experimental measurements – dynamic • Moving shutters 10 – 40 Hz square wave excitation • OQ: 20, 40, 60, 80 % • Glottal width: 0.25 mm to 4 mm

  35. Peak glottal width versus OQ for all f0 Glottal amplitude 20 40 60 80 Open quotient

  36. closure Pressure time history at p1 in the duct Pressure (Pa) opening Time (s)

  37. Experimental measurements – dynamic • F1 frequency found from AR spectral estimation. AR analysis uses whole glottal cycle to ensure STI effects included in analysis • AR analysis uses the Yule-Walker algorithm with an order of ceil((Fs/1000)+2) = 11

  38. Experimental measurements – dynamic • F1 peak defined as maximum value of spectrum between 200 Hz and 1 kHz • Data set rejected if no peak visible in this range hence small data set for OQ = 80%

  39. AR analysis

  40. Frequency of F1 for changing glottal width and OQ F1 (Hz) Glottal width (mm)

  41. Summary – dynamic measurements • F1 increases with increasing glottal width for fixed OQ • F1 increases with increasing OQ for fixed glottal width – at least at small glottal widths • Observed values of F1 much higher than normally predicted for open-closed tube of the same length or expected for real speech.

  42. Theoretical model – dynamic • Simulink model • Model adapted from one created by Nicolas Montgermont and Benoit Fabre, LAM for investigating the flute

  43. Glottal excitation Switchable glottal impedance Duct model Simulink model of dynamic case

  44. closed open Pressure time history at P1 - simulated

  45. F1 values for dynamic simulation

  46. Simulation - summary • The simulation does show a change in the formant frequency as OQ changes • The increase in F1 is much smaller than observed in the dynamic model experiments • The dynamic model has much greater damping, especially during closure, than the simulation

  47. Future work • To make a theoretical model of the formant shift in the dynamic case that matches the measurements more closely • To make similar measurements in real speakers

More Related