slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonolog PowerPoint Presentation
Download Presentation
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonolog

Loading in 2 Seconds...

play fullscreen
1 / 37

Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonolog - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology. Mark Hasegawa-Johnson jhasegaw@uiuc.edu University of Illinois at Urbana-Champaign, USA. Lecture 2: Acoustics of Vowel and Glide Production.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonolog


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Landmark-Based Speech Recognition:Spectrogram Reading,Support Vector Machines,Dynamic Bayesian Networks,and Phonology

Mark Hasegawa-Johnson

jhasegaw@uiuc.edu

University of Illinois at Urbana-Champaign, USA

lecture 2 acoustics of vowel and glide production
Lecture 2: Acoustics of Vowel and Glide Production
  • One-Dimensional Linear Acoustics
    • The Acoustic Wave Equation
    • Transmission Lines
    • Standing Wave Patterns
  • One-Tube Models
    • Schwa
    • Front cavity resonance of fricatives
  • Two-Tube Models
    • The vowel /a/
    • Helmholtz Resonator
    • The vowels /u,i,e/
  • Perturbation Theory
    • The vowels /u/, /o/ revisited
    • Glides
standing wave patterns quarter wave resonators
Standing Wave Patterns: Quarter-Wave Resonators

Tube Closed at the Left End, Open at the Right End

standing wave patterns half wave resonators
Standing Wave Patterns: Half-Wave Resonators

Tube Closed at Both Ends

Tube Open at Both Ends

schwa and invv the vowels in a tug
Schwa and Invv (the vowels in “a tug”)

F3=2500Hz=5c/4L

F2=1500Hz=3c/4L

F1=500Hz=c/4L

front cavity resonances of a fricative
Front Cavity Resonances of a Fricative

/s/: Front Cavity Resonance = 4500Hz

4500Hz = c/4L if

Front Cavity Length is L=1.9cm

/sh/: Front Cavity Resonance = 2200Hz

2200Hz = c/4L if

Front Cavity Length is L=4.0cm

conservation of mass at the juncture of two tubes
Conservation of Mass at the Juncture of Two Tubes

U2(x,t)=

2U1(x,t)

U1(x,t)

A2 = A1/2

A1

Total liters/second transmitted = (velocity) X (tube area)

two tube model two different sets of waves
Two-Tube Model: Two Different Sets of Waves

Incident Wave P1+

Reflected Wave P2+

Reflected Wave P1-

Incident Wave P2-

approximate solution of the two tube model a 1 a 2
Approximate Solution of the Two-Tube Model, A1>>A2

LBACK

LFRONT

Approximate solution: Assume that the two tubes are completely decoupled, so that the formants include

- F(BACK CAVITY) = c/4 LBACK

- F(FRONT CAVITY) = c/4LFRONT

the vowels aa ah
The Vowels /AA/, /AH/

LBACK

LFRONT

LBACK=8.8cm  F2= c/4LBACK = 1000Hz

LFRONT=12.6cm  F1= c/4LFRONT = 700Hz

acoustic impedance
Acoustic Impedance

Z(x,jW)

x

0

Z(x,jW)

x

0

helmholtz resonator
Helmholtz Resonator

-Z1(x,jW) =

Z2(x,jW)

x

0

x

0

the vowel i
The Vowel /i/

Back Cavity = Pharynx

Resonances: 0Hz, 2000Hz, 4000Hz

Front Cavity = Palatal Constriction

Resonances: 0Hz, 2500Hz, 5000Hz

Back Cavity Volume = 70cm3

Front Cavity Length/Area = 7cm-1

 1/2p√MC = 250Hz

Helmholtz Resonance replaces all 0Hz

partial-tube resonances.

2500Hz

2000Hz

250Hz

the vowel u a two tube model
The Vowel /u/: A Two-Tube Model

2000Hz

1000Hz

250Hz

Back Cavity = Mouth + Pharynx

Resonances: 0Hz, 1000Hz, 2000Hz

Front Cavity = Lips

Resonances: 0Hz, 18000Hz, …

Back Cavity Volume = 200cm3

Front Cavity Length/Area = 2cm-1

 1/2p√MC = 250Hz

Helmholtz Resonance replaces all 0Hz

partial-tube resonances.

the vowel u a four tube model
The Vowel /u/: A Four-Tube Model

Velar

Tongue

Body

Constriction

Lips

Pharynx

Mouth

Two Helmholtz Resonators =

Two Low-Frequency Formants!

F1 = 250Hz

F2 = 500Hz

F3 = Pharynx resonance,

c/2L = 2000Hz

2000Hz

500Hz

250Hz

perturbation theory chiba and kajiyama the vowel 1940
Perturbation Theory(Chiba and Kajiyama, The Vowel, 1940)

A(x) is constant everywhere, except for one small perturbation.

Method:

1. Compute formants of the “unperturbed” vocal tract.

2. Perturb the formant frequencies to match the area perturbation.

formant frequencies of vowels
Formant Frequencies of Vowels

From Peterson & Barney, 1952

summary
Summary
  • Acoustic wave equation easiest to solve in frequency domain, for example:
    • Solve two boundary condition equations for P+ and P-, or
    • Solve the two-tube model (four equations in four unknowns)
  • Quarter-Wave Resonator: Open at one end, Closed at the other
    • Schwa or Invv (“a tug”)
    • Front cavity resonance of a fricative or stop
  • Half-Wave Resonator: Closed at the glottis, Nearly closed at the lips
    • /uw/
  • Two-Tube Models
    • Exact solution: use reflection coefficient
    • Approximate solution: decouple the tubes, solve separately
  • Helmholtz Resonator
    • When the two-tube model seems to have resonances at 0Hz, use, instead, the Helmholtz Resonance frequency, computed with low-frequency approximations of acoustic impedance
    • /iy/: F1 is a Helmholtz Resonance
    • /uw/ and /ow/: Both F1 and F2 are Helmholtz Resonances
  • Perturbation Theory
    • Perturbed area  Perturbed formants
    • Sensitivity function explains most vowels and glides in one simple chart