Factor Analysis of MRI-Derived Tongue Shapes - PowerPoint PPT Presentation

bernad
factor analysis of mri derived tongue shapes l.
Skip this Video
Loading SlideShow in 5 Seconds..
Factor Analysis of MRI-Derived Tongue Shapes PowerPoint Presentation
Download Presentation
Factor Analysis of MRI-Derived Tongue Shapes

play fullscreen
1 / 28
Download Presentation
Factor Analysis of MRI-Derived Tongue Shapes
565 Views
Download Presentation

Factor Analysis of MRI-Derived Tongue Shapes

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Factor Analysis of MRI-Derived Tongue Shapes Mark Hasegawa-Johnson ECE Department and Beckman Institute University of Illinois at Urbana-Champaign

  2. Background The vowel sounds of English are classified in two dimensions: “high/low” and “front/back.” u High i e o ae a Low Front Back

  3. Background Tongue is composed of about 9 muscles (4 intrinsic, 5 extrinsic) Superior Longitudinalis Palatoglossus Styloglossus Verticalis Superior Phar. Constrictor Transversus Genioglossus Inferior Longitudinalis Hyoglossus

  4. Theories of Motor Control Theory 2: Hierarchical Control Theory 1: Direct Control

  5. Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977

  6. Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977

  7. Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977

  8. Factor Analysis of X-Ray ImagesHarshman, Ladefoged, &Goldstein, 1977 Finding: Two factors account for 92% of variance.

  9. Factor loadings seem to represent distinctive features: v1 = [a front] v2 = [b high]

  10. Can Three-Dimensional TongueShape be Explained Using ShapeFactors? Hypothesis 1 3D tongue shape during speech = weighted sum of 2-3 factors. Hypothesis 2 Shape of the factors t1(i), t2(i) is speaker-dependent. (??)

  11. Why is 3D Different from 2D? Linear Source-Filter Theory: - Vowel Quality is Determined by Areas - Area Correlated w/Midsagittal Width

  12. Do Shape Factors Exist in 3D? • If inter-speaker shape similarity is governed by desire for acoustic similarity, and... • If acoustic similarity depends on cross-sectional area, not cross-sectional shape... • Then Variation in 3D Shape May Not Have a Shape Factor Basis

  13. Factor Analysis of MRI-Derived Tongue Shapes: Methodology 1. Recruit Subjects 2. Collect MRI Images 3. Segment the Images 4. Interpolate ROI to Create 3D Tongue Shapes for Each Vowel 5. Speaker-Dependent Factor Analysis 6. Speaker-Independent Factor Analysis

  14. Subject Recruitment: • Ten subjects recruited; five successfully imaged (3 male, 2 female). • Subjects were college undergrads and grads with no metal fillings and no claustrophobia. • Subjects were trained to sustain vowel sounds with little variation. • Human subjects approval: both UCLA and Cedars-Sinai Medical Center.

  15. MRI Image Collection • GE Signa 1.5T • T1-weighted • 3mm slices • 24 cm FOV • 256 x 256 pixels • Coronal, Axial • 11-18 Sounds • per Subject. • Breath-hold in • vowel position • for 25 seconds

  16. Image Viewing and Segmentation: the CTMRedit GUI and toolbox • Display series of CT or MR image slices • Segment ROI manually or automatically • Interpolate and reconstruct ROI in 3D space

  17. Calibration: Segmentation of Phantom (J. Cha) • Test tubes of 3 sizes • Radius estimated from manual segmentation has an absolute error of • typical case: 0.1mm • worst case: 0.4mm

  18. Calibration: Articulatory Speech Synthesis (J. Cha) • /a,i,u/ synthesized using Maeda articulatory synthesizer • F1-F4 errors: • worst case: +/- 30% • mean error: +2.8% • std dev: 19.5%

  19. Reconstruction of ROI • Interpolate between image slices to create 3D object.

  20. Tongue Shape During /ae/

  21. Speaker Normalization: VT Length, Inter-Molar Width (S. Pizza)

  22. Speaker-Dependent Factor Analysis • 12 tongue shapes from one speaker: • Each tongue shape modeled as a 25 point x 40 point rubber sheet. • Principal Components Analysis: • 11 Non-Zero Factors (12 vowels - 1 mean vector = 11 degrees of freedom). • 2 Factors: 78% of variance • 3 Factors: 88% of variance

  23. “Excuses:” Why Didn’t it Work? • Tongue Length changes from /ao/ to /iy/. • Human Transcriber Error? • Interpolation to Form 3D Image Causes Error • Spline & Sinc interpolation: very large errors • Linear interpolation: smaller errors, but still too large.

  24. New Approaches: ---- Avoid Interpolation General Method: Avoid interpolation by modeling the measured data directly. • J. Huang: Control factor shape using an a priori probability distribution. • Y. Zheng: Limit factor to the set of polynomial surfaces.

  25. Polynomial Smoothing (Y. Zheng) • Polynomial Surface Modeling • Tongue shape = polynomial surface • 4D surface model enforces smoothness constraints. • Hybrid Polynomial/Factor model • Midsagittal tongue shape is as predicted by Harshman et al. • 3D shape = (midsag. shape)X(polynomial)

  26. Conclusions • X-ray analysis suggests hierarchical motor control, but... • “Hierarchical control” might reflect structure of the acoustic space. • MRI analysis does not find hierarchical control (yet), but... • Negative finding might be result of methodological weakness.

  27. Speaker-Dependent Factor Analysis