Download presentation
Presentation is loading. Please wait.
1
Factor Analysis of MRI- Derived Tongue Shapes Mark Hasegawa-Johnson ECE Department and Beckman Institute University of Illinois at Urbana-Champaign
2
Background The vowel sounds of English are classified in two dimensions: “high/low” and “front/back.” i u aae e o FrontBack High Low
3
Background Tongue is composed of about 9 muscles (4 intrinsic, 5 extrinsic) Styloglossus Superior Phar. Constrictor Genioglossus Hyoglossus Transversus Verticalis Superior Longitudinalis Inferior Longitudinalis Palatoglossus
4
Theories of Motor Control Theory 1: Direct Control Theory 2: Hierarchical Control
5
Factor Analysis of X-Ray Images Harshman, Ladefoged, &Goldstein, 1977
8
Finding: Two factors account for 92% of variance.
9
Factor loadings seem to represent distinctive features: v 1 = [ front] v 2 = [ high]
10
Can Three-Dimensional Tongue Shape be Explained Using Shape Factors? Hypothesis 1 3D tongue shape during speech = weighted sum of 2-3 factors. Hypothesis 2 Shape of the factors t 1 (i), t 2 (i) is speaker-dependent. (??)
11
Why is 3D Different from 2D? Linear Source-Filter Theory: - Vowel Quality is Determined by Areas - Area Correlated w/Midsagittal Width
12
Do Shape Factors Exist in 3D? n If inter-speaker shape similarity is governed by desire for acoustic similarity, and... n If acoustic similarity depends on cross- sectional area, not cross-sectional shape... n Then Variation in 3D Shape May Not Have a Shape Factor Basis
13
Factor Analysis of MRI-Derived Tongue Shapes: Methodology 1. Recruit Subjects 2. Collect MRI Images 3. Segment the Images 4. Interpolate ROI to Create 3D Tongue Shapes for Each Vowel 5. Speaker-Dependent Factor Analysis 6. Speaker-Independent Factor Analysis
14
Subject Recruitment: n Ten subjects recruited; five successfully imaged (3 male, 2 female). n Subjects were college undergrads and grads with no metal fillings and no claustrophobia. n Subjects were trained to sustain vowel sounds with little variation. n Human subjects approval: both UCLA and Cedars-Sinai Medical Center.
15
MRI Image Collection GE Signa 1.5T T1-weighted 3mm slices 24 cm FOV 256 x 256 pixels Coronal, Axial 11-18 Sounds per Subject. Breath-hold in vowel position for 25 seconds
16
Image Viewing and Segmentation: the CTMRedit GUI and toolbox n Display series of CT or MR image slices n Segment ROI manually or automatically n Interpolate and reconstruct ROI in 3D space
17
Calibration: Segmentation of Phantom (J. Cha) n Test tubes of 3 sizes n Radius estimated from manual segmentation has an absolute error of u typical case: 0.1mm u worst case: 0.4mm
18
Calibration: Articulatory Speech Synthesis (J. Cha) n /a,i,u/ synthesized using Maeda articulatory synthesizer n F1-F4 errors: u worst case: +/- 30% u mean error: +2.8% u std dev: 19.5%
19
Reconstruction of ROI n Interpolate between image slices to create 3D object.
20
Tongue Shape During /ae/
21
Speaker Normalization: VT Length, Inter-Molar Width (S. Pizza)
22
Speaker-Dependent Factor Analysis n 12 tongue shapes from one speaker: u Each tongue shape modeled as a 25 point x 40 point rubber sheet. n Principal Components Analysis: u 11 Non-Zero Factors (12 vowels - 1 mean vector = 11 degrees of freedom). u 2 Factors: 78% of variance u 3 Factors: 88% of variance
23
“Excuses:” Why Didn’t it Work? n Tongue Length changes from /ao/ to /iy/. n Human Transcriber Error? n Interpolation to Form 3D Image Causes Error u Spline & Sinc interpolation: very large errors u Linear interpolation: smaller errors, but still too large.
24
New Approaches: ---- Avoid Interpolation General Method: Avoid interpolation by modeling the measured data directly. n J. Huang: Control factor shape using an a priori probability distribution. n Y. Zheng: Limit factor to the set of polynomial surfaces.
26
Polynomial Smoothing (Y. Zheng) n Polynomial Surface Modeling u Tongue shape = polynomial surface u 4D surface model enforces smoothness constraints. n Hybrid Polynomial/Factor model u Midsagittal tongue shape is as predicted by Harshman et al. u 3D shape = (midsag. shape)X(polynomial)
27
Conclusions n X-ray analysis suggests hierarchical motor control, but... n “Hierarchical control” might reflect structure of the acoustic space. n MRI analysis does not find hierarchical control (yet), but... n Negative finding might be result of methodological weakness.
28
Speaker-Dependent Factor Analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.