Download presentation
Presentation is loading. Please wait.
Published byEustacia Wilkinson Modified over 8 years ago
1
GEPPETO 1 : A modeling approach to study the production of speech gestures Pascal Perrier (ICP – Grenoble) with Stéphanie Buchaillard (PhD) Matthieu Chabanas (ICP) Ma Liang (PhD), Yohan Payan (TIMC – Grenoble) 1 GEstures shaped by the Physics and by a PErceptually oriented Targets Optimization
2
Outline Introduction Current hypotheses implemented in GEPPETO Some results obtained with a 2D biomechanical tongue model New issues raised by the use of 3D biomechanical tongue model
3
Basic issues in Speech Production Research Phonology/Phonetics Interface –Link between discrete representations and continuous physical signals –Nature of physical correlates of speech units
4
Basic issues in Speech Production Research Control and Production of Speech Gestures –Control variables –Central representations of physical characteristics of the speech production apparatus –Interaction Perception-Action
5
Basic issues in Speech Production Research From Gestures to Speech Sounds –Nature of acoustic sources –Relations between motor commands and acoustics –Interaction between airflow and articulatory gestures.
6
What is GEPPETO? An evolutive modeling framework to quantitatively test hypotheses about the control and the production of speech gestures. It includes –Hypotheses about the physical correlates of phonological units. –Models of motor control –Physical models of the speech production apparatus
7
Current Hypotheses Phonology/Phonetic Interface –The smallest phonological unit is the phoneme –Phonemes are associated with target regions in the auditory domain –Larger phonological units are associated with speech sequences for which specific constraints exist for target optimization or for motor commands sequencing
8
Current Hypotheses Control of speech gestures –Control variables: commands (EP Hypothesis, Feldman, 1966) –No on line use of feedback going through the cortex. –Short-delay orosensory and proprioceptive feedbacks are taken into account. –Existence in the brain of internal representations of the speech apparatus (internal models).
9
Current Hypotheses Control of speech gestures –Internal representations do not account for the whole physical complexity of the speech production apparatus –Kinematic characteristics are not directly controlled. They are the results of the interaction between motor control setups and physical phenomena of speech production Which characteristics of speech signals are specifically controlled?
10
Application to the generation of speech gestures with a 2 D biomechanical tongue model Implementation of the model of control Inversion from desired perceptual objectives to motor commands Generation of gestures
11
2D Biomechanical Model Finite element structure Linear elasticity (small deformations) No account of the gravity
12
2D Biomechanical Model Posterior genioglossus Anterior Genioglossus Hyoglossus
13
2D Biomechanical Model VerticalisStyloglossusInferior Longitudinalis
14
Learning a static internal model From commands to formants Step 1: - Uniform sampling of the commands space -Generation of the corresponding tongue shapes. 9000 simulations
15
Learning a static internal model From commands to formants Step 2: Computation of the area function.
16
Step 3: Formants computation for 2 lip apertures (red dots: spread lips; blue dots: rounded lips) Learning a static internal model From commands to formants
17
Step 4: Learning and generalizing with radial basis functions 1 st layer 2 nd layer
18
Inversion From target regions to commands Target regions for some non rounded French phonemes Target regions Dispersion ellipses in the (F 1, F 2, F 3 ) space Currently defined by F c1, F c2, F c3 and F1, F2, F3
19
Inversion From target regions to commands Target regions Dispersion ellipses in the (F 1, F 2, F 3 ) space Currently defined by F c1, F c2, F c3 and F1, F2, F3 Target regions for some non rounded French phonemes
20
Inversion From target regions to commands + Cost for a sequence made of N phonemes with Optimization Cost minimization (Gradient descent technique) Speaker orientedListener oriented
21
Inversion From target regions to commands Example 1 Sequence [ œ-e-k-i ]
22
Example 2 Sequence [ œ-e-k-a ] Inversion From target regions to commands
23
Production of tongue movements from inferred commands Serial command patterns No difference between vowels and consonants [oe] [e] [k] [a]
24
Execution of tongue movements from inferred commands Öhman’s model: Vowel-to-Vowel basis Consonants are seen as perturbation of V-V [oe] [e] [k] [a]
25
Observed flesh point Execution of tongue movements from inferred commands
26
Production of tongue movements from inferred commands Serial command patterns [a] [i]
27
Production of tongue movements from inferred commands Öhman’s command patterns [a] [i]
28
R. Houde (1969) [aka][ika] Example: the Articulatory loops Interaction control / physics. Influence on the shapes of the articulatory paths
29
Fluid-Wall Interaction Forces Mechanics of thetissues. Finiteelement model) Flow model Imposed pressure difference Deformation
30
[aka] Example: the Articulatory loops Interaction control / physics. Influence on the shapes of the articulatory paths
31
[aka] No aerodynamicsWith aerodynamics Interaction control / physics. Influence on the shapes of the articulatory paths Example: the Articulatory loops
32
61626364656667 107 108 109 110 111 112 113 Deplacement X - Y X - mm Y - mm... PS = 1600 Pa ---- No aerodynamics [ika] Interaction control / physics. Influence on the shapes of the articulatory paths Example: the Articulatory loops
33
[ika] No aerodynamicsWith aerodynamics Interaction control / physics. Influence on the shapes of the articulatory paths Example: the Articulatory loops
34
A 3D biomechanical tongue model: For a better account of physics Visible Human Project ® data (Wilhelms-Tricarico, 2003) Finite Element Mesh made of Hexahedres Adaptation of the mesh to a specific speaker (PB) Wilhelms-Tricarico R.,1995 Gerard et al., ICP Grenoble
35
Inner muscle structure of the tongue Genioglossus (medium)Genioglossus (anterior)Styloglossus GeniohyoidGenioglossus (posterior)HyoglossusVerticalisTransversusInferior longitudinalisMylohyoidSuperior longitudinalis
36
Vocal tract structure HYOID BONE MANDIBLE PALATE OTHER MUSCLES TONGUE’S BODY
37
LinearNon Linear Displacement 0 Force Tongue Indentator Elastical properties of tongue muscles Hyperelastic material (2 nd order Yeoh model) with large deformation hypothesis
38
Effect of gravity [1s]
39
[300ms] Dealing with gravity with the EP hypothesis
40
Activation of GGp and MH Increase of reflex activity [300ms]
41
Dealing with gravity with the EP hypothesis GGP activation
42
[300ms] Dealing with gravity with the EP hypothesis Example of a good choice of control parameters
43
Conclusions A model of control based on perceptual objectives specified in terms of formants target regions associated with motor commands and on an optimization process using a static model of the motor- perception relations can generate realistic speech movements if it is applying to a realistic physical model of speech production.
44
Conclusions It supports our hypothesis that there is not need to assume the existence of a central optimization process that would apply to the articulatory trajectories in their whole (i.e. minimum of jerk, minimum of torque…)
45
Conclusions It gives an interesting account of coarticulation phenomena by separating the effects of planning and those of physics. It permits to test hypotheses about the phonological units (see serial model versus Öhman’s model).
46
Conclusions However a systematic comparison with data is required (currently in progress for French, German, Chinese, Japanese) No account for time control, or for hypo/hyperspeech No account for gravity
47
Conclusions Necessity to work on a more complex internal representations that would integrate some aspects of articulatory dynamics.
50
Influence of elasticity modeling Hyperelastic Small defo. Linear Large defo. Linear Activation of the Hyoglossus (2N)
51
EP Hypothesis (Feldman, 1966) Perrier, Ostry, Laboissière, 1996
52
EP Hypothesis (Feldman, 1966) Perrier, Ostry, Laboissière, 1996
53
Static Internal Models Peripheral motor system Formants Direct Model y i (t) Desired formants Inverse Model d Central Nervous System
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.