1 Comparing Computational Algorithms for Modeling Phoneme Learning Ilana Heintz, Fangfang Li, and Jeff Holliday The Ohio State University MCWOP 2008, University of Minnesota
2 Research Questions How do children learn to discriminate between similar phonemic categories? How does adult feedback affect that process? How are adults able to understand children? In what ways exactly is child speech different from adult speech?
3 Narrowing it Down How do children learn the difference between close consonants, for instance, /s/ vs. /S/ vs. /c}/ What are the differences in the productions of each of these consonants? How do the consonants differ across languages? How do children’s productions differ from adult speech?
4 Modeled data Dental/ Alveolar Post-alveolarAlveo-palatal English[s][S] Japanese[s][c}] Mandarin[s][S][c}] Stimuli elicited by 160 children and 37 adults 3 word tokens per CV type, 1390 total stimuli Children aged 2-5 from America, Japan, and Songyuan, China (Mandarin speaking) Stimuli later used in perception tests with adults, here we only study the production data
Frequency (Hz) Hand-measured acoustic analyses Sound pressure level (dB/Hz) Frequency (Hz)
6 As reported in Li 2008 Hand-measured acoustic analyses
7 Hand-measured acoustic analyses: English-speaking children
8 These are great results… so why use computational methods? Automatically derive many features per stimulus Derive time-varying features across the stimulus Look at more interactions between features Build a model that can be used to talk about acquisition & feedback
9 Self-organizing maps: a result E,J,M /s/ English /S/, Japanese /c}/, Mandarin /S/
10 Setting up the Map Determine dimensionality of data: –4 variables –Independent or correlated –Data represented by four-dimensional numeric vector: –[100.23, , , 4.3] Neurons same dimensionality as data Determine number of neurons: 15 x 15
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 Self-organizing maps: distance matrix E,J,M /s/ English /S/, Japanese /c}/, Mandarin /S/ All adult speakers
28 Self-organizing maps: Best-matching units, labeled E,J,M /s/ Mandarin /S/, Japanese /c}/ All adult speakers
29 English adults only /s/ /S/ English-speaking adults
30 Japanese adults only /s/ /c}/ Japanese-speaking adults
31 Mandarin adults only Mandarin-speaking adults /s/ /S/ /c}/
32 English-speaking children /s/ /S/ English-speaking children: all
33 Child-produced data on adult-trained map /s/ /S/ English-child data shown on English-adult map
34 English-speaking 2-year-olds English-speaking children: 2-year-olds /s/ /S/
35 English-speaking 3-year olds English-speaking children: 3-year-olds /s/ /S/
36 English-speaking 4-year-olds English-speaking children: 4-year-olds /s/ /S/
37 English-speaking 5-year-olds English-speaking children: 5-year-olds/s/ /S/
38 Conclusions Partially replicated some of the results of the hand-measured acoustic analysis with self-organizing maps Summing over four frequency regions of excitation pattern mirrored centroid results Less than 1390 stimuli split into 10 ms frames was enough to train 15 x 15 maps
39 More to do… Find better features for Mandarin, Japanese Incorporate dynamic features into the map Study the childrens’ productions more closely Incorporate notion of feedback by connecting the children and adult maps with Hebbian updates
40 References Cabrera, D., Ferguson, S. and Schubert, E “PsySound3: Software for acoustical and psychoacoustical analysis of sound recordings.” Proceedings of The 13th International Conference on Auditory Display. Montreal Canada. pp Glasberg, B.R and Moore, C.J “A model of loudness applicable to time-varying sounds.” Journal of the Audio Engineering Society. 50:5, Kohonen, T "Self-Organizing Map", 2nd ed., Springer-Verlag, Berlin Li, Fangfang “Universal Development in Context: the Case of Child Acquisition of Sounds Across Languages.” Lecture, University of Lethbridge. Vesanto J., Himberg J., Alhoniemi E., Parhankangas J “Self-organizing map in Matlab: the SOM Toolbox.” In Proceedings of the Matlab DSP Conference 1999, pages
41 English variables
42 Japanese variables
43 Mandarin variables