Download presentation
Presentation is loading. Please wait.
1
Tonal Speech without Pitch Jerry Zhu zhuxj@cs.cmu.edu 2003/7/3
2
What’s in your mouth Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html
3
MFCC Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html * Focus on vocal tract shape (e.g. different vowels) * No pitch
4
Tonal languages Tone: variation in pitch. e.g. Mandarin, Thai http://kca.org/education/ImageView.asp?ImageID=179
5
MFCC disastrous for tones? MFCC should have no pitch info. Bad for Mandarin speech recognition? Not really why?
6
Hypothesis 1 Language context helps a lot? –e.g. singing over-rides pitch –people *do* understand the lyric (sort of)
7
Hypothesis 2 MFCC retains some pitch? –by imperfection –residual pitch info used by speech recognizers Test: convert MFCC to speech, listen for tones. (TBD)
8
Hypothesis 3 Do we really need pitch to perceive tones? Test: whispered speech Can native speakers perceive tones in whispered speech? Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html
9
Minimum pairs A minimum pair: two 2-char words with only 1 tonal difference. Why not use –one-char words: to prevent over-articulating –multi-char words: hard to find min pairs.
10
Whisperer fileListener file Listener listens for the ORDER within each minimum pair
11
Experiment setup Each whisperer/listener group work on about 100 different minimum pairs. In a quiet room, 1 meter apart. Each pair whispered once. Native speakers. (Liu J., Yu H., Zhang Y., Zhu X.)
12
What to expect If there is no tonal info in whisper, listeners would guess the order with 50% accuracy.
13
Result
14
Result significant? Flip a coin 3 times, 2 heads 1 tail. A biased coin? Chi-square test Accuracy significantly better than random at p < 0.0001 (that’s *really* significant).
15
Accuracy breakdown correct/total..
16
Accuracy breakdown Accuracy %, significant at p<0.002..
17
Summary People do perceive tonal differences without pitch. How? –Strength (power)? –Duration? –Subtle vocal tract shape difference?
18
While we are whispering... Tonal difference (we’ve seen that) Voiced / unvoiced consonant? time vs. dime –voice onset time http://www.indiana.edu/~hlw/PhonUnits/consonants2.html
19
Voiced/unvoiced consonant [p,b], [t,d], [k,g] Mandarin speakers 94% accuracy Aspiration
20
Other languages? Thai –Is tonal too; 5 tones. –Has [p h ], [p], [b] would be interesting!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.