Download presentation
Presentation is loading. Please wait.
Published byTrevor Reed Modified over 9 years ago
1
The Perception of Speech
2
Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/ in pat Speech
3
Seeing Sound with Spectrograms A spectrogram is a 3D plot of sound Time Frequency
4
Seeing Sound with Spectrograms A spectrogram is a 3D plot of sound Time Frequency Intensity Intensity is often coded by colour
5
Speech can be characterized by a spectrogram Acoustic Properties of Speech
6
Spectrogram reveals differences between phonemes The differences are in the formants and the formant transitions Acoustic Properties of Speech
7
Perceiving Speech So perceiving (interpreting) speech sounds is simply a matter of matching the spectrotemporal properties (the shape of the spectrogram) of the incoming sound waves to the appropriate phoneme right?…
8
Perceiving Speech So perceiving (interpreting) speech sounds is simply a matter of matching the spectrotemporal properties (the shape of the spectrogram) of the incoming sound waves to the appropriate phoneme Then specific phonemes must correspond to specific spectrograms - a property called acoustic-phonetic invariance
9
Acoustic - Phonetic invariance says that phonemes should match one and only one pattern in the spectrogram –This is not the case! For example /d/ followed by different vowels: Perceiving Speech
10
Acoustic - Phonetic invariance says that phonemes should match one and only one pattern in the spectrogram –This is not the case! For example /d/ Clearly perception and understanding of speech sounds is more elaborate than simply interpreting an internal spectrogram Perceiving Speech
11
The phrase “Peter buttered the burnt toast” has five /t/ phonemes. There are not 5 identical sweeps in the spectrogram Perceiving Speech
12
The Segmentation Problem Segmentation is the perception of silence between words Often illusory Perceiving Speech
13
The phrase “I owe you a Yo-Yo” has no silence in it ! Perceiving Speech
14
Spoken Input The Segmentation Problem: –The stream of acoustic input is not physically segmented into discrete phonemes, words, phrases, etc. –Silent gaps don’t always indicate (aren’t perceived as) interruptions in speech
15
Spoken Input The Segmentation Problem: –The stream of acoustic input is not physically segmented into discrete phonemes, words, phrases, etc. –Continuous speech stream is sometimes perceived as having gaps
16
Perceiving Speech So how do you perceive speech? Some of the “strategies”: 1. reduce the data 2. use context clues 3. use vision
17
Categorical Perception is a phenomenon in which the brain assigns a stimulus into one or another category but never into an intermediate category Categorical Perception
18
For example, /ba/ and /pa/ differ in their formant transitions –/ba/ is formed by stopping the flow of air from the lungs and releasing it after about 10 milliseconds (called voice onset time) –/pa/ is similar except that voice onset time is about 50 ms Categorical Perception
19
Voice onset time can range from zero to >50 ms. For example, you could synthesize a sound with a voice onset time of 30 ms but... Categorical Perception
20
Voice onset time can range from zero to >50 ms. For example, you could synthesize a sound with a voice onset time of 30 ms but... English speakers will hear either /ba/ or /pa/ but never something in between Categorical Perception
21
Categorical Perception is Part of Learning a Language Babies can discriminate /ba/ from /pa/ and can discriminate these from phonemes with intermediate voice onset times! By 10 to 12 months, babies (learning English) stop discriminating intermediate voice onset times
22
Categorical Perception is Part of Learning a Language Once category boundaries are learned it is impossible to unlearn them –non-native speakers of any language often cannot hear certain phonemes the way native speakers do –as a consequence they will always have at least some slight accent
23
Another example: Categorical Perception
24
Perception (of all types) Makes Use of Context The stream of information contained in speech is usually ambiguous and incomplete Your brain makes a “best guess” based on the circumstances
25
Perception (of all types) Makes Use of Context Consider the following example: “The __eel fell of the cough shoe”. car”.
26
Perception (of all types) Makes Use of Context Consider the following example: Listeners report hearing the “appropriate” phoneme during the cough “The __eel fell of the cough shoe”. car”.
27
Why rely on only one sensory system when there is information in two !? Much of Speech Perception isn’t Auditory !
28
Why rely on only one sensory system when there is information in two !? The brain seamlessly integrates any information it is given - this is called cross- modal integration Much of Speech Perception isn’t Auditory !
29
Speech perception involves the synthesis of vision and hearing The McGurk effect demonstrates the critical role of vision on speech perception Cross-modal Integration
30
The McGurk Effect Cross-modal Integration
31
The McGurk Effect - suggests that visual and auditory information are combined to enhance speech perception under normal circumstances When visual and auditory information are incongruous the resulting perception is unpredictable and often wrong Cross-modal Integration
32
Next Time: Taste Smell Touch Balance
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.