Chapter 10 Perception of Speech

Slides:



Advertisements
Similar presentations
Speech Perception Dynamics of Speech
Advertisements

Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
Acoustic Characteristics of Vowels
Chapter 3: Neural Processing and Perception. Lateral Inhibition and Perception Experiments with eye of Limulus –Ommatidia allow recordings from a single.
Chapter 12 Speech Perception. Animals use sound to communicate in many ways Bird calls Bird calls Whale calls Whale calls Baboons shrieks Baboons shrieks.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
MIMICKING THE HUMAN EAR Philipos Loizou (author) Oliver Johnson (me)
Speech perception 2 Perceptual organization of speech.
Development of Speech Perception. Issues in the development of speech perception Are the mechanisms peculiar to speech perception evident in young infants?
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.
M.Sc. in Medical Engineering
Outline Sensation, Perception, Behavior Process of sensation Perceived vs. “real” world Properties of perceptual processes - Adaptation, pattern coding.
© red ©
Exam 1 Monday, Tuesday, Wednesday next week WebCT testing centre Covers everything up to and including hearing (i.e. this lecture)
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
Perception: Getting Started April 24, Perception is a key topic in Cognitive Science Cognitive Science regards the mind as an information processing.
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Chapter three Phonology
The Perception of Speech
THEORIES OF COLOR VISION
Sensation and Perception
Speech Perception. Phoneme - a basic unit of a speech sound that distinguishes one word from another Phonemes do not have meaning on their own but they.
CSD 5400 REHABILITATION PROCEDURES FOR THE HARD OF HEARING Auditory Perception of Speech and the Consequences of Hearing Loss.
Phonetics: the generation of speech Phonemes “The shortest segment of speech that, if changed, would change the meaning of a word.” hog fog log *Phonemes.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Speech Perception1 Fricatives and Affricates We will be looking at acoustic cues in terms of … –Manner –Place –voicing.
Speech Science Fall 2009 Nov 2, Outline Suprasegmental features of speech Stress Intonation Duration and Juncture Role of feedback in speech production.
PERCEPTION AND PATTERN RECOGNITION Making sense of sensation –Local vs. Global scope –Data-driven (sensory, bottom-up) vs. Concept-driven (knowledge, “top-down”)
Speech Or can you hear me now?. Linguistic Parts of Speech Phone Phone Basic unit of speech sound Basic unit of speech sound Phoneme Phoneme Phone to.
Myers EXPLORING PSYCHOLOGY Module 14 Introduction to Sensation and Perception: Vision James A. McCubbin, PhD Clemson University Worth Publishers.
Chapter 5 Sensation — the window on the world How does the world out there get in?
Sensation & Perception
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
Sounds and speech perception Productivity of language Speech sounds Speech perception Integration of information.
The complicated Ba/Pa lecture…made simple by Alex and Nicole CATEGORICAL PERCEPTION.
CSD 2230 INTRODUCTION TO HUMAN COMMUNICATION DISORDERS Normal Sound Perception, Speech Perception, and Auditory Characteristics at the Boundaries of the.
Infant Perception. William James, 1890 “The baby, assailed by eyes, ears, nose, skin and entrails all at once, feels it all as one great blooming, buzzing.
Sensation and Perception Gateway to the outside world.
Chapter 13: Speech Perception. The Acoustic Signal Produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract Vowels.
Fundamentals of Sensation and Perception SPEECH & MUSIC ERIK CHEVRIER NOVEMBER 3RD, 2015.
Vocab Theories & Laws Anatomical Structures Other Senses Perceptual Organization $100 $500 $400 $300 $200.
Step Up To: Psychology by John J. Schulte, Psy.D. From Myers, Psychology 8e Worth Publishers.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
Speech Perception.
Language Perception.
WebCT You will find a link to WebCT under the “Current Students” heading on It is your responsibility to know how to work WebCT!
Transitions + Perception March 25, 2010 Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.
AUDITORY CORTEX 1 SEPT 11, 2015 – DAY 8 Brain & Language LING NSCI Fall 2015.
Sensation and Perception. Transformation of stimulus energy into a meaningful understanding –Each sense converts energy into awareness.
Hearing + Perception, part 2 April 10, 2013 Hearing Aids et al. Generally speaking, a hearing aid is simply an amplifier. Old style: amplifies all frequencies.
Sensation and Perception Unit 7
Language is for Labels:
Psychology Ch. 3 Sensation and Perception
Vision.
Chapter 9 Auditory Perry C. Hanavan, Au.D..
Color Vision by King Saud University Physiology Dept
Chapter 6 Sensation and Perception
Speech Perception.
Chapter 6 (C): Vision.
VISION Module 18.
Introduction to Sensation and Perception
Maggie Hamilton, Au.D. Hamilton CI Services, LLC
Chapter 4: Sensation and Perception
What Color is it?.
GCSE Psychology Unit 2: Language, Thought & Communication
Fundamentals of Sensation and Perception
Topic: Language perception
(Do Now) Journal What is psychophysics? How does it connect sensation with perception? What is an absolute threshold? What are some implications of Signal.
Presentation transcript:

Chapter 10 Perception of Speech Perry C. Hanavan, Au.D.

Question How do we perceive speech? Individual sounds (phonemes)? Syllables? Words? Sentences? Listening to Mozart?

Speech Perception How do we perceive speech? Individual sounds (phonemes)? Syllables? Words? Sentences? How do we derive meaning from the ocean of sounds we hear? Speech is variable Speakers vary in speech Variant or invariant cues?

Question What is Pattern Playback? A music group from India Talking machine built by Dr. Franklin S. Cooper and colleagues at Haskins Laboratories A brain based device for speech perception

Pattern Playback

Question What is an invariant speech cue? Phonemes coarticulated A phoneme produced in isolation Transition from one phoneme to the next

Sensation: An internal representation of the stimulus. Excitation, Sensation, Cognition Excitation: The pattern of neural responses elicited by a given stimulus. Sensation: An internal representation of the stimulus. Cognition: The interpretation of a sensation on the basis of stored knowledge.

Measured by the Just-Noticeable-Difference threshold. Discrimination The ability to distinguish between two levels of a stimulus parameter (e.g., different wavelengths of light. Measured by the Just-Noticeable-Difference threshold. Uses sensory representation. Modality dependent

Uses cognitive representation: needs to refer to stored knowledge. Recognition The ability to distinguish categorize a stimulus as belonging to a particular class (e.g., colour or object type). Uses cognitive representation: needs to refer to stored knowledge. Representation dependent.

The relationship between discrimination and recognition Recognition relies on discrimination… but does recognition also influence discrimination? Discriminability seems to be affected by category structures – this is categorical perception.

Categorical perception: Discriminability across category boundaries is more sensitive than discriminability within categories.

Phonemes are the sounds that make up language: e.g., /b/ & /p/. First example of categorical perception: the phoneme boundary effect. Phonemes are the sounds that make up language: e.g., /b/ & /p/. The phonemes /b/ and /p/ differ in the time between the onset (stop) and voicing.

Liberman and colleagues (1957) showed a phoneme boundary effect: Alvin Liberman (1917 – 2000) Liberman and colleagues (1957) showed a phoneme boundary effect: A smaller change in delay was necessary to distinguish /b/ from /p/, than to distinguish two phonemes within these categories.

The phoneme boundary effect Motor theory of speech perception: The phoneme boundary effect is caused by activation of the motor program required to produce a phoneme.

Category boundary effects in the colour domain Question: Is the way we sense colour affected by the words for colours in our language? Benjamin Lee Whorf (1897-1941)

Color can be objectively measured in terms of its wavelength: The question about colour perception can be operationalized: Color can be objectively measured in terms of its wavelength: 400n m 550nm 700nm Wavelength

Not subsumed by another term. The question about colour perception can be operationalized: The number of basic color terms in a language can be measured. Basic color terms are: Single words. Not subsumed by another term. Not restricted to a particular class of objects.

Dani (New Guinea): Two basic colour terms - mili (light), mola (dark). Early research on color naming Different languages have a variation in the number of words for colour categories. Dani (New Guinea): Two basic colour terms - mili (light), mola (dark). English: eleven basic color terms – white, black, grey, red green, blue, yellow, orange, purple, pink, brown.

Compared English and Tamahumara speakers. Kay and Kempton (1984) Compared English and Tamahumara speakers. Tamahumara does not make a distinction between blue and green. Kay and Kempton theorized that the perceptual distance between blue and green would be exaggerated in English speakers.

3 green G G G 2 green, 1 blue G G B B 3 blue B B Kay and Kempton (1984) 3 green G G G 2 green, 1 blue G G B B 3 blue B B

Kay and Kempton (1984) Tamahumara speakers were equally likely to choose either extreme for all three types of triplet.

Kay and Kempton (1984) English speakers were the same when all chips came from the same category. When there was an odd one out, they were more likely to choose that one.

Perception of Vowels /a/ vowel has greatest intensity with unvoiced /θ/ as weakest consonant Front vowels perceived on basis of F1 frequency and average of F2 and F3, whereas back vowels are perceived on the basis of the average of F1 and F2, as well as F3 So is it the absolute frequency values of the formants? Or the ratio of F2 to F1? Perhaps it is the invariant cues (frequency changes that occur with coarticulation F1/F2 F3 F1 F2/F3

Invariant and Variant Cues Showing how onset formant transitions that define perceptually consonant [d] differ depending on the identity of the following vowel. (Formants highlighted by red dotted lines; transitions are the bending beginnings of the formant trajectories.) /di/ /da/ /du/

Perception of Diphthongs Perceived on basis of formant transitions Salient feature: rapidity of transition

Consonant Perceptions Perception different for consonants than vowels Greater variety of consonant types than vowels Greater complexity for consonants

Question Which is TRUE regarding the following statements about categorical perception? Experience of percept invariances in sensory phenomena that can be varied along a continuum. Can be inborn or can be induced by learning. Related to how neural networks in our brains detect the features that allow us to sort the things in the world into separate categories All the above are true All the above are false

Categorical Perception Experience of percept invariances in sensory phenomena that can be varied along a continuum. Can be inborn or can be induced by learning. Related to how neural networks in our brains detect the features that allow us to sort the things in the world into separate categories  area in the left prefrontal cortex has been localized as the place in the brain responsible for phonetic categorical perception

Categorical Perception

CI Speech Coding Strategies ACE™: Unique to Cochlear’s Nucleus® 24 CI system. ACE optimizes detailed pitch and timing information of sound. SPEAK: (spectral peak) Increases the richness of important pitch information by stimulating electrodes across the entire electrode array. MPEAK: multipeak CIS : (Continuous-Interleaved Sampling) This high rate strategy uses a fixed set of electrodes. Emphasizes the detailed timing information of speech.

ACE Strategy Sound enters the speech processor through the microphone and is divided into a maximum of 22 frequency bands. Up to 20 narrow-band filters divide sound into corresponding frequency (pitch) ranges. Each frequency band stimulates a specific electrode along the electrode array. The electrode stimulated depends on the pitch of the sound. For example, in the word "show," the high pitch sound (sh) causes stimulation of electrodes placed near the entrance cochlea, where hearing nerve fibers respond to high pitch sounds. The low pitch sound (ow) stimulates electrodes further into the cochlea, where hearing nerve fibers respond to low pitch sounds. ACE varies the rate of stimulation of the electrodes with a total maximum stimulation rate of 14,400 pulses per second.

SPEAK Sound enters the speech processor through the microphone and is divided into 20 frequency bands. SPEAK selects the six to ten frequency bands containing maximum speech information. Each frequency band stimulates a specific electrode along the electrode array. The electrode stimulated depends on the pitch of the sound. For example, in the word "show" the high pitch sound (sh) causes stimulation of electrodes placed near the entrance of the cochlea, where the hearing nerve fibers respond to high pitch sounds. The low pitch sound (ow) stimulates electrodes further into the cochlea, where the hearing nerve fibers respond to low pitch sounds. SPEAK's dynamic stimulation along 20 electrodes allows you to perceive the detailed pitch information of natural sound.

CIS Sound enters the speech processor through the microphone. The sound is divided into 4, 6, 8 or 12 bands depending upon the number of channels used. Each band stimulates one specific electrode along the electrode array, sequentially. The same sites along the electrode are stimulated for every sound at a fast rate to deliver the rapid timing cues of speech.