Transitions + Perception March 25, 2010 Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.

Slides:



Advertisements
Similar presentations
Tom Lentz (slides Ivana Brasileiro)
Advertisements

Perturbation Theory, part 2 November 4, 2014 Before I forget Course project report #3 is due! I have course project report #4 guidelines to hand out.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Basic Spectrogram & Clinical Application: Consonants
Acoustic Characteristics of Consonants
Speech Perception Dynamics of Speech
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
From Resonance to Vowels March 8, 2013 Friday Frivolity Some project reports to hand back… Mystery spectrogram reading exercise: solved! We need to plan.
Perturbation Theory March 11, 2013 Just So You Know The Fourier Analysis/Vocal Tract exercise is due on Wednesday. Please note: don’t make too much out.
1 CS 551/651: Structure of Spoken Language Spectrogram Reading: Stops John-Paul Hosom Fall 2010.
CS 551/651: Structure of Spoken Language Lecture 12: Tests of Human Speech Perception John-Paul Hosom Fall 2008.
Acoustic Characteristics of Vowels
Vowels and Tubes (again) March 22, 2011 Today’s Plan Perception experiment! Discuss vowel theory #2: tubes! Then: some thoughts on music. First: let’s.
Vowels (again) February 23, 2010 The News For Thursday: Give me a (one paragraph or so) description of what you’re thinking of doing for a term project.
Nasal Stops.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
SPEECH PERCEPTION 2 DAY 17 – OCT 4, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Speech perception 2 Perceptual organization of speech.
Speech Science XII Speech Perception (acoustic cues) Version
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
Digital Systems: Hardware Organization and Design
ACOUSTICAL THEORY OF SPEECH PRODUCTION
Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.
Unit 4 Articulation I.The Stops II.The Fricatives III.The Affricates IV.The Nasals.
Spectrogram & its reading
Exam 1 Monday, Tuesday, Wednesday next week WebCT testing centre Covers everything up to and including hearing (i.e. this lecture)
TEMPLATE DESIGN © Listener’s variation in phoneme category boundary as a source of sound change: a case of /u/-fronting.
PSY 369: Psycholinguistics
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Nasal Stops. Nasals Distinct vocal tract configuration Pharyngeal cavity Oral cavity (closed) Nasal cavity (open)
The Perception of Speech
Speech Sounds of American English and Some Iranian Languages
Phonetics HSSP Week 5.
Source/Filter Theory and Vowels February 4, 2010.
Speech Perception. Phoneme - a basic unit of a speech sound that distinguishes one word from another Phonemes do not have meaning on their own but they.
Phonetics: the generation of speech Phonemes “The shortest segment of speech that, if changed, would change the meaning of a word.” hog fog log *Phonemes.
Speech Production1 Articulation and Resonance Vocal tract as resonating body and sound source. Acoustic theory of vowel production.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
The Motor Theory of Speech Perception April 1, 2013.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Speech Science VII Acoustic Structure of Speech Sounds WS
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
Speech Or can you hear me now?. Linguistic Parts of Speech Phone Phone Basic unit of speech sound Basic unit of speech sound Phoneme Phoneme Phone to.
Speech Science Fall 2009 Oct 28, Outline Acoustical characteristics of Nasal Speech Sounds Stop Consonants Fricatives Affricates.
Voice Quality + Stop Acoustics
Say “blink” For each segment (phoneme) write a script using terms of the basic articulators that will say “blink.” Consider breathing, voicing, and controlling.
Glides, Place and Perception March 18, 2010 News The hard drive on the computer has been fixed! A couple of new readings have been posted to the course.
More Motor Theory + Fricative Acoustics March 30, 2010.
Transitions + Perception March 27, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their projects.
Perception + Vocal Tract Physiology November 25, 2014.
Information Processing Assumptions Measuring the real-time stages General theory –structures –control processes Representation –definition –content vs.
Stops Stops include / p, b, t, d, k, g/ (and glottal stop)
From Resonance to Vowels March 13, 2012 Fun Stuff (= tracheotomy) Peter Ladefoged: “To record the pressure of the air associated with stressed as opposed.
Chapter 13: Speech Perception. The Acoustic Signal Produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract Vowels.
Sonorant Acoustics + Place Transitions
Exemplar Theory, part 2 April 15, 2013.
Stop Acoustics and Glides December 2, 2013 Where Do We Go From Here? The Final Exam has been scheduled! Wednesday, December 18 th 8-10 am (!) Kinesiology.
Language Perception.
Stop + Approximant Acoustics
Motor Theory + Signal Detection Theory
WebCT You will find a link to WebCT under the “Current Students” heading on It is your responsibility to know how to work WebCT!
Nasals + Liquids + Everything Else
Acoustic Phonetics 3/14/00.
Motor Theory of Perception March 29, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their.
Stop/Plosives.
Stop Acoustics + Glides December 2, 2015 Down The Stretch They Come Today: Stop and Glide Acoustics Friday: Sonorant Acoustics + USRI evaluations We’ll.
Speech Perception (acoustic cues)
Motor theory.
Presentation transcript:

Transitions + Perception March 25, 2010

Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.

Glides Each glide corresponds to a different high vowel. VowelGlidePlace [i][j]palatal(front, unrounded) [u][w]labio-velar(back, rounded) [y]labial-palatal(front, rounded) velar (back, unrounded) Glides are vowel-like sonorants which are produced… with slightly more constriction than a vowel at the same place of articulation. Each glide’s acoustics will be similar to those of the vowel they correspond to.

Glide Acoustics Glides look like high vowels, but… are shorter than vowels They also tend to lack “steady states” and exhibit rapid transitions into (or from) vowels hence: “glides” Also: lower in intensity especially in the higher formants

[j] vs. [i]

[w] vs. [u]

Vowel-Glide-Vowel [iji][uwu]

More Glides [wi:][ju:]

Transitions When stops are released, they go through a transition phase in between the stop and the vowel. From stop to vowel: 1.Stop closure 2.Release burst 3.(glide-like) transition 4.“steady-state” vowel Vowel-to-stop works the same way, in reverse, except: Release burst (if any) comes after the stop closure.

Stop Components From Armenian:[bag] closure voicing vowel formant transitions another closure stop release burst

Confusions When the spectrogram was first invented… phoneticians figured out quite quickly how to identify vowels from their spectral characteristics… but they had a much harder time learning how to identify stops by their place of articulation. Eventually they realized: the formant transitions between vowels and stops provided a reliable cue to place of articulation. Why?

Formant Transitions A: the resonant frequencies of the vocal tract change as stop gestures enter or exit the closure phase. Simplest case: formant frequencies usually decrease near bilabial stops

Stops vs. Glides Note: formant transitions are more rapid for stops than they are for glides. “baby” “wave”

Formant Transitions: alveolars For other places of articulation, the formant transition that appears is more complex. From front vowels into alveolars, F2 tends to slope downward. From back vowels into alveolars, F2 tends to slope upwards. In Perturbation Theory terms: alveolars constrict somewhat closer to an F2 node (the palate) than to an F2 anti-node (the lips)

[hid] [hæd]

Formant Locus Whether in a front vowel or back vowel context... The formant transitions for alveolars tend to point to the same frequency value. (  Hz) This (apparent) frequency value is known as the locus of the formant transition. In the ‘50s, researchers theorized: the locus frequency can be used by listeners to reliably identify place of articulation. However, velars posed a problem…

Velar Transitions Velar formant transitions do not always have a reliable locus frequency for F2. Velars exhibit a lot of coarticulation with neighboring vowels. Fronter (more palatal) next to front vowels Locus is high:  Hz Backer (more velar) next to back vowels Locus is lower: < 1500 Hz F2 and F3 often come together in velar transitions “Velar Pinch”

The Velar Pinch [bag][bak]

“Velar” Co-articulations

The earliest experiments on place perception were conducted in the 1950s, using a speech synthesizer known as the pattern playback. Testing the Theory

Pattern Playback Picture

Haskins Formant Transitions Testing the perception of two-formant stimuli, with varying F2 transitions, led to a phenomenon known as categorical perception.

Categorical Perception Categorical perception = continuous physical distinctions are perceived in discrete categories. In the in-class experiment from last time: There were 11 different syllable stimuli They only differed in the locus of their F2 transition F2 Locus range = Hz Source:

Stimulus #1Stimulus #6 Stimulus #11 Example stimuli from the in-class experiment.

Identification In Categorical Perception: All stimuli within a category boundary should be labeled the same.

Discrimination Original task: ABX discrimination Stimuli across category boundaries should be 100% discriminable. Stimuli within category boundaries should not be discriminable at all. In practice, categorical perception means: the discrimination function can be determined from the identification function.

Identification  Discrimination Let’s consider a case where the two sounds in a discrimination pair are the same. Example: the pair is stimulus 3 followed by stimulus 3 Identification data--Stimulus 3 is identified as: [b] 95% of the time [d] 5% of the time The discrimination pair will be perceived as: [b] - [b]-.95 *.95 =.9025 [d] - [d]-.05 *.05 =.0025 Probability of same response is predicted to be: ( ) =.905 = 90.5%

Identification  Discrimination Let’s consider a case where the two sounds in a discrimination pair are different. Example: the pair is stimulus 9 followed by stimulus 11 Identification data: Stimulus 9: [d] 80% of the time, [g] 20% of the time Stimulus 11: [d] 5% of the time, [g] 95% of the time The discrimination pair will be perceived as: [d] - [d]-.80 *.05 =.04 [g] - [g]-.20 *.95 =.19 Probability of same response is predicted to be: ( ) = 23%

Discrimination In this discrimination graph-- Solid line is the observed data Dashed line is the predicted data (on the basis of the identification scores) Note: the actual listeners did a little bit better than the predictions.

Categorical, Continued Categorical Perception was also found for VOT distinctions. And for stop/glide/vowel distinctions: 10 ms transitions: [b] percept 60 ms transitions: [w] percept 200 ms transitions: [u] percept

Interpretation Main idea: in categorical perception, the mind translates an acoustic stimulus into a phonemic label. (category) The acoustic details of the stimulus are discarded in favor of an abstract representation. A continuous acoustic signal: Is thus transformed into a series of linguistic units:

The Next Level Interestingly, categorical perception is not found for non-speech stimuli. Miyawaki et al: tested perception of an F3 continuum between /r/ and /l/.

The Next Level They also tested perception of the F3 transitions in isolation. Listeners did not perceive these transitions categorically.

The Implications Interpretation: we do not perceive speech in the same way we perceive other sounds. “Speech is special”… and the perception of speech is modular. A module is a special processor in our minds/brains devoted to interpreting a particular kind of environmental stimuli.

Module Characteristics You can think of a module as a “mental reflex”. A module of the mind is defined as having the following characteristics: 1.Domain-specific 2.Automatic 3.Fast 4.Hard-wired in brain 5.Limited top-down access (you can’t “unperceive”) Example: the sense of vision operates modularly.

A Modular Mind Model central processes judgment, imagination, memory, attention modules visionhearingtouchspeech transducers eyesearsskinetc. external, physical reality

Remember this stuff? Speech is a “special” kind of sound because it exhibits spectral change over time.  it’s processed by the speech module, not by the auditory module.

SWS Findings The uninitiated either hear sinewave speech as speech or as “whistles”, “chirps”, etc. Claim: once you hear it as speech, you can’t go back. The speech module takes precedence (Limited top-down access) Analogy: it’s impossible to not perceive real speech as speech. We can’t hear the individual formants as whistles, chirps, etc. Motor theory says: we don’t perceive the “sounds”, we perceive the gestures which shape the spectrum.

More Evidence for Modularity It has also been observed that speech is perceived multi-modally. i.e.: we can perceive it through vision, as well as hearing (or some combination of the two).  We’re perceiving “gestures” …and the gestures are abstract. Interesting evidence: McGurk Effect