Phonetic Context Effects Major Theories of Speech Perception Motor Theory: Specialized module (later version) represents speech sounds in terms of intended.

Slides:

Advertisements

Similar presentations

Sounds that “move” Diphthongs, glides and liquids.

Advertisements

Acoustic Characteristics of Consonants

Glides (/w/, /j/) & Liquids (/l/, /r/) Degree of Constriction Greater than vowels – P oral slightly greater than P atmos Less than fricatives – P oral.

Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)

1.0 Introduction Traditional View of phonetic laryngeal contrasts (/t/~/d/, VOICING): F0 drop, F1 drop, pulsing in the gap, CV Ratio, etc. (Kingston et.

CS 551/651: Structure of Spoken Language Lecture 12: Tests of Human Speech Perception John-Paul Hosom Fall 2008.

The perception of dialect Julia Fischer-Weppler HS Speaker Characteristics Venice International University

Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.

Chapter 12 Speech Perception. Animals use sound to communicate in many ways Bird calls Bird calls Whale calls Whale calls Baboons shrieks Baboons shrieks.

The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/

SPEECH PERCEPTION 2 DAY 17 – OCT 4, 2013 Brain & Language LING NSCI Harry Howard Tulane University.

Ling 240: Language and Mind Acquisition of Phonology.

Speech perception 2 Perceptual organization of speech.

Speech Science XII Speech Perception (acoustic cues) Version

The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/

Speech and speaker normalization (in vowel normalization)

Speaking Style Conversion Dr. Elizabeth Godoy Speech Processing Guest Lecture December 11, 2012.

Speech Perception Overview of Questions Can computers perceive speech as well as humans? Does each word that we hear have a unique pattern associated.

Speech Group INRIA Lorraine

Identification and discrimination of the relative onset time of two component tones: Implications for voicing perception in stops David B. Pisoni ( )

Exam 1 Monday, Tuesday, Wednesday next week WebCT testing centre Covers everything up to and including hearing (i.e. this lecture)

Cognitive Processes PSY 334 Chapter 2 – Perception April 9, 2003.

TEMPLATE DESIGN © Listener’s variation in phoneme category boundary as a source of sound change: a case of /u/-fronting.

PSY 369: Psycholinguistics

SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?

What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.

Psycholinguistics Lecture 7

Auditory-acoustic relations and effects on language inventory Carrie Niziolek [carrien] may 2004.

The Perception of Speech

Cognitive Processes PSY 334 Chapter 2 – Perception.

Conclusions  Constriction Type does influence AV speech perception when it is visibly distinct Constriction is more effective than Articulator in this.

Speech Perception. Phoneme - a basic unit of a speech sound that distinguishes one word from another Phonemes do not have meaning on their own but they.

Experiments concerning boundary tone perception in German 3 rd Workshop of the SPP-1234 Potsdam, 7 th January 2009 Presentation of the Stuttgart Project.

Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,

The Motor Theory of Speech Perception April 1, 2013.

1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.

Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.

A MEG study of the neural basis of context-dependent speech categorization Erika J.C.L. Taylor 1, Lori L. Holt 1, Anto Bagic 2 1 Department of Psychology.

Acoustic Cues to Laryngeal Contrasts in Hindi Susan Jackson and Stephen Winters University of Calgary Acoustics Week in Canada October 14,

Failed, because: Discriminability alone is not enough; code on speech needs to be compatible with speech. Minimally, must have the speed of speech. Lessons:

SPEECH PERCEPTION DAY 16 – OCT 2, 2013 Brain & Language LING NSCI Harry Howard Tulane University.

Sh s Children with CIs produce ‘s’ with a lower spectral peak than their peers with NH, but both groups of children produce ‘sh’ similarly [1]. This effect.

Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.

Perception + Vocal Tract Physiology November 25, 2014.

Localization of Auditory Stimulus in the Presence of an Auditory Cue By Albert Ler.

Sensation & Perception

The long-term retention of fine- grained phonetic details: evidence from a second language voice identification training task Steve Winters CAA Presentation.

Acoustic Continua and Phonetic Categories Frequency - Tones.

1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.

CSD 2230 INTRODUCTION TO HUMAN COMMUNICATION DISORDERS Normal Sound Perception, Speech Perception, and Auditory Characteristics at the Boundaries of the.

Chapter 13: Speech Perception. The Acoustic Signal Produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract Vowels.

Bosch & Sebastián-Gallés Simultaneous Bilingualism and the Perception of a Language-Specific Vowel Contrast in the First Year of Life.

Exemplar Theory, part 2 April 15, 2013.

The Relation Between Speech Intelligibility and The Complex Modulation Spectrum Steven Greenberg International Computer Science Institute 1947 Center Street,

Motor Theory + Signal Detection Theory

WebCT You will find a link to WebCT under the “Current Students” heading on It is your responsibility to know how to work WebCT!

Nuclear Accent Shape and the Perception of Syllable Pitch Rachael-Anne Knight LAGB 16 April 2003.

Transitions + Perception March 25, 2010 Tidbits Mystery spectrogram #3 is now up and ready for review! Final project ideas.

AUDITORY CORTEX 1 SEPT 11, 2015 – DAY 8 Brain & Language LING NSCI Fall 2015.

Motor Theory of Perception March 29, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their.

Tone sandhi and tonal coarticulation in Fuzhou Min Yang Li 李杨 Phonetics Laboratory, DTAL University of Cambridge 1.

A STUDY ON PERCEPTUAL COMPENSATION FOR / /- FRONTING IN A MERICAN E NGLISH Reiko Kataoka February 14, 2009 BLS 35.

What can we expect of cochlear implants for listening to speech in noisy environments? Andrew Faulkner: UCL Speech Hearing and Phonetic Sciences.

/u/-fronting in RP: a link between sound change and diminished perceptual compensation for coarticulation? Jonathan Harrington, Felicitas Kleber, Ulrich.

Cognitive Processes PSY 334

Cognitive Processes PSY 334

What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.

Speech Perception (acoustic cues)

/r/ Place: palatal Articulatory phonetics Acoustics

Topic: Language perception

Presentation transcript:

Phonetic Context Effects

Major Theories of Speech Perception Motor Theory: Specialized module (later version) represents speech sounds in terms of intended gestures through a model of or knowledge of vocal tracts Direct Realism: Perceptual system recovers (phonetically- relevant) gestures by picking up the specifying information in the speech signal. General Approaches: Speech is processed in the same way as other sounds. Representation is a function of the auditory system and experience with language. Explanatory level = gestureExplanatory level = sound

Fluent Speech Production adjacent speech becomes more similar The vocal tract is subject to physical constraints... Mass Inertia Coarticulation = Assimilation Radical Context Dependency Also a result of the motor plan

An Example Place of Articulation in stops Say /da/Say /ga/ AnteriorPosterior

An Example Place of Articulation in stops Say /al/Say /ar/ AnteriorPosterior

An Example Place of Articulation in stops Say /al da/ Say /ar da/ Say /al ga/ Say /ar ga/ Place of articulation changes = Coarticulation

An Example Place of Articulation in stops Say /ar da/ Say /al ga/ Coarticulation has acoustical consequences

/al da/ /ar da/ /al ga/ /ar ga/ * * f t How does the listener deal with this?

Speech Perception /ar//al/ /ga/ /da/

Identifying in Context Percent “g” Responses [ga] [da] /al/ /ar/ Percent “g” Responses

Direction of Effect Production/al/ More /da/-like /ar/ More /ga/-like Perception/al/ /ar/ More /da/-like

Perceptual Compensation ForCoarticulation

What happens when there is no coarticulation? AT&T Natural Voices Text-To-Speech Engine “ALL DA”“ARE GA”

Further Findings 4 ½ month old infants (Fowler et al. 1990) Native Japanese listeners who do not discriminate /al/ from /ar/ (Mann, 1986)

“There may exist a universally shared level where representation of speech sounds more closely corresponds to articulatory gestures that give rise to the speech signal.” (Mann, 1986) “Presumably human listeners possess implicit knowledge of coarticulation.” (Repp, 1982) Theoretical Interpretations Motor Theory

Major Theories of Speech Perception Motor Theory: “Knowledge” of coarticulation allows perceptual system to compensate for its predicted effects on the speech signal. Direct Realism: Coarticulation is information for the gestures involved. Signal is parsed along the gestural lines. Coart. is assigned to gesture. General Approaches: Those other guys are wrong.

Theoretical Interpretations Common Thread: Detailed correspondence between speech production and perception Special Module for Speech Perception Talker-Specific Speech-Specific Two Predictions:

Testing Hypothesis #1 Testing Hypothesis #1 Talker-specific Should only compensate for the speech coming from a single speaker

Testing Hypothesis #1 Testing Hypothesis #1 Talker-specific Male /al/ Male /ar/ Male /da/ - /ga/ Female /al/ Female /ar/

Testing Hypothesis #2 Testing Hypothesis #2 Speech-specific Compensation should only occur for speech sounds /al/ SPEECH TONES /ar/ SPEECH TONES

Testing Hypothesis #2

Does this rule out motor theory? It may be that the special speech module is broadly tuned. If it acts like speech it went through speech module. If not, not. /al/ /ar/ SPEECH PRECURSORS

Training the Quail /da/ /ga/ 1267 /al//ar/ Withheld from training Withheld from training CV series varying in F3 onset frequency

Context-Dependent Speech Perception by an avian species Normalized Response (Pecks or “GA” Responses)

Conclusions General auditory processes play a substantive role in maintaining perceptual compensation for coarticulation 3 Links to speech production are not necessary 1 Neither speech-specific nor species-specific Learning is not necessary 2 Quail had no experience with covariation

Major Theories of Speech Perception Motor Theory: “Knowledge” of coarticulation allows perceptual system to compensate for its predicted effects on the speech signal. Direct Realism: Coarticulation is information for the gestures involved. Signal is parsed along the gestural lines. Coart. is assigned to gesture. General Approaches: General Auditory Processes GAP

Effects of Context a familiar example How well does this analogy hold up for context effects in speech?

Effects of Context a familiar example

/al da/ /ar da/ /al ga/ /ar ga/ * * f t

Hypothesis: Spectral Contrast the case of [ar] Time Production Production F3 assimilated toward lower frequency Frequency /ar da/ Perception Perception F3 is perceived as a higher frequency F3 Step Percent /ga/ Responses

Evidence for General Approach

The Empire Strikes Back

Fowler, et al. (2000) video audio Visual cue: face “AL” or “AR” Ambiguous precursorTest syllable: /ga/-/da/ series Precursor conditions differed only in visual information

Results of Fowler, et al. (2000) More /ga/ responses when video cued /al/

Experiment 1: Results No context effect on test syllable –F(1,8) = 3.2, p =.111 %ga responses by condition for 9 participants

A closer look… 2 videos: /alda/ /arda/ Video information during test syllable presentation Should be the same in both conditions

…more consistent with /ga/? …more consistent with /da/? /alda/ video/arda/ video

Results

Comparisons

Conclusions Spectral contrast is best current account 3 No evidence of visually mediated phonetic context effect 1 No evidence that gestural information is required 2 But what about backwards effects???

The Stimulus Paradigm Time Target Speech Stimulus /da-ga/ Noise Burst (/t/ or /k/) Sine-wave Tone Context (High or Low Freq) Got Dot LowHigh Gawk Dock     Time (ms) Frequency (Hz)

Speaker Normalization Ladefoged & Broadbent (1957) CARRIER SENTENCE “Please say what this word is…” Original, F1 , F1  TARGET “bit”, “bet”, “bat”, “but” + TARGET acoustics were constant TARGET perception shifted with changes in “speaker” Spectral characteristics of the preceding sentence predicted perception ‘Talker/Speaker Normalization’ Sensitivity to Accent, Etc.

Experiment Model /ga/ /da/ 19 Time Natural speech F2 & F3 onset edited to create 9-step series Varying perceptually from /ga/ to /da/ Speech Token 589 ms

No Effect of Adjacent Context with intermediate spectral characteristics Time Silent Interval 50 ms Standard Tone 70 ms Speech Token 589 ms 2300 Hz PILOT TEST: No context effect on speech perception (t (9) =1.35, p=.21)

Acoustic Histories ACOUSTIC HISTORY: The critical context stimulus for these experiments is not a single sound, but a distribution of sounds  ms tones, sampled from a distribution  30-ms silent interval between tones Time Acoustic History 2100 ms Silent Interval 50 ms Standard Tone 70 ms Speech Token 589 ms

Acoustic History Distributions Tone Frequency (Hz) Frequency of Presentation High Mean = 2800 Hz Low Mean = 1800 Hz

Example Stimuli Time (ms) Frequency (Hz) AB 2800 Hz Mean 1800 Hz Mean

Characteristics of the Context Context is not local  Standard tone immediately precedes each stimulus, independent of condition. On its own, this has no effect of context on /ga/-/da/ stimuli. Context is defined by distribution characteristics  Sampling of the distribution varies on each trial Precise acoustic characteristics vary with trial  Context unfolds over a broad time course Time Acoustic History 2100 ms Silent Interval 50 ms Standard Tone 70 ms Speech Token 589 ms

Results p<.0001 Contrastive

Notched Noise Histories Time (ms) Frequency (Hz) A B 100 Hz BW for each notch

Results TonesNotched Noise p<0.04 p<0.01 N=10

Joint Effects? Time Acoustic History 2100 ms Silent Interval 50 ms Standard Tone 70 ms Speech Token 589 ms Time Acoustic History 2100 ms Silent Interval 50 ms /al/ or /ar/ 300 ms Speech Token 589 ms Conflicting  e.g., High Mean + /ar/ Cooperating  e.g., High Mean + /al/

Interaction of Speech/N.S. p=.007 p<.0001 p=.009 Significantly greater than speech alone Same magnitude as Speech Only, opposite direction Follows NS spectra