Levels of Representation in Adult Speech Perception

Slides:



Advertisements
Similar presentations
Tom Lentz (slides Ivana Brasileiro)
Advertisements

Accessing spoken words: the importance of word onsets
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
CS 551/651: Structure of Spoken Language Lecture 12: Tests of Human Speech Perception John-Paul Hosom Fall 2008.
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
SPEECH PERCEPTION 2 DAY 17 – OCT 4, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Ling 240: Language and Mind Acquisition of Phonology.
Speech perception 2 Perceptual organization of speech.
Development of Speech Perception. Issues in the development of speech perception Are the mechanisms peculiar to speech perception evident in young infants?
Psych 156A/ Ling 150: Acquisition of Language II Lecture 4 Sounds.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
Speech Perception in Infant and Adult Brains
Identification and discrimination of the relative onset time of two component tones: Implications for voicing perception in stops David B. Pisoni ( )
Experiment 2: MEG Study Materials and Methods: 11 right-handed subjects with 20:20 vision were run. 3 subjects’ data was discarded because of poor performance.
Phonetic Detail in Developing Lexicon Daniel Swingley 2010/11/051Presented by T.Y. Chen in 599.
Exam 1 Monday, Tuesday, Wednesday next week WebCT testing centre Covers everything up to and including hearing (i.e. this lecture)
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Chapter three Phonology
Phonetics, day 2 Oct 3, 2008 Phonetics 1.Experimental a. production b. perception 2. Surveys/Interviews.
Conditioned allophony in speech perception: An MEG study Mary Ann Walter & Valentine Hacquard Dept. of Linguistics &
Auditory-acoustic relations and effects on language inventory Carrie Niziolek [carrien] may 2004.
Correlating Consonant Confusability and Neural Responses: An MEG Study Valentine Hacquard 1 Mary Ann Walter 1,2 1 Department of Linguistics and Philosophy,
The Perception of Speech
Cognitive Processes PSY 334 Chapter 2 – Perception.
A Lecture about… Phonetic Acquisition Veronica Weiner May, 2006.
Measuring the brain’s response to temporally modulated sound stimuli Chloe Rose Institute of Digital Healthcare, WMG, University of Warwick, INTRODUCTION.
Phonetics and Phonology
Sebastián-Gallés, N. & Bosch, L. (2009) Developmental shift in the discrimination of vowel contrasts in bilingual infants: is the distributional account.
Speech Perception. Phoneme - a basic unit of a speech sound that distinguishes one word from another Phonemes do not have meaning on their own but they.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 5 Sounds III.
Statistical learning, cross- constraints, and the acquisition of speech categories: a computational approach. Joseph Toscano & Bob McMurray Psychology.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Jiwon Hwang Department of Linguistics, Stony Brook University Factors inducing cross-linguistic perception of illusory vowels BACKGROUND.
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
Is phonetic variation represented in memory for pitch accents ? Amelia E. Kimball Jennifer Cole Gary Dell Stefanie Shattuck-Hufnagel ETAP 3 May 28, 2015.
Results 1.Boundary shift Japanese vs. English perceptions Korean vs. English perceptions 1.Category boundary was shifted toward boundaries in listeners’
Results Tone study: Accuracy and error rates (percentage lower than 10% is omitted) Consonant study: Accuracy and error rates 3aSCb5. The categorical nature.
Acoustic Cues to Laryngeal Contrasts in Hindi Susan Jackson and Stephen Winters University of Calgary Acoustics Week in Canada October 14,
Ch 3 Slide 1 Is there a connection between phonemes and speakers’ perception of phonetic differences? (audibility of fine distinctions) Due to phonology,
SPEECH PERCEPTION DAY 16 – OCT 2, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
SPEECH PERCEPTION DAY 18 – OCT 9, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Epenthetic vowels in Japanese: a perceptual illusion? Emmanual Dupoux, et al (1999) By Carl O’Toole.
Need for cortical evoked potentials Assessment and determination of amplification benefit in actual hearing aid users is an issue that continues to be.
SEPARATION OF CO-OCCURRING SYLLABLES: SEQUENTIAL AND SIMULTANEOUS GROUPING or CAN SCHEMATA OVERRULE PRIMITIVE GROUPING CUES IN SPEECH PERCEPTION? William.
3308 First Language acquisition Acquisition of sounds Perception Sook Whan Cho Fall, 2012.
Electrophysiological Processing of Single Words in Toddlers and School-Age Children with Autism Spectrum Disorder Sharon Coffey-Corina 1, Denise Padden.
Electrophysiology as a Brain Measure of Perceptual Sensitivity and Abstraction.
Lecture 2 Phonology Sounds: Basic Principles. Definition Phonology is the component of linguistic knowledge concerned with rules, representations, and.
Acoustic Continua and Phonetic Categories Frequency - Tones.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
CSD 2230 INTRODUCTION TO HUMAN COMMUNICATION DISORDERS Normal Sound Perception, Speech Perception, and Auditory Characteristics at the Boundaries of the.
Neurophysiologic correlates of cross-language phonetic perception LING 7912 Professor Nina Kazanina.
Chapter II phonology II. Classification of English speech sounds Vowels and Consonants The basic difference between these two classes is that in the production.
A Psycholinguistic Perspective on Child Phonology Sharon Peperkamp Emmanuel Dupoux Laboratoire de Sciences Cognitives et Psycholinguistique, EHESS-CNRS,
Näätänen et al. (1997) Language-specific phoneme representations revealed by electric and magnetic brain responses. Presented by Viktor Kharlamov September.
Language Perception.
WebCT You will find a link to WebCT under the “Current Students” heading on It is your responsibility to know how to work WebCT!
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 3 Sounds I.
1 Behavioral and electrophysiological evidence for the impact of regional variation on phoneme perception A. Brunellière, S. Dufour, N. Nguyen, U. H. Frauenfelder.
AUDITORY CORTEX 1 SEPT 11, 2015 – DAY 8 Brain & Language LING NSCI Fall 2015.
Against formal phonology (Port and Leary).  Generative phonology assumes:  Units (phones) are discrete (not continuous, not variable)  Phonetic space.
Auditory Perception 1 Streaming 400 vs. 504 Hz 400 vs. 566 Hz 400 vs. 635 Hz 400 vs. 713 Hz A 400-Hz tone (tone A) is alternated with a tone of a higher.
Risto Näätänen University of Tartu, Estonia
S. Kramer1, K. Tucker1, A.L. Moro1, E. Service1, J.F. Connolly1
Cognitive Processes PSY 334
Phonological Priming and Lexical Access in Spoken Word Recognition
Presentation transcript:

Levels of Representation in Adult Speech Perception

The Big Questions What levels of acoustic/phonetic/phonological representation can we distinguish in the brain? How are these representations created or modified during development? What is the flow of information (in space and time) in the mapping from acoustics to the lexicon in the brain? How does knowledge of native language categories and phonotactics constrain perception? How are phonological representations encoded?

/kæt/

A Category

Another Category 3 III

Types of Category Phonetic categories Phonological categories Islands of acoustic consistency Graded internal structure matters Not good for computation Phonological categories Differences among category members are irrelevant Good for computation May correspond to complex acoustic distribution

/kæt/ Gradient Category Representations Discrete Category Representations

Sensory Maps Internal representations of the outside world. Cellular neuroscience has discovered a great deal in this area.

Vowel Space Notions of sensory maps may be applicable to some aspects of human phonetic representations… …but there’s been little success in that regard, and we shouldn’t expect this to yield much.

/kæt/ Gradient Category Representations Discrete Category Representations

Phonological Categories are Different Decisions about how to categorize a sound may be fuzzy, etc. But phonological processes are blind to this We don’t find gradient application of phonological transformations Partial epenthesis Gradient stress etc. Developmental dissociations of category types

Some Established Results Search for phonetic ‘maps’ in the brain: consistently uninformative Electrophysiology of speech perception has been dominated by studies of the Mismatch Negativity (MMN), a response elicited in auditory cortex 150-200ms after the onset of an oddball sound MMN amplitude tracks perceptual distance between standard and deviant sound; i.e. measure of similarity along many dimensions There are established effects and non-effects of linguistic category structure on the MMN non-effects in comparison of within/across category contrasts real effects in comparison of native/non-native contrasts

Electroencephalography (EEG)

Event-Related Potentials (ERPs) John is laughing. s1 s2 s3

Magnetoencephalography pickup coil & SQUID assembly 160 SQUID whole-head array

Brain Magnetic Fields (MEG) SQUID detectors measure brain magnetic fields around 100 billion times weaker than earth’s steady magnetic field.

Evoked Responses

M100 Elicited by any well-defined onset Varies with tone frequency Varies with F1 of vowels May vary non-linearly with VOT variation Functional value of time-code unclear No evidence of higher-level representations (Poeppel & Roberts 1996) (Poeppel, Phillips et al. 1997) (Phillips et al. 1995; Sharma & Dorman 1999)

Mismatch Response X X X X X Y X X X X Y X X X X X X Y X X X Y X X X...

Mismatch Response X X X X X Y X X X X Y X X X X X X Y X X X Y X X X...

Mismatch Response X X X X X Y X X X X Y X X X X X X Y X X X Y X X X... Latency: 150-250 msec. Localization: Supratemporal auditory cortex Many-to-one ratio between standards and deviants

Localization of Mismatch Response (Phillips, Pellathy, Marantz et al., 2000)

Basic MMN elicitation © Risto Näätänen

MMN Amplitude Variation Sams et al. 1985

How does MMN latency, amplitude vary with frequency difference How does MMN latency, amplitude vary with frequency difference? 1000Hz tone std. Tiitinen et al. 1994

Different Dimensions of Sounds Length Amplitude Pitch …you name it … Amplitude of mismatch response can be used as a measure of perceptual distance

Impetus for Language Studies If MMN amplitude is a measure of perceptual distance, then perhaps it can be informative in domains where acoustic and perceptual distance diverge…

Place of Articulation Acoustic variation: F2 & F3 transitions

Place of Articulation [bæ] [dæ] Acoustic variation: F2 & F3 transitions

Place of Articulation [bæ] [dæ] within category between category [bæ] [dæ] Acoustic variation: F2 & F3 transitions

Place of Articulation [bæ] [dæ] within category between category [bæ] [dæ] Acoustic variation: F2 & F3 transitions

Categories in Infancy High Amplitude Sucking - 2 month olds Eimas et al. 1971 20 vs. 40 ms. VOT - yes 40 vs. 60 ms. VOT - no Infants show contrast, but this doesn’t entail phonological knowledge

Place of Articulation No effect of category boundary on MMN amplitude (Sharma et al. 1993) Similar findings in Sams et al. (1991), Maiste et al. (1995)

but…

Näätänen et al. (1997) e e/ö ö õ o

Phonetic Category Effects Measures of uneven discrimination profiles Findings are mixed (…and techniques vary) Relies on assumption that effects of contrasts at multiple levels are additive, …plus the requirement that the additivity effect be strong enough to yield a statistical interaction Logic of our studies: Eliminate contribution of lower levels by isolating the many-to-one ratio at an abstract level of representations Do this by introducing non-orthogonal variation among standards

Auditory Cortex Accesses Phonological Categories: An MEG Mismatch Study Colin Phillips, Tom Pellathy, Alec Marantz, Elron Yellin, et al. [Journal of Cognitive Neuroscience, 2000]

More Abstract Categories At the level of phonological categories, within-category differences are irrelevant Aims use MMF to measure categorization rather than discrimination focus on failure to make category-internal distinctions

Voice Onset Time (VOT) 60 msec

Design Fixed Design - Discrimination 20ms 40ms 60ms

Design Fixed Design - Discrimination Grouped Design - Categorization 20ms 40ms 60ms Grouped Design - Categorization 0ms 8ms 16ms 24ms 40ms 48ms 56ms 64ms Non-orthogonal within-category variation: excludes grouping via acoustic streaming.

Design Fixed Design - Discrimination Grouped Design - Categorization 20ms 40ms 60ms Grouped Design - Categorization 0ms 8ms 16ms 24ms 40ms 48ms 56ms 64ms Grouped Design - Acoustic Control 20ms 28ms 36ms 44ms 60ms 68ms 76ms 84ms

/dæ/ standard vs. /dæ/ deviant

Discrimination vs. Categorization: Vowels Daniel Garcia-Pedrosa Colin Phillips Henny Yeung

Some Concerns Are the category effects an artifact: It is very hard to discriminate different members of the same category on a VOT scale Perhaps subjects are forming ad hoc groupings of sounds during the experiment, not using their phonological representations? Does the ~30ms VOT boundary simply reflect a fundamental neurophysiological timing constraint?

Vowels Vowels show categorical perception effects in identification tasks …but vowels show much better discriminability of within-category pairs

Vowels & Tones Synthetic /u/-/o/ continuum F1 varied, all else constant Amplitude envelope of F1 extracted for creation of tone controls Pure tone continuum at F1 center frequency Matched to amplitude envelope of vowel Vowel, F1 = 310Hz Pure Tone, 310Hz

Design Tones Vowels First formant (F1) varies along the same 290-470Hz continuum F0, F2, voicing onset, etc. all remain constant 300Hz 320Hz 340Hz 360Hz 400Hz 420Hz 440Hz 460Hz

Results: Vowels

Results: Vowels

Results: Tones

Results: Tones

Preliminary conclusions Clear MMN in standard 150-250ms latency range in vowel but not in tone condition Both vowels and tones yield larger N100 responses Categorization effect for tones? Response to rarity of individual deviant tones, without categorization? Response to larger frequency changes when moving from standard to deviant category?

Phonological Features Colin Phillips Tom Pellathy Henny Yeung Alec Marantz

Sound Groupings

Phonological Features Phonological Natural Classes exist because... Phonemes are composed of features - the smallest building blocks of language Phonemes that share a feature form a natural class Effect of Feature-based organization observed in… Language development Language disorders Historical change Synchronic processes Roman Jakobson, 1896-1982

Sound Groupings in the Brain pæ, tæ, tæ, kæ, dæ, pæ, kæ, tæ, pæ, kæ, bæ, tæ...

Sound Groupings in the Brain pæ, tæ, tæ, kæ, dæ, pæ, kæ, tæ, pæ, kæ, bæ, tæ...

Feature Mismatch: Stimuli

Feature Mismatch Design

Feature Mismatch

Control Experiment - ‘Acoustic Condition’ Identical acoustical variability No phonological many-to-one ratio

Phoneme Variation: Features I Alternative account of the findings No feature-based grouping Independent MMF elicited by 3 low-frequency phonemes /bæ/ /dæ/ /gæ/ /pæ/ /tæ/ /kæ/ 29% 29% 29% 4% 4% 4% 12.5% 87.5%

Phoneme Variation: Features II Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5%

Phoneme Variation: Features II Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Phoneme-based classification

Phoneme Variation: Features II Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Feature-based grouping

Phoneme Variation: Features II Design N = 10 Multiple exemplars, individually selected boundaries 2 versions recorded for all participants, reversing [±voice] value Acoustic control, with all VOT values in [-voice] range /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Feature-based grouping

Phoneme Variation: Features II Left-anterior channels

Phoneme Variation: Features II Left-anterior channels

Distinguishing Lexical and Surface Category Contrasts Nina Kazanina Colin Phillips Bill Idsardi Nina Kazanina, Univ. of Ottawa

Allophonic Variation All studies shown so far fail to distinguish surface and lexical-level category representations (‘underlying’) Phonological category ≠ Acoustic distribution

Allophonic Variation All studies shown so far fail to distinguish surface and lexical-level category representations (‘underlying’)

Russian vs. Korean Three series of stops in Korean: plain (lenis) pa ta ka glottalized (tense, long) p’a t’a k’a aspirated ph tha kha Intervocalic Plain Stop Voicing: /papo/ [pabo] ‘fool’ /ku papo/ [kbabo] ‘the fool’ Plain stops: Bimodal distribution of +VOT and –VOT tokens Word-initially: always a positive VOT Word-medially intervocalically: a voicing lead (negative VOT)

Identification/ Rating Discrimination

TA (voicing leads & lags) MEG Stimuli Russian (basic Russian [ta]-token: 00ms voicing lead, +13ms vowel lag): DA (voicing leads) TA (voicing leads & lags) -40ms -34ms -28ms -24ms -08ms -04ms +02ms +08ms (relative) -08ms -04ms +15ms +21ms (absolute) Korean (basic Korean [ta]-token: 00ms voicing lead, +29ms vowel lag): DA (voicing leads) TA (voicing lags) -40ms -36ms -30ms -24ms 00ms +07ms +11ms +15ms (relative) +29ms +36ms +40ms +44ms (absolute)

Black: p < .05 White: n.s.

Russian vs. Korean MEG responses indicate that Russian speakers immediately map sounds from [d-t] continuum onto categories Korean speakers do not… … despite the fact that the sounds show bimodal distribution in their language Perceptual space reflects the functional status of sounds in encoding word meanings

Basic understanding How strong is this Adults are prisoners of their native language sound system How strong is this Structure-adding models predict residual sensitivity to non-native sounds There is a great deal of motivation in L2 research to find ways to free perception from the constraints of L1

Phonology - Syllables Japanese versus French Pairs like “egma” and “eguma” Difference is possible in French, but not in Japanese

Behavioral Results Japanese have difficulty hearing the difference Dupoux et al. 1999

EXECTIVE SUITE

ERP Results Sequences: egma, egma, egma, egma, eguma French have 3 mismatch responses Early, middle, late Japanese only have late response Dehaene-Lambertz et al. 2000

ERP Results - 2 Early response Dehaene-Lambertz et al. 2000

ERP Results - 3 Middle response Dehaene-Lambertz et al. 2000

ERP Results - 4 Late response Dehaene-Lambertz et al. 2000

Implications Cross-language contrast in MMN mirrors behavioral contrast Relative timing of responses that are same and different across French & Japanese is surprising from a bottom-up view of analysis - suggests a dual route Is this effect specific to comparison in an XXXXY task? Is the result robust; does it generalize to other phonotactic generalizations?

What drives Perceptual Epenthesis? Illegal syllables? Illegal sequences of consonants? (Kabak & Idsardi, 2004)

What drives Perceptual Epenthesis? Korean syllables Only [p, t, k, m, n, N, l] in coda Other consonants neutralize in coda position [c, c’, ch] --> [t] in coda Voiced stops only in CVC environments (allophones of voiceless stops) Korean contact restrictions *C + N Repair 1: nasalize C [path] + [ma] --> [panma] Repair 2: denasalize N [tal] + [nala] --> [tallaPa] Restrictions apply within IntPh (Kabak & Idsardi, 2004)

What drives Perceptual Epenthesis? (Kabak & Idsardi, 2004)

What drives Perceptual Epenthesis? (Kabak & Idsardi, 2004)