Levels of Representation in Adult Speech Perception

Name: Levels of Representation in Adult Speech Perception
Uploaded: 2017-10-07T09:24:45+00:00
Duration: PTM25S14
Channel: Edgar Walters
Description: Levels of Representation in Adult Speech Perception

Levels of Representation in Adult Speech Perception

The Big Questions What levels of acoustic/phonetic/phonological representation can we distinguish in the brain? How are these representations created or modified during development? What is the flow of information (in space and time) in the mapping from acoustics to the lexicon in the brain? How does knowledge of native language categories and phonotactics constrain perception? How are phonological representations encoded?

/kæt/

A Category

Another Category 3 III

Types of Category Phonetic categories Phonological categories
Islands of acoustic consistency Graded internal structure matters Not good for computation Phonological categories Differences among category members are irrelevant Good for computation May correspond to complex acoustic distribution

/kæt/ Gradient Category Representations
Discrete Category Representations

Sensory Maps Internal representations of the outside world. Cellular neuroscience has discovered a great deal in this area.

Vowel Space Notions of sensory maps may be applicable to some aspects of human phonetic representations… …but there’s been little success in that regard, and we shouldn’t expect this to yield much.

/kæt/ Gradient Category Representations
Discrete Category Representations

Phonological Categories are Different
Decisions about how to categorize a sound may be fuzzy, etc. But phonological processes are blind to this We don’t find gradient application of phonological transformations Partial epenthesis Gradient stress etc. Developmental dissociations of category types

Some Established Results
Search for phonetic ‘maps’ in the brain: consistently uninformative Electrophysiology of speech perception has been dominated by studies of the Mismatch Negativity (MMN), a response elicited in auditory cortex ms after the onset of an oddball sound MMN amplitude tracks perceptual distance between standard and deviant sound; i.e. measure of similarity along many dimensions There are established effects and non-effects of linguistic category structure on the MMN non-effects in comparison of within/across category contrasts real effects in comparison of native/non-native contrasts

Electroencephalography (EEG)

Event-Related Potentials (ERPs)
John is laughing. s1 s2 s3

Magnetoencephalography
pickup coil & SQUID assembly 160 SQUID whole-head array

Brain Magnetic Fields (MEG)
SQUID detectors measure brain magnetic fields around 100 billion times weaker than earth’s steady magnetic field.

Evoked Responses

M100 Elicited by any well-defined onset Varies with tone frequency
Varies with F1 of vowels May vary non-linearly with VOT variation Functional value of time-code unclear No evidence of higher-level representations (Poeppel & Roberts 1996) (Poeppel, Phillips et al. 1997) (Phillips et al. 1995; Sharma & Dorman 1999)

Mismatch Response X X X X X Y X X X X Y X X X X X X Y X X X Y X X X...

Mismatch Response X X X X X Y X X X X Y X X X X X X Y X X X Y X X X...
Latency: msec. Localization: Supratemporal auditory cortex Many-to-one ratio between standards and deviants

Localization of Mismatch Response
(Phillips, Pellathy, Marantz et al., 2000)

Basic MMN elicitation © Risto Näätänen

MMN Amplitude Variation
Sams et al. 1985

How does MMN latency, amplitude vary with frequency difference
How does MMN latency, amplitude vary with frequency difference? 1000Hz tone std. Tiitinen et al. 1994

Different Dimensions of Sounds
Length Amplitude Pitch …you name it … Amplitude of mismatch response can be used as a measure of perceptual distance

Impetus for Language Studies
If MMN amplitude is a measure of perceptual distance, then perhaps it can be informative in domains where acoustic and perceptual distance diverge…

Place of Articulation Acoustic variation: F2 & F3 transitions

Place of Articulation [bæ] [dæ]
Acoustic variation: F2 & F3 transitions

Place of Articulation [bæ] [dæ]
within category between category [bæ] [dæ] Acoustic variation: F2 & F3 transitions

Categories in Infancy High Amplitude Sucking - 2 month olds
Eimas et al. 1971 20 vs. 40 ms. VOT - yes 40 vs. 60 ms. VOT - no Infants show contrast, but this doesn’t entail phonological knowledge

Place of Articulation No effect of category boundary on MMN amplitude (Sharma et al. 1993) Similar findings in Sams et al. (1991), Maiste et al. (1995)

but…

Näätänen et al. (1997) e e/ö ö õ o

Phonetic Category Effects
Measures of uneven discrimination profiles Findings are mixed (…and techniques vary) Relies on assumption that effects of contrasts at multiple levels are additive, …plus the requirement that the additivity effect be strong enough to yield a statistical interaction Logic of our studies: Eliminate contribution of lower levels by isolating the many-to-one ratio at an abstract level of representations Do this by introducing non-orthogonal variation among standards

Auditory Cortex Accesses Phonological Categories: An MEG Mismatch Study
Colin Phillips, Tom Pellathy, Alec Marantz, Elron Yellin, et al. [Journal of Cognitive Neuroscience, 2000]

More Abstract Categories
At the level of phonological categories, within-category differences are irrelevant Aims use MMF to measure categorization rather than discrimination focus on failure to make category-internal distinctions

Voice Onset Time (VOT) 60 msec

Design Fixed Design - Discrimination 20ms 40ms 60ms

Design Fixed Design - Discrimination Grouped Design - Categorization
20ms 40ms 60ms Grouped Design - Categorization 0ms 8ms 16ms 24ms 40ms 48ms 56ms 64ms Non-orthogonal within-category variation: excludes grouping via acoustic streaming.

Design Fixed Design - Discrimination Grouped Design - Categorization
20ms 40ms 60ms Grouped Design - Categorization 0ms 8ms 16ms 24ms 40ms 48ms 56ms 64ms Grouped Design - Acoustic Control 20ms 28ms 36ms 44ms 60ms 68ms 76ms 84ms

/dæ/ standard vs. /dæ/ deviant

Discrimination vs. Categorization: Vowels
Daniel Garcia-Pedrosa Colin Phillips Henny Yeung

Some Concerns Are the category effects an artifact:
It is very hard to discriminate different members of the same category on a VOT scale Perhaps subjects are forming ad hoc groupings of sounds during the experiment, not using their phonological representations? Does the ~30ms VOT boundary simply reflect a fundamental neurophysiological timing constraint?

Vowels Vowels show categorical perception effects in identification tasks …but vowels show much better discriminability of within-category pairs

Vowels & Tones Synthetic /u/-/o/ continuum
F1 varied, all else constant Amplitude envelope of F1 extracted for creation of tone controls Pure tone continuum at F1 center frequency Matched to amplitude envelope of vowel Vowel, F1 = 310Hz Pure Tone, 310Hz

Design Tones Vowels First formant (F1) varies along the same Hz continuum F0, F2, voicing onset, etc. all remain constant 300Hz 320Hz 340Hz 360Hz 400Hz 420Hz 440Hz 460Hz

Results: Vowels

Results: Tones

Preliminary conclusions
Clear MMN in standard ms latency range in vowel but not in tone condition Both vowels and tones yield larger N100 responses Categorization effect for tones? Response to rarity of individual deviant tones, without categorization? Response to larger frequency changes when moving from standard to deviant category?

Phonological Features
Colin Phillips Tom Pellathy Henny Yeung Alec Marantz

Sound Groupings

Phonological Features
Phonological Natural Classes exist because... Phonemes are composed of features - the smallest building blocks of language Phonemes that share a feature form a natural class Effect of Feature-based organization observed in… Language development Language disorders Historical change Synchronic processes Roman Jakobson,

Sound Groupings in the Brain
pæ, tæ, tæ, kæ, dæ, pæ, kæ, tæ, pæ, kæ, bæ, tæ...

Feature Mismatch: Stimuli

Feature Mismatch Design

Feature Mismatch

Control Experiment - ‘Acoustic Condition’
Identical acoustical variability No phonological many-to-one ratio

Phoneme Variation: Features I
Alternative account of the findings No feature-based grouping Independent MMF elicited by 3 low-frequency phonemes /bæ/ /dæ/ /gæ/ /pæ/ /tæ/ /kæ/ 29% 29% 29% 4% 4% 4% 12.5% 87.5%

Phoneme Variation: Features II
Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5%

Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Phoneme-based classification

Follow-up study distinguishes Phoneme-level frequency Feature-level status /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Feature-based grouping

Design N = 10 Multiple exemplars, individually selected boundaries 2 versions recorded for all participants, reversing [±voice] value Acoustic control, with all VOT values in [-voice] range /bæ/ /gæ/ /dæ/ /tæ/ 37.5% 37.5% 12.5% 12.5% Feature-based grouping

Left-anterior channels

Distinguishing Lexical and Surface Category Contrasts
Nina Kazanina Colin Phillips Bill Idsardi Nina Kazanina, Univ. of Ottawa

Allophonic Variation All studies shown so far fail to distinguish surface and lexical-level category representations (‘underlying’) Phonological category ≠ Acoustic distribution

Allophonic Variation All studies shown so far fail to distinguish surface and lexical-level category representations (‘underlying’)

Russian vs. Korean Three series of stops in Korean:
plain (lenis) pa ta ka glottalized (tense, long) p’a t’a k’a aspirated ph tha kha Intervocalic Plain Stop Voicing: /papo/ [pabo] ‘fool’ /ku papo/ [kbabo] ‘the fool’ Plain stops: Bimodal distribution of +VOT and –VOT tokens Word-initially: always a positive VOT Word-medially intervocalically: a voicing lead (negative VOT)

Identification/ Rating Discrimination

TA (voicing leads & lags)
MEG Stimuli Russian (basic Russian [ta]-token: 00ms voicing lead, +13ms vowel lag): DA (voicing leads) TA (voicing leads & lags) -40ms -34ms -28ms -24ms -08ms -04ms +02ms +08ms (relative) -08ms -04ms +15ms +21ms (absolute) Korean (basic Korean [ta]-token: 00ms voicing lead, +29ms vowel lag): DA (voicing leads) TA (voicing lags) -40ms -36ms -30ms -24ms 00ms +07ms +11ms +15ms (relative) +29ms +36ms +40ms +44ms (absolute)

Black: p < .05 White: n.s.

Russian vs. Korean MEG responses indicate that Russian speakers immediately map sounds from [d-t] continuum onto categories Korean speakers do not… … despite the fact that the sounds show bimodal distribution in their language Perceptual space reflects the functional status of sounds in encoding word meanings

Basic understanding How strong is this
Adults are prisoners of their native language sound system How strong is this Structure-adding models predict residual sensitivity to non-native sounds There is a great deal of motivation in L2 research to find ways to free perception from the constraints of L1

Phonology - Syllables Japanese versus French
Pairs like “egma” and “eguma” Difference is possible in French, but not in Japanese

Behavioral Results Japanese have difficulty hearing the difference
Dupoux et al. 1999

EXECTIVE SUITE

ERP Results Sequences: egma, egma, egma, egma, eguma
French have 3 mismatch responses Early, middle, late Japanese only have late response Dehaene-Lambertz et al. 2000

ERP Results - 2 Early response Dehaene-Lambertz et al. 2000

ERP Results - 3 Middle response Dehaene-Lambertz et al. 2000

ERP Results - 4 Late response Dehaene-Lambertz et al. 2000

Implications Cross-language contrast in MMN mirrors behavioral contrast Relative timing of responses that are same and different across French & Japanese is surprising from a bottom-up view of analysis - suggests a dual route Is this effect specific to comparison in an XXXXY task? Is the result robust; does it generalize to other phonotactic generalizations?

What drives Perceptual Epenthesis?
Illegal syllables? Illegal sequences of consonants? (Kabak & Idsardi, 2004)

Korean syllables Only [p, t, k, m, n, N, l] in coda Other consonants neutralize in coda position [c, c’, ch] --> [t] in coda Voiced stops only in CVC environments (allophones of voiceless stops) Korean contact restrictions *C + N Repair 1: nasalize C [path] + [ma] --> [panma] Repair 2: denasalize N [tal] + [nala] --> [tallaPa] Restrictions apply within IntPh (Kabak & Idsardi, 2004)

(Kabak & Idsardi, 2004)

Levels of Representation in Adult Speech Perception

Similar presentations

Presentation on theme: "Levels of Representation in Adult Speech Perception"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Levels of Representation in Adult Speech Perception

Similar presentations

Presentation on theme: "Levels of Representation in Adult Speech Perception"— Presentation transcript:

Similar presentations

About project

Feedback