Statistical Frequency in Word Segmentation. Words don’t come with nice clean boundaries between them Where are the word boundaries?

Slides:



Advertisements
Similar presentations
How Children Acquire Language
Advertisements

Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
Language and Cognition Colombo, June 2011 Day 8 Aphasia: disorders of comprehension.
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
Foundations of psycholinguistics Week 3 The beginnings of language acquisition Vasiliki (Celia) Antoniou.
Is Recursion Uniquely Human? Hauser, Chomsky and Fitch (2002) Fitch and Hauser (2004)
SPEECH PERCEPTION 2 DAY 17 – OCT 4, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Ling 240: Language and Mind Acquisition of Phonology.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 6 Words in Fluent Speech I.
1 Language and kids Linguistics lecture #8 November 21, 2006.
Development of Speech Perception. Issues in the development of speech perception Are the mechanisms peculiar to speech perception evident in young infants?
Psych 156A/ Ling 150: Acquisition of Language II Lecture 4 Sounds.
PERCEPTION OF MUSIC & LANGUAGE. Anthony J Greene2 Music Perception Musical notes –Sounds of music extend across frequency range: 25 – 4200 Hz –To increase.
The Perception of Speech. Speech is for rapid communication Speech is composed of units of sound called phonemes –examples of phonemes: /ba/ in bat, /pa/
Segmenting Nonsense Sanders, Newport & Neville (2002) Ricardo TaboneLIN 7912.
Every child talking Nursery Clusters. Supporting speech, language and communication skills Nursery Clusters Cluster 3 Expressive Language.
Sentence Memory: A Constructive Versus Interpretive Approach Bransford, J.D., Barclay, J.R., & Franks, J.J.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 8 Words in Fluent Speech.
Language Special form of communication in which we learn complex rules to manipulate symbols that can be used to generate an endless number of meaningful.
PaPI 2005 (Barcelona, June) The perception of stress patterns by Spanish and Catalan infants Ferran Pons (University of British Columbia) Laura Bosch.
Exam 1 Monday, Tuesday, Wednesday next week WebCT testing centre Covers everything up to and including hearing (i.e. this lecture)
Language and Symbolic Development. Symbols Systems for representing and conveying information 1 thing is used to stand for something else e.g. numbers,
Casenhiser and Goldberg (2005) Ability to learn to pair novel constructional meaning with novel form Known nouns and nonsense verb arranged in non- English.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 11 Poverty of the Stimulus II.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 4 Words in Fluent Speech.
Language Acquisition Species-specific, species-universal accomplishment Central issue for cognitive science Important distinction between language comprehension.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 13 Learning Biases.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 5 Words in Fluent Speech I.
Chapter 10: Language and Communication Module 10.1 The Road to Speech Module 10.2 Learning the Meanings of Words Module 10.3 Speaking in Sentences Module.
A Lecture about… Phonetic Acquisition Veronica Weiner May, 2006.
Cognitive Development: Language Infants and children face an especially important developmental task with the acquisition of language.
Chapter 4 Opener. Figure 4.1 A testing booth set up for the head-turn preference paradigm.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
A chicken-and-egg problem
Statistical Learning in Infants (and bigger folks)
Building a Lexicon Statistical learning & recognizing words.
Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander.
Adele E. Goldberg. How argument structure constructions are learned.
Speech Perception 4/4/00.
Statistical Learning in Infants (and bigger folks)
Survey of Modern Psychology Language Development.
Language Language – our spoken, written or signed words and the ways we combine them to communicate meaning.
A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.
Acoustic Cues to Laryngeal Contrasts in Hindi Susan Jackson and Stephen Winters University of Calgary Acoustics Week in Canada October 14,
Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.
Epenthetic vowels in Japanese: a perceptual illusion? Emmanual Dupoux, et al (1999) By Carl O’Toole.
First Language Acquisition Chapter 14
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 6 Sounds of Words I.
Sensation & Perception
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 3 Sounds II.
Big Ideas in Reading: Phonemic Awareness
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Language Acquisition Computational Intelligence 4/7/05 LouAnn Gerken.
What infants bring to language acquisition Limitations of Motherese & First steps in Word Learning.
Chapter 13: Speech Perception. The Acoustic Signal Produced by air that is pushed up from the lungs through the vocal cords and into the vocal tract Vowels.
Bosch & Sebastián-Gallés Simultaneous Bilingualism and the Perception of a Language-Specific Vowel Contrast in the First Year of Life.
Psych156A/Ling150: Psychology of Language Learning Lecture 15 Poverty of the Stimulus II.
First Language Acquisition. It is the process by which humans acquire the capacity to perceive and comprehend language, as well as to produce and use.
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 9 Words in Fluent Speech II.
Presenter: Grace M. Wholley Advisor: Jessica F. Hay Department of Psychology, The University of Tennessee, Knoxville
Chapter 10 Language acquisition Language acquisition----refers to the child’s acquisition of his mother tongue, i.e. how the child comes to understand.
Poverty of stimulus in the context of language Second Semester.
Chapter 1 Language learning in early childhood
Susan Geffen, Suzanne Curtin and Susan Graham
Psych 156A/ Ling 150: Psychology of Language Learning
Areas of Language Development Theories of Language Development
Psych156A/Ling150: Psychology of Language Learning
Presentation transcript:

Statistical Frequency in Word Segmentation

Words don’t come with nice clean boundaries between them Where are the word boundaries?

Question: How do children work out where the word boundaries are? -Statistical regularities There are several potential clues: -Pauses (although this is dubious) -Intonation (this too is dubious)

Statistical Regularities Words very rarely begin with [dw], Words never begin with [bn], Words never begin with [lb], Etc. So if the child hears these sequences, the child hypothesizes the sequence occurred in the middle or at the end of the word.

Statistical Regularities Voiceless stops that begin words are almost always aspirated, Voiced segments that end words are often de-voiced, Various other phonological processes may occur, e.g., word-final frication, etc. So these are phonological clues that may help segment the speech stream.

Problem In order for children to be able to make use of these cues, they must be able to track the frequency of such items in the speech, otherwise it is a useless cue. So if the child is not able to track the frequency of [bn] at the beginning of words, what use is using this strategy?

Statistical Tracking Very recent work suggests that children do in fact have the capacity to track statistical frequencies of certain elements in their environment. Major researchers: Jenny Saffran (Wisconsin), Rebecca Gomez (Arizona), Elisa Newport (Rochester), Richard Aslin (Rochester), LouAnn Gerken (Arizona), Gary Marcus (NYU), etc.

The Experiment - Overview Create a synthesized string of syllables that occur in a particular frequency (can’t use English… ). Expose the children to this string of syllables for ~20 minutes. Test children to see if they have a preference for the highly frequent syllable sets or the rare syllable sets. If children show a preference (no matter what direction that preference is in), then children are sensitive to frequencies of syllables in the input.

Sample Stimulus Their language consisted of: Four consonants (p,t,b,d) Three vowels (a, i, u) Which when combined created 12 syllables (pa, ti, bu, da, etc.). These then created six words: babupu, bupada, dutaba, patubi, pidabu, and tutibu

babupu bu pada du taba pa tubi pi dabu tu tibu bi bu pa pi ba pu ta ti tu da di du bupu bupa pada duta babu taba patu tubi pida dabu tuti tibu

Transitional Probabilities The chances of a word containing bu are much greater than the chances of a word containing di. Transitional probabilities quantify this. The Transitional Probability of xy is: xy x

Transitional Probabilities So for the word babupu, the transitional probability of babu is calculated as follows: Frequency of babu / Frequency of ba  1/2 = 0.5 Frequency of bupu / Frequency of bu  1/4 = 0.25 Overall transitional probability of the word babupu = ( ) / 2 = 0.375

What’s the point? Transitional probability was manipulated so that: The transitional probability was high within a word, but low across a word boundary. This is what a word IS in real life.

ba bu pu bu pa da du ta ba High Transitional Probability High Transitional Probability Low Transitional Probability High Transitional Probability High Transitional Probability Low Transitional Probability High Transitional Probability High Transitional Probability

300 tokens of each of the six words were randomly concatenated. All word boundaries were removed This left 4536 continuous syllables, which were read by a speech synthesizer. Synthesizer produced a monotone of syllables at a rate of 216 syllables per minute.

Procedures Subjects consisted of 24 undergraduate students. Subjects were told to listen to ‘nonsense’ language. Task is to figure out where words begin/end. After 3 blocks of 7 minutes of exposure to the language, subjects were tested.

Subjects heard two tri-syllabic strings, e.g., Test Procedure bu-pa-da and pi-da-bu Real wordNot a real word Which sounds more like a word from this nonsense language? 36 trials in the test.

Results Mean score correct for all subjects was 27.2, where chance is 18. t-test shows this to be statistically significantly different from chance. Conclusion: adults are able to recognize what is a word and what is not a word based purely on statistical frequency.

Additional finding: the three words with the most common syllables in them were easiest to recognize. the three words with the least common syllables in them were hardest to recognize.

But can kids do this too? Answer appears to be Yes. Saffran et al. (1996) used essentially the same stimuli on 8 month old children Used four strings of words instead of six. Children were exposed for only 2 minutes (not 21 minutes)

Child Methodology Head turning Procedure speakers light

Results Children looked statistically longer at the speaker from which novel words were being produced. Why is this? Why wouldn’t they look longer at the speaker from which familiar words are being produced?

Bottom Line Children have the ability to track transitional probabilities of sounds on the basis of very little exposure. This is therefore how words are parsed.

Tool against Nativism…? This has recently been the most prolific weapon against the idea that children use innate knowledge to acquire language. If children are using such sophisticated skills to segment words, why can’t they use similar (non-linguistic) skills to learn syntax?

But it isn’t so simple Marcus et al. (1999) trained children on sentences of the following sort: la – ta – la ga – na – ga da – ba – da x – y – x

And tested them on: wo – fe – wo gi – tu – gi po – zi – po Namely, words with: -new syllables, but -the same structure (x-y-x) And… wo – fe – fe gi – tu – tu po – zi – zi Namely, words with: -new syllables, and -new structure (x-y-y)

Results Children appear to recognize the difference between these sets of stimuli  Children are therefore tracking structure and not just simple statistics.

Questions to ask yourself: Why would statistical tracking be useful to linguists?  As a tool to explain language acquisition. Does statistical tracking explain how children acquire language? What aspects of language can we track?  No, only certain aspects of it.  So far, it appears only phonologically related things can be tracked like this (not meaning-related things).

Most Important Questions Is this useful for ALL languages on Earth?  It appears that statistical tracking is only useful for auditory stimuli, not visual…ASL? Are humans the only creatures that can do this? (I hope so, otherwise other animals should have language too…)  No. Vervet and Tamarin monkeys have been shown to have essentially the same abilities that humans do.

So what do we really know? Kids have spectacular abilities to track statistics. But so do adults (so why can’t adults learn languages as well as kids?) But so do monkeys (so why can’t monkeys learn language as well as humans?) This ability appears to be limited to statistics in auditory perception.