Psych 156A/ Ling 150: Psychology of Language Learning Lecture 9 Words in Fluent Speech II.

Slides:



Advertisements
Similar presentations
Psych 156A/ Ling 150: Acquisition of Language II Lecture 5 Sounds of Words.
Advertisements

Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
Ling 240: Language and Mind Structure Dependence in Grammar Formation.
Foundations of psycholinguistics Week 3 The beginnings of language acquisition Vasiliki (Celia) Antoniou.
Copyright © Cengage Learning. All rights reserved. CHAPTER 1 Foundations for Learning Mathematics.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 6 Words in Fluent Speech I.
Academic Communication Lesson 2 Pick up two different handouts per person from the desk at the front of the room: –“Choose a result” homework –“Strategy.
 Running are a method of recording a student’s reading behavior. Running Records provide teachers with information that can be analyzed to determine.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 8 Words in Fluent Speech.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 6 Words in Fluent Speech II.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 11 Poverty of the Stimulus II.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 12 Poverty of the Stimulus II.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 4 Words in Fluent Speech.
B-1 Lecture 2: Problems, Algorithms, and Programs © 2000 UW CSE University of Washington Computer Programming I.
Distributional Cues to Word Boundaries: Context Is Important Sharon Goldwater Stanford University Tom Griffiths UC Berkeley Mark Johnson Microsoft Research/
Language Acquisition Species-specific, species-universal accomplishment Central issue for cognitive science Important distinction between language comprehension.
Psych 56L/ Ling 51: Acquisition of Language Lecture 13 Development of Syntax & Morphology III.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 14 Learning Language Structure.
Session 6: Introduction to cryptanalysis part 1. Contents Problem definition Symmetric systems cryptanalysis Particularities of block ciphers cryptanalysis.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 13 Learning Biases.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 5 Words in Fluent Speech I.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 9 Structure.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 11 Phrases.
The Art of Programming Top-Down Design. The Art of Problem Solving The art of problem solving is the transformation of an English description of a problem.
Psych 156A/ Ling 150: Acquisition of Language II Lecture 2 Introduction: Mechanism of Language Acquistion.
Extensive Reading Research in Action
Misunderstood Minds 1.
Psych 56L/ Ling 51: Acquisition of Language Lecture 2 Introduction Continued.
Psych 156A/ Ling 150: Psychology of Language Learning
Psych 56L/ Ling 51: Acquisition of Language Lecture 2 Introduction Continued.
General Programming Introduction to Computing Science and Programming I.
CS161 Topic #21 CS161 Introduction to Computer Science Topic #2.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 5 Sounds III.
Psych156A/Ling150: Psychology of Language Learning Lecture 19 Learning Structure with Parameters.
A chicken-and-egg problem
Statistical Learning in Infants (and bigger folks)
Iowa State University Developmental Robotics Laboratory Unsupervised Segmentation of Audio Speech using the Voting Experts Algorithm Matthew Miller, Alexander.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistical Learning in Infants (and bigger folks)
Statistics (cont.) Psych 231: Research Methods in Psychology.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
Assessing Student Learning to Improve Teaching Jeff Bell and Jim Bidlack.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 6 Sounds of Words I.
CS 206 Introduction to Computer Science II 02 / 23 / 2009 Instructor: Michael Eckmann.
HELLO AGAIN !!!... And... How are you ? I forgot... Oh ! But that's right !...
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 3 Sounds II.
Psych 156A/ Ling 150: Acquisition of Language II
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Psych 56L/ Ling 51: Acquisition of Language Lecture 11 Lexical Development III.
Psych229: Language Acquisition Lecture 13 Learning Language Structure.
Psych156A/Ling150: Psychology of Language Learning Lecture 15 Poverty of the Stimulus II.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 3 Sounds I.
The Hashemite University Computer Engineering Department
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Statistics (cont.) Psych 231: Research Methods in Psychology.
Building awareness and concern for pronunciation by Joanne Kenworthy - Teaching English Pronunciation FONETICA Y FONOLOGIA II - ALEXANDRA NAIR ZUÑIGA.
Presenter: Grace M. Wholley Advisor: Jessica F. Hay Department of Psychology, The University of Tennessee, Knoxville
Inferential Statistics Psych 231: Research Methods in Psychology.
Unit One Basic Concepts: Syllables, Stress & Rhythm.
Psych156A/Ling150: Psychology of Language Learning
Psych 156A/ Ling 150: Psychology of Language Learning
New Key Stage 3 Assessment
CHAPTER 14: Confidence Intervals The Basics
Psych 231: Research Methods in Psychology
Psych156A/Ling150: Psychology of Language Learning
Inferential Statistics
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Presentation transcript:

Psych 156A/ Ling 150: Psychology of Language Learning Lecture 9 Words in Fluent Speech II

Announcements Homework 3 due today Homework 2 returned (Avg: 21.6 out of 27) Quiz 3 returned (Avg: 8.6 out of 10) Comments about how to do well in this class

Computational Problem Divide spoken speech into words húwz´fréjd´vD´bÍgbQ‘dw´‘lf

Computational Problem Divide spoken speech into words húwz´fréjd´vD´bÍgbQ‘dw´‘lf who‘s afraid of the big bad wolf húwz ´fréjd ´v D´ bÍg bQ‘d w´‘lf

Saffran, Aslin, & Newport (1996) Experimental evidence suggests that 8 month old infants can track statistical information such as the transitional probability between syllables. This can help them solve the task of word segmentation. Evidence comes from testing children in an artificial language paradigm, with very short exposure time.

Computational Modeling Data (Digital Children)

How good is transitional probability on real data? Real data, Psychologically plausible learning algorithm Realistic data is important to use since the experimental study of Saffran, Aslin, & Newport (1996) used artificial language data A psychologically plausible learning algorithm is important since we want to make sure whatever strategy the model uses is something a child could use, too. (Transitional probability would probably work, since Saffran, Aslin, & Newport (1996) showed that infants can track this kind of information in the artificial language.) Gambell & Yang (2006): Computational model goal

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf the big bad wolf

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf the big bad wolf Recall calculation: Should have identified 4 words: the, big, bad, wolf Identified 4 real words: the, big, bad, wolf Recall Score: 4/4 = 1.0

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf the big bad wolf Precision calculation: Identified 4 words: the, big, bad, wolf Identified 4 real words: the, big, bad, wolf Precision Score: 4/4 = 1.0

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf thebig bad wolf Error

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf thebig bad wolf Error Recall calculation: Should have identified 4 words: the, big, bad, wolf Identified 2 real words: big, bad Recall Score: 2/4 = 0.5

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) D´bÍgbQ‘dw´‘lf thebig bad wolf Error Precision calculation: Identified 3 words: thebig, bad, wolf Identified 2 real words: big, bad Precision Score: 2/3 = 0.666…

How do we measure word segmentation performance? Perfect word segmentation: identify all the words in the speech stream (recall) only identify syllables groups that are actually words (precision) Want good scores on both of these measures

Where does the realistic data come from? CHILDES Child Language Data Exchange System Large collection of child-directed speech data transcribed by researchers. Used to see what children’s input is actually like.

Where does the realistic data come from? Gambell & Yang (2006) Looked at Brown corpus files in CHILDES (226,178 words made up of 263,660 syllables). Converted the transcriptions to pronunciations using a pronunciation dictionary called the CMU Pronouncing Dictionary.

Where does the realistic data come from? Converting transcriptions to pronunciations Gambell and Yang (2006) tried to see if a model learning from transitional probabilities between syllables could correctly segment words from realistic data. D´ bÍg bQ‘d w´‘lf DH AH0. B IH1 G. B AE1 D. W UH1 L F.

Segmenting Realistic Data Gambell and Yang (2006) tried to see if a model learning from transitional probabilities between syllables could correctly segment words from realistic data. D´ bÍg bQ‘d w´‘lf DH AH0. B IH1 G. B AE1 D. W UH1 L F.

Segmenting Realistic Data Gambell and Yang (2006) tried to see if a model learning from transitional probabilities between syllables could correctly segment words from realistic data. D´ bÍg bQ‘d w´‘lf DH AH0. B IH1 G. B AE1 D. W UH1 L F. the big bad wolf

Modeling Results for Transitional Probability A learner relying only on transitional probability does not reliably segment words such as those in child-directed English. About 60% of the words posited by the transitional probability learner are not actually words (41.6% precision) and almost 80% of the actual words are not extracted (23.3 % recall). Precision: 41.6% Recall: 23.3%

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006)

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf TrProb1TrProb2TrProb3

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf > 0.3, 0.3 < 0.7

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf learner posits one word boundary at minimum TrProb 0.6 > 0.3, 0.3 <

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf …but nowhere else 0.6 > 0.3, 0.3 <

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´ bÍg bQ‘d w´‘lf …but nowhere else

Why such poor performance? “We were surprised by the low level of performance. Upon close examination of the learning data, however, it is not difficult to understand the reason….a sequence of monosyllabic words requires a word boundary after each syllable; a [transitional probability] learner, on the other hand, will only place a word boundary between two sequences of syllables for which the [transitional probabilities] within [those sequences] are higher than [those surrounding the sequences]...” - Gambell & Yang (2006) D´bÍg bQ‘dw´‘lf …but nowhere else Precision for this sequence: 0 words correct out of 2 posited Recall: 0 words correct out of 4 that should have been posited

Why such poor performance? “More specifically, a monosyllabic word is followed by another monosyllabic word 85% of the time. As long as this is the case, [a transitional probability learner] cannot work.” - Gambell & Yang (2006)

Additional Learning Bias Gambell & Yang (2006) idea Children are sensitive to the properties of their native language like stress patterns very early on. Maybe they can use those sensitivities to help them solve the word segmentation problem. Unique Stress Constraint (USC) A word can bear at most one primary stress. D´ bÍg bQ‘d w´‘lf stress no stress

Additional Learning Bias Gambell & Yang (2006) idea Children are sensitive to the properties of their native language like stress patterns very early on. Maybe they can use those sensitivities to help them solve the word segmentation problem. Unique Stress Constraint (USC) A word can bear at most one primary stress. D´ bÍg bQ‘d w´‘lf Learner gains knowledge: These must be separate words

Additional Learning Bias Gambell & Yang (2006) idea Children are sensitive to the properties of their native language like stress patterns very early on. Maybe they can use those sensitivities to help them solve the word segmentation problem. Unique Stress Constraint (USC) A word can bear at most one primary stress. húw z´ fréjd ´v D´ bÍg bQ‘d w´‘lf Get these boundaries because stressed (strong) syllables are next to each other.

Additional Learning Bias Gambell & Yang (2006) idea Children are sensitive to the properties of their native language like stress patterns very early on. Maybe they can use those sensitivities to help them solve the word segmentation problem. Unique Stress Constraint (USC) A word can bear at most one primary stress. húw z´ fréjd ´v D´ bÍg bQ‘d w´‘lf Can use this in tandem with transitional probabilities when there are weak (unstressed) syllables between stressed syllables.

Additional Learning Bias Gambell & Yang (2006) idea Children are sensitive to the properties of their native language like stress patterns very early on. Maybe they can use those sensitivities to help them solve the word segmentation problem. Unique Stress Constraint (USC) A word can bear at most one primary stress. húw z´ fréjd ´v D´ bÍg bQ‘d w´‘lf ?? There’s a word boundary at one of these two.

USC + Transitional Probabilities A learner relying only on transitional probability but who also has knowledge of the Unique Stress Constraint does a much better job at segmenting words such as those in child-directed English. Only about 25% of the words posited by the transitional probability learner are not actually words (73.5% precision) and about 30% of the actual words are not extracted (71.2 % recall). Precision: 73.5% Recall: 71.2%

Another Strategy Subtraction process of figuring out unknown words. “Look, honey - it’s a big goblin!” Algebraic Learning (Gambell & Yang (2003)) bÍggáblIn bÍg = big (familiar word) bÍgbÍg gáblIn = (new word) bÍggáblIn

Evidence of Algebraic Learning in Children “Behave yourself!” “I was have!” (be-have = be + have) “Was there an adult there?” “No, there were two dults.” (a-dult = a + dult) “Did she have the hiccups?” “Yeah, she was hiccing-up.” (hicc-up = hicc + up)

Using Algebraic Learning + USC StrongSyl WeakSyl1 WeakSyl2 StrongSyl ma ny can come “Many can come…”

Using Algebraic Learning + USC StrongSyl WeakSyl1 WeakSyl2 StrongSyl ma ny can come “Many can come…” Familiar word: “many”

Using Algebraic Learning + USC StrongSyl WeakSyl1 WeakSyl2 StrongSyl ma ny can come “Many can come…” Familiar word: “come”

Using Algebraic Learning + USC StrongSyl WeakSyl1 WeakSyl2 StrongSyl ma ny can come “Many can come…” This must be a word: add it to memory

Algebraic Learning + USC A learner relying on algebraic learning and who also has knowledge of the Unique Stress Constraint does a really great job at segmenting words such as those in child-directed English. Only about 5% of the words posited by the transitional probability learner are not actually words (95.9% precision) and about 7% of the actual words are not extracted (93.4 % recall). Precision: 95.9% Recall: 93.4%

Gambell & Yang (2006) Summary Learning from transitional probabilities alone doesn’t work so well on realistic data. Models of children who have additional knowledge about the stress patterns of words in their language have a much better chance of succeeding at word segmentation if they learn via transitional probabilities. However, models of children who use algebraic learning as well as have additional knowledge about language-specific stress patterns perform even better at word segmentation.

Questions?