Presentation is loading. Please wait.

Presentation is loading. Please wait.

The origin of syntactic bootstrapping: A computational model Michael Connor, Cynthia Fisher, & Dan Roth University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "The origin of syntactic bootstrapping: A computational model Michael Connor, Cynthia Fisher, & Dan Roth University of Illinois at Urbana-Champaign."— Presentation transcript:

1 The origin of syntactic bootstrapping: A computational model Michael Connor, Cynthia Fisher, & Dan Roth University of Illinois at Urbana-Champaign

2 How do children begin to understand sentences? Topid rivvo den marplox.

3 Two problems of ambiguity "the language" Topid rivvo den marplox. "the world" eating (sheep, food) feed (Johanna, sheep) help (Daddy, Johanna) scared (Johanna, sheep) are-nice (sheep) want (sheep, food) getting away (brother)

4 "the sentence" "the world" Typical view: knowledge of meaning drives knowledge of syntax Topid rivvo den marplox. feed (Johanna, sheep)

5 "the language" "the world" Typical view: knowledge of meaning drives knowledge of grammar Topid rivvo den marplox. feed (Johanna, sheep) With or without assumed innate constraints on syntax- semantics linking (e.g. Pinker, 1984, 1989; Tomasello, 2003) Sentence-meaning pairs

6 The problem of sentence and verb meanings Predicate terms don't label events, but adopt different perspectives on them (e.g., Clark, 1990; Fillmore, 1977; Gillette, Gleitman, Gleitman, & Lederer., 1999; Rispoli, 1989; Slobin, 1985) "I'll put this here." versus "This goes here."

7 "the sentence" "the world" If we take seriously the ambiguity of scenes... Topid rivvo den marplox. eating (sheep, food) feed (Johanna, sheep) help (Daddy, Johanna) scared (Johanna, sheep) are-nice (sheep) want (sheep, food) getting away (brother)

8 Syntactic bootstrapping (Gleitman, 1990; Gleitman et al., 2005; Landau & Gleitman, 1985; Naigles, 1990) "I'll put this here." versus "This goes here." sentence structure verbs' semantic predicate-argument structure

9 How could this be? How could aspects of syntactic structure guide early sentence interpretation... before the child has learned much about the syntax and morphology of this language?

10 Three main claims: (1) Structure-mapping: Syntactic bootstrapping begins with a bias toward one-to-one mapping between nouns in sentences and semantic arguments of predicate terms. (Fisher et al., 1994; Fisher et al., 2006; Gillette et al., 1999; Yuan, Fisher & Snedeker, in press)

11 Three main claims: (1) Structure-mapping: Syntactic bootstrapping begins with a bias toward one-to-one mapping between nouns in sentences and semantic arguments of predicate terms. (Fisher et al., 1994; Fisher et al., 2006; Gillette et al., 1999; Yuan, Fisher & Snedeker, in press) (2) Early abstraction: Children are biased toward abstract representations of language experience. (Gertner, Fisher & Eisengart, 2006; Thothathiri & Snedeker, 2008) (3) Independent encoding of sentence structure: Children gather distributional facts about verbs from listening experience. (Arunachalam & Waxman, 2010; Scott & Fisher, 2009; Yuan & Fisher, 2009)

12 A fourth prediction: (4) 'Real' syntactic bootstrapping: Children can learn which words are verbs by tracking their syntactic argument- taking behavior in sentences.

13 A computational model of syntactic bootstrapping Computational experiments simulate a learner whose knowledge of sentence structure is under our control We equip the model with an unlearned bias to map nouns onto abstract semantic roles, and ask: –Are partial representations based on sets of nouns useful for learning more about the native language? First case study: learning English word order –Could a learner identify verbs as verbs by noting their tendency to occur with particular numbers of arguments?

14 Computational Model of Semantic Role Labeling (SRL) A semantic analysis of sentences at the level of who does what to whom For each verb in the sentence: –SRL system tries to identify all constituents that fill a semantic role –and to assign them roles (agent, patient, goal, etc.)

15 Predicate Identification: Feature Extraction: Argument Classification: External Labeled Feedback 3 4 5 Parsing: Argument Identification: 1 2 Ilikedarkbread. arg=INP < S verb= like A0 A1... arg= bread NP < VP verb= like A0 A1... Semantic Role Labeling (SRL): The basic idea I is an agent A0 bread is a patient A1

16 The nature of the semantic roles A key assumption of the SRL is that the semantic roles are abstract: –This is a property of the PropBank annotation scheme: verb-specific roles are grouped into macro-roles (Palmer, Gildea, & Kinsbury, 2005; cf. Dowty, 1991) Give: Arg0:Giver Arg1:Thing given Arg2:Recipient Like: Arg0:Liker Arg1:Object of affection Have: Arg0:Owner Arg1:Possession

17 The nature of the semantic roles A key assumption of the SRL is that the semantic roles are abstract: –This is a result of the PropBank annotation scheme: verb specific roles are grouped into macro-roles (e.g., Dowty, 1991) Like: Arg0:Liker Arg1:Object of affection Give: Arg0:Giver Arg1:Thing given Arg2:Recipient Have: Arg0:Owner Arg1:Possession Agent

18 The nature of the semantic roles A key assumption of the SRL is that the semantic roles are abstract: –This is a result of the PropBank annotation scheme: verb specific roles are grouped into macro-roles (e.g., Dowty, 1991) So, like children, an SRL learns to identify agents and patients in sentences, not givers and things-given. Give: Arg0:Giver Arg1:Thing given Arg2:Recipient Have: Arg0:Owner Arg1:Possession Patient/Theme Like: Arg0:Liker Arg1:Object of affection

19 Predicate Identification: Feature Extraction: Argument Classification: External Labeled Feedback 3 4 5 Parsing: Argument Identification: 1 2 Ilikedarkbread. arg=INP < S verb= like A0 A1... arg= bread NP < VP verb= like A0 A1... This SRL knows the grammar... & reads minds I is an agent A0 bread is a patient A1

20 Predicate Identification: Feature Extraction: Argument Classification: External Labeled Feedback 3 4 5 Parsing: Argument Identification: 1 2 arg=??? < ? verb= ??? A0 A1... arg=??? < ? verb= ??? A0 A1... This SRL knows the grammar... & reads minds what are they talking about? Topid rivvo den marplox.

21 Predicate Identification: Feature Extraction: Argument Classification: External Labeled Feedback 3 4 5 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N arg=I 1 st of two before verb verb= like A0 A1... arg= bread 2 nd of two after verb verb= like A0 A1... Baby SRL I is an agent A0 bread is a patient A1

22 Predicate Identification: Feature Extraction: Argument Classification: Internal Animacy Feedback 3 4 5 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N arg=I 1 st of two before verb verb= like A0 A1... arg= bread 2 nd of two after verb verb= like A0 A1... Baby SRL I is animate A0 bread is unknown, agent is I not A0

23 Unsupervised Part-of-Speech Clustering using a Hidden Markov Model (HMM) Yields context-sensitive clustering of words based on sequential dependencies (cf. Bannard et al., 2004; Chang et al., 2006; Mintz, 2003; Solan et al., 2005) A priori split between content and function words based on phonological differences (e.g., Shi et al., 1998; Connor et al., 2010) –80 hidden states, 30 pre-allocated to function words Trained on ~ 1 million words of unlabeled text from CHILDES HMM: 1 Trained on CDS 58 I 78 like 50 dark 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk

24 Linking rules But these clusters are unlabeled How do we link unlabeled clusters with particular grammatical categories, thus particular roles in sentence interpretation (SRL)? –nouns, which are candidate arguments –and verbs, which take arguments HMM: 1 Trained on CDS 58 I 78 like 50 dark 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk

25 Linking rules: Nouns first Syntactic bootstrapping is grounded in noun learning (e.g., Gillette et al., 1999) –Children learn the meanings of some nouns without syntactic knowledge Each noun is assumed to be a candidate argument –A simple form of semantic bootstrapping (Pinker, 1984) (10 to 75) Seed Nouns sampled from M-CDI 1 2 HMM: Argument Identification: Trained on CDS Seed Nouns N 58 I 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk

26 Linking rules: What about verbs? Verbs take noun arguments Via syntactic bootstrapping, children identify a word as a verb by tracking its argument-taking behavior in sentences Predicate Identification: 3 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk

27 Linking rules: What about verbs? Collect statistics over HMM training corpus: how often each state occurs with a particular number of arguments For each SRL input sentence, choose the word whose state is most likely to appear with the number of arguments found in the current sentence. Predicate Identification: 3 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N

28 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N Predicate Identification: Feature Extraction: Argument Classification: 3 4 5 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk arg=I 1 st of two before verb verb= like A0 A1... arg= bread 2 nd of two after verb verb= like A0 A1... Baby SRL This is a partial sentence representation.

29 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N Predicate Identification: Feature Extraction: Argument Classification: 3 4 5 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk arg=I 1 st of two before verb verb= like A0 A1... arg= bread 2 nd of two after verb verb= like A0 A1... Baby SRL This is a partial sentence representation. External Labeled Feedback I is an agent A0 bread is a patient A1

30 Semantic-role feedback??? Theories of language acquisition assume learning to understand sentences is a partially-supervised task. –use existing knowledge of words & syntax to assign a meaning to a sentence –the appropriateness of the meaning in context then provides the feedback [Johanna rivvo den sheep.]

31 bread is unknown, agent is I not A0 Baby SRL with Animacy-based Feedback Bootstrapped animacy feedback (a list of concrete nouns): 1.Animates are agents; inanimates are non-agents 2.Each verb has at most one agent 3.If there are two known animate nouns –Use current SRL classifier to choose the best agent (and use this decision as feedback to train the SRL) Note: Animacy feedback trains a simplified classifier, with only an agent/non-agent role distinction Internal Animacy Feedback I is animate A0

32 Predicate Identification: Feature Extraction: Argument Classification: Internal Animacy Feedback 3 4 5 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N arg=I 1 st of two before verb verb= like A0 notA0 arg= bread 2 nd of two after verb verb= like A0 notA0 Baby SRL I is animate A0 bread is unknown, agent is I not A0

33 Training the Baby SRL 3 corpora of child-directed speech –Annotated parental utterances using PropBank annotation scheme –Speech to Adam, Eve, Sarah (Brown, 1973) Adam 01-23 (2;3 - 3;2) Train on 01-20: 3951 propositions, 8107 arguments Eve 01-20 (1;6 - 2;3) Train on 01-18: 4029 propositions, 8499 arguments Sarah 01-90 (2;3 - 4;1) Train on 01-83: 8570 propositions, 15599 arguments

34 Testing the Baby SRL Constructed test sentences like those used in experiments with children –Unknown verbs & two animate nouns force the SRL to rely on syntactic knowledge "Adam krads Daddy!" krad Adam Mommy Daddy Ursula... Adam Mommy Daddy Ursula...

35 Testing the Baby SRL Compare systems trained on different syntactic features derived from the partial sentence representation Lexical baseline: Verb and argument word features only Noun Pattern: Lexical features plus features such as '1 st of 2 nouns', '2 nd of 2 nouns' Verb Position: Lexical features plus 'before verb', 'after verb' Combined: All feature types arg=I verb= like arg=I 1 st of two before verb verb= like arg=I 1 st of two verb= like arg=I before verb verb= like

36 Question 1: Are partial sentence representations useful, in principle, for learning new syntactic knowledge? Learning English word order Start with "gold standard" part-of-speech tagging and "gold standard" semantic role feedback

37 Are partial sentence representations useful, as a starting point for learning? We assume that representations as simple as an ordered set of nouns can yield useful information about verb argument structure and sentence interpretation. Is this true, in ordinary samples of language use? Despite all the noise this simple representation will add to the data?

38 Result 1: Partial sentence representations are useful for further syntax learning (A krads B)

39 Question 2: Can partial sentence representations be built via unsupervised clustering... and the proposed linking rules? Predicate Identification: 3 HMM: Argument Identification: 1 2 Trained on CDS Seed Nouns N 58 I V 78 like 50 dark N 48 bread 1. 58: I, you, ya, Kent 78: like, got, did, had 50: many, nice, good 48: more, juice, milk 78: 13% 1-N, 54% 2-N 50: 41% 1-N, 31% 2-N

40 Result 2: Partial sentence representations can be built via unsupervised clustering Number of known seed nouns

41 Question 3: Now for the Baby SRL Are partial sentence representations built via unsupervised clustering useful for learning new syntactic knowledge? With animacy-based feedback? Learning English word order Starting from scratch...

42 Result 3: Partial sentence representations based on unsupervised clustering are useful

43 SRL Summary Result 1: Partial sentence representations are useful for further syntax learning (gold arguments & feedback) Result 2: Partial sentence representations can be built via unsupervised clustering, and the proposed linking rules: –Semantic bootstrapping for nouns, 'real' syntactic bootstrapping for verbs Result 3: Partial sentence representations based on unsupervised clustering are useful for further syntax learning –Robust learning based on noisy unsupervised 'parse' of sentences, and noisy animacy-based feedback verbs precede objects in English

44 Three main claims: (1) Structure-mapping: Syntactic bootstrapping begins with an unlearned bias toward one-to-one mapping between nouns in sentences and semantic arguments of predicate terms. (2) Early abstraction: Children are biased toward usefully abstract representations of language experience. (3) Independent encoding of sentence structure: Children gather distributional facts about new verbs from listening experience.

45 A fourth prediction: (4) 'Real' syntactic bootstrapping: Children learn which words are the verbs by tracking their syntactic argument- taking behavior in sentences Hey, she pushed her. Will you push me on the swing? John pushed the cat off the sofa... Verb = Push [noun1, noun2] 2-participant relation

46 Acknowledgements Yael Gertner Jesse Snedeker Lila Gleitman NSF NICHD Soondo Baek Kate Messenger Sylvia Yuan Rose Scott


Download ppt "The origin of syntactic bootstrapping: A computational model Michael Connor, Cynthia Fisher, & Dan Roth University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google