Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nancy Chang UC Berkeley / International Computer Science Institute

Similar presentations


Presentation on theme: "Nancy Chang UC Berkeley / International Computer Science Institute"— Presentation transcript:

1 Nancy Chang UC Berkeley / International Computer Science Institute
Embodied Models of Language Learning and Use Embodied language learning Nancy Chang UC Berkeley / International Computer Science Institute

2 Turing’s take on the problem
“Of all the above fields the learning of languages would be the most impressive, since it is the most human of these activities. This field seems however to depend rather too much on sense organs and locomotion to be feasible.” Alan M. Turing Intelligent Machinery (1948)

3 …it may be more feasible than Turing thought!
Five decades later… Sense organs and locomotion Perceptual systems (especially vision) Motor and premotor cortex Mirror neurons: possible representational substrate Methodologies: fMRI, EEG, MEG Language Chomskyan revolution …and counter-revolution(s) Progress on cognitively and developmentally plausible theories of language Suggestive evidence of embodied basis of language …it may be more feasible than Turing thought! (Maybe language depends enough on sense organs and locomotion to be feasible!)

4 From single words to complex utterances
FATHER: Nomi are you climbing up the books? NAOMI: up. NAOMI: climbing. NAOMI: books. 1;11.3 FATHER: what’s the boy doing to the dog? NAOMI: squeezing his neck. NAOMI: and the dog climbed up the tree. NAOMI: now they’re both safe. NAOMI: but he can climb trees. 4;9.3 MOTHER: what are you doing? NAOMI: I climbing up. MOTHER: you’re climbing up? 2;0.18 Sachs corpus (CHILDES)

5 How do they make the leap?
0-9 months Smiles Responds differently to intonation Responds to name and “no” 9-18 months First words Recognizes intentions Responds, requests, calls, greets, protests 18-24 months agent-object Daddy cookie Girl ball agent-action Daddy eat Mommy throw action-object Eat cookie Throw hat entity-attribute entity-locative Doggie bed

6 Theory of Language Structure
Theory of Language Acquisition Theory of Language Use

7 The logical problem of language acquisition
Gold’s Theorem: Identification in the limit No superfinite class of language is identifiable from positive data only The logical problem of language acquisition Natural languages are not finite sets. Children receive (mostly) positive data. But children acquire language abilities quickly and reliably. One (not so) logical conclusion: THEREFORE: there must be strong innate biases restricting the search space Universal Grammar + parameter setting But kids aren’t born as blank slates! And they do not learn language in a vacuum!

8 Model is generalizing! Just like real babies!
Note: class of probabilistic context-free languages is learnable in the limit!! I.e., from hearing a finite number of sentences, Baby can correctly converge on a grammar that predicts an infinite number of sentences. Model is generalizing! Just like real babies!

9 Theory of Language Structure = autonomous syntax
Theory of Language Acquisition Theory of Language Use

10 What is knowledge of language?
Basic sound patterns (Phonology) How to make words (Morphology) How to put words together (Syntax) What words (etc.) mean (Semantics) How to do things with words (Pragmatics) Rules of conversation (Pragmatics)

11 Many mysteries What is the nature of linguistic representation?
Learning biases? Input data Prior knowledge How does language acquisition interact with other linguistic and cognitive processes?

12 meaningful language use in context.
Grammar learning is driven by meaningful language use in context. All aspects of the problem should reflect this assumption: Target of learning: a construction (form-meaning pair) Prior knowledge: rich conceptual structure, pragmatic inference Training data: pairs of utterances / situational context Performance measure: success in communication (comprehension)

13 Theory of Language Structure = constructions (form-meaning pairs)
= autonomous syntax Theory of Language Structure = constructions (form-meaning pairs) Theory of Language Acquisition Theory of Language Use

14 Theory of Language Structure = constructions (form-meaning pairs)
Theory of Language Acquisition Theory of Language Use

15 Theory of Language Structure
Theory of Language Acquisition Theory of Language Use

16 The course of development
0 mos 2 yr 6 mos 3 yrs 4 yrs 5 yrs 12 mos cooing first word reduplicated babbling two-word combinations multi-word utterances questions, complex sentence structures, conversational principles

17 Incremental development
throw throw 1;8.0 throw off 1;8.0 I throwded 1;10.28 I throw it. 1;11.3 throwing in. 1;11.3 throw it. 1;11.3 throw frisbee. 1;11.3 can I throw it? 2;0.2 I throwed Georgie. 2;0.2 you throw that? 2;0.5 gonna throw that? 2;0.18 throw it in the garbage. 2;1.17 throw in there. 2;1.17 throw it in that. 2;5.0 throwed it in the diaper pail. 2;11.12 fall fell down. 1;6.16 fall down. 1;8.0 I fall down. 1;10.17 fell out. 1;10.18 I fell it. 1;10.28 fell in basket. 1;10.28 fall down boom. 1;11.11 almost fall down. 1;11.11 toast fall down. 1;11.20 did Daddy fall down? 1;11.20 Kangaroo fall down 1;11.21 Georgie fell off 2;0.4 you fall down. 2;0.5 Georgie fall under there? 2;0.5 He fall down 2;0.18 Nomi fell down? 2;0.18 I falled down. 2;3.0

18 Children in one-word stage know a lot!
images people embodied knowledge statistical correlations … i.e., experience. actions Or, Opulence of the substrate. objects locations

19 Correlating forms and meanings
FORM (sound) MEANING (stuff) lexical constructions “you” you Human Object throw “throw” Throw thrower throwee “ball” ball “block” block

20 Phonology: Non-native contrasts
Werker and Tees (1984) Thompson: velar vs. uvular, /`ki/-/`qi/. Hindi: retroflex vs. dental, /t.a/-/ta/ The Thompson language is an Interior Salish (Native Indian) language spoken in south central British Columbia.

21 Finding words: Statistical learning
Saffran, Aslin and Newport (1996) /bidaku/, /padoti/, /golabu/ /bidakupadotigolabubidaku/ 2 minutes of this continuous speech stream By 8 months infants detect the words (vs non-words and part-words) pretty baby

22 Language Acquisition Opulence of the substrate
Prelinguistic children already have rich sensorimotor representations and sophisticated social knowledge intention inference, reference resolution language-specific event conceptualizations (Bloom 2000, Tomasello 1995, Bowerman & Choi, Slobin, et al.) Children are sensitive to statistical information Phonological transitional probabilities Most frequent items in adult input learned earliest (Saffran et al. 1998, Tomasello 2000)

23 Words learned by most 2-year olds in a play school (Bloom 1993)
food toys misc people sound emotion action prep demon. social Words learned by most 2-year olds in a play school (Bloom 1993)

24 Early syntax agent + action ‘Daddy sit’ action + object ‘drive car’
agent + object ‘Mommy sock’ action + location ‘sit chair’ entity + location ‘toy floor’ possessor + owned ‘my teddy’ entity + attribute ‘crayon big’ Demonst.+ entity ‘this phone’

25 Word order: agent and patient
Hirsch-Pasek and Golinkoff (1996) 1;4-1;7 mostly still in the one-word stage Where is CM tickling BB?

26 Language Acquisition Basic Scenes Verb Island Hypothesis
Simple clause constructions are associated directly with scenes basic to human experience (Goldberg 1995, Slobin 1985) Verb Island Hypothesis Children learn their earliest constructions (arguments, syntactic marking) on a verb-specific basis (Tomasello 1992) Young children’s early verbs and relational terms are individual islands of organization in an otherwise unorganized grammatical system. Later get: SELF-MOTION CXN, more general kind of progressive construction, etc. throw frisbee get ball throw ball get bottle throw OBJECT get OBJECT

27 Children generalize from experience
push3 force=high push12 force=low push34 force=? Specific cases are learned before general throw frisbee throw ball throw OBJECT drop ball drop bottle drop OBJECT Earliest constructions are lexically specific (item-based). (Verb Island Hypothesis, Tomasello 1992) throw OBJECT push OBJECT ACTION OBJECT

28 Development Of Throw Contextually grounded
throw off 1;10.28 I throwded it. (= I fell) I throwded. (= I fell) 1;11.3 I throw it. I throw it ice. (= I throw the ice) throwing in. throwing. 1;2.9 don’t throw the bear. 1;10.11 don’t throw them on the ground. 1;11.3 Nomi don’t throw the books down. what do you throw it into? what did you throw it into? 1;11.9 they’re throwing this in here. throwing the thing. Contextually grounded Parental utterances more complex These are (nearly) all Naomi’s uses of throw during the period shown. Note N’s phrases grow in complexity, infer a lot from context …but, parent’s are about the same over time. [Much more to input than just this.] (Independent development of different verb usages) *** looking ahead, hypothesis is that children are learning CONSTRUCTIONS that help them understand/produce language

29 Development Of Throw (cont’d)
2;0.3 don’t throw it Nomi. Nomi stop throwing. well you really shouldn’t throw things Nomi you know. remember how we told you you shouldn’t throw things. can I throw it? I throwed Georgie. could I throw that? 2;0.5 throw it? you throw that? 2;0.18 gonna throw that? 2;1.17 throw it in the garbage. throw in there. 2;5.0 throw it in that. 2;11.12 I throwed it in the diaper pail.

30 Session 4 outline Language acquisition: the problem
Child language acquisition Usage-based construction learning model Recapitulation: Embodied cognitive models

31 How do children make the transition from single words to complex combinations?
Multi-unit expressions with relational structure Concrete word combinations fall down, eat cookie, Mommy sock Item-specific constructions (limited-scope formulae) X throw Y, the X, X’s Y Argument structure constructions (syntax) Grammatical markers Tense-aspect, agreement, case In early word combinations and other “relational” constructions = structurally complex constructions.

32 Language learning is structure learning
“You’re throwing the ball!” Intonation, stress Phonemes, syllables Morphological structure Word segmentation, order Syntactic structure Sensorimotor structure Event structure Pragmatic structure: attention, intention, perspective Stat. regularities Language is rife with structure. Even simple utterances involve structure at a variety of interdependent levels, from intonational and phonological and syllabic structure, etc. Of course, there’s a lot more structure in the environment than that! In fact, much of the first year of life is devoted to mastery of all patterns of experience -- linguistic and otherwise. Event: Causal, temporal, force-dynamic structure. (Some crosslinguistic variation in timing, and perhaps packaging.)

33 Making sense: structure begets structure!
Structure is cumulative Object recognition  scene understanding Word segmentation  word learning Learners exploit existing structure to make sense of their environment Achieve goals Infer intentions Language learners exploit existing structure to make sense of their environment Achieve communicative goals Infer communicative intentions True of all structure -- sound, meaning, social, etc.; march toward complexity. Children are active learners, employing function / understanding.

34 Exploiting existing structure
“You’re throwing the ball!” This claim is increasingly uncontroversial for word learning. Word learning is also, famously, a mapping problem (word-to-world). - it is a mapping across domains -- those of form and meaning. widespread consensus that children are sensitive to a lot of rich structure they can use to identify potential mappings (bloom etc).

35 Comprehension is partial. (not just for dogs)

36 What we say to kids… what do you throw it into? they’re throwing this in here. do you throw the frisbee? they’re throwing a ball. don’t throw it Nomi. well you really shouldn’t throw things Nomi you know. remember how we told you you shouldn’t throw things. What they hear… blah blah YOU THROW blah? blah THROW blah blah HERE. blah YOU THROW blah blah? blah THROW blah blah BALL. DON’T THROW blah NOMI. blah YOU blah blah THROW blah NOMI blah blah. blah blah blah blah YOU shouldn’t THROW blah. But children also have rich situational context/cues they can use to fill in the gaps.

37 Understanding drives learning
Utterance+Situation Conceptual knowledge Linguistic knowledge Understanding Learning Basic idea: exploit all known structure -- linguistic and otherwise -- to: - communicate - make sense of new data - build complex structures from known ones (Partial) Interpretation

38 Potential inputs to learning
Genetic language-specific biases Domain-general structures and processes Embodied representations …grounded in action, perception, conceptualization, and other aspects of physical, mental and social experience Talmy 1988, 2000; Glenberg and Robertson 1999; MacWhinney 2005; Barsalou 1999; Choi and Bowerman 1991; Slobin 1985, 1997 Social routines Intention inference, reference resolution Statistical information transition probabilities, frequency effects Usage-based approaches to language learning (Tomasello 2003, Clark 2003, Bybee 1985, Slobin 1985, Goldberg 2005) …the opulence of the substrate!

39 Methodology: computational modeling
Grammar learning is driven by meaningful language use in context. Meaningful, structured representations Target representation: construction-based grammar Input data: utterance+context pairs, conceptual/linguistic knowledge Construction analyzer (comprehension) Usage-based learning framework Optimization toward “simplest” grammar given the data Goal: improved comprehension Take meaning seriously! Structural constraints of word combinations -- meaningful * all meaningful I nput available * functional constraints of communication Learning model: usage-driven optimization (MDL) Learning operations (structural mapping) Evaluation criteria (simplicity)

40 Models of language learning
Several previous models of word learning are grounded (form + meaning) Regier 1996: <bitmaps, word> ® spatial relations Roy and Pentland 1998: <image, sound> ® object shapes/attributes Bailey 1997: <feature structure, word> ® actions Siskind 2000: <video, sound> ® actions Oates et al. 1999: <sensors, word class> ® actions Not so for grammar learning! Stolcke 1994: probabilistic attribute grammars from sentences Siskind 1996: verb argument structure from predicates Thompson 1998: syntax-semantics mapping from database queries Word learning is like category learning, but with cross-domain character -addressed by many methods (model merging, clustering, any generalization) -NOTICE this is UNARY Roles handled implicitly, if at all. [Oates et al. 1999: Using Syntax to Learn Semantics -- learn senses using bigram clustering]

41 Representation: constructions
The basic linguistic unit is a <form, meaning> pair (Kay and Fillmore 1999, Lakoff 1987, Langacker 1987, Goldberg 1995, Croft 2001, Goldberg and Jackendoff 2004) ball toward Big Bird throw-it

42 Relational constructions
throw ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Embodied Construction Grammar (Bergen & Chang, 2005)

43 Usage: Construction analyzer
Utterance+Situation Conceptual knowledge Linguistic knowledge (constructions) (embodied schemas) Understanding Partial parser Unification-based Reference resolution (Bryant 2004) (Partial) Interpretation (semantic specification)

44 Usage: best-fit constructional analysis
Discourse & Situational Context Constructions Utterance Analyzer: probabilistic, incremental, competition-based Semantic Specification: image schemas, frames, action schemas Simulation

45 Competition-based analyzer finds the best analysis
An analysis is made up of: A constructional tree A set of resolutions A semantic specification The best fit has the highest combined score

46 An analysis using THROW-TRANSITIVE

47 Usage: Partial understanding
“You’re throwing the ball!” ANALYZED MEANING Participants: ball, Ego Throw-Action thrower = ? throwee = ? PERCEIVED MEANING Participants: my_ball, Ego Throw-Action thrower = Ego throwee = my_ball

48 Construction learning model: search
Model allows incorporation of all kinds of available info. Model defines a CLASS of learning problems. I.e. allows all or none of the sources of constraint above, though we will focus on latter set.

49 Proposing new constructions
Relational Mapping context-dependent Reorganization Merging (generalization) Splitting (decomposition) Joining (compositon) context-independent

50 Initial Single-Word Stage
FORM (sound) lexical constructions MEANING (stuff) schema Addressee subcase of Human “you” you schema Throw roles: thrower throwee throw “throw” ball “ball” schema Ball subcase of Object schema Block subcase of Object block “block”

51 New Data: “You Throw The Ball”
FORM MEANING SITUATION throw-ball Self you Addressee schema Addressee subcase of Human “you” Addressee Throw thrower throwee schema Throw roles: thrower throwee Throw thrower throwee “throw” throw before role-filler “the” ball schema Ball subcase of Object Ball Ball “ball” schema Block subcase of Object “block” block

52 New Construction Hypothesized
construction THROW-BALL constructional constituents t : THROW b : BALL form tf before bf meaning tm.throwee ↔ bm

53 Context-driven relational mapping: partial analysis

54 Context-driven relational mapping: form and meaning correlation

55 Meaning Relations: pseudo-isomorphism
strictly isomorphic: Bm fills a role of Am shared role-filler: Am and Bm have a role filled by X sibling role-fillers: Am and Bm fill roles of Y

56 Relational mapping strategies
strictly isomorphic: Bm is a role-filler of Am (or vice versa) Am.r1  Bm Af Am A form- relation role- filler Bf Bm B throw ball throw.throwee  ball

57 Relational mapping strategies
shared role-filler: Am and Bm each have a role filled by the same entity Am.r1  Bm.r2 role- filler Af Am A form- relation X role- filler Bf Bm B put ball down put.mover  ball down.tr  ball

58 Relational mapping strategies
sibling role-fillers: Am and Bm fill roles of the same schema Y.r1  Am, Y.r2  Bm role- filler Af Am A form- relation Y role- filler Bf Bm B Nomi ball possession.possessor  Nomi possession.possessed  ball

59 Overview of learning processes
Relational mapping throw the ball Merging throw the block throwing the ball Joining ball off you throw the ball off THROW < BALL THROW < OBJECT THROW < BALL < OFF

60 Merging similar constructions
FORM MEANING throw the block construction THROW-BLOCK constituents t : THROW o : BLOCK form tf before of meaning tm.throwee « om Block Throw thrower throwee throw before Objectf THROW.throwee = Objectm THROW-OBJECT construction throw the ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Ball Throw thrower throwee Merge constructions involving correlated relational mappings over one or more pairs of similar constituents “MERGE” b/c default algorithm is model merging. -been used for several frameworks, most relevant = verbs. (Also used it for one-word domain, featurized.) construction THROW-OBJECT constituents t : THROW o : OBJECT form tf before of meaning tm.throwee « om construction THROW-BLOCK subcase of THROW-OBJECT o : BLOCK construction THROW-BALL o : BALL

61 More complex generalization operations could drive the formation of new *constructional* categories.
top: common parent Toy found for Ball and Block bottom: new “ToyX” category formed with Ball-Cn and Block-Cn as only subcases

62 Overview of learning processes
Relational mapping throw the ball Merging throw the block throwing the ball Joining ball off you throw the ball off THROW < BALL THROW < OBJECT THROW < BALL < OFF

63 Joining co-occurring constructions
FORM MEANING throw the ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Ball Throw thrower throwee THROW.throwee=Ball Motion m m.mover = Ball m.path = Off throw before ball ball before off ThrowBallOff construction Compose frequently co-occurring constructions with compatible constraints (e.g., common arguments) construction BALL-OFF constituents b : BALL o : OFF form bf before of meaning evokes Motion as m mm.mover « bm mm.path « om Ball Motion mover ball off path Off

64 Joined construction construction THROW-BALL-OFF constructional
constituents t : THROW b : BALL o : OFF form tf before bf bf before of meaning evokes MOTION as m tm.throwee  bm m.mover  bm m.path  om

65 Construction learning model: evaluation
Learning operations are guided by an MDL heuristic Heuristic: minimum description length (MDL: Rissanen 1978) asdf

66 Learning:usage-based optimization
Grammar learning = search for (sets of) constructions Incremental improvement toward best grammar given the data Search strategy: usage-driven learning operations Evaluation criteria: simplicity-based, information-theoretic Minimum description length: most compact encoding of the grammar and data Trade-off between storage and processing Domain-general learning principles applied to linguistic structures/processes

67 Minimum description length
(Rissanen 1978, Goldsmith 2001, Stolcke 1994, Wolff 1982) Seek most compact encoding of data in terms of Compact representation of model (i.e., the grammar) Compact representation of data (i.e., the utterances) Approximates Bayesian learning (Bailey 1997, Stolcke 1994) Exploit tradeoff between preferences for: smaller grammars simpler analyses of data Fewer constructions Fewer constituents/constraints Shorter slot chains (more local concepts) Pressure to compress/generalize More likely constructions Shallower analyses Pressure to retain specific constructions

68 MDL: details Choose grammar G to minimize length(G|D):
length(G|D) = m • length(G) + n • length(D|G) Bayesian approximation: length(G|D) ≈ posterior probability P(G|D) Length of grammar = length(G) ≈ prior P(G) favor fewer/smaller constructions/roles favor shorter slot chains (more familiar concepts) Length of data given grammar = length(D|G) ≈ likelihood P(D|G) favor simpler analyses using more frequent constructions

69 Flashback to verb learning: Learning 2 senses of PUSH
Model merging based on Bayesian MDL

70 Experiment: learning verb islands
Question: Can the proposed construction learning model acquire English item-based motion constructions? (Tomasello 1992) Form: Participants : Mother, Naomi, Ball Scene : Discourse : text : throw the ball intonation : falling Throw thrower : Naomi throwee : Ball speaker :Mother addressee Naomi speech act : imperative activity : play joint attention : Ball Given: initial lexicon and ontology Data: child-directed language annotated with contextual information

71 Experiment: learning verb islands
Subset of the CHILDES database of parent-child interactions (MacWhinney 1991; Slobin ) coded by developmental psychologists for form: particles, deictics, pronouns, locative phrases, etc. meaning: temporality, person, pragmatic function, type of motion (self-movement vs. caused movement; animate being vs. inanimate object, etc.) crosslinguistic (English, French, Italian, Spanish) English motion utterances: 829 parent, 690 child utterances English all utterances: 3160 adult, 5408 child age span is 1;2 to 2;6 Psychologists have chosen meaningful variables this may be sufficient, but in addition we may translate into simulation primitives self-movement of animate being (walk) force.energy-source = spg.trajector

72 Annotated Childes Data
765 Annotated Parent Utterances Annotated for the following scenes: CausedMotion : “Put Goldie through the chimney” SelfMotion : “did you go to the doctor today?” JointMotion : “bring the other pieces Nomi” Transfer :“give me the toy” SerialAction: “come see the doggie” Originally annotated by psychologists

73 An Annotation (Bindings)
Utterance: Put Goldie through the chimney SceneType: CausedMotion Causer: addressee Action: put Direction: through Mover: Goldie (toy) Landmark: chimney

74 Learning throw-constructions
INPUT UTTERANCE SEQUENCE LEARNED CXNS 1. Don’t throw the bear. throw-bear 2. you throw it you-throw 3. throw-ing the thing. throw-thing 4. Don’t throw them on the ground. throw-them 5. throwing the frisbee. throw-frisbee MERGE throw-OBJ 6. Do you throw the frisbee? COMPOSE you-throw-frisbee 7. She’s throwing the frisbee. COMPOSE she-throw-frisbee

75 Example learned throw-constructions
Throw bear You throw Throw thing Throw them Throw frisbee Throw ball You throw frisbee She throw frisbee <Human> throw frisbee Throw block Throw <Toy> Throw <Phys-Object> <Human> throw <Phys-Object>

76 Early talk about throwing
Transcript data, Naomi 1;11.9 Par: they’re throwing this in here. Par: throwing the thing. Child: throwing in. Child: throwing. Par: throwing the frisbee. … Par: do you throw the frisbee? do you throw it? Child: throw it. Child: I throw it. … Child: throw frisbee. Par: she’s throwing the frisbee. Child: throwing ball. Sample input prior to 1;11.9: don’t throw the bear. don’t throw them on the ground. Nomi don’t throw the books down. what do you throw it into? Sample tokens prior to 1;11.9: throw throw off I throw it. I throw it ice. (= I throw the ice) Over whole corpus, N’s phrases grow in complexity…but, parent’s are about the same over time. [Much more to input than just this.] Still infer a lot from context (Independent development of different verb usages) Sachs corpus (CHILDES)

77 A quantitative measure: coverage
Goal: incrementally improving comprehension At each stage in testing, use current grammar to analyze test set Coverage = % role bindings analyzed Example: Grammar: throw-ball, throw-block, you-throw Test sentence: throw the ball. Bindings: scene=Throw, thrower=Nomi, throwee=ball Parsed bindings: scene=Throw, throwee=ball Score test grammar on sentence: 2/3 = 66.7%

78 Learning to comprehend

79 Principles of interaction
Early in learning: no conflict Conceptual knowledge dominates More lexically specific constructions (no cost) throw want throw off want cookie throwing in want cereal you throw it I want it Later in learning: pressure to categorize More constructions = more potential for confusion during analysis Mixture of lexically specific and more general constructions throw OBJ want OBJ throw DIR I want OBJ throw it DIR ACTOR want OBJ ACTOR throw OBJ

80 Verb island constructions learned
Basic processes produce constructions similar to those in child production data. System can generalize beyond encountered data with pressure to merge constructions. Differences in verb learning lend support to verb island hypothesis. Future directions full English corpus: non-motion scenes, argument structure constructions Crosslinguistic data: Russian (case marking), Mandarin Chinese (omission, directional particles, aspect markers) Morphological constructions Contextual constructions; multi-utterance discourse (Mok)

81 Summary Model satisfies convergent constraints from diverse disciplines Crosslinguistic developmental evidence Cognitive and constructional approaches to grammar Precise grammatical representations and data-driven learning framework for understanding and acquisition Model addresses special challenges of language learning Exploits structural parallels in form/meaning to learn relational mappings Learning is usage-based/error-driven (based on partial comprehension) Minimal specifically linguistic biases assumed Learning exploits child’s rich experiential advantage Earliest, item-based constructions learnable from utterance-context pairs

82 Key model components Embodied representations Construction formalism
Experientially motivated rep’ns incorporating meaning/context Construction formalism Multiword constructions = relational form-meaning correspondences Usage 1: Learning tightly integrated with comprehension New constructions bridge gap between linguistically analyzed meaning and contextually available meaning Usage 2: Statistical learning framework Incremental, specific-to-general learning Minimum description length heuristic for choosing best grammar

83 Embodied Construction Grammar
Theory of Language Structure Theory of Language Acquisition Theory of Language Use Usage-based optimization Simulation Semantics

84 Usage-based learning: comprehension and production
constructicon world knowledge discourse & situational context simulation analysis utterance analyze & resolve utterance response comm. intent generate reinforcement (usage) hypothesize constructions & reorganize reinforcement (correction) reinforcement (usage) reinformcent (correction)

85

86 A Best-Fit Approach for Productive Analysis of Omitted Arguments
Eva Mok & John Bryant University of California, Berkeley International Computer Science Institute

87 Simplifying grammar by exploiting the language understanding process
Omission of arguments in Mandarin Chinese Construction grammar framework Model of language understanding Our best-fit approach

88 Productive Argument Omission (in Mandarin)
ma1+ma gei3 ni3 zhei4+ge mother give 2PS this+CLS 1 Mother (I) give you this (a toy). You give auntie [the peach]. 2 ni3 gei3 yi2 2PS give auntie ao ni3 gei3 ya EMP 2PS give 3 Oh (go on)! You give [auntie] [that]. gei3 give 4 [I] give [you] [some peach]. CHILDES Beijing Corpus (Tardiff, 1993; Tardiff, 1996)

89 Arguments are omitted with different probabilities
All arguments omitted: 30.6% No arguments omitted: 6.1%

90 Construction grammar approach
Kay & Fillmore 1999; Goldberg 1995 Grammaticality: form and function Basic unit of analysis: construction, i.e. a pairing of form and meaning constraints Not purely lexically compositional Implies early use of semantics in processing Embodied Construction Grammar (ECG) (Bergen & Chang, 2005)

91 Proliferation of constructions
Subj Verb Obj1 Obj2 Giver Transfer Recipient Theme Verb Obj1 Obj2 Transfer Recipient Theme Subj Verb Obj2 Giver Transfer Theme Subj Verb Obj1 Giver Transfer Recipient

92 If the analysis process is smart, then...
Subj Verb Obj1 Obj2 Giver Transfer Recipient Theme The grammar needs only state one construction Omission of constituents is flexibly allowed The analysis process figures out what was omitted

93 Best-fit analysis takes burden off grammar representation
Discourse & Situational Context Constructions Utterance Analyzer: incremental, competition-based, psycholinguistically plausible Semantic Specification: image schemas, frames, action schemas Simulation

94 Competition-based analyzer finds the best analysis
An analysis is made up of: A constructional tree A set of resolutions A semantic specification The best fit has the highest combined score

95 Combined score that determines best-fit
Syntactic Fit: Constituency relations Combine with preferences on non-local elements Conditioned on syntactic context Antecedent Fit: Ability to find referents in the context Conditioned on syntactic information, feature agreement Semantic Fit: Semantic bindings for frame roles Frame roles’ fillers are scored

96 Analyzing ni3 gei3 yi2 (You give auntie)
Two of the competing analyses: ni3 gei3 yi2 omitted Giver Transfer Recipient Theme ni3 gei3 omitted yi2 Giver Transfer Recipient Theme Syntactic Fit: P(Theme omitted | ditransitive cxn) = 0.65 P(Recipient omitted | ditransitive cxn) = 0.42 (1-0.78)*(1-0.42)*0.65 = 0.08 (1-0.78)*(1-0.65)*0.42 = 0.03

97 Frame and lexical information restrict type of reference
Transfer Frame Giver Recipient Theme Manner Means Place Purpose Reason Time Lexical Unit gei3 Giver (DNI) Recipient (DNI) Theme (DNI)

98 Can the omitted argument be recovered from context?
Antecedent Fit: ni3 gei3 yi2 omitted Giver Transfer Recipient Theme ni3 gei3 omitted yi2 Giver Transfer Recipient Theme Discourse & Situational Context child mother peach auntie table ?

99 How good of a theme is a peach? How about an aunt?
Semantic Fit: ni3 gei3 yi2 omitted Giver Transfer Recipient Theme ni3 gei3 yi2 omitted Giver Transfer Recipient Theme ni3 gei3 omitted yi2 Giver Transfer Recipient Theme The Transfer Frame Giver (usually animate) Recipient (usually animate) Theme (usually inanimate)

100 Each construction is annotated with probabilities of omission
The argument omission patterns shown earlier can be covered with just ONE construction Subj Verb Obj1 Obj2 Giver Transfer Recipient Theme 0.78 0.65 0.42 P(omitted|cxn): Each construction is annotated with probabilities of omission Language-specific default probability can be set

101 Leverage process to simplify representation
The processing model is complementary to the theory of grammar By using a competition-based analysis process, we can: Find the best-fit analysis with respect to constituency structure, context, and semantics Eliminate the need to enumerate allowable patterns of argument omission in grammar This is currently being applied in models of language understanding and grammar learning.

102 Simulation hypothesis
We understand by mentally simulating Simulation exploits some of the same neural structures activated during performance, perception, imagining, memory… Linguistic structure parameterizes the simulation. Language gives us enough information to simulate

103 Language understanding as simulative inference
“Harry walked to the cafe.” Utterance Linguistic knowledge Analysis Process General Knowledge Belief State Schema Trajector Goal walk Harry cafe Simulation Specification Simulation Cafe

104 Usage-based learning: comprehension and production
constructicon world knowledge discourse & situational context simulation analysis utterance analyze & resolve utterance response comm. intent generate reinforcement (usage) hypothesize constructions & reorganize reinforcement (correction) reinforcement (usage) reinformcent (correction)

105 Recapituation

106 Theory of Language Structure
Theory of Language Acquisition Theory of Language Use

107 Motivating assumptions
Structure and process are linked Embodied language use constrains structure! Language and rest of cognition are linked All evidence is fair game Need computational formalisms that capture embodiment Embodied meaning representations Embodied grammatical theory

108 Embodiment and Simulation: Basic NTL Hypotheses
Embodiment Hypothesis Basic concepts and words derive their meaning from embodied experience. Abstract and theoretical concepts derive their meaning from metaphorical maps to more basic embodied concepts. Structured connectionist models provide a suitable formalism for capturing these processes. Simulation Hypothesis Language exploits many of the same structures used for action, perception, imagination, memory and other neurally grounded processes. Linguistic structures set parameters for simulations that draw on these embodied structures.

109 The ICSI/Berkeley Neural Theory of Language Project
The general tactic is to view complex cognitive phenomena as having more than one level of analysis, here 5 levels. (Cog/ling level that cogsci people study; computational level with relatively standard CS rep’ns; neural networks with structure, etc.) Importantly, this reduction or abstraction is constrained in that structures or representations used at one level should have equivalent translations or implementations at the more concrete or biologically inspired levels (e.g. SHRUTI binding via temporal synchrony) The ICSI/Berkeley Neural Theory of Language Project

110

111 Complex phenomena within reach
Radial categories / prototype effects (Rosch 1973, 1978; Lakoff 1985) mother: birth / adoptive / surrogate / genetic, … Profiling (Langacker 1989, 1991; cf. Fillmore XX) hypotenuse, buy/sell (Commercial Event frame) Metaphor and metonymy (Lakoff & Johnson 1980) ANGER IS HEAT, MORE IS UP The ham sandwich wants his check. / All hands on deck. Mental spaces (Fauconnier 1994) The girl with blue eyes in the painting really has green eyes. Conceptual blending (Fauconnier & Turner 2002, inter alia) workaholic, information highway, fake guns

112 Embodiment in language
Perceptual and motor systems play a central role in language production and comprehension Theoretical proposals Linguistics: Lakoff, Langacker, Talmy Neuroscience: Damasio, Edelman Cognitive psychology: Barsalou, Gibbs, Glenberg, MacWhinney Computer science: Steels, Brooks, Siskind, Feldman, Roy


Download ppt "Nancy Chang UC Berkeley / International Computer Science Institute"

Similar presentations


Ads by Google