Nancy Chang UC Berkeley / International Computer Science Institute

Slides:



Advertisements
Similar presentations
FIRST LANGUAGE ACQUISITION
Advertisements

PSY 369: Psycholinguistics Language Comprehension: Propositional meaning.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
CSCTR Session 11 Dana Retová.  Start bottom-up  Create cognition based on sensori-motor interaction ◦ Cohen et al. (1996) – Building a baby ◦ Cohen.
Comparing L1 and L2 acquisition SS Linguistic knowledge L2 learners know linguistic categories from their native language: Units: words, clauses,
Chapter 4 Key Concepts.
PM—Propositional Model A Computational Psycholinguistic Model of Language Comprehension Based on a Relational Analysis of Written English Jerry T. Ball,
Shallow Processing Eva M. Fernández Queens College & Graduate Center City University of New York.
Embodied Construction Grammar ECG (Formalizing Cognitive Linguistics) 1.Community Grammar and Core Concepts 2.Deep Grammatical Analysis 3.Computational.
What does an infant feel and perceive?
How Children Learn Language. Lec. 3
FIRST LANGUAGE ACQUISITION
Cognitive Linguistics Croft & Cruse 9
Language, Mind, and Brain by Ewa Dabrowska Chapter 9: Syntactic constructions, pt. 1.
1 Language and kids Linguistics lecture #8 November 21, 2006.
Sentence Memory: A Constructive Versus Interpretive Approach Bransford, J.D., Barclay, J.R., & Franks, J.J.
Chapter (7), part (2).  Intentions in words. First words fulfill the intentions previously expressed through gestures and vocalization. Very different.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
Models of Grammar Learning CS 182 Lecture April 24, 2008.
Topics in Cognition and Language: Theory, Data and Models *Perceptual scene analysis: extraction of meaning events, causality, intentionality, Theory of.
Language, Mind, and Brain by Ewa Dabrowska Chapter 10: The cognitive enterprise.
NTL – Converging Constraints Basic concepts and words derive their meaning from embodied experience. Abstract and theoretical concepts derive their meaning.
Distributional Cues to Word Boundaries: Context Is Important Sharon Goldwater Stanford University Tom Griffiths UC Berkeley Mark Johnson Microsoft Research/
CS 182 Sections slides created by: Eva Mok modified by jgm April 26, 2006.
Early language Acquisition
Embodied Models of Language Learning and Use Embodied language learning Nancy Chang UC Berkeley / International Computer Science Institute.
Models of Grammar Learning CS 182 Lecture April 26, 2007.
Constructing Grammar: a computational model of the acquisition of early constructions CS 182 Lecture April 25, 2006.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
Psycholinguistics 12 Language Acquisition. Three variables of language acquisition Environmental Cognitive Innate.
Lecture 1 Introduction: Linguistic Theory and Theories
Language Development Language and thought are intertwined. Both abilities involve using symbols. We are able to think and talk about objects that are not.
Emergence of Syntax. Introduction  One of the most important concerns of theoretical linguistics today represents the study of the acquisition of language.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
NTL – Converging Constraints Basic concepts and words derive their meaning from embodied experience. Abstract and theoretical concepts derive their meaning.
Assessment of Semantics
Cognitive Development: Language Infants and children face an especially important developmental task with the acquisition of language.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
Psych156A/Ling150: Psychology of Language Learning Lecture 19 Learning Structure with Parameters.
A Cognitive Substrate for Natural Language Understanding Nick Cassimatis Arthi Murugesan Magdalena Bugajska.
What is linguistics  It is the science of language.  Linguistics is the systematic study of language.  The field of linguistics is concerned with the.
Chapter 10 Thinking and Language.
The 12 th International Cognitive Linguistics Conference (June 25, 2013) Mothers’ Speech for Children’s Intention-reading: a Cross-linguistic Study of.
Embodied Machines The Grounding (binding) Problem –Real cognizers form multiple associations between concepts Affordances - how is an object interacted.
Chapter 10 - Language 4 Components of Language 1.Phonology Understanding & producing speech sounds Phoneme - smallest sound unit Number of phonemes varies.
Psycholinguistic Theory
Adele E. Goldberg. How argument structure constructions are learned.
Survey of Modern Psychology Language Development.
English-speaking children who are typically developing first acquire item-specific patterns (e.g. put it in) and their meanings as a whole, then develop.
V Language and cognition. Psycholinguistics is the study of psychological aspects of language; it usually studies the psychological states and mental.
Introduction to Embodied Construction Grammar March 4, 2003 Ben Bergen
PSY270 Michaela Porubanova. Language  a system of communication using sounds or symbols that enables us to express our feelings, thoughts, ideas, and.
First Language Acquisition
Why languages differ: Variation in the conventionalization of constraints on inference By: Randy J. LaPolla City University of Hong Kong Presented by:
Language Development. Four Components of Language Phonology sounds Semantics meanings of words Grammar arrangements of words into sentences Pragmatics.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Child Directed Speech. What is CDS? A specialised way of speaking to young children/a way of direct teaching A specialised way of speaking to young children/a.
MENTAL GRAMMAR Language and mind. First half of 20 th cent. – What the main goal of linguistics should be? Behaviorism – Bloomfield: goal of linguistics.
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Figure and Ground Part 2 APLNG 597C LEJIAO WANG 03/16/2015.
VISUAL WORD RECOGNITION. What is Word Recognition? Features, letters & word interactions Interactive Activation Model Lexical and Sublexical Approach.
Usage-Based Phonology Anna Nordenskjöld Bergman. Usage-Based Phonology overall approach What is the overall approach taken by this theory? summarize How.
1 Prepared by: Laila al-Hasan. 2 language Acquisition This lecture concentrates on the following topics: Language and cognition Language acquisition Phases.
The Neural Basis of Thought and Language Week 14.
CS 182 Sections slide credit to Eva Mok and Joe Makin Updated by Leon Barrett April 25, 2007.
Child Syntax and Morphology
PSYC 206 Lifespan Development Bilge Yagmurlu.
Biointelligence Laboratory, Seoul National University
Language, Mind, and Brain by Ewa Dabrowska
Presentation transcript:

Nancy Chang UC Berkeley / International Computer Science Institute Embodied Models of Language Learning and Use Embodied language learning Nancy Chang UC Berkeley / International Computer Science Institute

Turing’s take on the problem “Of all the above fields the learning of languages would be the most impressive, since it is the most human of these activities. This field seems however to depend rather too much on sense organs and locomotion to be feasible.” Alan M. Turing Intelligent Machinery (1948)

…it may be more feasible than Turing thought! Five decades later… Sense organs and locomotion Perceptual systems (especially vision) Motor and premotor cortex Mirror neurons: possible representational substrate Methodologies: fMRI, EEG, MEG Language Chomskyan revolution …and counter-revolution(s) Progress on cognitively and developmentally plausible theories of language Suggestive evidence of embodied basis of language …it may be more feasible than Turing thought! (Maybe language depends enough on sense organs and locomotion to be feasible!)

From single words to complex utterances FATHER: Nomi are you climbing up the books? NAOMI: up. NAOMI: climbing. NAOMI: books. 1;11.3 FATHER: what’s the boy doing to the dog? NAOMI: squeezing his neck. NAOMI: and the dog climbed up the tree. NAOMI: now they’re both safe. NAOMI: but he can climb trees. 4;9.3 MOTHER: what are you doing? NAOMI: I climbing up. MOTHER: you’re climbing up? 2;0.18 Sachs corpus (CHILDES)

How do they make the leap? 0-9 months Smiles Responds differently to intonation Responds to name and “no” 9-18 months First words Recognizes intentions Responds, requests, calls, greets, protests 18-24 months agent-object Daddy cookie Girl ball agent-action Daddy eat Mommy throw action-object Eat cookie Throw hat entity-attribute entity-locative Doggie bed

Theory of Language Structure Theory of Language Acquisition Theory of Language Use

The logical problem of language acquisition Gold’s Theorem: Identification in the limit No superfinite class of language is identifiable from positive data only The logical problem of language acquisition Natural languages are not finite sets. Children receive (mostly) positive data. But children acquire language abilities quickly and reliably. One (not so) logical conclusion: THEREFORE: there must be strong innate biases restricting the search space Universal Grammar + parameter setting But kids aren’t born as blank slates! And they do not learn language in a vacuum!

Model is generalizing! Just like real babies! Note: class of probabilistic context-free languages is learnable in the limit!! I.e., from hearing a finite number of sentences, Baby can correctly converge on a grammar that predicts an infinite number of sentences. Model is generalizing! Just like real babies!

Theory of Language Structure = autonomous syntax Theory of Language Acquisition Theory of Language Use

What is knowledge of language? Basic sound patterns (Phonology) How to make words (Morphology) How to put words together (Syntax) What words (etc.) mean (Semantics) How to do things with words (Pragmatics) Rules of conversation (Pragmatics)

Many mysteries What is the nature of linguistic representation? Learning biases? Input data Prior knowledge How does language acquisition interact with other linguistic and cognitive processes?

meaningful language use in context. Grammar learning is driven by meaningful language use in context. All aspects of the problem should reflect this assumption: Target of learning: a construction (form-meaning pair) Prior knowledge: rich conceptual structure, pragmatic inference Training data: pairs of utterances / situational context Performance measure: success in communication (comprehension)

Theory of Language Structure = constructions (form-meaning pairs) = autonomous syntax Theory of Language Structure = constructions (form-meaning pairs) Theory of Language Acquisition Theory of Language Use

Theory of Language Structure = constructions (form-meaning pairs) Theory of Language Acquisition Theory of Language Use

Theory of Language Structure Theory of Language Acquisition Theory of Language Use

The course of development 0 mos 2 yr 6 mos 3 yrs 4 yrs 5 yrs 12 mos cooing first word reduplicated babbling two-word combinations multi-word utterances questions, complex sentence structures, conversational principles

Incremental development throw throw 1;8.0 throw off 1;8.0 I throwded 1;10.28 I throw it. 1;11.3 throwing in. 1;11.3 throw it. 1;11.3 throw frisbee. 1;11.3 can I throw it? 2;0.2 I throwed Georgie. 2;0.2 you throw that? 2;0.5 gonna throw that? 2;0.18 throw it in the garbage. 2;1.17 throw in there. 2;1.17 throw it in that. 2;5.0 throwed it in the diaper pail. 2;11.12 fall fell down. 1;6.16 fall down. 1;8.0 I fall down. 1;10.17 fell out. 1;10.18 I fell it. 1;10.28 fell in basket. 1;10.28 fall down boom. 1;11.11 almost fall down. 1;11.11 toast fall down. 1;11.20 did Daddy fall down? 1;11.20 Kangaroo fall down 1;11.21 Georgie fell off 2;0.4 you fall down. 2;0.5 Georgie fall under there? 2;0.5 He fall down 2;0.18 Nomi fell down? 2;0.18 I falled down. 2;3.0

Children in one-word stage know a lot! images people embodied knowledge statistical correlations … i.e., experience. actions Or, Opulence of the substrate. objects locations

Correlating forms and meanings FORM (sound) MEANING (stuff) lexical constructions “you” you Human Object throw “throw” Throw thrower throwee “ball” ball “block” block

Phonology: Non-native contrasts Werker and Tees (1984) Thompson: velar vs. uvular, /`ki/-/`qi/. Hindi: retroflex vs. dental, /t.a/-/ta/ The Thompson language is an Interior Salish (Native Indian) language spoken in south central British Columbia.

Finding words: Statistical learning Saffran, Aslin and Newport (1996) /bidaku/, /padoti/, /golabu/ /bidakupadotigolabubidaku/ 2 minutes of this continuous speech stream By 8 months infants detect the words (vs non-words and part-words) pretty baby

Language Acquisition Opulence of the substrate Prelinguistic children already have rich sensorimotor representations and sophisticated social knowledge intention inference, reference resolution language-specific event conceptualizations (Bloom 2000, Tomasello 1995, Bowerman & Choi, Slobin, et al.) Children are sensitive to statistical information Phonological transitional probabilities Most frequent items in adult input learned earliest (Saffran et al. 1998, Tomasello 2000)

Words learned by most 2-year olds in a play school (Bloom 1993) food toys misc. people sound emotion action prep. demon. social Words learned by most 2-year olds in a play school (Bloom 1993)

Early syntax agent + action ‘Daddy sit’ action + object ‘drive car’ agent + object ‘Mommy sock’ action + location ‘sit chair’ entity + location ‘toy floor’ possessor + owned ‘my teddy’ entity + attribute ‘crayon big’ Demonst.+ entity ‘this phone’

Word order: agent and patient Hirsch-Pasek and Golinkoff (1996) 1;4-1;7 mostly still in the one-word stage Where is CM tickling BB?

Language Acquisition Basic Scenes Verb Island Hypothesis Simple clause constructions are associated directly with scenes basic to human experience (Goldberg 1995, Slobin 1985) Verb Island Hypothesis Children learn their earliest constructions (arguments, syntactic marking) on a verb-specific basis (Tomasello 1992) Young children’s early verbs and relational terms are individual islands of organization in an otherwise unorganized grammatical system. Later get: SELF-MOTION CXN, more general kind of progressive construction, etc. throw frisbee get ball throw ball get bottle … … throw OBJECT get OBJECT

Children generalize from experience … push3 force=high push12 force=low push34 force=? Specific cases are learned before general throw frisbee … throw ball throw OBJECT drop ball … drop bottle drop OBJECT Earliest constructions are lexically specific (item-based). (Verb Island Hypothesis, Tomasello 1992) throw OBJECT … push OBJECT ACTION OBJECT

Development Of Throw Contextually grounded throw off 1;10.28 I throwded it. (= I fell) I throwded. (= I fell) 1;11.3 I throw it. I throw it ice. (= I throw the ice) throwing in. throwing. 1;2.9 don’t throw the bear. 1;10.11 don’t throw them on the ground. 1;11.3 Nomi don’t throw the books down. what do you throw it into? what did you throw it into? 1;11.9 they’re throwing this in here. throwing the thing. Contextually grounded Parental utterances more complex These are (nearly) all Naomi’s uses of throw during the period shown. Note N’s phrases grow in complexity, infer a lot from context …but, parent’s are about the same over time. [Much more to input than just this.] (Independent development of different verb usages) *** looking ahead, hypothesis is that children are learning CONSTRUCTIONS that help them understand/produce language

Development Of Throw (cont’d) 2;0.3 don’t throw it Nomi. Nomi stop throwing. well you really shouldn’t throw things Nomi you know. remember how we told you you shouldn’t throw things. can I throw it? I throwed Georgie. could I throw that? 2;0.5 throw it? you throw that? 2;0.18 gonna throw that? 2;1.17 throw it in the garbage. throw in there. 2;5.0 throw it in that. 2;11.12 I throwed it in the diaper pail.

Session 4 outline Language acquisition: the problem Child language acquisition Usage-based construction learning model Recapitulation: Embodied cognitive models

How do children make the transition from single words to complex combinations? Multi-unit expressions with relational structure Concrete word combinations fall down, eat cookie, Mommy sock Item-specific constructions (limited-scope formulae) X throw Y, the X, X’s Y Argument structure constructions (syntax) Grammatical markers Tense-aspect, agreement, case In early word combinations and other “relational” constructions = structurally complex constructions.

Language learning is structure learning “You’re throwing the ball!” Intonation, stress Phonemes, syllables Morphological structure Word segmentation, order Syntactic structure Sensorimotor structure Event structure Pragmatic structure: attention, intention, perspective Stat. regularities Language is rife with structure. Even simple utterances involve structure at a variety of interdependent levels, from intonational and phonological and syllabic structure, etc. Of course, there’s a lot more structure in the environment than that! In fact, much of the first year of life is devoted to mastery of all patterns of experience -- linguistic and otherwise. Event: Causal, temporal, force-dynamic structure. (Some crosslinguistic variation in timing, and perhaps packaging.)

Making sense: structure begets structure! Structure is cumulative Object recognition  scene understanding Word segmentation  word learning Learners exploit existing structure to make sense of their environment Achieve goals Infer intentions Language learners exploit existing structure to make sense of their environment Achieve communicative goals Infer communicative intentions True of all structure -- sound, meaning, social, etc.; march toward complexity. Children are active learners, employing function / understanding.

Exploiting existing structure “You’re throwing the ball!” This claim is increasingly uncontroversial for word learning. Word learning is also, famously, a mapping problem (word-to-world). - it is a mapping across domains -- those of form and meaning. widespread consensus that children are sensitive to a lot of rich structure they can use to identify potential mappings (bloom etc).

Comprehension is partial. (not just for dogs)

What we say to kids… what do you throw it into? they’re throwing this in here. do you throw the frisbee? they’re throwing a ball. don’t throw it Nomi. well you really shouldn’t throw things Nomi you know. remember how we told you you shouldn’t throw things. What they hear… blah blah YOU THROW blah? blah THROW blah blah HERE. blah YOU THROW blah blah? blah THROW blah blah BALL. DON’T THROW blah NOMI. blah YOU blah blah THROW blah NOMI blah blah. blah blah blah blah YOU shouldn’t THROW blah. But children also have rich situational context/cues they can use to fill in the gaps.

Understanding drives learning Utterance+Situation Conceptual knowledge Linguistic knowledge Understanding Learning Basic idea: exploit all known structure -- linguistic and otherwise -- to: - communicate - make sense of new data - build complex structures from known ones (Partial) Interpretation

Potential inputs to learning Genetic language-specific biases Domain-general structures and processes Embodied representations …grounded in action, perception, conceptualization, and other aspects of physical, mental and social experience Talmy 1988, 2000; Glenberg and Robertson 1999; MacWhinney 2005; Barsalou 1999; Choi and Bowerman 1991; Slobin 1985, 1997 Social routines Intention inference, reference resolution Statistical information transition probabilities, frequency effects Usage-based approaches to language learning (Tomasello 2003, Clark 2003, Bybee 1985, Slobin 1985, Goldberg 2005) …the opulence of the substrate!

Methodology: computational modeling Grammar learning is driven by meaningful language use in context. Meaningful, structured representations Target representation: construction-based grammar Input data: utterance+context pairs, conceptual/linguistic knowledge Construction analyzer (comprehension) Usage-based learning framework Optimization toward “simplest” grammar given the data Goal: improved comprehension Take meaning seriously! Structural constraints of word combinations -- meaningful * all meaningful I nput available * functional constraints of communication Learning model: usage-driven optimization (MDL) Learning operations (structural mapping) Evaluation criteria (simplicity)

Models of language learning Several previous models of word learning are grounded (form + meaning) Regier 1996: <bitmaps, word> ® spatial relations Roy and Pentland 1998: <image, sound> ® object shapes/attributes Bailey 1997: <feature structure, word> ® actions Siskind 2000: <video, sound> ® actions Oates et al. 1999: <sensors, word class> ® actions Not so for grammar learning! Stolcke 1994: probabilistic attribute grammars from sentences Siskind 1996: verb argument structure from predicates Thompson 1998: syntax-semantics mapping from database queries Word learning is like category learning, but with cross-domain character -addressed by many methods (model merging, clustering, any generalization) -NOTICE this is UNARY Roles handled implicitly, if at all. [Oates et al. 1999: Using Syntax to Learn Semantics -- learn senses using bigram clustering]

Representation: constructions The basic linguistic unit is a <form, meaning> pair (Kay and Fillmore 1999, Lakoff 1987, Langacker 1987, Goldberg 1995, Croft 2001, Goldberg and Jackendoff 2004) ball toward Big Bird throw-it

Relational constructions throw ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Embodied Construction Grammar (Bergen & Chang, 2005)

Usage: Construction analyzer Utterance+Situation Conceptual knowledge Linguistic knowledge (constructions) (embodied schemas) Understanding Partial parser Unification-based Reference resolution (Bryant 2004) (Partial) Interpretation (semantic specification)

Usage: best-fit constructional analysis Discourse & Situational Context Constructions Utterance Analyzer: probabilistic, incremental, competition-based Semantic Specification: image schemas, frames, action schemas Simulation

Competition-based analyzer finds the best analysis An analysis is made up of: A constructional tree A set of resolutions A semantic specification The best fit has the highest combined score

An analysis using THROW-TRANSITIVE

Usage: Partial understanding “You’re throwing the ball!” ANALYZED MEANING Participants: ball, Ego Throw-Action thrower = ? throwee = ? PERCEIVED MEANING Participants: my_ball, Ego Throw-Action thrower = Ego throwee = my_ball

Construction learning model: search Model allows incorporation of all kinds of available info. Model defines a CLASS of learning problems. I.e. allows all or none of the sources of constraint above, though we will focus on latter set.

Proposing new constructions Relational Mapping context-dependent Reorganization Merging (generalization) Splitting (decomposition) Joining (compositon) context-independent

Initial Single-Word Stage FORM (sound) lexical constructions MEANING (stuff) schema Addressee subcase of Human “you” you schema Throw roles: thrower throwee throw “throw” ball “ball” schema Ball subcase of Object schema Block subcase of Object block “block”

New Data: “You Throw The Ball” FORM MEANING SITUATION throw-ball Self you Addressee schema Addressee subcase of Human “you” Addressee Throw thrower throwee schema Throw roles: thrower throwee Throw thrower throwee “throw” throw before role-filler “the” ball schema Ball subcase of Object Ball Ball “ball” schema Block subcase of Object “block” block

New Construction Hypothesized construction THROW-BALL constructional constituents t : THROW b : BALL form tf before bf meaning tm.throwee ↔ bm

Context-driven relational mapping: partial analysis

Context-driven relational mapping: form and meaning correlation

Meaning Relations: pseudo-isomorphism strictly isomorphic: Bm fills a role of Am shared role-filler: Am and Bm have a role filled by X sibling role-fillers: Am and Bm fill roles of Y

Relational mapping strategies strictly isomorphic: Bm is a role-filler of Am (or vice versa) Am.r1  Bm Af Am A form- relation role- filler Bf Bm B throw ball throw.throwee  ball

Relational mapping strategies shared role-filler: Am and Bm each have a role filled by the same entity Am.r1  Bm.r2 role- filler Af Am A form- relation X role- filler Bf Bm B put ball down put.mover  ball down.tr  ball

Relational mapping strategies sibling role-fillers: Am and Bm fill roles of the same schema Y.r1  Am, Y.r2  Bm role- filler Af Am A form- relation Y role- filler Bf Bm B Nomi ball possession.possessor  Nomi possession.possessed  ball

Overview of learning processes Relational mapping throw the ball Merging throw the block throwing the ball Joining ball off you throw the ball off THROW < BALL THROW < OBJECT THROW < BALL < OFF

Merging similar constructions FORM MEANING throw the block construction THROW-BLOCK constituents t : THROW o : BLOCK form tf before of meaning tm.throwee « om Block Throw thrower throwee throw before Objectf THROW.throwee = Objectm THROW-OBJECT construction throw the ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Ball Throw thrower throwee Merge constructions involving correlated relational mappings over one or more pairs of similar constituents “MERGE” b/c default algorithm is model merging. -been used for several frameworks, most relevant = verbs. (Also used it for one-word domain, featurized.) construction THROW-OBJECT constituents t : THROW o : OBJECT form tf before of meaning tm.throwee « om construction THROW-BLOCK subcase of THROW-OBJECT o : BLOCK construction THROW-BALL o : BALL

More complex generalization operations could drive the formation of new *constructional* categories. top: common parent Toy found for Ball and Block bottom: new “ToyX” category formed with Ball-Cn and Block-Cn as only subcases

Overview of learning processes Relational mapping throw the ball Merging throw the block throwing the ball Joining ball off you throw the ball off THROW < BALL THROW < OBJECT THROW < BALL < OFF

Joining co-occurring constructions FORM MEANING throw the ball construction THROW-BALL constituents t : THROW o : BALL form tf before of meaning tm.throwee « om Ball Throw thrower throwee THROW.throwee=Ball Motion m m.mover = Ball m.path = Off throw before ball ball before off ThrowBallOff construction Compose frequently co-occurring constructions with compatible constraints (e.g., common arguments) construction BALL-OFF constituents b : BALL o : OFF form bf before of meaning evokes Motion as m mm.mover « bm mm.path « om Ball Motion mover ball off path Off

Joined construction construction THROW-BALL-OFF constructional constituents t : THROW b : BALL o : OFF form tf before bf bf before of meaning evokes MOTION as m tm.throwee  bm m.mover  bm m.path  om

Construction learning model: evaluation Learning operations are guided by an MDL heuristic Heuristic: minimum description length (MDL: Rissanen 1978) asdf

Learning:usage-based optimization Grammar learning = search for (sets of) constructions Incremental improvement toward best grammar given the data Search strategy: usage-driven learning operations Evaluation criteria: simplicity-based, information-theoretic Minimum description length: most compact encoding of the grammar and data Trade-off between storage and processing Domain-general learning principles applied to linguistic structures/processes

Minimum description length (Rissanen 1978, Goldsmith 2001, Stolcke 1994, Wolff 1982) Seek most compact encoding of data in terms of Compact representation of model (i.e., the grammar) Compact representation of data (i.e., the utterances) Approximates Bayesian learning (Bailey 1997, Stolcke 1994) Exploit tradeoff between preferences for: smaller grammars simpler analyses of data Fewer constructions Fewer constituents/constraints Shorter slot chains (more local concepts) Pressure to compress/generalize More likely constructions Shallower analyses Pressure to retain specific constructions

MDL: details Choose grammar G to minimize length(G|D): length(G|D) = m • length(G) + n • length(D|G) Bayesian approximation: length(G|D) ≈ posterior probability P(G|D) Length of grammar = length(G) ≈ prior P(G) favor fewer/smaller constructions/roles favor shorter slot chains (more familiar concepts) Length of data given grammar = length(D|G) ≈ likelihood P(D|G) favor simpler analyses using more frequent constructions

Flashback to verb learning: Learning 2 senses of PUSH Model merging based on Bayesian MDL

Experiment: learning verb islands Question: Can the proposed construction learning model acquire English item-based motion constructions? (Tomasello 1992) Form: Participants : Mother, Naomi, Ball Scene : Discourse : text : throw the ball intonation : falling Throw thrower : Naomi throwee : Ball speaker :Mother addressee Naomi speech act : imperative activity : play joint attention : Ball Given: initial lexicon and ontology Data: child-directed language annotated with contextual information

Experiment: learning verb islands Subset of the CHILDES database of parent-child interactions (MacWhinney 1991; Slobin ) coded by developmental psychologists for form: particles, deictics, pronouns, locative phrases, etc. meaning: temporality, person, pragmatic function, type of motion (self-movement vs. caused movement; animate being vs. inanimate object, etc.) crosslinguistic (English, French, Italian, Spanish) English motion utterances: 829 parent, 690 child utterances English all utterances: 3160 adult, 5408 child age span is 1;2 to 2;6 Psychologists have chosen meaningful variables this may be sufficient, but in addition we may translate into simulation primitives self-movement of animate being (walk) force.energy-source = spg.trajector

Annotated Childes Data 765 Annotated Parent Utterances Annotated for the following scenes: CausedMotion : “Put Goldie through the chimney” SelfMotion : “did you go to the doctor today?” JointMotion : “bring the other pieces Nomi” Transfer :“give me the toy” SerialAction: “come see the doggie” Originally annotated by psychologists

An Annotation (Bindings) Utterance: Put Goldie through the chimney SceneType: CausedMotion Causer: addressee Action: put Direction: through Mover: Goldie (toy) Landmark: chimney

Learning throw-constructions INPUT UTTERANCE SEQUENCE LEARNED CXNS 1. Don’t throw the bear. throw-bear 2. you throw it you-throw 3. throw-ing the thing. throw-thing 4. Don’t throw them on the ground. throw-them 5. throwing the frisbee. throw-frisbee MERGE throw-OBJ 6. Do you throw the frisbee? COMPOSE you-throw-frisbee 7. She’s throwing the frisbee. COMPOSE she-throw-frisbee

Example learned throw-constructions Throw bear You throw Throw thing Throw them Throw frisbee Throw ball You throw frisbee She throw frisbee <Human> throw frisbee Throw block Throw <Toy> Throw <Phys-Object> <Human> throw <Phys-Object>

Early talk about throwing Transcript data, Naomi 1;11.9 Par: they’re throwing this in here. Par: throwing the thing. Child: throwing in. Child: throwing. Par: throwing the frisbee. … Par: do you throw the frisbee? do you throw it? Child: throw it. Child: I throw it. … Child: throw frisbee. Par: she’s throwing the frisbee. Child: throwing ball. Sample input prior to 1;11.9: don’t throw the bear. don’t throw them on the ground. Nomi don’t throw the books down. what do you throw it into? Sample tokens prior to 1;11.9: throw throw off I throw it. I throw it ice. (= I throw the ice) Over whole corpus, N’s phrases grow in complexity…but, parent’s are about the same over time. [Much more to input than just this.] Still infer a lot from context (Independent development of different verb usages) Sachs corpus (CHILDES)

A quantitative measure: coverage Goal: incrementally improving comprehension At each stage in testing, use current grammar to analyze test set Coverage = % role bindings analyzed Example: Grammar: throw-ball, throw-block, you-throw Test sentence: throw the ball. Bindings: scene=Throw, thrower=Nomi, throwee=ball Parsed bindings: scene=Throw, throwee=ball Score test grammar on sentence: 2/3 = 66.7%

Learning to comprehend

Principles of interaction Early in learning: no conflict Conceptual knowledge dominates More lexically specific constructions (no cost) throw want throw off want cookie throwing in want cereal you throw it I want it Later in learning: pressure to categorize More constructions = more potential for confusion during analysis Mixture of lexically specific and more general constructions throw OBJ want OBJ throw DIR I want OBJ throw it DIR ACTOR want OBJ ACTOR throw OBJ

Verb island constructions learned Basic processes produce constructions similar to those in child production data. System can generalize beyond encountered data with pressure to merge constructions. Differences in verb learning lend support to verb island hypothesis. Future directions full English corpus: non-motion scenes, argument structure constructions Crosslinguistic data: Russian (case marking), Mandarin Chinese (omission, directional particles, aspect markers) Morphological constructions Contextual constructions; multi-utterance discourse (Mok)

Summary Model satisfies convergent constraints from diverse disciplines Crosslinguistic developmental evidence Cognitive and constructional approaches to grammar Precise grammatical representations and data-driven learning framework for understanding and acquisition Model addresses special challenges of language learning Exploits structural parallels in form/meaning to learn relational mappings Learning is usage-based/error-driven (based on partial comprehension) Minimal specifically linguistic biases assumed Learning exploits child’s rich experiential advantage Earliest, item-based constructions learnable from utterance-context pairs

Key model components Embodied representations Construction formalism Experientially motivated rep’ns incorporating meaning/context Construction formalism Multiword constructions = relational form-meaning correspondences Usage 1: Learning tightly integrated with comprehension New constructions bridge gap between linguistically analyzed meaning and contextually available meaning Usage 2: Statistical learning framework Incremental, specific-to-general learning Minimum description length heuristic for choosing best grammar

Embodied Construction Grammar Theory of Language Structure Theory of Language Acquisition Theory of Language Use Usage-based optimization Simulation Semantics

Usage-based learning: comprehension and production constructicon world knowledge discourse & situational context simulation analysis utterance analyze & resolve utterance response comm. intent generate reinforcement (usage) hypothesize constructions & reorganize reinforcement (correction) reinforcement (usage) reinformcent (correction)

A Best-Fit Approach for Productive Analysis of Omitted Arguments Eva Mok & John Bryant University of California, Berkeley International Computer Science Institute

Simplifying grammar by exploiting the language understanding process Omission of arguments in Mandarin Chinese Construction grammar framework Model of language understanding Our best-fit approach

Productive Argument Omission (in Mandarin) ma1+ma gei3 ni3 zhei4+ge mother give 2PS this+CLS 1 Mother (I) give you this (a toy). You give auntie [the peach]. 2 ni3 gei3 yi2 2PS give auntie ao ni3 gei3 ya EMP 2PS give 3 Oh (go on)! You give [auntie] [that]. gei3 give 4 [I] give [you] [some peach]. CHILDES Beijing Corpus (Tardiff, 1993; Tardiff, 1996)

Arguments are omitted with different probabilities All arguments omitted: 30.6% No arguments omitted: 6.1%

Construction grammar approach Kay & Fillmore 1999; Goldberg 1995 Grammaticality: form and function Basic unit of analysis: construction, i.e. a pairing of form and meaning constraints Not purely lexically compositional Implies early use of semantics in processing Embodied Construction Grammar (ECG) (Bergen & Chang, 2005)

Proliferation of constructions Subj Verb Obj1 Obj2 ↓ Giver Transfer Recipient Theme Verb Obj1 Obj2 ↓ Transfer Recipient Theme Subj Verb Obj2 ↓ Giver Transfer Theme Subj Verb Obj1 ↓ Giver Transfer Recipient …

If the analysis process is smart, then... Subj Verb Obj1 Obj2 ↓ Giver Transfer Recipient Theme The grammar needs only state one construction Omission of constituents is flexibly allowed The analysis process figures out what was omitted

Best-fit analysis takes burden off grammar representation Discourse & Situational Context Constructions Utterance Analyzer: incremental, competition-based, psycholinguistically plausible Semantic Specification: image schemas, frames, action schemas Simulation

Competition-based analyzer finds the best analysis An analysis is made up of: A constructional tree A set of resolutions A semantic specification The best fit has the highest combined score

Combined score that determines best-fit Syntactic Fit: Constituency relations Combine with preferences on non-local elements Conditioned on syntactic context Antecedent Fit: Ability to find referents in the context Conditioned on syntactic information, feature agreement Semantic Fit: Semantic bindings for frame roles Frame roles’ fillers are scored

Analyzing ni3 gei3 yi2 (You give auntie) Two of the competing analyses: ni3 gei3 yi2 omitted ↓ Giver Transfer Recipient Theme ni3 gei3 omitted yi2 ↓ Giver Transfer Recipient Theme Syntactic Fit: P(Theme omitted | ditransitive cxn) = 0.65 P(Recipient omitted | ditransitive cxn) = 0.42 (1-0.78)*(1-0.42)*0.65 = 0.08 (1-0.78)*(1-0.65)*0.42 = 0.03

Frame and lexical information restrict type of reference Transfer Frame Giver Recipient Theme Manner Means Place Purpose Reason Time Lexical Unit gei3 Giver (DNI) Recipient (DNI) Theme (DNI)

Can the omitted argument be recovered from context? Antecedent Fit: ni3 gei3 yi2 omitted ↓ Giver Transfer Recipient Theme ni3 gei3 omitted yi2 ↓ Giver Transfer Recipient Theme Discourse & Situational Context child mother peach auntie table ?

How good of a theme is a peach? How about an aunt? Semantic Fit: ni3 gei3 yi2 omitted ↓ Giver Transfer Recipient Theme ni3 gei3 yi2 omitted ↓ Giver Transfer Recipient Theme ni3 gei3 omitted yi2 ↓ Giver Transfer Recipient Theme The Transfer Frame Giver (usually animate) Recipient (usually animate) Theme (usually inanimate)

Each construction is annotated with probabilities of omission The argument omission patterns shown earlier can be covered with just ONE construction Subj Verb Obj1 Obj2 ↓ Giver Transfer Recipient Theme 0.78 0.65 0.42 P(omitted|cxn): Each construction is annotated with probabilities of omission Language-specific default probability can be set

Leverage process to simplify representation The processing model is complementary to the theory of grammar By using a competition-based analysis process, we can: Find the best-fit analysis with respect to constituency structure, context, and semantics Eliminate the need to enumerate allowable patterns of argument omission in grammar This is currently being applied in models of language understanding and grammar learning.

Simulation hypothesis We understand by mentally simulating Simulation exploits some of the same neural structures activated during performance, perception, imagining, memory… Linguistic structure parameterizes the simulation. Language gives us enough information to simulate

Language understanding as simulative inference “Harry walked to the cafe.” Utterance Linguistic knowledge Analysis Process General Knowledge Belief State Schema Trajector Goal walk Harry cafe Simulation Specification Simulation Cafe

Usage-based learning: comprehension and production constructicon world knowledge discourse & situational context simulation analysis utterance analyze & resolve utterance response comm. intent generate reinforcement (usage) hypothesize constructions & reorganize reinforcement (correction) reinforcement (usage) reinformcent (correction)

Recapituation

Theory of Language Structure Theory of Language Acquisition Theory of Language Use

Motivating assumptions Structure and process are linked Embodied language use constrains structure! Language and rest of cognition are linked All evidence is fair game Need computational formalisms that capture embodiment Embodied meaning representations Embodied grammatical theory

Embodiment and Simulation: Basic NTL Hypotheses Embodiment Hypothesis Basic concepts and words derive their meaning from embodied experience. Abstract and theoretical concepts derive their meaning from metaphorical maps to more basic embodied concepts. Structured connectionist models provide a suitable formalism for capturing these processes. Simulation Hypothesis Language exploits many of the same structures used for action, perception, imagination, memory and other neurally grounded processes. Linguistic structures set parameters for simulations that draw on these embodied structures.

The ICSI/Berkeley Neural Theory of Language Project The general tactic is to view complex cognitive phenomena as having more than one level of analysis, here 5 levels. (Cog/ling level that cogsci people study; computational level with relatively standard CS rep’ns; neural networks with structure, etc.) Importantly, this reduction or abstraction is constrained in that structures or representations used at one level should have equivalent translations or implementations at the more concrete or biologically inspired levels (e.g. SHRUTI binding via temporal synchrony) The ICSI/Berkeley Neural Theory of Language Project

Complex phenomena within reach Radial categories / prototype effects (Rosch 1973, 1978; Lakoff 1985) mother: birth / adoptive / surrogate / genetic, … Profiling (Langacker 1989, 1991; cf. Fillmore XX) hypotenuse, buy/sell (Commercial Event frame) Metaphor and metonymy (Lakoff & Johnson 1980) ANGER IS HEAT, MORE IS UP The ham sandwich wants his check. / All hands on deck. Mental spaces (Fauconnier 1994) The girl with blue eyes in the painting really has green eyes. Conceptual blending (Fauconnier & Turner 2002, inter alia) workaholic, information highway, fake guns

Embodiment in language Perceptual and motor systems play a central role in language production and comprehension Theoretical proposals Linguistics: Lakoff, Langacker, Talmy Neuroscience: Damasio, Edelman Cognitive psychology: Barsalou, Gibbs, Glenberg, MacWhinney Computer science: Steels, Brooks, Siskind, Feldman, Roy