Computational models of cognitive development: the grammar analogy Josh Tenenbaum MIT.

Slides:



Advertisements
Similar presentations
The influence of domain priors on intervention strategy Neil Bramley.
Advertisements

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Instructional Design Course code: LCE 233 Lecture 1.
Knowledge Representation
Cognitive Linguistics Croft & Cruse 9
Probabilistic Models of Cognition Conceptual Foundations Chater, Tenenbaum, & Yuille TICS, 10(7), (2006)
Anna-Mari Rusanen Department of Philosophy, History, Culture and Art Studies University of Helsinki Department of Physics University of Helsinki.
The dynamics of iterated learning Tom Griffiths UC Berkeley with Mike Kalish, Steve Lewandowsky, Simon Kirby, and Mike Dowman.
Topics in Cognition and Language: Theory, Data and Models *Perceptual scene analysis: extraction of meaning events, causality, intentionality, Theory of.
Bayesian models of inductive learning
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Nonparametric Bayes and human cognition Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.
Part III Hierarchical Bayesian Models. Phrase structure Utterance Speech signal Grammar Universal Grammar Hierarchical phrase structure grammars (e.g.,
Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.
Chapter Two SCIENTIFIC METHODS IN BUSINESS
Cognitive Linguistics Croft & Cruse 10 An overview of construction grammars (part 1, through )
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Physical Symbol System Hypothesis
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Allyn & Bacon 2003 Social Research Methods: Qualitative and Quantitative Approaches, 5e This multimedia product and its contents are protected.
THE TRANSITION FROM ARITHMETIC TO ALGEBRA: WHAT WE KNOW AND WHAT WE DO NOT KNOW (Some ways of asking questions about this transition)‏
Generative Grammar(Part ii)
Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.
Bayesian approaches to cognitive sciences. Word learning Bayesian property induction Theory-based causal inference.
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
Learning causal theories Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL)
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Chapter 1: Research in the Behavioral Sciences History of Behavioral Research Aristotle and Buddha questioned human nature and why people behave in certain.
Introduction to probabilistic models of cognition Josh Tenenbaum MIT.
What is linguistics  It is the science of language.  Linguistics is the systematic study of language.  The field of linguistics is concerned with the.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Chapter 2 Developmental Psychology A description of the general approach to behavior by developmental psychologists.
Logical Calculus of Ideas Immanent in Nervous Activity McCulloch and Pitts Deep Learning Fatima Al-Raisi.
THEORETICAL FRAMEWORK and Hypothesis Development
Inferring structure from data Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.
Four Big Ideas Big Idea 1: the process of evolution drives the diversity and unity of life. Big Idea 2: biological systems utilize free energy and molecular.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
1 LING 696B: Midterm review: parametric and non-parametric inductive inference.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Infinite block models for belief networks, social networks, and cultural knowledge Josh Tenenbaum, MIT 2007 MURI Review Meeting Work of Charles Kemp, Chris.
The Reality of Logic David Davenport Computer Eng. Dept., Bilkent University, Ankara Turkey.
Intro to Scientific Research Methods in Geography Chapter 2: Fundamental Research Concepts.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
1 The Theoretical Framework. A theoretical framework is similar to the frame of the house. Just as the foundation supports a house, a theoretical framework.
Cognitive Processes Chapter 8. Studying CognitionLanguage UseVisual CognitionProblem Solving and ReasoningJudgment and Decision MakingRecapping Main Points.
Experimental Psychology PSY 433 Chapter 1 – Explanation in Scientific Psychology.
The Language of Thought : Part I Joe Lau Philosophy HKU.
Cognitive Processes PSY 334 Chapter 11 – Language Structure June 2, 2003.
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Thought & Language. Thinking Thinking involves manipulating mental representations for a purpose. Thinking incorporates the use of: Words Mental Images.
Introduction Chapter 1 Foundations of statistical natural language processing.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
From NARS to a Thinking Machine Pei Wang Temple University.
Artificial Intelligence Knowledge Representation.
Basic Bayes: model fitting, model selection, model averaging Josh Tenenbaum MIT.
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Human causal induction Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley.
Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.
ece 627 intelligent web: ontology and beyond
Bayesian models of human learning and inference
Experimental Psychology PSY 433
Josh Tenenbaum Statistical learning of abstract knowledge:
Building and evaluating models of human-level intelligence
Information Networks: State of the Art
The causal matrix: Learning the background knowledge that makes causal learning possible Josh Tenenbaum MIT Department of Brain and Cognitive Sciences.
Traditional Grammar VS. Generative Grammar
Learning overhypotheses with hierarchical Bayesian models
Habib Ullah qamar Mscs(se)
Presentation transcript:

Computational models of cognitive development: the grammar analogy Josh Tenenbaum MIT

Top 10 reasons to be Bayesian 1. Bayesian inference over hierarchies of structured representations provides integrated answers to these key questions of cognition: –What is the content and form of human knowledge, at multiple levels of abstraction? –How is abstract knowledge used to guide learning and inference of more specific representations? –How are abstract systems of knowledge themselves learned?

The plan This morning… … this afternoon: higher-level cognition –Causal theories –Theories for object categories and properties Phrase structures Utterances Linguistic grammar Object layout (scenes) Image features Scene grammar P(phrase structures | grammar) P(utterances | phrase structures) P(objects | grammar) P(features | objects) P(grammar)

(fragments of) Theories Physics –The natural state of matter is at rest. Every motion has a causal agent. –F = m a. Medicine –People have different amounts of four humors in the body, and illnesses are caused by them being out of balance. –Diseases result from pathogens transmitted through the environment, interacting with genetically controlled cellular mechanisms. Psychology –Rational agents choose efficient plans of action in order to satisfy their desires given their beliefs. –Action results from neuronal activity in the prefrontal cortex, which activates units in the motor cortex that control motor neurons projecting to the body’s muscles.

“A theory consists of three interrelated components: a set of phenomena that are in its domain, the causal laws and other explanatory mechanisms in terms of which the phenomena are accounted for, and the concepts in terms of which the phenomena and explanatory apparatus are expressed.” The structure of intuitive theories Carey (1985), “Constraints on semantic development”

The function of intuitive theories “Set the frame” of cognition by defining different domains of thought. Describe what things are found in that domain, as well as what behaviors and relations are possible Provide causal explanations appropriate to that domain. Guide inference and learning: – Highlights variables to attend to – Generates hypotheses/explanations to consider – Supports generalization from very limited data – Supports prediction and action planning.

Theories in cognitive development The big question: what develops from childhood to adulthood? –One extreme: basically everything Totally new ways of thinking, e.g. logical thinking. –The other extreme: basically nothing Just new facts (specific beliefs), e.g., trees can die. –Intermediate view: something important New theories, new ways of organizing beliefs.

Hierarchical Bayesian models for learning abstract knowledge Causal theory Causal network structures Observed events Structural form Structure over objects Observed properties of objects Linguistic grammar Phrase structures Observed utterances in the language

Causal theory Observed events (Griffiths, Tenenbaum, Kemp et al.) Hierarchical Bayesian framework for causal induction Causal network structures E BA E BA E BA E BA BA BA B A

Tree with species at leaf nodes mouse squirrel chimp gorilla mouse squirrel chimp gorilla F1 F2 F3 F4 Has T9 hormones ?????? … Hierarchical Bayesian framework for property induction (Kemp & Tenenbaum) Structural form Observed properties of objects Structure over objects

Questions How can the abstract knowledge be learned? How do we represent abstract knowledge? A tradeoff in representational expressiveness versus learnability...

Types: Block, Detector, Trial Predicates: Contact(block, detector, trial)non-random Activates(block, detector)random Active(detector, trial)random Dependency statements: Activates(b, d) ~ TabularCPD[[q (1-q)]]; Active(d, t) ~ NoisyORAggCPD[1 0] ({Contact(b, d, t) for block b: Activates(b, d)}); Causal theory Causal structure Observed events Activates(A, detector)? Activates(B, detector)? Relational Skeleton (unknown) E BA E BA E BA Contact(A, detector, t1) Contact(B, detector, t1) Active(detector, t1) Contact(A, detector, t2) ~Contact(B, detector, t2) Active(detector, t2) E BA

Questions How can the abstract knowledge be learned? How do we represent abstract knowledge? A tradeoff in representational expressiveness versus learnability... … consider a representation for theories which is less expressive but more learnable.

patients conditions has(patient,condition) Learning causal networks Observed events Causal structure

Different causal networks, Same domain theory Different domain theories

The grammar analogy People Cats Bees Trees Talk Sleep Fly Grow People Cats Bees Trees Talk Sleep Fly Grow Different semantic networks, Same linguistic grammar Different grammars People Cats Bees Trees Imagine Sleep Fly Grow People Cats Bees Trees Imagine Sleep Fly Grow

Natural Language Grammar Abstract classes: N, V, …. Production rules: S [N V], …. (“Nouns precede verbs”) N {people, cats, bees, trees,.…} V {talks, sleep, fly, grow, ….} Linguistic Grammar Syntactic structures Observed sentences in the language Causal Theory Abstract classes: D, S, …. Causal laws: [D S], …. (“Diseases cause symptoms”) D {flu, bronchitis, lung cancer,….} S {fever, cough, chest pain, ….} Causal Theory Causal network structures Observed data in the domain The grammar analogy

Classes = { B, D, S } Laws = { B D, D S } ( : possible causal link) Classes = { B, D, S } Laws = { S D } Classes = { C } Laws = { C C } Theories as graph grammars

History of the grammar analogy “The grammar of a language can be viewed as a theory of the structure of this language. Any scientific theory is based on a certain finite set of observations and, by establishing general laws stated in terms of certain hypothetical constructs, it attempts to account for these observations, to show how they are interrelated, and to predict an indefinite number of new phenomena…. Similarly, a grammar is based on a finite number of observed sentences… and it ‘projects’ this set to an infinite set of grammatical sentences by establishing general ‘laws’… [framed in terms of] phonemes, words, phrases, and so on.…” Chomsky (1956), “Three models for the description of language”

patients conditions has(patient,condition) B: working in factory, smoking, stress, high fat diet, … D: flu, bronchitis, lung cancer, heart disease, … S: headache, fever, coughing, chest pain, … Classes = { B, D, S } Laws = { B D, D S } Learning causal networks Causal theory Observed events Causal structure

Using a causal theory Given current causal network beliefs and some new observed data: Correlation between “working in factory” and “chest pain”.

The theory constrains possible hypotheses: And rules out others: Allows strong inferences about causal structure from very limited data. Very different from conventional Bayes net learning.

patients conditions has(patient,condition) R: working in factory, smoking, stress, high fat diet, … D: flu, bronchitis, lung cancer, heart disease, … S: headache, fever, coughing, chest pain, … Learning causal theories Causal theory Observed events Causal structure Classes = { B, D, S } Laws = { B D, D S } How do we define the theory as a probabilistic generative model?

ProfsGradsUgrads Learning a relational theory (c.f. Charles Kemp, Monday) Profs Grads Ugrads ProfGradsUg dominates people

The Infinite Relational Model (IRM) Goal: find and that maximize

patients conditions has(patient,condition) Causal theory Observed events Causal structure IRM Causal graphical model Classes z Laws  ‘B’ ‘D’ ‘S’ ‘B’‘D’‘S’ ‘B’ ‘D’ ‘S’

attributes (1-12) observed data True network Sample 75 observations… patients Learning with a uniform prior on network structures:

True network Sample 75 observations… Learning a block- structured prior on network structures: (Mansinghka et al. UAI 06) attributes (1-12) observed data patients z 

True structure of Bayesian network N: edge (N) class (Z) edge (N) # of samples: Data D Network N Data D Network N Abstract Theory … … … … … (Mansinghka, Kemp, Tenenbaum, Griffiths UAI 06) c1c1 c2c2 c1c1 c2c2 c1c1 c2c2 Classes Z  “blessing of abstraction”

The flexibility of a nonparametric prior edge (N) class (Z) edge (N) # of samples: True structure of Bayesian network N: Data D Network N Data D Network N Abstract Theory … 0.1 c1c1 c1c1 c1c1  Classes Z … … …

Insights Abstract theories, which have traditionally either been neglected or presumed to be innate in cognitive development, could in fact be learned from data by rational inferential means, together with specific causal relations. Abstract theories can in some cases be learned more quickly and more easily than specific concrete causal relations (the “blessing of abstraction”). Key to progress on learning abstract knowledge : finding sweet spots in the expressiveness – learnability tradeoff….

Hierarchical Bayesian models for learning abstract knowledge Causal theory Causal network structures Observed events Structural form Structure over objects Observed properties of objects Linguistic grammar Phrase structures Observed utterances in the language

1. How does knowledge guide learning from sparsely observed data? Bayesian inference, with priors based on background knowledge. 2. What form does knowledge take, across different domains and tasks? Probabilities defined over structured representations: graphs, grammars, rules, logic, relational schemas, theories. 3. How is more abstract knowledge itself learned? Hierarchical Bayesian models, with inference at multiple levels of abstraction. 4. How can abstract knowledge constrain learning yet maintain flexibility, balancing assimilation and accommodation? Nonparametric models, growing in complexity as the data require. A framework for inductive learning