Bayesian models of human inference Josh Tenenbaum MIT.

Slides:



Advertisements
Similar presentations
Approaches, Tools, and Applications Islam A. El-Shaarawy Shoubra Faculty of Eng.
Advertisements

Bayesian Belief Propagation
The influence of domain priors on intervention strategy Neil Bramley.
Probabilistic Models in Human and Machine Intelligence.
Motion Illusions As Optimal Percepts. What’s Special About Perception? Arguably, visual perception is better optimized by evolution than other cognitive.
Causal learning in humans Alison Gopnik Dept. of Psychology UC-Berkeley.
Probabilistic Models of Cognition Conceptual Foundations Chater, Tenenbaum, & Yuille TICS, 10(7), (2006)
Probabilistic inference in human semantic memory Mark Steyvers, Tomas L. Griffiths, and Simon Dennis 소프트컴퓨팅연구실오근현 TRENDS in Cognitive Sciences vol. 10,
Modeling Human Reasoning About Meta-Information Presented By: Scott Langevin Jingsong Wang.
Causes and coincidences Tom Griffiths Cognitive and Linguistic Sciences Brown University.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
What is Artificial Intelligence? –Depends on your perspective... Philosophical: a method for modeling intelligence Psychological: a method for studying.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2004 Vision as Optimal Inference The problem of visual processing can be thought of as.
Solved the Maze? Start at phil’s house. At first, you can only make right turns through the maze. Each time you cross the red zigzag sign (under Carl’s.
Bayesian models of human learning and reasoning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL)
1 Learning from Behavior Performances vs Abstract Behavior Descriptions Tolga Konik University of Michigan.
Nonparametric Bayes and human cognition Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.
Decision making as a model 7. Visual perception as Bayesian decision making.
Introduction to Cognitive Science Lecture #1 : INTRODUCTION Joe Lau Philosophy HKU.
Exploring subjective probability distributions using Bayesian statistics Tom Griffiths Department of Psychology Cognitive Science Program University of.
PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005 Making Decisions: Modeling perceptual decisions.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Overview and History of Cognitive Science. How do minds work? What would an answer to this question look like? What is a mind? What is intelligence? How.
Summer 2011 Wednesday, 8/3. Biological Approaches to Understanding the Mind Connectionism is not the only approach to understanding the mind that draws.
Baysian Approaches Kun Guo, PhD Reader in Cognitive Neuroscience School of Psychology University of Lincoln Quantitative Methods 2011.
Revealing inductive biases with Bayesian models Tom Griffiths UC Berkeley with Mike Kalish, Brian Christian, and Steve Lewandowsky.
Modeling Vision as Bayesian Inference: Is it Worth the Effort? Alan L. Yuille. UCLA. Co-Director: Centre for Image and Vision Sciences. Dept. Statistics.
Vision in Man and Machine. STATS 19 SEM Talk 2. Alan L. Yuille. UCLA. Dept. Statistics and Psychology.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
What is Artificial Intelligence? –not programming in LISP or Prolog (!) –depends on your perspective... a method for modeling intelligence a method for.
Bayesian approaches to cognitive sciences. Word learning Bayesian property induction Theory-based causal inference.
1 AI and Agents CS 171/271 (Chapters 1 and 2) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Human-Centered Information Visualization Jiajie Zhang, Kathy Johnson, Jack Smith University of Texas at Houston Jane Malin NASA Johnson Space Center July.
Perfection and bounded rationality in the study of cognition Henry Brighton.
Artificial Intelligence
Optimal predictions in everyday cognition Tom Griffiths Josh Tenenbaum Brown University MIT Predicting the future Optimality and Bayesian inference Results.
What is Cognitive Science? Josh Tenenbaum MLSS 2010.
Introduction to probabilistic models of cognition Josh Tenenbaum MIT.
Advances in Modeling Neocortex and its impact on machine intelligence Jeff Hawkins Numenta Inc. VS265 Neural Computation December 2, 2010 Documentation.
Demonstration and Verbal Instructions
Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics.
I Robot.
ARTIFICIAL INTELLIGENCE Human like intelligence Definitions: 1. Focus on intelligent Behaviour “Behaviour by a machine that, if performed by a human.
Cognitive Processes Chapter 8. Studying CognitionLanguage UseVisual CognitionProblem Solving and ReasoningJudgment and Decision MakingRecapping Main Points.
Probabilistic Models in Human and Machine Intelligence.
Motion Illusions As Optimal Percepts. What’s Special About Perception? Visual perception important for survival  Likely optimized by evolution  at least.
Chapter 1: Introduction to Neuro-Fuzzy (NF) and Soft Computing (SC)
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Organic Evolution and Problem Solving Je-Gun Joung.
From NARS to a Thinking Machine Pei Wang Temple University.
Basic Bayes: model fitting, model selection, model averaging Josh Tenenbaum MIT.
HIERARCHICAL TEMPORAL MEMORY WHY CANT COMPUTERS BE MORE LIKE THE BRAIN?
Cognitive Modeling Cogs 4961, Cogs 6967 Psyc 4510 CSCI 4960 Mike Schoelles
Learning Fast and Slow John E. Laird
Bayesian models of human inference
Nicolas Alzetta CoNGA: Cognition and Neuroscience Group of Antwerp
Chapter 6 Heuristics and Controlled Problem Solving
Course: Autonomous Machine Learning
Bayesian models of human learning and reasoning
Course Instructor: knza ch
Artificial Intelligence introduction(2)
CSE-490DF Robotics Capstone
Revealing priors on category structures through iterated learning
Probabilistic Models in Human and Machine Intelligence
Intelligent Systems and
AI and Agents CS 171/271 (Chapters 1 and 2)
The causal matrix: Learning the background knowledge that makes causal learning possible Josh Tenenbaum MIT Department of Brain and Cognitive Sciences.
Presentation transcript:

Bayesian models of human inference Josh Tenenbaum MIT

The Bayesian revolution in AI Principled and effective solutions for inductive inference from ambiguous data: –Vision –Robotics –Machine learning –Expert systems / reasoning –Natural language processing Standard view in AI: no necessary connection to how the human brain solves these problems. –Heuristics & Biases program in the background (“We know people aren’t Bayesian, but…”).

Bayesian models of cognition Visual perception [Weiss, Simoncelli, Adelson, Richards, Freeman, Feldman, Kersten, Knill, Maloney, Olshausen, Jacobs, Pouget,...] Language acquisition and processing [Brent, de Marken, Niyogi, Klein, Manning, Jurafsky, Keller, Levy, Hale, Johnson, Griffiths, Perfors, Tenenbaum, …] Motor learning and motor control [Ghahramani, Jordan, Wolpert, Kording, Kawato, Doya, Todorov, Shadmehr, …] Associative learning [Dayan, Daw, Kakade, Courville, Touretzky, Kruschke, …] Memory [Anderson, Schooler, Shiffrin, Steyvers, Griffiths, McClelland, …] Attention [Mozer, Huber, Torralba, Oliva, Geisler, Yu, Itti, Baldi, …] Categorization and concept learning [Anderson, Nosfosky, Rehder, Navarro, Griffiths, Feldman, Tenenbaum, Rosseel, Goodman, Kemp, Mansinghka, …] Reasoning [Chater, Oaksford, Sloman, McKenzie, Heit, Tenenbaum, Kemp, …] Causal inference [Waldmann, Sloman, Steyvers, Griffiths, Tenenbaum, Yuille, …] Decision making and theory of mind [Lee, Stankiewicz, Rao, Baker, Goodman, Tenenbaum, …]

How to meet up with mainstream JDM research (i.e., heuristics & biases)? 1.How to reconcile apparently contradictory messages of H&B and Bayesian models? Are people Bayesian or aren’t they? When are they, when aren’t they, and why? 2.How to integrate the H&B and Bayesian research approaches?

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) –People are Bayesian in low-level input or output processes that have a long evolutionary history shared with other species, e.g. vision, motor control, memory retrieval.

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) –Higher-level cognition can be Bayesian when information is presented in formats that we have evolved to process, and that support simple heuristic algorithms, e.g., base-rate neglect disappears with “natural frequencies”. Explicit probabilities Natural frequencies

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) Core capacities hypothesis –Bayes can illuminate core human cognitive capacities for inductive inference – learning words and concepts, projecting properties of objects, causal inference, or action understanding: problems we solve effortlessly, unconsciously, and successfully in natural contexts, which any five-year-old solves better than any animal or computer.

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) Core capacities hypothesis Causal induction (Sobel, Griffiths, Tenenbaum, & Gopnik) E BA ?? B B AA B AB Trial A Trial B A

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) Core capacities hypothesis Hypothesis space Data (Tenenbaum & Xu) Word learning

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) Core capacities hypothesis –Bayes can illuminate core human cognitive capacities for inductive inference – learning words and concepts, projecting properties of objects, causal inference, or action understanding: problems we solve effortlessly, unconsciously, and successfully in natural contexts, which a five-year-old solves better than any animal or computer. –The mind is not good at explicit Bayesian reasoning about verbally or symbolically presented statistics, unless core capacities can be engaged.

When are people Bayesian, and why? Low level hypothesis (Shiffrin, Maloney, etc.) Information format hypothesis (Gigerenzer) Core capacities hypothesis Statistical version of Diagnosis problem Causal version of Diagnosis problem Correct Base-rate neglect (Krynski & Tenenbaum)

How to meet up with mainstream JDM research (i.e., heuristics & biases)? 1.How to reconcile apparently contradictory messages of H&B and Bayesian models? Are people Bayesian or aren’t they? When are they, when aren’t they, and why? 2.How to integrate the H&B and Bayesian research approaches?

Reverse engineering Goal is to reverse-engineer human inference. –A computational understanding of how the mind works and why it works it does. Even for core inferential capacities, we are likely to observe behavior that deviates from any ideal Bayesian analysis. These deviations are likely to be informative about how the mind works.

Analogy to visual illusions (Shepard) Highlight the problems the visual system is designed to solve: inferring world structure from images, not judging properties of the images themselves. Reveal the implicit visual system’s implicit assumptions about the physical world and the processes of image formation that are needed to solve these problems. (Adelson)

How do we interpret deviations from a Bayesian analysis? H&B: People aren’t Bayesian, but use some other means of inference. –Base-rate neglect: representativeness heuristic –Recency bias: availability heuristic –Order of evidence effects: anchoring and adjustment –… Not so compelling as reverse engineering. –What engineer would want to design a system based on “representativeness”, without knowing how it is computed, why it is computed that way, what problem it attempts to solve, when it works, or how its accuracy and efficiency compares to some ideal computation or other heuristics.

How do we interpret deviations from a Bayesian analysis? Multiple levels of analysis (Marr) Computational theory –What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed? Representation and algorithm –How is the information represented? How is the computation carried out algorithmically, approximating the ideal computational theory with realistic time & space resources? Hardware implementation

How do we interpret deviations from a Bayesian analysis? Multiple levels of analysis (Marr) Computational theory –What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed? Representation and algorithm –How is the information represented? How is the computation carried out algorithmically, approximating the ideal computational theory with realistic time & space resources? Hardware implementation Bayes

Different philosophies H&B –One canonical Bayesian analysis of any given task, and we know what it is. –Ideal Bayesian solution can be computed. –The question “Are people Bayesian?” is empirically meaningful on any given task. Bayes+Marr –Many possible Bayesian analyses of any given task, and we need to discover which best characterize cognition. –Ideal Bayesian solution can only be approximately computed. –The question “Are people Bayesian?” is not an empirical one, at least not for an individual task. Bayes is a framework- level assumption, like distributed representations in connectionism or condition-action rules in ACT-R.

How do we interpret deviations from a Bayesian analysis? Multiple levels of analysis (Marr) Computational theory –What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed? Representation and algorithm –How is the information represented? How is the computation carried out algorithmically, approximating the ideal computational theory with realistic time & space resources? Hardware implementation

The centrality of causal inference (Griffiths & Tenenbaum) In visual perception: –Judge P(scene|image features) rather than P(image features|scene) or P(image features|other image features). Coin–flipping: Which sequence is more likely to come from flipping a fair coin, HHTHT or HHHHH ? Coincidences: How likely that 2 people in a random party of 25 have the same birthday? 3 in a party of 10?

(Griffiths & Tenenbaum) Rational measure of evidential support: Judgments of randomness:Judgments of coincidence:

How do we interpret deviations from a Bayesian analysis? Multiple levels of analysis (Marr) Computational theory –What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed? Representation and algorithm –How is the information represented? How is the computation carried out algorithmically, approximating the ideal computational theory with realistic time & space resources? Hardware implementation

Assuming the world is simple P(A is a blicket|data) = 1 P(B is a blicket|data) ~ 1/6 P(A is a blicket|data) ~ 3/4 P(B is a blicket|data) ~ 1/4 AB Trial A Trial AB Trial AC Trial C AB A B B B AA AAB BC C In visual perception: –“Slow and smooth” prior on visual motion – Causal induction: –P(blicket) = 1/6, “Activation law” AB

Recognizing the world is complex (Kemp & Tenenbaum) In visual perception: –Need uncertainty about coherence ratio and velocity of coherent motion. (Lu & Yuille) Property induction: –Properties should be distributed stochastically over tree structure, not just focused on single branches. Gorillas have T9 cells. Seals have T9 cells. Horses have T9 cells. Bayes: single branch prior r = 0.50

Recognizing the world is complex (Kemp & Tenenbaum) In visual perception: –Need uncertainty about coherence ratio and velocity of coherent motion. (Lu & Yuille) Property induction: –Properties should be distributed stochastically over tree structure, not just focused on single branches. Bayes: “mutation” prior Gorillas have T9 cells. Seals have T9 cells. Horses have T9 cells. r = 0.92

“has T9 hormones” “is found near Minneapolis” “can bite through wire” “carry E. Spirus bacteria” (Kemp & Tenenbaum)

How do we interpret deviations from a Bayesian analysis? Multiple levels of analysis (Marr) Computational theory –What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed? Representation and algorithm –How is the information represented? How is the computation carried out algorithmically, approximating the ideal computational theory with realistic time & space resources? Hardware implementation

Sampling-based approximate inference (Griffiths et al., Goodman et al.) In visual perception: –Temporal dynamics of bi-stability due to fast sampling-based approximation of a bimodal posterior (Schrater & Sundareswara). Order effects in category learning –Particle filter (sequential Monte Carlo), an online approximate inference algorithm assuming stationarity. Probability matching in classification decisions –Sampling-based approximations with guarantees of near optimal generalization performance.

Conclusions “Are people Bayesian?”, “When are they Bayesian?” –Maybe not the most interesting questions in the long run…. What is the best way to reverse engineer cognition at multiple levels of analysis? Assuming core inductive capacities are approximately Bayesian at the computational-theory level offers several benefits: –Explanatory power: why does cognition work? –Fewer degrees of freedom in modeling –A bridge to state-of-the-art AI and machine learning –Tools to study the big questions: What are the goals of cognition? What does the mind know about the world? How is that knowledge represented? What are the processing mechanisms and why do they work as they do?