Optimal predictions in everyday cognition Tom Griffiths Josh Tenenbaum Brown University MIT Predicting the future Optimality and Bayesian inference Results.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Probabilistic Models in Human and Machine Intelligence.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Bayesian Wrap-Up (probably). 5 minutes of math... Marginal probabilities If you have a joint PDF:... and want to know about the probability of just one.
Causes and coincidences Tom Griffiths Cognitive and Linguistic Sciences Brown University.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Visual Recognition Tutorial
The dynamics of iterated learning Tom Griffiths UC Berkeley with Mike Kalish, Steve Lewandowsky, Simon Kirby, and Mike Dowman.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Priors and predictions in everyday cognition Tom Griffiths Cognitive and Linguistic Sciences.
Probability theory Much inspired by the presentation of Kren and Samuelsson.
Learning with Bayesian Networks David Heckerman Presented by Colin Rickert.
A Bayesian view of language evolution by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
Exploring subjective probability distributions using Bayesian statistics Tom Griffiths Department of Psychology Cognitive Science Program University of.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Inductive Reasoning Bayes Rule. Urn problem (1) A B A die throw determines from which urn to select balls. For outcomes 1,2, and 3, balls are picked from.
Priors and predictions in everyday cognition Tom Griffiths Cognitive and Linguistic Sciences.
Visual Recognition Tutorial
Revealing inductive biases with Bayesian models Tom Griffiths UC Berkeley with Mike Kalish, Brian Christian, and Steve Lewandowsky.
Markov chain Monte Carlo with people Tom Griffiths Department of Psychology Cognitive Science Program UC Berkeley with Mike Kalish, Stephan Lewandowsky,
Exploring cultural transmission by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana With thanks to: Anu Asnaani, Brian.
Computer vision: models, learning and inference
1 Learning with Bayesian Networks Author: David Heckerman Presented by Yan Zhang April
Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Normative models of human inductive inference Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Lab 4 1.Get an image into a ROS node 2.Find all the orange pixels (suggest HSV) 3.Identify the midpoint of all the orange pixels 4.Explore the findContours.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Random Sampling, Point Estimation and Maximum Likelihood.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Learning Theory Reza Shadmehr Bayesian estimation in everyday behaviors Text to read: chapter 2.
Scientific Inquiry & Skills
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Week 71 Hypothesis Testing Suppose that we want to assess the evidence in the observed data, concerning the hypothesis. There are two approaches to assessing.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Simple examples of the Bayesian approach For proportions and means.
Probabilistic Models in Human and Machine Intelligence.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Everyday inductive leaps Making predictions and detecting coincidences Tom Griffiths Department of Psychology Program in Cognitive Science University of.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
CS Ensembles and Bayes1 Ensembles, Model Combination and Bayesian Combination.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
FIXETH LIKELIHOODS this is correct. Bayesian methods I: theory.
Bayesian Optimization. Problem Formulation Goal  Discover the X that maximizes Y  Global optimization Active experimentation  We can choose which values.
Bayesian Estimation and Confidence Intervals Lecture XXII.
15 Inferential Statistics.
Bayesian Estimation and Confidence Intervals
Understanding Results
Inference Concerning a Proportion
Revealing priors on category structures through iterated learning
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical NLP: Lecture 4
LECTURE 07: BAYESIAN ESTIMATION
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Optimal predictions in everyday cognition Tom Griffiths Josh Tenenbaum Brown University MIT Predicting the future Optimality and Bayesian inference Results The effects of prior knowledge Many people believe that perception is optimal… …but cognition is not. Posterior probability LikelihoodPrior probability Sum over space of hypotheses h: hypothesis d: data A puzzle If they do not use priors, how do people… predict the future infer causal relationships identify the work of chance assess similarity and make generalizations learn languages and concepts …and solve other inductive problems? Drawing strong conclusions from limited data requires using prior knowledge How often is Google News updated? t = time since last update t total = time between updates What should we guess for t total given t? More generally… You encounter a phenomenon that has existed for t units of time. How long will it continue into the future? (i.e. what’s t total ?) We could replace “time” with any other variable that ranges from 0 to some unknown upper limit Everyday prediction problems You read about a movie that has made $60 million to date. How much money will it make in total? You see that something has been baking in the oven for 34 minutes. How long until it’s ready? You meet someone who is 78 years old. How long will they live? Your friend quotes to you from line 17 of his favorite poem. How long is the poem? You see taxicab #107 pull up to the curb in front of the train station. How many cabs in this city? Bayesian inference p(t total |t)  p(t|t total ) p(t total ) assuming random sampling, the likelihood is p(t|t total ) = 1/t total posterior probability likelihoodprior p(t total |t) t total Not the maximal value of p(t total |t) (that’s just t* = t) What is the best guess for t total ? (call it t*) p(t total |t) t total t total = t t  4000 years, t*  8000 years Predicting everyday events This seems like a good strategy… –You meet someone who is 35 years old. How long will they live? –“70 years” seems reasonable But, it’s not so simple: –You meet someone who is 78 years old. How long will they live? –You meet someone who is 6 years old. How long will they live? The effects of priors Evaluating human predictions Different domains with different priors: –a movie has made $60 million [power-law] –your friend quotes from line 17 of a poem [power-law] –you meet a 78 year old man [Gaussian] –a movie has been running for 55 minutes [Gaussian] –a U.S. congressman has served for 11 years [Erlang] Prior distributions derived from actual data Use 5 values of t for each People predict t total A total of 350 participants and ten scenarios people parametric prior empirical prior Gott’s rule Nonparametric priors You arrive at a friend’s house, and see that a cake has been in the oven for 34 minutes. How long will it be in the oven? People make good predictions despite the complex distribution You learn that in ancient Egypt, there was a great flood in the 11th year of a pharaoh’s reign. How long did he reign? How long did the typical pharaoh reign in ancient Egypt? People identify the form, but are mistaken about the parameters No direct experience Conclusions People produce accurate predictions for the duration and extent of everyday events People have strong prior knowledge –form of the prior (power-law or exponential) –distribution given that form (parameters) –non-parametric distribution when necessary Reveals a surprising correspondence between probabilities in the mind and in the world, and suggests that people do use prior probabilities in making inductive inferences In particular, there is controversy over whether people’s inferences follow Bayes’ rule which indicates how a rational agent should update beliefs about hypotheses h in light of data d. Several results suggest people do not combine prior probabilities with data correctly. (e.g., Tversky & Kahneman, 1974) Strategy: examine the influence of prior knowledge in an inductive problem we solve every day We use the posterior median P(t total < t*|t) = 0.5 t*t What should we use as the prior, p(t total )? Gott (1993): use the uninformative prior p(t total )  1/t total Yields a simple prediction rule: t* = 2t