Causes and coincidences Tom Griffiths Cognitive and Linguistic Sciences Brown University.

Slides:



Advertisements
Similar presentations
The influence of domain priors on intervention strategy Neil Bramley.
Advertisements

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Part II: Graphical models
The dynamics of iterated learning Tom Griffiths UC Berkeley with Mike Kalish, Steve Lewandowsky, Simon Kirby, and Mike Dowman.
Review: What influences confidence intervals?
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Priors and predictions in everyday cognition Tom Griffiths Cognitive and Linguistic Sciences.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
A Bayesian view of language evolution by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
A/Prof Geraint Lewis A/Prof Peter Tuthill
Exploring subjective probability distributions using Bayesian statistics Tom Griffiths Department of Psychology Cognitive Science Program University of.
Priors and predictions in everyday cognition Tom Griffiths Cognitive and Linguistic Sciences.
Revealing inductive biases with Bayesian models Tom Griffiths UC Berkeley with Mike Kalish, Brian Christian, and Steve Lewandowsky.
Chapter 4: Basic Probability
Exploring cultural transmission by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana With thanks to: Anu Asnaani, Brian.
Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Normative models of human inductive inference Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley.
Bayesian models as a tool for revealing inductive biases Tom Griffiths University of California, Berkeley.
Bio (“life”) + logy (“study of”) Scientific study of life (pg. 4)
Scientific Method Lab.
Probability, Bayes’ Theorem and the Monty Hall Problem
Statistics Continued. Purpose of Inferential Statistics Try to reach conclusions that extend beyond the immediate data Make judgments about whether an.
Bayesian approaches to cognitive sciences. Word learning Bayesian property induction Theory-based causal inference.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Theory-based causal induction Tom Griffiths Brown University Josh Tenenbaum MIT.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Optimal predictions in everyday cognition Tom Griffiths Josh Tenenbaum Brown University MIT Predicting the future Optimality and Bayesian inference Results.
Individual values of X Frequency How many individuals   Distribution of a population.
Introduction to Probability Theory March 24, 2015 Credits for slides: Allan, Arms, Mihalcea, Schutze.
Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory. How.
Tetrad project 
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Methodological Problems in Cognitive Psychology David Danks Institute for Human & Machine Cognition January 10, 2003.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Scientific Method Probability and Significance Probability Q: What does ‘probability’ mean? A: The likelihood that something will happen Probability.
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
Everyday inductive leaps Making predictions and detecting coincidences Tom Griffiths Department of Psychology Program in Cognitive Science University of.
- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Probability judgement. AO1 Probability judgement ‘Probability’ refers to the likelihood of an event occurring, such as the likelihood that a horse will.
Basic Bayes: model fitting, model selection, model averaging Josh Tenenbaum MIT.
Chap 4-1 Chapter 4 Using Probability and Probability Distributions.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Scientific Method Vocabulary Observation Hypothesis Prediction Experiment Variable Experimental group Control group Data Correlation Statistics Mean Distribution.
Human causal induction Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Lecture 1.31 Criteria for optimal reception of radio signals.
QMT 3033 ECONOMETRICS QMT 3033 ECONOMETRIC.
Probability and Statistics
Markov chain Monte Carlo with people
Understanding Results
Reasoning Under Uncertainty in Expert System
Revealing priors on category structures through iterated learning
Statistical NLP: Lecture 4
Significance Tests: The Basics
PSY 626: Bayesian Statistics for Psychological Science
The causal matrix: Learning the background knowledge that makes causal learning possible Josh Tenenbaum MIT Department of Brain and Cognitive Sciences.
Warm- Up What is an Inference? What is a Hypothesis?
Basic Probability Chapter Goal:
Presentation transcript:

Causes and coincidences Tom Griffiths Cognitive and Linguistic Sciences Brown University

“It could be that, collectively, the people in New York caused those lottery numbers to come up … If enough people all are thinking the same thing, at the same time, they can cause events to happen… It's called psychokinesis.”

(Halley, 1752) 76 years 75 years

The paradox of coincidences How can coincidences simultaneously lead us to irrational conclusions and significant discoveries?

Outline 1.A Bayesian approach to causal induction 2.Coincidences i.what makes a coincidence? ii.rationality and irrationality iii.the paradox of coincidences 3.Explaining inductive leaps

Outline 1.A Bayesian approach to causal induction 2.Coincidences i.what makes a coincidence? ii.rationality and irrationality iii.the paradox of coincidences 3.Explaining inductive leaps

Causal induction Inferring causal structure from data A task we perform every day … –does caffeine increase productivity? … and throughout science –three comets or one?

Reverend Thomas Bayes

Bayes’ theorem Posterior probability LikelihoodPrior probability Sum over space of hypotheses h: hypothesis d: data

Bayesian causal induction Hypotheses: Likelihoods: Priors: Data: causal structures

Causal graphical models (Pearl, 2000; Spirtes et al., 1993) Variables X Y Z

Structure X Y Z Causal graphical models (Pearl, 2000; Spirtes et al., 1993)

X Y Z Variables Structure Conditional probabilities p(z|x,y)p(z|x,y) p(x)p(x) p(y)p(y) Defines probability distribution over variables (for both observation, and intervention) Causal graphical models (Pearl, 2000; Spirtes et al., 1993)

Bayesian causal induction Hypotheses: Likelihoods: Priors: probability distribution over variables Data: observations of variables causal structures a priori plausibility of structures

“Does C cause E?” (rate on a scale from 0 to 100) E present (e + ) E absent (e - ) C present (c + ) C absent (c - ) a b c d Causal induction from contingencies

Buehner & Cheng (1997) “Does the chemical cause gene expression?” (rate on a scale from 0 to 100) E present (e + ) E absent (e - ) C present (c + ) C absent (c - ) Gene Chemical

People Examined human judgments for all values of P(e + |c + ) and P(e + |c - ) in increments of 0.25 How can we explain these judgments? Buehner & Cheng (1997) Causal rating

Bayesian causal induction cause chance E B C E B C B B Hypotheses: Likelihoods: Priors: each cause has an independent opportunity to produce the effect p 1 - p Data: frequency of cause-effect co-occurrence

Bayesian causal induction cause chance E B C E B C B B Hypotheses:

Bayesian causal induction cause chance E B C E B C B B Hypotheses: evidence for a causal relationship

People Bayes (r = 0.97) Buehner and Cheng (1997)

People Bayes (r = 0.97) Buehner and Cheng (1997)  P (r = 0.89) Power (r = 0.88)

Other predictions Causal induction from contingency data –sample size effects –judgments for incomplete contingency tables (Griffiths & Tenenbaum, in press) More complex cases –detectors (Tenenbaum & Griffiths, 2003) –explosions (Griffiths, Baraff, & Tenenbaum, 2004) –simple mechanical devices

AB The stick-ball machine (Kushnir, Schulz, Gopnik, & Danks, 2003)

Outline 1.A Bayesian approach to causal induction 2.Coincidences i.what makes a coincidence? ii.rationality and irrationality iii.the paradox of coincidences 3.Explaining inductive leaps

What makes a coincidence?

A common definition: Coincidences are unlikely events “an event which seems so unlikely that it is worth telling a story about” “we sense that it is too unlikely to have been the result of luck or mere chance”

Coincidences are not just unlikely... HHHHHHHHHH vs. HHTHTHTTHT

Bayesian causal induction Likelihood ratio (evidence) Prior odds high low high low cause chance ? ?

Bayesian causal induction Likelihood ratio (evidence) Prior odds high low high low cause chance coincidence ?

What makes a coincidence? A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists

What makes a coincidence? A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists likelihood ratio is high

What makes a coincidence? A coincidence is an event that provides evidence for causal structure, but not enough evidence to make us believe that structure exists likelihood ratio is high prior odds are low posterior odds are middling

HHHHHHHHHH HHTHTHTTHT likelihood ratio is high prior odds are low posterior odds are middling

Bayesian causal induction cause chance E C Hypotheses: Likelihoods: Priors: p 1 - p Data: frequency of effect in presence of cause E C (small) 0 < p(E) < 1 p(E) = 0.5

HHHHHHHHHH HHTHTHTTHT likelihood ratio is high prior odds are low posterior odds are middling likelihood ratio is low prior odds are low posterior odds are low coincidence chance

HHHHHHHHHHHHHHHHHH HHHHHHHHHH HHHH likelihood ratio is middling prior odds are low posterior odds are low mere coincidence likelihood ratio is high prior odds are low posterior odds are middling suspicious coincidence likelihood ratio is very high prior odds are low posterior odds are high cause

Mere and suspicious coincidences mere coincidence suspicious coincidence evidence for a causal relation Transition produced by –increase in likelihood ratio (e.g., coinflipping) –increase in prior odds (e.g., genetics vs. ESP)

Testing the definition Provide participants with data from experiments Manipulate: –cover story: genetic engineering vs. ESP (prior) –data: number of males/heads (likelihood) –task: “coincidence or evidence?” vs. “how likely?” Predictions: –coincidences affected by prior and likelihood –relationship between coincidence and posterior

r = Number of heads/males Proportion “coincidence” Posterior probability

Likelihood ratio (evidence) Prior odds high low high low cause chance coincidence ? Rationality and irrationality

(Gilovich, 1991) The bombing of London

(uniform) Spread Location Ratio Number Change in... People

Bayesian causal induction cause chance Hypotheses: Likelihoods: Priors: p 1 - p uniform + regularity T X X X X T T T T X X X X T Data: bomb locations

r = 0.98 (uniform) Spread Location Ratio Number Change in... People Bayes

May 14, July 8, August 21, December 25 vs. August 3, August 3, August 3, August 3 Coincidences in date

People

Bayesian causal induction cause chance Hypotheses: Likelihoods: Priors: p 1 - p uniform uniform + regularity P P P P P P P P B B B August Data: birthdays of those present

People Bayes

People’s sense of the strength of coincidences gives a close match to the likelihood ratio –bombing and birthdays Rationality and irrationality

People’s sense of the strength of coincidences gives a close match to the likelihood ratio –bombing and birthdays Suggests that we accept false conclusions when our prior odds are insufficiently low Rationality and irrationality

Likelihood ratio (evidence) Prior odds high low high low cause chance coincidence ?

The paradox of coincidences Prior odds can be low for two reasons Incorrect current theory Significant discovery Correct current theory False conclusion Reason Consequence Attending to coincidences makes more sense the less you know

Coincidences Provide evidence for causal structure, but not enough to make us believe that structure exists Intimately related to causal induction –an opportunity to discover a theory is wrong Guided by a well calibrated sense of when an event provides evidence of causal structure

Outline 1.A Bayesian approach to causal induction 2.Coincidences i.what makes a coincidence? ii.rationality and irrationality iii.the paradox of coincidences 3.Explaining inductive leaps

Explaining inductive leaps How do people –infer causal relationships –identify the work of chance –predict the future –assess similarity and make generalizations –learn functions, languages, and concepts... from such limited data? What knowledge guides human inferences?

Which sequence seems more random? HHHHHHHHHH vs. HHTHTHTTHT

Subjective randomness Typically evaluated in terms of p(d | chance) Assessing randomness is part of causal induction evidence for a random generating process

Randomness and coincidences evidence for a random generating process strength of coincidence

Randomness and coincidences r = r = -0.94

People Bayes Pick a random number…

Bayes’ theorem

inference = f(data,knowledge)

Bayes’ theorem inference = f(data,knowledge)

Predicting the future Human predictions match optimal predictions from empirical prior

Iterated learning (Briscoe, 1998; Kirby, 2001) data hypothesis learning production datahypothesis learning production d0d0 h1h1 d1d1 h2h2 inference sampling inference sampling p(h|d)p(h|d) p(d|h)p(d|h) p(d|h)p(d|h) p(h|d)p(h|d) (Griffiths & Kalish, submitted)

Iteration

Conclusion Many cognitive judgments are the result of challenging problems of induction Bayesian statistics provides a formal framework for exploring how people solve these problems Makes it possible to ask… –how do we make surprising discoveries? –how do we learn so much from so little? –what knowledge guides our judgments?

Collaborators Causal induction –Josh Tenenbaum (MIT) –Liz Baraff (MIT) Iterated learning –Mike Kalish (University of Louisiana)

Causes and coincidences “coincidence” appears in 13/60 cases p(“cause”) = 0.01 p(“cause”|“coincidence”) = 0.26

A reformulation: unlikely kinds Coincidences are events of an unlikely kind –e.g. a sequence with that number of heads Deals with the obvious problem... p(10 heads) < p(5 heads, 5 tails)

Problems with unlikely kinds Defining kinds August 3, August 3, August 3, August 3 January 12, March 22, March 22, July 19, October 1, December 8

Problems with unlikely kinds Defining kinds Counterexamples P(4 heads) < P(2 heads, 2 tails) P(4 heads) > P(15 heads, 8 tails) HHHH > HHHHTHTTHHHTHTHHTHTTHHH HHHH > HHTT