A probabilistic approach to cognition

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

The influence of domain priors on intervention strategy Neil Bramley.
Pattern Recognition and Machine Learning
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
Motion Illusions As Optimal Percepts. What’s Special About Perception? Arguably, visual perception is better optimized by evolution than other cognitive.
Comments on Hierarchical models, and the need for Bayes Peter Green, University of Bristol, UK IWSM, Chania, July 2002.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
5/17/20151 Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn.
Probabilistic Models of Cognition Conceptual Foundations Chater, Tenenbaum, & Yuille TICS, 10(7), (2006)
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Bayes Rule How is this rule derived? Using Bayes rule for probabilistic inference: –P(Cause | Evidence): diagnostic probability –P(Evidence | Cause): causal.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Introduction to Bayesian Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Baysian Approaches Kun Guo, PhD Reader in Cognitive Neuroscience School of Psychology University of Lincoln Quantitative Methods 2011.
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Cognitive Processes PSY 334 Chapter 10 – Reasoning & Decision-Making August 21, 2003.
Vision in Man and Machine. STATS 19 SEM Talk 2. Alan L. Yuille. UCLA. Dept. Statistics and Psychology.
Language Modeling Approaches for Information Retrieval Rong Jin.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference.
Crash Course on Machine Learning
The free-energy principle: a rough guide to the brain? K Friston Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Naive Bayes Classifier
New Bulgarian University 9th International Summer School in Cognitive Science Simplicity as a Fundamental Cognitive Principle Nick Chater Institute for.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.
Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
How Solvable Is Intelligence? A brief introduction to AI Dr. Richard Fox Department of Computer Science Northern Kentucky University.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 of 41 Monday, 25 October.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
DNA Identification: Bayesian Belief Update Cybergenetics © TrueAllele ® Lectures Fall, 2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Inference: Probabilities and Distributions Feb , 2012.
Lecture 2: Statistical learning primer for biologists
Motion Illusions As Optimal Percepts. What’s Special About Perception? Visual perception important for survival  Likely optimized by evolution  at least.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 24 of 41 Monday, 18 October.
- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Basic Bayes: model fitting, model selection, model averaging Josh Tenenbaum MIT.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.
Bayesian Estimation and Confidence Intervals
Authors: Peter W. Battaglia, Robert A. Jacobs, and Richard N. Aslin
Probability Theory and Parameter Estimation I
Naive Bayes Classifier
Nicolas Alzetta CoNGA: Cognition and Neuroscience Group of Antwerp
What is Probability? Quantification of uncertainty.
ICS 280 Learning in Graphical Models
Ch3: Model Building through Regression
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
CS 416 Artificial Intelligence
Special Topics In Scientific Computing
Data Mining Lecture 11.
More about Posterior Distributions
The free-energy principle: a rough guide to the brain? K Friston
A Switching Observer for Human Perceptual Estimation
Bayes for Beginners Luca Chech and Jolanda Malamud
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
A Switching Observer for Human Perceptual Estimation
Bayesian vision Nisheeth 14th February 2019.
#10 The Central Limit Theorem
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

A probabilistic approach to cognition The Bayesian Brain A probabilistic approach to cognition

Background Computational/Theoretical Neuroscience: describe the brain in the language of mathematics (as we do with the universe) On top of giving a better explanation, could perhaps a (shared) mathematical description also bridge the brain/mind dichotomy? Could such a description also relate brains to machines? (Neuroscience in Engineering Departments)? The engineering approach to intelligence: the brain as a machine The brain as a computational device – what are the computations? Marr and his 3 level of studying the mind/brain Lots of data available out there– try to find structure & a causal model of the world (making sense of data) AI: give instructions vs. give data (program vs. train) The Renaissanse of Artificial Intelligence (3 reasons)

The Bayesian approach The Bayesian brain: a possible mathematical framework on how the brain works as a probabilistic inference machine (1990’s) Helmholtz (born 1821): the brain as a device making unconscious inferences about the world - infer the state of the world from noisy and incomplete data Bayes’ (born circa 1701) Theorem Probability Theory A system with no memory Conditional probabilities (e.g. married with kids) Probability of A&B being true Venn diagrams (e.g. dice rolling) Diagnosed with the rare disease: The possibility of having rare disease can be checked with a medical test that is correct 90% of the time. If you take the test which then turns out to be positive, what is the probability that you have the disease?

Venn diagram

Bayes’ Law/Theorem

Bayes law makes sense & we use it every day to make (probabilistic) inferences: Is this person holding Protopapas book a PMS student? How old is my mom? (need a prior) Familiar vs. unfamiliar song in the shower Coin flipping (HHHHH vs. HTTHT) Gestalt Psychology The problem of forgetting priors (Tom the plumber/ mathematician) The problem of strong priors Bayes & Freud Examples from visual perception Ball-in-a-box video The rare disease problem (test is 90% correct, disease is 1/100) (9/108) The Monty Hall Example (2/3 win if change) The cookie problem (30v/10c & 20v/20c: take 1, it is v, P it is from box1?) (3/5) m&m (‘95 change G 10->20 & Y 20->14. ’94 & ’96 bags, P that Y is from ‘94?) (20/27) Elvis Presley (P dead twin brother was identical? 1/125 fraternal, 1/300 identical) (5/11)

Bayes law makes sense & we use it every day to make (probabilistic) inferences: Is this person holding Protopapas book a PMS student? How old am I? (sensory data is enough) How old is my mom? (need a prior) Familiar vs. unfamiliar song in the shower Coin flipping (HHHHH vs. HTTHT) Gestalt Psychology The problem of forgetting priors (Tom the plumber/ mathematician) The problem of strong priors - Bayes & Freud? Examples from visual perception Ball-in-a-box video The rare disease problem (test is 90% correct, disease is 1/100) (9/108) The Monty Hall Example (2/3 win if change) The cookie problem (30v/10c & 20v/20c: take 1, it is v, P it is from box1?) (3/5) m&m (‘95 change G 10->20 & Y 20->14. ’94 & ’96 bags, P that Y is from ‘94?) (20/27) Elvis Presley (P dead twin brother was identical? 1/125 fraternal, 1/300 identical) (5/11)

A very strong prior compensates for a weak likelihood (the boy/girl-friend example )

Bayes law makes sense & we use it every day to make (probabilistic) inferences: Is this person holding Protopapas book a PMS student? How old am I? (sensory data is enough) How old is my mom? (need a prior) Familiar vs. unfamiliar song in the shower Coin flipping (HHHHH vs. HTTHT) Gestalt Psychology The problem of forgetting priors (Tom the plumber/ mathematician) The problem of strong priors Bayes & Freud Examples from visual perception Ball-in-a-box video The rare disease problem (test is 90% correct, disease is 1/100) (9/108) The Monty Hall Example (2/3 win if change) The cookie problem (30v/10c & 20v/20c: take 1, it is v, P it is from box1?) (3/5) m&m (‘95 change G 10->20 & Y 20->14. ’94 & ’96 bags, P that Y is from ‘94?) (20/27) Elvis Presley (P dead twin brother was identical? 1/125 fraternal, 1/300 identical) (5/11)

Bayes law makes sense & we use it every day to make (probabilistic) inferences: Is this person holding Protopapas book a PMS student? How old am I? (sensory data is enough) How old is my mom? (need a prior) Familiar vs. unfamiliar song in the shower Coin flipping (HHHHH vs. HTTHT) Gestalt Psychology The problem of forgetting priors (Tom the plumber/ mathematician) The problem of strong priors Bayes & Freud Examples from visual perception Ball-in-a-box video The rare disease problem (test is 90% correct, disease is 1/100) (9/108) The Monty Hall Example (2/3 win if change) The cookie problem (30v/10c & 20v/20c: take 1, it is v, P it is from box1?) (3/5) m&m (‘95 change G 10->20 & Y 20->14. ’94 & ’96 bags, P that Y is from ‘94?) (20/27) Elvis Presley (P dead twin brother was identical? 1/125 fraternal, 1/300 identical) (5/11)

Bayes’ Law/Theorem

Probability vs. Likelihood The number that is the probability of some observed outcomes given a set of parameter values is regarded as the likelihood of the set of parameter values given the observed outcomes The likelihood is about the stimulus (parameter/hypothesis) – what are the likely stimuli that could give rise to this activation (data)? Likelihood is also referred to as inverse probability, as well as unormalised probability (likelihoods need not add to 1) Probability attaches to possible results whereas likelihood attaches to hypotheses about how these data came around Sometimes people right P(D/H)=L(H/D)

In perception, we can view the stimulus as the (inverse) function of the (noisy) sensory input and make probabilistic inferences regarding the former with the aid of priors that we have from experiencing the statistical regularities of the world..

Bayesian approach to perception: make inferences from noisy sensory input

Bayesian estimation steps Maximum Likelihood Estimation (MLE) when we do not have any priors about our hypothesis From that to the Bayesian estimate that takes the priors into account and gives us a posterior distribution Maximum A Posteriori Probability (MAP) estimate is the mode of the posterior distribution The final estimate might also take into account a cost function that biases your estimation towards avoiding cost (similar to SDT )

If our prior distribution is uniform then all that we have in hand is the likelihood…

Predictive coding Our common model of the visual system consists of: Several stages hierarchically organised from retina on In this feedforward sweep the signal is sensory information that travels bottom-up This information gets transformed into increasing levels of complexity and abstraction at higher processing stages There is also top-down (modulatory?) feedback from the higher to the lower processing stages (there is also functional specialisation) Predictive coding is a Bayesian-inspired alternative: The hierarchical structure remains Higher stages make predictions about the incoming sensory information that travel top-down These are compared with the actual sensory information at each stage What travels bottom-up is the error between the predicted and the actual signal (which also sounds economical ) The brain thus only codes changes/surprise = information (IT) What we see is the result of what we expect to see Perception (veridical) can be seen as “controlled hallucination”! (A. Clark, BBS 2013 review)