Bayesian Models Honors 207, Intro to Cognitive Science David Allbritton An introduction to Bayes' Theorem and Bayesian models of human cognition.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
THE CENTRAL LIMIT THEOREM
Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Psychology 10 Analysis of Psychological Data February 26, 2014.
1 Parametric Sensitivity Analysis For Cancer Survival Models Using Large- Sample Normal Approximations To The Bayesian Posterior Distribution Gordon B.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Intro to Bayesian Learning Exercise Solutions Ata Kaban The University of Birmingham 2005.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Introduction to Bayesian Parameter Estimation
Thanks to Nir Friedman, HU
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Naive Bayes Classifier
The Case of the Blue Cab and the Black Cab Companies * Apologies to the young *Adapted from:
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
CS464 Introduction to Machine Learning1 Bayesian Learning Features of Bayesian learning methods: Each observed training example can incrementally decrease.
1 A Bayesian statistical method for particle identification in shower counters IX International Workshop on Advanced Computing and Analysis Techniques.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Bayesian statistics Probabilities for everything.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
BCS547 Neural Decoding.
12/7/20151 Math b Conditional Probability, Independency, Bayes Theorem.
Bayesian Statistics and Decision Analysis
MATH 643 Bayesian Statistics. 2 Discrete Case n There are 3 suspects in a murder case –Based on available information, the police think the following.
Indian Institute of Technology Bombay Bayesian Probabilistic or Weights of Evidence Model for Mineral Prospectivity Mapping.
1 Bayesian Essentials Slides by Peter Rossi and David Madigan.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
The Uniform Prior and the Laplace Correction Supplemental Material not on exam.
Univariate Gaussian Case (Cont.)
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Intro to Bayesian Learning Exercise Solutions Ata Kaban The University of Birmingham.
Journal of Vision. 2006;6(5):2. doi: /6.5.2 Figure Legend:
Univariate Gaussian Case (Cont.)
Lecture 1.31 Criteria for optimal reception of radio signals.
Probability Theory and Parameter Estimation I
Chapter 4 Probability.
Naive Bayes Classifier
Ch3: Model Building through Regression
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
Special Topics In Scientific Computing
Inference Concerning a Proportion
Modelling data and curve fitting
Advanced Artificial Intelligence
Statistical NLP: Lecture 4
Pattern Recognition and Machine Learning
Wellcome Trust Centre for Neuroimaging
LECTURE 09: BAYESIAN LEARNING
LECTURE 07: BAYESIAN ESTIMATION
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Bayesian Decision Theory
basic probability and bayes' rule
Presentation transcript:

Bayesian Models Honors 207, Intro to Cognitive Science David Allbritton An introduction to Bayes' Theorem and Bayesian models of human cognition

Bayes Theorem: An Introduction ● What is Bayes' Theorem? ● What does it mean? ● An application: Calculating a probability ● What are distributions? ● Bayes' Theorem using distributions ● An application to cognitive modelling: perception

What is Bayes' Theorem? ● A theorem of probability theory ● Concerns “conditional probabilities”  Example 1: Probability class is cancelled if it snows ● P (A | B) ● “Probability of A given B,” where A = “class is cancelled” and B = “it snows.”  Example 2: Probability that it snowed, if class was cancelled ● P (B | A) ● Bayes' Theorem tells how these two conditional probabilities are related

What is Bayes Theorem? (cont.) P (B | A) * P (A) P (A | B) = P (B) likelihood * prior posterior probability = normalizing constant

What does it mean? ● The “prior” is the probability of A as estimated before you knew anything about B; the prior probability of A. ● The “likelihood” is the new information you have that will change your estimate of the probability of A; it is the likelihood of B if A were true. ● The “normalizing constant” just turns the resulting quantity into a probability (a number between 0 and 1); it is not that interesting to us.

An application: Calculating a probability "A cab was involved in a hit and run accident. There are two cab companies in town, the green (85% of the cabs in the city) and the blue (15%). A witness said that the cab in the accident was blue. Tests showed that the witness is 80% reliable in identifying cabs." Question: What is the probability that the cab in the accident was blue? A = The cab is blue. B = The witness says the cab is blue. P (A) = prior probability that the cab is blue P (B) = overall probability that the witness says the cab is blue P (B | A) = likelihood the witness says the cab is blue when it really is blue P (A | B) = posterior probability the cab is blue given that the witness says it is P(A) =.15 P(B|A) =.8 P(B) = P(B|A)*P(A) + P(B|~A)*P(~A) =.8* *.85 =.29 P (A | B) = [ P(B | A) * P(A) ] / P(B) =.8 *.15 /.29 =.41

What are distributions? ● Demonstration:  ● Terms  Probability density function: area under the curve = 1  Normal distribution, Gaussian distribution  Standard deviation  Uniform distribution

Bayes Theorem using distributions ● Posterior, prior and likelihood are not single values, but functions over distributions ● P(theta | phi) = P(phi | theta) * P(theta) / P(phi) ● Posterior dist. = C * likelihood dist. * prior dist.  C is just a normalizing constant to make the posterior distribution sum to 1, and can be ignored here ● Because the posterior is a distribution, need a decision rule to use it to choose a value to output

An application to cognitive modeling: perception ● Task: guess the distal angle theta that produced the observed proximal angle phi  The viewing angle is unknown, so theta is unknown ● p (theta | phi) = posterior; result of combining our prior knowledge and perceptual information ● p (phi | theta) = likelihood; probabilities for various values of theta to produce the observed value of phi ● p (theta) = prior; probabilities of various values of theta that were known before phi was observed ● Gain function = rewards for correct guess, penalties for incorrect guesses ● Decision rule = function of posterior and gain function

More information: ● ●