1 Bayesian Essentials Slides by Peter Rossi and David Madigan.

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

Bayesian inference of normal distribution

Pattern Recognition and Machine Learning

Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.

Bayesian Estimation in MARK

Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,

Sampling Distributions (§ )

Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.

Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.

Bayesian Essentials and Bayesian Regression

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Presenting: Assaf Tzabari

A Discussion of the Bayesian Approach Reference: Chapter 1 and notes from Dr. David Madigan.

Machine Learning CMPT 726 Simon Fraser University

. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.

Introduction to Bayesian Parameter Estimation

Thanks to Nir Friedman, HU

Learning Bayesian Networks (From David Heckerman’s tutorial)

Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Further advanced methods Chapter 17.

Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.

Mixture Modeling Chongming Yang Research Support Center FHSS College.

1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.

B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.

Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.

Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.

Applied Bayesian Inference, KSU, April 29, 2012 § ❷ / §❷ An Introduction to Bayesian inference Robert J. Tempelman 1.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

1 Bayesian Essentials and Bayesian Regression. 2 Distribution Theory 101 Marginal and Conditional Distributions: X Y 1 1 uniform.

- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.

Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.

Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:

Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)

Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.

Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000.

The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.

The Uniform Prior and the Laplace Correction Supplemental Material not on exam.

The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.

- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.

Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)

Introduction: Metropolis-Hasting Sampler Purpose--To draw samples from a probability distribution There are three steps 1Propose a move from x to y 2Accept.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:

Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:

SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.

Bayesian Inference: Multiple Parameters

Markov Chain Monte Carlo in R

MCMC Output & Metropolis-Hastings Algorithm Part I

Probability Theory and Parameter Estimation I

Artificial Intelligence

Multiple Imputation.

CAP 5636 – Advanced Artificial Intelligence

OVERVIEW OF BAYESIAN INFERENCE: PART 1

Bayesian Inference, Basics

Statistical NLP: Lecture 4

CS 188: Artificial Intelligence

Pattern Recognition and Machine Learning

CHAPTER 15 SUMMARY Chapter Specifics

LECTURE 07: BAYESIAN ESTIMATION

Parametric Methods Berlin Chen, 2005 References:

CS639: Data Management for Data Science

Applied Statistics and Probability for Engineers

Classical regression review

Presentation transcript:

1 Bayesian Essentials Slides by Peter Rossi and David Madigan

2 Distribution Theory 101 Marginal and Conditional Distributions: X Y 1 1 uniform

3 Simulating from Joint To draw from the joint: i. Draw from marginal on X ii. Condition on this draw, and draw from conditional of Y|X library(triangle) x <- rtriangle(NumDraws,0,1,1) y <- runif(NumDraws,0,x) plot(x,y)

4 Triangular Distribution If U~ unif(0,1), then: sqrt(U) has the standard triangle distribution If U1, U2 ~ unif(0,1), then: Y=max{U1,U2} has the standard triangle distribution

Sampling Importance Resampling 5 f g draw a big sample from g sub-sample from that sample with probability f/g

Metropolis 6 start with current = 0.5 to get the next value: draw a “proposal” from g keep with probability f(proposal)/f(current) else keep current f g

7 The Goal of Inference Make inferences about unknown quantities using available information. Inference -- make probability statements unknowns -- parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action Information – data-based non data-based theories of behavior; subjective views; mechanism parameters are finite or in some range

8 p(θ|D) α p(D| θ) p(θ) Posterior α “Likelihood” × Prior Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D). Bayes theorem

9 Summarizing the posterior Output from Bayesian Inference: A possibly high dimensional distribution Summarize this object via simulation: marginal distributions of don’t just compute Contrast with Sampling Theory: point est/standard error summary of irrelevant dist bad summary (normal) Limitations of asymptotics

10 Metropolis Start somewhere with θ current To get the next value, generate a proposal θ proposal Accept with “probability”: else keep currrent

11 Example Believe these measurements (D) come from N(μ,1): Prior for μ? p(μ) = 2μ

12 Example continued p(D|μ)? y 1,…,y 10 switch to R… other priors? unif(0,1), norm(0,1), norm(0,100) generating good candidates?

13 Prediction See D, compute : “Predictive Distribution” future observable

14 Bayes/Classical Estimators Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.

15 Bayesian Computations Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior. If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.

16 Conjugate Families Models with convenient analytic properties almost invariably come from conjugate families. Why do I care now? - conjugate models are used as building blocks - build intuition re functions of Bayesian inference Definition: A prior is conjugate to a likelihood if the posterior is in the same class of distributions as prior. Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.

17 Beta-Binomial model Need a prior!

18 Beta distribution

19 Posterior

20 Prediction

21 Regression model

22 Bayesian Regression Prior: Inverted Chi-Square: Interpretation as from another dataset. Draw from prior?

23 Posterior

24 Combining quadratic forms

25 Posterior

26 IID Simulations 3) Repeat 1) Draw [  2 | y, X] 2) Draw [  |  2,y, X] Scheme: [y|X, ,  2 ] [  |  2 ] [  2 ] [ ,  2 |y,X]  [  2 | y,X] [  |  2,y,X]

27 IID Simulator, cont.