Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
Chapter 1 Probability Theory (i) : One Random Variable
Assuming normally distributed data! Naïve Bayes Classifier.
Estimation of parameters. Maximum likelihood What has happened was most likely.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )
Results 2 (cont’d) c) Long term observational data on the duration of effective response Observational data on n=50 has EVSI = £867 d) Collect data on.
Presenting: Assaf Tzabari
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference.
Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.
Discrete Probability Distributions
Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.
Learning Bayesian Networks (From David Heckerman’s tutorial)
Computer vision: models, learning and inference Chapter 6 Learning and Inference in Vision.
Chapter Two Probability Distributions: Discrete Variables
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
PBG 650 Advanced Plant Breeding
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Statistical Decision Theory
Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
Pemodelan Kualitas Proses Kode Matakuliah: I0092 – Statistik Pengendalian Kualitas Pertemuan : 2.
1 A Bayesian statistical method for particle identification in shower counters IX International Workshop on Advanced Computing and Analysis Techniques.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
1 Francisco José Vázquez Polo [ Miguel Ángel Negrín Hernández [ {fjvpolo or
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
1 Bayesian Essentials Slides by Peter Rossi and David Madigan.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Decision Analysis Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Jan. 24, 2001.
1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.
Machine Learning 5. Parametric Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Chapter 31Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.
Dirichlet Distribution
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Linear-Quadratic-Gaussian Problem Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Feb. 5, 2001.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Chapter 6 Sampling and Sampling Distributions
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.
Bayesian Estimation and Confidence Intervals
Bayesian statistics So far we have thought of probabilities as the long term “success frequency”: #successes / #trails → P(success). In Bayesian statistics.
Ch3: Model Building through Regression
Course on Bayesian Methods in Environmental Valuation
OVERVIEW OF BAYESIAN INFERENCE: PART 1
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
More about Posterior Distributions
Statistical NLP: Lecture 4
Example Human males have one X-chromosome and one Y-chromosome,
LECTURE 07: BAYESIAN ESTIMATION
Parametric Methods Berlin Chen, 2005 References:
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions July chonbuk national university.
CS639: Data Management for Data Science
Kalman Filter: Bayes Interpretation
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000

2 Outline  Conditional Density  Bayes Rule  Conjugate Distribution  Example  Other Conjugate Distributions  Application

3 Conditional Density  The conditional probability density of w happening given x has occurred, assume p x (x)  0: N

4 Bayes Rule  Replace the joint probability density function with the bottom equation from page 3: N

5 Conjugate Distribution  W: parameter of interest in some system  X: the independent and identical observation on the system  Since we know the model of the system, the conditional density of X|W could be easily computed, e.g., N

6 Conjugate Distribution (cont.)  If the prior distribution of W belong to a family, for any size n and any values of the observations in the sample, the posterior distribution of W must also belong to the same family. This family is called a conjugate family of distributions. N

7 Example  An urn of white and red balls with unknown w being the fraction of the balls that are red.  Assume we can take n sample, X 1, …, X n, from the urn, with replacement, e.g, n i.i.d. samples. This is a Bernoulli distribution. N

8 Example (cont.)  Total number of red ball out of n trials, Y = X 1 + … + X n, has the binomial distribution  Assume the prior dist. of w is beta distribution with parameters  and  N

9 Example (cont.)  The posterior distribution of W is which is also a beta distribution.

10 Example (cont.)  Updating formula: ’ =  + y Posterior (new) parameter = prior (old) parameter + # of red balls ’ =  + (n – y) Posterior (new) parameter = prior (old) parameter + # of white balls N

11 Other Conjugate Distributions  The observations forms a Poisson distribution with an unknown value of the mean w.  The prior distribution of w is a gamma distribution with parameters  and .  The posterior is also a gamma distribution with parameters and  + n.  Updating formula: ’ =  + y ’ =  + n N

12 Other Conjugate Distributions (cont.)  The observations forms a negative binomial distribution with a specified r value and an unknown value of the mean w.  The prior distribution of w is a beta distribution with parameters  and .  The posterior is also a beta distribution with parameters  + rn and.  Updating formula: ’ =  + rn ’ =  + y N

13 Other Conjugate Distributions (cont.)  The observations forms a normal distribution with an unknown value of the mean w and specified precision r.  The prior distribution of w is a normal distribution with mean  and precision .  The posterior is also a normal distribution with mean and precision  + nr.  Updating formula: N

14 Other Conjugate Distributions (cont.)  The observations forms a normal distribution with the specified mean m and unknown precision w.  The prior distribution of w is a gamma distribution with parameters  and .  The posterior is also a gamma distribution with parameters and.  Updating formula: ’ =  + n/2 ’ =  + ½ N

15 Summary of the Conjugate Distributions ObservationsPriorPosterior BernoulliBeta PoissonGamma Negative binominal Beta Normal Gamma N

16 Application  Estimate the state of the system based on the observations: Kalman filter. N

17 References: DeGroot, M. H., Optimal Statistical Decisions, McGraw-Hill, Ho, Y.-C., Lecture Notes, Harvard University, Larsen, R. J. and M. L. Marx, An Introduction to Mathematical Statistics and Its Applications, Prentice Hall, 1986.