Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000.

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

LECTURE 11: BAYESIAN PARAMETER ESTIMATION

Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,

Chapter 1 Probability Theory (i) : One Random Variable

Assuming normally distributed data! Naïve Bayes Classifier.

Estimation of parameters. Maximum likelihood What has happened was most likely.

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )

Results 2 (cont’d) c) Long term observational data on the duration of effective response Observational data on n=50 has EVSI = £867 d) Collect data on.

Presenting: Assaf Tzabari

CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference.

Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.

Discrete Probability Distributions

Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.

Learning Bayesian Networks (From David Heckerman’s tutorial)

Computer vision: models, learning and inference Chapter 6 Learning and Inference in Vision.

Chapter Two Probability Distributions: Discrete Variables

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.

PBG 650 Advanced Plant Breeding

ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:

Statistical Decision Theory

Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.

Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.

Pemodelan Kualitas Proses Kode Matakuliah: I0092 – Statistik Pengendalian Kualitas Pertemuan : 2.

1 A Bayesian statistical method for particle identification in shower counters IX International Workshop on Advanced Computing and Analysis Techniques.

Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.

- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.

Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.

1 Francisco José Vázquez Polo [ Miguel Ángel Negrín Hernández [ {fjvpolo or

Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……

Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.

IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.

1 Bayesian Essentials Slides by Peter Rossi and David Madigan.

The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.

Decision Analysis Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Jan. 24, 2001.

1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.

Machine Learning 5. Parametric Methods.

Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.

Chapter 31Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.

Dirichlet Distribution

Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.

Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.

Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)

Linear-Quadratic-Gaussian Problem Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Feb. 5, 2001.

Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:

Chapter 6 Sampling and Sampling Distributions

A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.

Bayesian Estimation and Confidence Intervals Lecture XXII.

Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.

Bayesian Estimation and Confidence Intervals

Bayesian statistics So far we have thought of probabilities as the long term “success frequency”: #successes / #trails → P(success). In Bayesian statistics.

Ch3: Model Building through Regression

Course on Bayesian Methods in Environmental Valuation

OVERVIEW OF BAYESIAN INFERENCE: PART 1

Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.

More about Posterior Distributions

Statistical NLP: Lecture 4

Example Human males have one X-chromosome and one Y-chromosome,

LECTURE 07: BAYESIAN ESTIMATION

Parametric Methods Berlin Chen, 2005 References:

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions July chonbuk national university.

CS639: Data Management for Data Science

Kalman Filter: Bayes Interpretation

Mathematical Foundations of BME Reza Shadmehr

Presentation transcript:

Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000

2 Outline  Conditional Density  Bayes Rule  Conjugate Distribution  Example  Other Conjugate Distributions  Application

3 Conditional Density  The conditional probability density of w happening given x has occurred, assume p x (x)  0: N

4 Bayes Rule  Replace the joint probability density function with the bottom equation from page 3: N

5 Conjugate Distribution  W: parameter of interest in some system  X: the independent and identical observation on the system  Since we know the model of the system, the conditional density of X|W could be easily computed, e.g., N

6 Conjugate Distribution (cont.)  If the prior distribution of W belong to a family, for any size n and any values of the observations in the sample, the posterior distribution of W must also belong to the same family. This family is called a conjugate family of distributions. N

7 Example  An urn of white and red balls with unknown w being the fraction of the balls that are red.  Assume we can take n sample, X 1, …, X n, from the urn, with replacement, e.g, n i.i.d. samples. This is a Bernoulli distribution. N

8 Example (cont.)  Total number of red ball out of n trials, Y = X 1 + … + X n, has the binomial distribution  Assume the prior dist. of w is beta distribution with parameters  and  N

9 Example (cont.)  The posterior distribution of W is which is also a beta distribution.

10 Example (cont.)  Updating formula: ’ =  + y Posterior (new) parameter = prior (old) parameter + # of red balls ’ =  + (n – y) Posterior (new) parameter = prior (old) parameter + # of white balls N

11 Other Conjugate Distributions  The observations forms a Poisson distribution with an unknown value of the mean w.  The prior distribution of w is a gamma distribution with parameters  and .  The posterior is also a gamma distribution with parameters and  + n.  Updating formula: ’ =  + y ’ =  + n N

12 Other Conjugate Distributions (cont.)  The observations forms a negative binomial distribution with a specified r value and an unknown value of the mean w.  The prior distribution of w is a beta distribution with parameters  and .  The posterior is also a beta distribution with parameters  + rn and.  Updating formula: ’ =  + rn ’ =  + y N

13 Other Conjugate Distributions (cont.)  The observations forms a normal distribution with an unknown value of the mean w and specified precision r.  The prior distribution of w is a normal distribution with mean  and precision .  The posterior is also a normal distribution with mean and precision  + nr.  Updating formula: N

14 Other Conjugate Distributions (cont.)  The observations forms a normal distribution with the specified mean m and unknown precision w.  The prior distribution of w is a gamma distribution with parameters  and .  The posterior is also a gamma distribution with parameters and.  Updating formula: ’ =  + n/2 ’ =  + ½ N

15 Summary of the Conjugate Distributions ObservationsPriorPosterior BernoulliBeta PoissonGamma Negative binominal Beta Normal Gamma N

16 Application  Estimate the state of the system based on the observations: Kalman filter. N

17 References: DeGroot, M. H., Optimal Statistical Decisions, McGraw-Hill, Ho, Y.-C., Lecture Notes, Harvard University, Larsen, R. J. and M. L. Marx, An Introduction to Mathematical Statistics and Its Applications, Prentice Hall, 1986.