Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
AP Statistics Section 11.2 B. A 95% confidence interval captures the true value of in 95% of all samples. If we are 95% confident that the true lies in.
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Bayesian Estimation in MARK
A Brief Introduction to Bayesian Inference Robert Van Dine 1.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
Binomial Distribution & Bayes’ Theorem. Questions What is a probability? What is the probability of obtaining 2 heads in 4 coin tosses? What is the probability.
Inferential Statistics & Hypothesis Testing
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
3.3 Toward Statistical Inference. What is statistical inference? Statistical inference is using a fact about a sample to estimate the truth about the.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Statistics Lecture 20. Last Day…completed 5.1 Today Parts of Section 5.3 and 5.4.
LARGE SAMPLE TESTS ON PROPORTIONS
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Statistics 201 – Lecture 23. Confidence Intervals Re-cap 1.Estimate the population mean with sample mean Know sample mean is unbiased estimator for 
Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology
Inferences About Process Quality
Statistical Inference Lab Three. Bernoulli to Normal Through Binomial One flip Fair coin Heads Tails Random Variable: k, # of heads p=0.5 1-p=0.5 For.
1 Basic Probability Statistics 515 Lecture Importance of Probability Modeling randomness and measuring uncertainty Describing the distributions.
Standard error of estimate & Confidence interval.
The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,
Binomial and Related Distributions 學生 : 黃柏舜 學號 : 授課老師 : 蔡章仁.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Mid-Term Review Final Review Statistical for Business (1)(2)
Lecture 5a: Bayes’ Rule Class web site: DEA in Bioinformatics: Statistics Module Box 1Box 2Box 3.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Determination of Sample Size: A Review of Statistical Theory
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Bayesian Prior and Posterior Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 24, 2000.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Psychology 202a Advanced Psychological Statistics September 22, 2015.
© Copyright McGraw-Hill 2004
MPS/MSc in StatisticsAdaptive & Bayesian - Lect 71 Lecture 7 Bayesian methods: a refresher 7.1 Principles of the Bayesian approach 7.2 The beta distribution.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Course on Bayesian Methods in Environmental Valuation
Bayes Theorem, a.k.a. Bayes Rule
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Oliver Schulte Machine Learning 726
One-Sample Inference for Proportions
Bayesian approach to the binomial distribution with a discrete prior
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Bayes Net Learning: Bayesian Approaches
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
Binomial Distribution & Bayes’ Theorem
More about Posterior Distributions
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
LECTURE 07: BAYESIAN ESTIMATION
Bayes for Beginners Luca Chech and Jolanda Malamud
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated trials  Degree of belief (subjective) Personal measure of uncertainty Problems  The chance that a meteor strikes earth is 1%  The probability of rain today is 30%  The chance of getting an A on the exam is 50%

Problems of statistical inference Ho: θ=1 versus Ha: θ>1 Classical approach  P-value = P(Data | θ=1)  P-value is NOT P(Null hypothesis is true)  Confidence interval [a, b] : What does it mean? But scientist wants to know:  P(θ=1 | Data)  P(Ho is true) = ? Problem  θ “not random”

Bayesian statistics Fundamental change in philosophy Θ assumed to be a random variable Allows us to assign a probability distribution for θ based on prior information 95% “confidence” interval [1.34 < θ < 2.97] means what we “want” it to mean: P(1.34 < θ < 2.97) = 95% P-values mean what we want them to mean: P(Null hypothesis is false)

Estimating P(Heads) for a biased coin Parameter p Data: 0, 0, 0, 1, 0, 1, 0, 0, 1, 0 p = 3/10 = 0.3 But what if we believe coin is biased in favor of low probabilities? How to incorporate prior beliefs into model We’ll see that p-hat =.22

Bayes Theorem

Example Population has 10% liars Lie Detector gets it “right” 90% of the time. Let A = {Actual Liar}, Let R = {Lie Detector reports you are Liar} Lie Detector reports suspect is a liar. What is probability that suspect actually is a liar?

More general form of Bayes Theorem

Example Three urns Urn A: 1 red, 1 blue Urn B: 2 reds, 1 blue Urn C: 2 reds, 3 blues Roll a fair die. If it’s 1, pick Urn A. If 2 or 3, pick Urn B. If 4, 5, 6, pick Urn C. Then choose one ball. A ball was chosen and it’s red. What’s the probability it came from Urn C?

Bayes Theorem for Statistics Let θ represent parameter(s) Let X represent data Left-hand side is a function of θ Denominator on right-hand side does not depend on θ Posterior distributionLikelihood x Prior distribution Posterior dist’n = Constant x Likelihood x Prior dist’n Equation can be understood at the level of densities Goal: Explore the posterior distribution of θ

A simple estimation example Biased coin estimation: P(Heads) = p = ? 0-1 i.i.d. Bernoulli(p) trials Let be the number of heads in n trials Likelihood is For prior distribution use uninformative prior  Uniform distribution on (0,1): f(p) = 1 So posterior distribution is proportional to f(X|p)f(p) = f(p|X)

Coin estimation (cont’d) Posterior density of the form f(p)=Cp x (1-p) n-x Beta distribution: Parameters x+1 and n-x+1 n.html n.html Data: 0, 0, 1, 0, 0, 0, 0, 1, 0, 1 n=10 and x=3 Posterior dist’n is Beta(3+1,7+1) = Beta(4,8)

Coin estimation (cont’d) Posterior dist’n: Beta(4,8) Mean: 0.33 Mode: 0.30 Median: qbeta(.025,4,8), qbeta(.975,4,8) = [.11,.61] gives 95% credible interval for p P(.11 < p <.61|X) =.95

Prior distribution Choice of beta distribution for prior

Posterior Likelihood x Prior = [ p x (1-p) n-x ] [ p a+1 (1-p) b+1 ] = p x+a+1 (1-p) n-x+b+1 Posterior distribution is Beta(x+a, n-x+b)

Prior distributions Posterior summaries:  Mean = (x+a)/(n+a+b)  Mode = (x+a-1)/(n+a+b-2)  Quantiles can be computed by integrating the beta density For this example, prior and posterior distributions have same general form Priors which have the same form as the posteriors are called conjugate priors

Data example Maternal condition placenta previa Unusual condition of pregnancy where placenta is implanted very low in uterus preventing normal delivery Is this related to the sex of the baby? Proportion of female births in general population is Early study in Germany found that in 980 placenta previa births, 437 were female (0.4459) Ho: p = versus Ha: p < 0.485

Placenta previa births Assume uniform prior Beta(1,1) Posterior is Beta(438,544) Posterior summaries  Mean = 0.446, Standard Deviation =  95% confidence interval: [ qbeta(.025,438,544), qbeta(.975,438,544) ] = [.415,.477 ]

Sensitivity of Prior Suppose we took a prior more concentrated about the null hypothesis value E.g., Prior ~ Normal(.485,.01) Posterior proportional to Constant of integration is about Mean, summary statistics, confidence intervals, etc., require numerical methods See S-script: ourses/275w05/Scripts/Bayes.ssc ourses/275w05/Scripts/Bayes.ssc