Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/4/12 Bayesian Inference SECTION 11.1, 11.2 More probability rules (11.1)

Slides:



Advertisements
Similar presentations
Introducing Hypothesis Tests
Advertisements

Hypothesis Testing: Intervals and Tests
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Inference Sampling distributions Hypothesis testing.
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Business Statistics for Managerial Decision
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Ch. 9 Fundamental of Hypothesis Testing
Dr. Kari Lock Morgan Department of Statistics Penn State University Teaching the Common Core: Making Inferences and Justifying Conclusions ASA Webinar.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Bayesian Inference SECTION 11.1, 11.2 Bayes rule (11.2) Bayesian inference.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Probability SECTION 11.1 Events and, or, if Disjoint Independent Law of total.
Review: Probability Random variables, events Axioms of probability
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Probability SECTIONS 11.1 Probability (11.1) Odds, odds ratio (not in book)
Fall 2012Biostat 5110 (Biostatistics 511) Discussion Section Week 8 C. Jason Liang Medical Biometry I.
ANOVA 3/19/12 Mini Review of simulation versus formulas and theoretical distributions Analysis of Variance (ANOVA) to compare means: testing for a difference.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Tue, Oct 23, 2007.
More Randomization Distributions, Connections
An Intuitive Explanation of Bayes' Theorem By Eliezer Yudkowsky.
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Bayes for Beginners Presenters: Shuman ji & Nick Todd.
Lecture 5a: Bayes’ Rule Class web site: DEA in Bioinformatics: Statistics Module Box 1Box 2Box 3.
Concept of Power ture=player_detailpage&v=7yeA7a0u S3A.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/1/12 ANOVA SECTION 8.1 Testing for a difference in means across multiple.
Bayesian Inference I 4/23/12 Law of total probability Bayes Rule Section 11.2 (pdf)pdf Professor Kari Lock Morgan Duke University.
Hypotheses tests for means
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Section 10.3 Hypothesis Testing for Means (Large Samples) HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant.
S-012 Testing statistical hypotheses The CI approach The NHST approach.
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
MATH 2400 Ch. 15 Notes.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/6/12 Simple Linear Regression SECTIONS 9.1, 9.3 Inference for slope (9.1)
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Lecture: Forensic Evidence and Probability Characteristics of evidence Class characteristics Individual characteristics  features that place the item.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/6/12 Synthesis Big Picture Essential Synthesis Bayesian Inference (continued)
Populations III: evidence, uncertainty, and decisions Bio 415/615.
Review: Probability Random variables, events Axioms of probability Atomic events Joint and marginal probability distributions Conditional probability distributions.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Probability SECTIONS 11.1, 11.2 Probability (11.1, 11.2) Odds, Odds Ratio.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : χ.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Wed, Nov 1, 2006.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/29/12 Probability SECTION 11.1 Events and, or, if Disjoint Independent.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Bayes Theorem, a.k.a. Bayes Rule
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : 
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Estimation: Confidence Intervals SECTION 3.2 Confidence Intervals (3.2)
Probability 4/18/12 Probability Conditional probability Disjoint events Independent events Section 11.1 (pdf)pdf Professor Kari Lock Morgan Duke University.
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
HL2 Math - Santowski Lesson 93 – Bayes’ Theorem. Bayes’ Theorem  Main theorem: Suppose we know We would like to use this information to find if possible.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
Test of a Population Median. The Population Median (  ) The population median ( , P 50 ) is defined for population T as the value for which the following.
Test of a Population Median. The Population Median (  ) The population median ( , P 50 ) is defined for population T as the value for which the following.
Unit 5: Hypothesis Testing
Significance Tests: The Basics
Lecture: Forensic Evidence and Probability Characteristics of evidence
Significance Tests: The Basics
Wellcome Trust Centre for Neuroimaging
CS639: Data Management for Data Science
Section 11.1: Significance Tests: Basics
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 12/4/12 Bayesian Inference SECTION 11.1, 11.2 More probability rules (11.1) Bayes rule (11.2) Bayesian inference (not in book)

Statistics: Unlocking the Power of Data Lock 5 Posters Good job! 5 minutes means 5 minutes Names on poster I am teaching faculty – for prediction intervals Focus on what this tells you about course evaluations (real world conclusions) Significance is not all that matters – which is higher? (gender, ethnicity, rank, etc.) I have your posters – stop by office hours if you want them back

Statistics: Unlocking the Power of Data Lock 5 P(A and B)

Statistics: Unlocking the Power of Data Lock 5 Duke Rank and Experience 60% of STAT 101 students rank their Duke experience as “Excellent,” and Duke was the first choice school for 59% of those who ranked their Duke experience as excellent. What percentage of STAT 101 students had Duke as a first choice and rank their experience here as excellent? a)60% b)59% c)35% d)41% P( first choice and excellent) = P(first choice if excellent)P(excellent) = 0.59  0.60 = 0.354

Statistics: Unlocking the Power of Data Lock 5 Summary

Statistics: Unlocking the Power of Data Lock 5 A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer? a) 0-10% b) 10-25% c) 25-50% d) 50-75% e) % Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock 5 1% of women at age 40 who participate in routine screening have breast cancer. 80% of women with breast cancer get positive mammographies. 9.6% of women without breast cancer get positive mammographies. A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer? Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock 5 A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer? a) 0-10% b) 10-25% c) 25-50% d) 50-75% e) % Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock 5 Breast Cancer Screening A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer? What is this asking for? a)P(cancer if positive mammography) b)P(positive mammography if cancer) c)P(positive mammography if no cancer) d)P(positive mammography) e)P(cancer)

Statistics: Unlocking the Power of Data Lock 5 Bayes Rule We know P(positive mammography if cancer)… how do we get to P(cancer if positive mammography)? How do we go from P(A if B) to P(B if A)?

Statistics: Unlocking the Power of Data Lock 5 Bayes Rule <- Bayes Rule

Statistics: Unlocking the Power of Data Lock 5 Rev. Thomas Bayes

Statistics: Unlocking the Power of Data Lock 5 Breast Cancer Screening 1% of women at age 40 who participate in routine screening have breast cancer. 80% of women with breast cancer get positive mammographies. 9.6% of women without breast cancer get positive mammographies.

Statistics: Unlocking the Power of Data Lock 5 P(positive) 1.Use the law of total probability to find P(positive). 2.Use Bayes Rule to find P(cancer if positive)

Statistics: Unlocking the Power of Data Lock 5 CancerCancer-free Positive Result Negative Result If we randomly pick a ball from the Cancer bin, it’s more likely to be red/positive. If we randomly pick a ball the Cancer-free bin, it’s more likely to be green/negative. Everyone We randomly pick a ball from the Everyone bin. C C C C C FFFFFFFFFFFF FFFFFFFFFFF FFFFFFFFFF FFFFFFFFF FFFFFFFF FFFFFFF FFFFFF FFFFF If the ball is red/positive, is it more likely to be from the Cancer or Cancer-free bin?

Statistics: Unlocking the Power of Data Lock 5 100,000 women in the population 1% Thus, 800/(800+9,504) = 7.8% of positive results have cancer 1000 have cancer99,000 cancer-free 99% 80%20% 800 test positive 200 test negative 9.6%90.4% 9,504 test positive 89,496 test negative

Statistics: Unlocking the Power of Data Lock 5 Hypotheses H 0 : no cancer H a : cancer Data: positive mammography p-value = P(statistic as extreme as observed if H 0 true) = P(positive mammography if no cancer) = The probability of getting a positive mammography just by random chance, if the woman does not have cancer, is

Statistics: Unlocking the Power of Data Lock 5 Hypotheses H 0 : no cancer H a : cancer Data: positive mammography You don’t really want the p-value, you want the probability that the woman has cancer! You want P(H 0 true if data), not P(data if H 0 true)

Statistics: Unlocking the Power of Data Lock 5 Hypotheses H 0 : no cancer H a : cancer Data: positive mammography Using Bayes Rule: P(H a true if data) = P(cancer if data) = P(H 0 true if data) = P(no cancer | data) = This tells a very different story than a p-value of 0.096!

Statistics: Unlocking the Power of Data Lock 5 Frequentist Inference Frequentist Inference considers what would happen if the data collection process (sampling or experiment) was repeated many times Probability is considered to be the proportion of times an event would happen if repeated many times In frequentist inference, we condition on some unknown truth, and find the probability of our data given this unknown truth

Statistics: Unlocking the Power of Data Lock 5 Frequentist Inference Everything we have done so far in class is based on frequentist inference A confidence interval is created to capture the truth for a specified proportion of all samples A p-value is the proportion of times you would get results as extreme as those observed, if the null hypothesis were true

Statistics: Unlocking the Power of Data Lock 5 Bayesian Inference Bayesian inference does not think about repeated sampling or repeating the experiment, but only what you can tell from your single observed data set Probability is considered to be the subjective degree of belief in some statement In Bayesian inference we condition on the data, and find the probability of some unknown parameter, given the data

Statistics: Unlocking the Power of Data Lock 5 Fixed and Random In frequentist inference, the parameter is considered fixed and the sample statistic is random In Bayesian inference, the statistic is considered fixed, and the parameter is considered random

Statistics: Unlocking the Power of Data Lock 5 Bayesian Inference Frequentist: P(data if truth) Bayesian: P(truth if data) How are they connected?

Statistics: Unlocking the Power of Data Lock 5 Bayesian Inference PRIOR Probability POSTERIOR Probability Prior probability: probability of a statement being true, before looking at the data Posterior probability: probability of the statement being true, after updating the prior probability based on the data

Statistics: Unlocking the Power of Data Lock 5 Breast Cancer Before getting the positive result from her mammography, the prior probability that the woman has breast cancer is 1% Given data (the positive mammography), update this probability using Bayes rule: The posterior probability of her having breast cancer is

Statistics: Unlocking the Power of Data Lock 5 Paternity A woman is pregnant. However, she slept with two different guys (call them Al and Bob) close to the time of conception, and does not know who the father is. What is the prior probability that Al is the father? The baby is born with blue eyes. Al has brown eyes and Bob has blue eyes. Update based on this information to find the posterior probability that Al is the father.

Statistics: Unlocking the Power of Data Lock 5 Eye Color In reality eye color comes from several genes, and there are several possibilities but let’s simplify here: Brown is dominant, blue is recessive One gene comes from each parent BB, bB, Bb would all result in brown eyes Only bb results in blue eyes To make it a bit easier: You know that Al’s mother and the mother of the child both have blue eyes.

Statistics: Unlocking the Power of Data Lock 5 Paternity What is the probability that Al is the father? a)1/2 b)1/3 c)1/4 d)1/5 e)No idea

Statistics: Unlocking the Power of Data Lock 5 Paternity 1/2 P(blue eyes) = P(blue eyes and Al) + P(blue eyes and Bob) = P(blue eyes if Al) × P(Al) + P(blue eyes if Bob) × P(Bob) = 1/2 × 1/2 + 1 × 1/2 = 3/4

Statistics: Unlocking the Power of Data Lock 5 Bayesian Inference Why isn’t everyone a Bayesian? Need some “prior belief” for the probability of the truth Also, until recently, it was hard to be a Bayesian (needed complicated math.) Now, we can let computers do the work for us! ???

Statistics: Unlocking the Power of Data Lock 5 Inference Both kinds of inference have the same goal, and it is a goal fundamental to statistics: to use information from the data to gain information about the unknown truth

Statistics: Unlocking the Power of Data Lock 5 To Do Read 11.2 Do Project 2 (paper due 12/6)Project 2 Do Homework 9 (all practice problems)Homework 9

Statistics: Unlocking the Power of Data Lock 5 Course Evaluations