HUDM4122 Probability and Statistical Inference

Slides:



Advertisements
Similar presentations
AP Statistics: Section 10.1 A Confidence interval Basics.
Advertisements

Sampling: Final and Initial Sample Size Determination
HUDM4122 Probability and Statistical Inference March 30, 2015.
Confidence Intervals for Proportions
Chapter Six Sampling Distributions McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1 A heart fills with loving kindness is a likeable person indeed.
The Normal Curve and Sampling A.A sample will always be different from the true population B.This is called “sampling error” C.The difference between a.
Population Proportion The fraction of values in a population which have a specific attribute p = Population proportion X = Number of items having the attribute.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Standard error of estimate & Confidence interval.
Chapter 11: Estimation Estimation Defined Confidence Levels
Dan Piett STAT West Virginia University
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Ch 6 Introduction to Formal Statistical Inference
Statistical estimation, confidence intervals
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Determination of Sample Size: A Review of Statistical Theory
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
BUS304 – Chapter 6 Sample mean1 Chapter 6 Sample mean  In statistics, we are often interested in finding the population mean (µ):  Average Household.
1 Section 10.1 Estimating with Confidence AP Statistics January 2013.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Confidence Intervals Inferences about Population Means and Proportions.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Section 9.2: Large-Sample Confidence Interval for a Population Proportion.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Essential Statistics Chapter 191 Comparing Two Proportions.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Chapter 8: Estimating with Confidence
Confidence Intervals for Proportions
Chapter 8: Estimating with Confidence
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
Sampling Distributions
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
LECTURE 24 TUESDAY, 17 November
Sample Size Determination
Tests of Significance The reasoning of significance tests
Estimating with Confidence: Means and Proportions
Confidence Intervals for Proportions
Confidence Intervals for Proportions
Inferences Based on a Single Sample
Confidence Interval Estimation for a Population Proportion
Chapter 7 Sampling Distributions.
HUDM4122 Probability and Statistical Inference
Week 10 Chapter 16. Confidence Intervals for Proportions
Estimating
Statistics in Applied Science and Technology
Chapter 7 Sampling Distributions.
Statistics Confidence Intervals
Chapter 7 Sampling Distributions.
Chapter 8: Estimating with Confidence
Hypothesis Testing II ?10/10/1977?.
Chapter 7 Sampling Distributions.
Confidence Intervals for Proportions
Chapter 8: Estimating with Confidence
Confidence Intervals for Proportions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Confidence Intervals for Proportions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Confidence Intervals for Proportions
Chapter 7 Sampling Distributions.
The Normal Distribution
Chapter 8: Estimating with Confidence
How Confident Are You?.
Presentation transcript:

HUDM4122 Probability and Statistical Inference April 1, 2015

Questions from last class?

Continuing from last class…

Another way to think about Confidence Intervals For a 95% Confidence Interval 5% of the time , the true value will be outside the confidence interval

Another way to think about Confidence Intervals For a 95% Confidence Interval 5% of the time , the true value will be outside the confidence interval We can write this proportion of 5% as a If a is 0.05, then we have a 95% Confidence Interval

So, for a 95% Confidence Interval a is 0.05, Cumulative Normal Probability is from 0.025 to 0.975 Meaning that Confidence Intervals bounds are -1.96 SE and +1.96 SE These values are called − 𝒁 ∝/𝟐 and + 𝒁 ∝/𝟐 − 𝒁 ∝/𝟐 =−𝟏.𝟗𝟔 + 𝒁 ∝/𝟐 = + 1.96

Why is it 𝑍 ∝/2 ? Because to get a 95% confidence interval ∝ = 0.05 0.975 – 0.025 = 0.95 0.025 = ∝/2 0.975 = 1-(∝/2)

So, formally For given a The confidence intervals are 𝑥 ± 𝑍 ∝/2 𝜎 𝑛 Where 𝜎 𝑛 is just your standard error And we can use s (sample standard deviation) for 𝜎 whenever the sample is sufficiently large

Regarding this whole “sufficiently large” thing MBB say “sufficiently large” is when N>30 That’s probably reasonable We’ll come back to this issue in a couple weeks when we discuss the difference between Z statistical tests and t statistical tests

Questions? Comments?

For a 99% Confidence Interval a is 0.01, Cumulative Normal Probability is from 0.005 to 0.995 Look in your table to get 0.005 and 0.995 That’s -2.57 and +2.57 Meaning that Confidence Intervals bounds are -2.57 SE and +2.57 SE − 𝑍 ∝/2 =−2.57 + 𝑍 ∝/2 = + 2.57

You try it: 90% Confidence Interval What are the bounds?

You try it: 90% Confidence Interval a is 0.10, Cumulative Normal Probability is from 0.05 to 0.95 Look in your table to get 0.05 and 0.95 That’s -1.64 and +1.64 Meaning that Confidence Intervals bounds are -1.64 SE and +1.64 SE

You try it: 90% Confidence Interval a is 0.10, Cumulative Normal Probability is from 0.05 to 0.95 Look in your table to get 0.05 and 0.95 That’s -1.64 and +1.64 − 𝑍 ∝/2 =−1.64 + 𝑍 ∝/2 = + 1.64

Comments? Questions?

Note The smaller your a The larger your % Confidence Interval In other words, the bigger your interval The more certain you are And the smaller your interval The less certain you are

What you can adjust You can adjust You can’t adjust Your level of certainty Your sample size You can’t adjust The population mean The population standard deviation

So if you want to be more certain Get a bigger sample size

So if you want to be more certain Get a bigger sample size Which is not always easy 

On to today’s class…

Today Chapter 8.6-8.7 in Mendenhall, Beaver, & Beaver Estimating Differences Between Means and Proportions

Often… We don’t just want to estimate one mean or proportion We want to estimate the difference between two means or proportions

Examples You conduct a randomized experiment on the effectiveness of ASSISTments versus Dreambox. Which one has higher learning gains? You test out two medicines in a randomized experiment. Which one leads to a higher proportion of patients surviving? You test the SAT scores of students who take honors classes versus regular classes. Which group has higher SAT scores?

Formally Once assigned and treated differently, these become two populations Population 1 Population 2

Each has their own set of statistics

What is the difference between means?

What is the difference between means? We know already that each group’s mean can be treated as a single value, a point estimate Or as a distribution

What is the difference between means? So it stands to reason that the difference between group means Can be treated as a single value, a point estimate Or as a distribution

What is the difference between means? Point estimate of the difference Is the difference between the two point estimates of the mean E.g. our best estimate of ( 𝜇 1 − 𝜇 2 ) is ( 𝑥 1 − 𝑥 2 )

We can call this value The mean difference between groups Or the most likely value for the difference between groups

Example Students using ASSISTments gain 30 points pre-post. Students using Dreambox gain 5 points pre-post. What is the mean difference in learning gains? 30-5 = 25 points

You try it You test the SAT scores of students who take honors classes versus regular classes. Honors students average 650. Regular students average 480. What is the mean difference in SAT?

You try it You test the SAT scores of students who take honors classes versus regular classes. Honors students average 650. Regular students average 480. What is the mean difference in SAT? 170 points

The Standard Error For the difference between groups For sufficiently large samples 𝑆𝐸= 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2

Example Students using ASSISTments gain 30 points pre-post, with a standard deviation of 10 points. Students using Dreambox gain 5 points pre-post, with a standard deviation of 8 points. There are 30 students in each condition. What is the standard error for the mean difference in learning gains? 𝑆𝐸= 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 = 10 2 30 + 8 2 30

Example Students using ASSISTments gain 30 points pre-post, with a standard deviation of 10 points. Students using Dreambox gain 5 points pre-post, with a standard deviation of 8 points. There are 30 students in each condition. What is the standard error for the mean difference in learning gains? 10 2 30 + 8 2 30 = 100 30 + 64 30 = 2.34

You try it You test the SAT scores of students who take honors classes versus regular classes. In your school, honors students average 650, with standard deviation 40. Regular students average 480, with standard deviation 20. Your school has 40 honors students and 200 regular students. What is the standard error for the mean difference in SAT?

𝑆𝐸= 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2 = 40 2 40 + 20 2 200 You test the SAT scores of students who take honors classes versus regular classes. In your school, honors students average 650, with standard deviation 40. Regular students average 480, with standard deviation 20. Your school has 40 honors students and 200 regular students. What is the standard error for the mean difference in SAT?

40 2 40 + 20 2 200 = 1600 40 + 400 200 = 40+2 You test the SAT scores of students who take honors classes versus regular classes. In your school, honors students average 650, with standard deviation 40. Regular students average 480, with standard deviation 20. Your school has 40 honors students and 200 regular students. What is the standard error for the mean difference in SAT?

40+2 = 42 = 6.5 You test the SAT scores of students who take honors classes versus regular classes. In your school, honors students average 650, with standard deviation 40. Regular students average 480, with standard deviation 20. Your school has 40 honors students and 200 regular students. What is the standard error for the mean difference in SAT?

By Central Limit Theorem The sampling distribution of 𝑥 1 − 𝑥 2 Is approximately normal when Both n1 and n2 are > 30 𝑆𝐸= 𝑠 1 2 𝑛 1 + 𝑠 2 2 𝑛 2

We’ll talk about cases with smaller data sets Later in the semester

Questions? Comments?

Since it is normally distributed You can compute the 95% Confidence Interval as we discussed in the last class

Example Students using ASSISTments gain 30 points pre-post, with a standard deviation of 10 points. Students using Dreambox gain 5 points pre-post, with a standard deviation of 8 points. There are 30 students in each condition. What is the standard error for the mean difference in learning gains? 𝑥 1 − 𝑥 2 =25, 𝑆𝐸= 2.34

Example Students using ASSISTments gain 30 points pre-post, with a standard deviation of 10 points. Students using Dreambox gain 5 points pre-post, with a standard deviation of 8 points. There are 30 students in each condition. What is the standard error for the mean difference in learning gains? 𝑥 1 − 𝑥 2 =25, 𝑆𝐸= 2.34 95% CI = [25-(1.96)(2.34), 25+(1.96)(2.34)]

Example Students using ASSISTments gain 30 points pre-post, with a standard deviation of 10 points. Students using Dreambox gain 5 points pre-post, with a standard deviation of 8 points. There are 30 students in each condition. What is the standard error for the mean difference in learning gains? 𝑥 1 − 𝑥 2 =25, 𝑆𝐸= 2.34 95% CI = 20.41, 29.59

You try it You test the SAT scores of students who take honors classes versus regular classes. In your school, honors students average 650, with standard deviation 40. Regular students average 480, with standard deviation 20. Your school has 40 honors students and 200 regular students. What is the standard error for the mean difference in SAT? 𝑥 1 − 𝑥 2 =170, 𝑆𝐸=6.5

Estimating the Difference Between Two Proportions Coming from a binomial distribution

What is the difference between means?

What is the difference between means? We know already that each group’s mean proportion can be treated as a single value, a point estimate Or as a distribution

What is the difference between means? So it stands to reason that the difference between group mean proportions Can be treated as a single value, a point estimate Or as a distribution

What is the difference between means? Point estimate of the difference Is the difference between the two point estimates of the mean proportions E.g. our best estimate of ( 𝑝 1 − 𝑝 2 ) is the difference between the sample proportions ( 𝑝 1 − 𝑝 2 )

Example You test out two medicines in a randomized experiment. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the mean difference in proportion of survival? 80%-30%=50%

Example You compare students in two sections of HUDM4122. If they later take HUDM5122, 85% of students in Section A pass HUDM5122, while only 80% of students in Section B pass HUDM5122. What is the mean difference in pass rates? 85%-80%=5%

The Standard Error For the difference between proportions For sufficiently large samples 𝑆𝐸= 𝑝 1 𝑞 1 𝑛 1 + 𝑝 2 𝑞 2 𝑛 2

By Central Limit Theoreom The sampling distribution of ( 𝑝 1 − 𝑝 2 ) Is approximately normal when samples are sufficiently large 𝑛 1 𝑝 1 > 5 AND 𝑛 1 𝑞 1 > 5 AND 𝑛 2 𝑝 2 > 5 AND 𝑛 2 𝑞 2 > 5 𝑆𝐸= 𝑝 1 𝑞 1 𝑛 1 + 𝑝 2 𝑞 2 𝑛 2

Example You test out two medicines in a randomized experiment, with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the standard error of the mean difference between proportions?

𝑆𝐸= 𝑝 1 𝑞 1 𝑛 1 + 𝑝 2 𝑞 2 𝑛 2 = (0.8)(0.2) 50 + (0.3)(0.7) 50 𝑆𝐸= 𝑝 1 𝑞 1 𝑛 1 + 𝑝 2 𝑞 2 𝑛 2 = (0.8)(0.2) 50 + (0.3)(0.7) 50 You test out two medicines in a randomized experiment , with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the standard error of the mean difference between proportions?

(0.8)(0.2) 50 + (0.3)(0.7) 50 = 0.086 You test out two medicines in a randomized experiment , with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the standard error of the mean difference between proportions?

Example You test out two medicines in a randomized experiment, with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the 95% Confidence Interval of the mean difference between proportions?

Example You test out two medicines in a randomized experiment, with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the 95% Confidence Interval of the mean difference between proportions? ( 𝑝 1 − 𝑝 2 )= 0.5, SE = 0.086

95% CI = [0.5-(1.96)(0.086), 0.5+(1.96)(0.086)] You test out two medicines in a randomized experiment , with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the 95% Confidence Interval of the mean difference between proportions? ( 𝑝 1 − 𝑝 2 )= 0.5, SE = 0.086

95% CI = [0.33,0.67] You test out two medicines in a randomized experiment , with 50 subjects in each group. Fworplomycin leads to 80% of patients surviving, while Penicillin leads to 30% of patients surviving. What is the 95% Confidence Interval of the mean difference between proportions? ( 𝑝 1 − 𝑝 2 )= 0.5, SE = 0.086

You try it You compare students in two sections of HUDM4122. If they later take HUDM5122, 85% of students in Section A (40 students) pass HUDM5122, while only 80% of students in Section B (30 students) pass HUDM5122. What is the standard error of the mean difference in pass rates, and what is the 95% Confidence Interval?

Note that Sample sizes can be different between groups

Comments? Questions?

Final questions or comments for the day?

Upcoming Classes 4/6 Z tests 4/8 No class 4/13 Types of Errors HW7 due