More About Confidence Intervals

Slides:



Advertisements
Similar presentations
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Confidence Intervals Chapter 12.
Advertisements

Stat 100, This week Chapter 20, Try Problems 1-9 Read Chapters 3 and 4 (Wednesday’s lecture)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Week 10 Comparing Two Means or Proportions. Generalising from sample IndividualsMeasurementGroupsQuestion Children aged 10 Mark in maths test Boys & girls.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Significance Tests Chapter 13.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Chapter 8 Estimation: Additional Topics
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Estimating Means with Confidence
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
CHAPTER 19: Two-Sample Problems
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Hypothesis Testing – Examples and Case Studies
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables Chapter 9 1.
Chapter 19: Two-Sample Problems STAT Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
CHAPTER 10 CONFIDENCE INTERVALS FOR ONE SAMPLE POPULATION
Comparing Two Population Means
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
1 Happiness comes not from material wealth but less desire.
Week 8 Confidence Intervals for Means and Proportions.
INCM 9201 Quantitative Methods Confidence Intervals for Means.
© Copyright McGraw-Hill 2000
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Essential Statistics Chapter 161 Inference about a Population Mean.
AP Statistics Chapter 24 Comparing Means.
ISMT253a Tutorial 1 By Kris PAN Skewness:  a measure of the asymmetry of the probability distribution of a real-valued random variable 
UNIT 3 YOUR FINAL EXAMINATION STUDY MATERIAL STARTS FROM HERE Copyright ©2011 Brooks/Cole, Cengage Learning 1.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Difference Between Two Means.
Chapter 9 Lecture 3 Section: 9.3. We will now consider methods for using sample data from two independent samples to test hypotheses made about two population.
AP Statistics Chapter 11 Section 2. TestConfidence IntervalFormulasAssumptions 1-sample z-test mean SRS Normal pop. Or large n (n>40) Know 1-sample t-test.
Objectives (PSLS Chapter 18) Comparing two means (σ unknown)  Two-sample situations  t-distribution for two independent samples  Two-sample t test 
AP Statistics Chapter 24 Comparing Means. Objectives: Two-sample t methods Two-Sample t Interval for the Difference Between Means Two-Sample t Test for.
Copyright ©2011 Brooks/Cole, Cengage Learning Understanding Sampling Distributions: Statistics as Random Variables UNIT V 1.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Chapter 8: Estimating with Confidence
Two-Sample Inference Procedures with Means
Chapter 9 Roadmap Where are we going?.
Chapter 8: Estimating with Confidence
Chapter 9 Estimation: Additional Topics
FINAL EXAMINATION STUDY MATERIAL PART I
Understanding Sampling Distributions: Statistics as Random Variables
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Chapter 24 Comparing Means.
AP Statistics Chapter 24 Comparing Means.
Chapter 23 Comparing Means.
Objectives (PSLS Chapter 18)
Inference for the Difference Between Two Means
Chapter 9: Inferences Involving One Population
Comparing Two Populations or Treatments
Estimating Means With Confidence
Statistics 200 Objectives:
Two-Sample Inference Procedures with Means
CHAPTER 21: Comparing Two Means
Two-Sample Inference Procedures with Means
CHAPTER 10 Comparing Two Populations or Groups
What are their purposes? What kinds?
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
Chapter 8: Estimating with Confidence
Estimates and Sample Sizes Lecture – 7.4
2/5/ Estimating a Population Mean.
Chapter 9 Estimation: Additional Topics
Chapter 9 Lecture 3 Section: 9.3.
Presentation transcript:

More About Confidence Intervals No:1 More About Confidence Intervals Chapter 12

No:2 Recall: A parameter is a population characteristic – value is usually unknown. We estimate the parameter using sample information. A statistic, or estimate, is a characteristic of a sample. A statistic estimates a parameter. A confidence interval is an interval of values computed from sample data that is likely to include the true population value. The confidence level for an interval describes our confidence in the procedure we used. We are confident that most of the confidence intervals we compute using a procedure will contain the true population value.

12.1 Examples of Different Estimation Situations No:3 12.1 Examples of Different Estimation Situations Situation 1. Estimating the proportion falling into a category of a categorical variable. Example research questions: What proportion of American adults believe there is extraterrestrial life? In what proportion of British marriages is the wife taller than her husband? Population parameter: p = proportion in the population falling into that category. Sample estimate: = proportion in the sample falling into that category.

More Estimation Situations No:4 More Estimation Situations Situation 2. Estimating the mean of a quantitative variable. Example research questions: What is the mean time that college students watch TV per day? What is the mean pulse rate of women? Population parameter: m (spelled “mu” and pronounced “mew”) = population mean for the variable Sample estimate: = the sample mean for the variable

More Estimation Situations No:5 More Estimation Situations Situation 3. Estimating the difference between two populations with regard to the proportion falling into a category of a qualitative variable. Example research questions: How much difference is there between the proportions that would quit smoking if taking the antidepressant buproprion (Zyban) versus if wearing a nicotine patch? How much difference is there between men who snore and men who don’t snore with regard to the proportion who have heart disease? Population parameter: p1 – p2 = difference between the two population proportions. Sample estimate: = difference between the two sample proportions.

More Estimation Situations No:6 More Estimation Situations Situation 4. Estimating the difference between two populations with regard to the mean of a quantitative variable. Example research questions: How much difference is there in average weight loss for those who diet compared to those who exercise to lose weight? How much difference is there between the mean foot lengths of men and women? Population parameter: m1 – m2 = difference between the two population means. Sample estimate: = difference between the two sample means.

No:7 Independent Samples Two samples are called independent samples when the measurements in one sample are not related to the measurements in the other sample. Random samples taken separately from two populations and same response variable is recorded. One random sample taken and a variable recorded, but units are categorized to form two populations. Participants randomly assigned to one of two treatment conditions, and same response variable is recorded.

Paired Data: A Special Case of One Mean No:8 Paired Data: A Special Case of One Mean Paired data (or paired samples): when pairs of variables are collected. Only interested in population (and sample) of differences, and not in the original data. Each person measured twice. Two measurements of same characteristic or trait are made under different conditions. Similar individuals are paired prior to an experiment. Each member of a pair receives a different treatment. Same response variable is measured for all individuals. Two different variables are measured for each individual. Interested in amount of difference between two variables.

No:9 12.2 Standard Errors Rough Definition: The standard error of a sample statistic measures, roughly, the average difference between the statistic and the population parameter. This “average difference” is over all possible random samples of a given size that can be taken from the population. Technical Definition: The standard error of a sample statistic is the estimated standard deviation of the sampling distribution for the statistic.

Standard Error of a Sample Proportion No:10 Standard Error of a Sample Proportion Example 12.1 Intelligent Life on Other Planets Poll: Random sample of 935 Americans Do you think there is intelligent life on other planets? Results: 60% of the sample said “yes”, = .60 The standard error of .016 is roughly the average difference between the statistic, , and the population parameter, p, for all possible random samples of n = 935 from this population.

Standard Error of a Sample Mean No:11 Standard Error of a Sample Mean Example 12.2 Mean Hours Watching TV Poll: Class of 175 students. In a typical day, about how much time to you spend watching television? Variable N Mean Median TrMean StDev SE Mean TV 175 2.09 2.000 1.950 1.644 0.124

Standard Error of the Difference Between Two Sample Proportions No:12 Standard Error of the Difference Between Two Sample Proportions Example 12.3 Patches vs Antidepressant (Zyban)? Study: n1 = n2 = 244 randomly assigned to each treatment Zyban: 85 of the 244 Zyban users quit smoking = .348 Patch: 52 of the 244 patch users quit smoking = .213 So,

Standard Error of the Difference Between Two Sample Means No:13 Standard Error of the Difference Between Two Sample Means Example 12.4 Lose More Weight by Diet or Exercise? Study: n1 = 42 men on diet, n2 = 47 men on exercise routine Diet: Lost an average of 7.2 kg with std dev of 3.7 kg Exercise: Lost an average of 4.0 kg with std dev of 3.9 kg So,

12.3 Approximate 95% CI For sufficiently large samples, the interval No:14 12.3 Approximate 95% CI For sufficiently large samples, the interval Sample estimate  2  Standard error is an approximate 95% confidence interval for a population parameter. Note: The 95% confidence level describes how often the procedure provides an interval that includes the population value. For about 95% of all random samples of a specific size from a population, the confidence interval captures the population parameter.

Necessary Conditions Sample Size Requirements: No:15 Necessary Conditions Sample Size Requirements: For one proportion: Both and are at least 5, preferably at least 10. For one mean: n is greater than 30. For two proportions: and are at least 5 (preferably 10) for each sample. For two means: n1 and n2 are each greater than 30.

Necessary Conditions Other Requirements: No:16 Necessary Conditions Other Requirements: The samples are randomly selected. In practice, it is sufficient to assume that samples are representative of the population for the question of interest. For the confidence intervals for the difference between two proportions or two means, the two samples must be independent of each other.

Example 12.1 Intelligent Life? (cont) No:17 Example 12.1 Intelligent Life? (cont) Poll: Random sample of 935 Americans Do you think there is intelligent life on other planets? Results: 60% of the sample said “yes”, = .60 Approximate 95% Confidence Interval: .60  2(.016) => .60  .032 => .568 to .632 Note: For about 95% of all random samples from the population, the corresponding confidence interval captures the population parameter. We don’t know if particular interval does or does not capture the population value.

Example 12.2 Watching TV (cont) No:18 Example 12.2 Watching TV (cont) Poll: Class of 175 students. In a typical day, about how much time do you spend watching television? The sample mean was 2.09 hours and the sample standard deviation was 1.644 hours. Approximate 95% Confidence Interval: 2.09  2(.124) => 2.09  .248 => 1.842 to 2.338 hours Note: We are 95% confident that the mean time that Penn State students spend watching television per day is somewhere between 1.842 and 2.338 hours.

Example 12.3 Patch vs Antidepressant (cont) No:19 Example 12.3 Patch vs Antidepressant (cont) Study: n1 = n2 = 244 randomly assigned to each group Zyban: 85 of the 244 Zyban users quit smoking = .348 Patch: 52 of the 244 patch users quit smoking = .213 So, Approximate 95% Confidence Interval: .135  2(.040) => .135  .080 => .055 to .215 Note: Zyban had a higher success rate and the interval does not include the value 0, so it supports a difference between the success rates of the two methods.

Example 12.4 Diet vs Exercise (cont) No:20 Example 12.4 Diet vs Exercise (cont) Study: n1 = 42 men on diet, n2 = 47 men exercise Diet: Lost an average of 7.2 kg with std dev of 3.7 kg Exercise: Lost an average of 4.0 kg with std dev of 3.9 kg So, Approximate 95% Confidence Interval: 3.2  2(.81) => 3.2  1.62 => 1.58 to 4.82 kg Note: We are 95% confident the interval 1.58 to 4.82 kg covers the increased mean population weight loss for dieters compared to those who exercise. The interval does not cover 0, so a real difference is likely to hold for the population.

12.4 General CI for One Mean or Paired Data A Confidence Interval for a Population Mean where the multiplier t* is the value in a t-distribution with degrees of freedom = df = n - 1 such that the area between -t* and t* equals the desired confidence level. (Found from Table A.2.) Conditions: Population of measurements is bell-shaped and a random sample of any size is measured; OR Population of measurements is not bell-shaped, but a large random sample is measured, n  30.

Example 12.5 Mean Forearm Length No:22 Example 12.5 Mean Forearm Length Data: Forearm lengths (cm) for a random sample of n = 9 men 25.5, 24.0, 26.5, 25.5, 28.0, 27.0, 23.0, 25.0, 25.0 Note: Dotplot shows no obvious skewness and no outliers. Multiplier t* from Table A.2 with df = 8 is t* = 2.31 95% Confidence Interval: 25.5  2.31(.507) => 25.5  1.17 => 24.33 to 26.67 cm

Example 12.6 What Students Sleep More? No:23 Example 12.6 What Students Sleep More? Q: How many hours of sleep did you get last night, to the nearest half hour? Class N Mean StDev SE Mean Stat 10 (stat literacy) 25 7.66 1.34 0.27 Stat 13 (stat methods) 148 6.81 1.73 0.14 Note: Bell-shape was reasonable for Stat 10 (with smaller n). Notes: Interval for Stat 10 is wider (smaller sample size) Two intervals do not overlap => Stat 10 average significantly higher than Stat 13 average.

Paired Data Confidence Interval No:24 Paired Data Confidence Interval Data: two variables for n individuals or pairs; use the difference d = x1 – x2. Population parameter: md = mean of differences for the population = m1 – m2. Sample estimate: = sample mean of the differences Standard deviation and standard error: sd = standard deviation of the sample of differences; Confidence interval for md: , where df = n – 1 for the multiplier t*.

Example 12.7 Screen Time: Computer vs TV No:25 Example 12.7 Screen Time: Computer vs TV Data: Hours spent watching TV and hours spent on computer per week for n = 25 students. Task: Make a 90% CI for the mean difference in hours spent using computer versus watching TV. Note: Boxplot shows no obvious skewness and no outliers.

Example 12.7 Screen Time: Computer vs TV No:26 Example 12.7 Screen Time: Computer vs TV Results: Multiplier t* from Table A.2 with df = 24 is t* = 1.71 90% Confidence Interval: 5.36  1.71(3.05) => 5.36  5.22 => 0.14 to 10.58 hours Interpretation: We are 90% confident that the average difference between computer usage and television viewing for students represented by this sample is covered by the interval from 0.14 to 10.58 hours per week, with more hours spent on computer usage than on television viewing.

12.5 General CI for Difference Between Two Means (Indep) No:27 12.5 General CI for Difference Between Two Means (Indep) A CI for the Difference Between Two Means (Independent Samples): where t* is the value in a t-distribution with area between -t* and t* equal to the desired confidence level. The df used depends on if equal population variances are assumed.

Necessary Conditions Two samples must be independent. Either … No:28 Necessary Conditions Two samples must be independent. Either … Populations of measurements both bell-shaped, and random samples of any size are measured. or … Large (n  30) random samples are measured.

No:29 Degrees of Freedom The t-distribution is only approximately correct and df formula is complicated (Welch’s approx): Statistical software can use the above approximation, but if done by-hand then use a conservative df = smaller of n1 – 1 and n2 – 1.

Example 12.8 Effect of a Stare on Driving No:30 Example 12.8 Effect of a Stare on Driving Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection. No Stare Group (n = 14): 8.3, 5.5, 6.0, 8.1, 8.8, 7.5, 7.8, 7.1, 5.7, 6.5, 4.7, 6.9, 5.2, 4.7 Stare Group (n = 13): 5.6, 5.0, 5.7, 6.3, 6.5, 5.8, 4.5, 6.1, 4.8, 4.9, 4.5, 7.2, 5.8 Task: Make a 95% CI for the difference between the mean crossing times for the two populations represented by these two independent samples.

Example 12.8 Effect of a Stare on Driving No:31 Example 12.8 Effect of a Stare on Driving Checking Conditions: Boxplots show … No outliers and no strong skewness. Crossing times in stare group generally faster and less variable.

Example 12.8 Effect of a Stare on Driving No:32 Example 12.8 Effect of a Stare on Driving Note: The df = 21 was reported by the computer package based on the Welch’s approximation formula. The 95% confidence interval for the difference between the population means is 0.14 seconds to 1.93 seconds .

Equal Variance Assumption No:33 Equal Variance Assumption Often reasonable to assume the two populations have equal population standard deviations, or equivalently, equal population variances: Estimate of this variance based on the combined or “pooled” data is called the pooled variance. The square root of the pooled variance is called the pooled standard deviation:

No:34 Pooled Standard Error Note: Pooled df = (n1 – 1) + (n2 – 1) = (n1 + n2 – 2).

Pooled Confidence Interval No:35 Pooled Confidence Interval Pooled CI for the Difference Between Two Means (Independent Samples): where t* is found using a t-distribution with df = (n1 + n2 – 2) and sp is the pooled standard deviation.

Example 12.9 Male and Female Sleep Times No:36 Example 12.9 Male and Female Sleep Times Q: How much difference is there between how long female and male students slept the previous night? Data: The 83 female and 65 male responses from students in an intro stat class. Task: Make a 95% CI for the difference between the two population means sleep hours for females versus males. Note: We will assume equal population variances.

Example 12.9 Male and Female Sleep Times No:37 Example 12.9 Male and Female Sleep Times Two-sample T for sleep [with “Assume Equal Variance” option] Sex N Mean StDev SE Mean Female 83 7.02 1.75 0.19 Male 65 6.55 1.68 0.21 Difference = mu (Female) – mu (Male) Estimate for difference: 0.461 95% CI for difference: (-0.103, 1.025) T-Test of difference = 0 (vs not =): T-Value = 1.62 P = 0.108 DF = 146 Both use Pooled StDev = 1.72 Notes: Two sample standard deviations are very similar. Sample mean for females higher than for males. 95% confidence interval contains 0 so cannot rule out that the population means may be equal.

Example 12.9 Male and Female Sleep Times No:38 Example 12.9 Male and Female Sleep Times Pooled standard deviation and pooled standard error “by-hand”:

No:39 Pooled or Unpooled? If sample sizes are equal, the pooled and unpooled standard errors are equal. If sample standard deviations similar, assumption of equal population variance is reasonable and pooled procedure can be used. If sample sizes are very different, pooled test can be quite misleading unless sample standard deviations are similar. If the smaller standard deviation accompanies the larger sample size, we do not recommend using the pooled procedure. If sample sizes are very different, the standard deviations are similar, and the larger sample size produced the larger standard deviation, the pooled procedure is acceptable because it will be conservative.

12.6 The Difference Between Two Proportions (Indep) No:40 12.6 The Difference Between Two Proportions (Indep) A CI for the Difference Between Two Proportions (Independent Samples): where z* is the value of the standard normal variable with area between -z* and z* equal to the desired confidence level.

No:41 Necessary Conditions Condition 1: Sample proportions are available based on independent, randomly selected samples from the two populations. Condition 2: All of the quantities – – are at least 5 and preferably at least 10.

Example 12.10 Snoring and Heart Attacks Q: Is there a relationship between snoring and risk of heart disease? Data: Of 1105 snorers, 86 had heart disease. Of 1379 nonsnorers, 24 had heart disease.

Example 12.10 Snoring and Heart Attacks Note: the higher the level of confidence, the wider the interval. It appears that the proportion of snorers with heart disease in the population is about 4% to 8% higher than the proportion of nonsnorers with heart disease. Risk of heart disease for snorers is about 4.5 times what the risk is for nonsnorers.