Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 8 Confidence Intervals for Means and Proportions.

Similar presentations


Presentation on theme: "Week 8 Confidence Intervals for Means and Proportions."— Presentation transcript:

1 Week 8 Confidence Intervals for Means and Proportions

2 Inference Data are a single sample Interested in underlying population, not specific sample Sample gives information about population Randomness of sample means uncertainty Called inference about population

3 Types of inference Focus on value of population parameter e.g. mean or proportion (probability) Estimation What is the value of the parameter? Hypothesis testing Is the parameter equal to a specific value (usually zero)?

4 Point estimate To estimate population parameter, use corresponding sample statistic e.g. Likely to be an error in estimate e.g. How big is error likely to be?

5 Error distribution Error is random Simulation from an ‘approx’ population could build up error distribution Shows how large error from actual sample data is likely to be

6 Example Silkworm survival after arsenic poisoning How long will 1 / 4 survive? What is upper quartile?

7 Simulation Approx population (same mean & sd as data) Target = UQ from normal = 293.3 sec

8 Simulation (cont) Sample UQs ≠ target Simulation shows error distribution Error in estimate (292 sec) unlikely to be more than 10 sec.

9 Error distn for proportion Simulation is not needed DistributionMeanSt devn # success x binomial (n,  )n  Propn(success) p = x / n  Error p -  0

10 Standard error of proportion Approx error distn bias = 0 standard error =

11 Teens and interracial dating Point estimate: Bias = 0 Standard error = 1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been out with someone of another race or ethnic group. = 0.57

12 Error distn (interracial dating)      = 0,  = 0.022 0.022.044.066-.022-.044-.066 General normal Error distn Error in estimate, p = 0.57, unlikely to be more than 0.05 almost certainly less than 0.07

13 Interval estimates Survey 150 randomly selected students and 41% think marijuana should be legalized. If we report between 33% and 49% of all students at the college think that marijuana should be legalized, how confident can we be that we are correct? Confidence interval: an interval of estimates that is ‘likely’ to capture the population value.

14 95% confidence interval Legalise? p = 0.41, n = 150 70-95-100 rule of thumb Prob(error < 2 x 0.0412) is approx 95% We are 95% confident that  is between 0.41 – 0.0824 and 0.41 + 0.0824 0.33 and 0.49 95% Conf Interval

15 Interpreting 95% C.I. Confidence interval is function of sample data Random It may not include population parameter (  here) In repeated samples, about 95% of CIs calculated as described will include  We therefore say we are 95% confident that our single CI will include 

16 Teens and interracial dating Point estimate: Standard error = 95% C.I. is 0.57 - 0.044 to 0.57 + 0.044 0.526 to 0.614 1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been out with someone of another race or ethnic group. = 0.57 We would prefer more decimals!

17 Teens and interracial dating 95% C.I. is 0.526 to 0.614 We do not know whether  is between 0.526 and 0.614 However 95% of CIs calculated in this way will work We are therefore 95% confident that is in (0.526, 0.614)

18 St error & width of 95% C.I. Smallest s.e. and C.I. width when: n is large p is close to 0 or 1 Biggest s.e. and C.I. width when: n is small p is close to 0.5

19 Margin of error Public opinion polls usually estimate several popn proportions. Each has its own “± 2 s.e.” describing accuracy n = 350 propn± 2 x s.e. Will vote for A0.45± 0.0532 Will vote for X0.04± 0.0209 Happy with govt0.66± 0.0506 Wants tax cut0.87± 0.0360

20 Margin of error (cont) n = 350 Maximum possible is propn± 2 x s.e. Will vote for A0.45± 0.0532 Will vote for X0.04± 0.0209 Happy with govt0.66± 0.0506 Wants tax cut0.87± 0.0360 “Margin of error” for poll

21 Requirements for C.I. Sample should be randomly selected from population “Large” sample size — at least 10 success and 10 failure (though some say only 5 needed) If finite population, at least 10 times sample size

22 Case Study : Nicotine Patches vs Zyban Study: New England Journal of Medicine 3/4/99) 893 participants randomly allocated to four treatment groups: placebo, nicotine patch only, Zyban only, and Zyban plus nicotine patch. Participants blinded: all used a patch (nicotine or placebo) all took a pill (Zyban or placebo). Treatments used for nine weeks.

23 Nicotine Patches vs Zyban (cont) Conclusions: Zyban is effective (no overlap of Zyban and not Zyban CIs) Nicotine patch is not particularly effective (overlap of patch and no patch CIs)

24 Error distribution for mean Again, a simulation is unnecessary to find the error distribution (approx) DistributionMeanSt devn Sample mean Approx normal  Error Approx normal 

25 Standard error of mean Approx error distn bias = 0 standard error =

26 Poll: Class of 175 students. In a typical day, about how much time to you spend watching television? Mean hours watching TV n Mean MedianStDev 175 2.09 2.000 1.644 Point estimate: Bias = 0 Standard error, = 2.09 hours

27 Standard devn & standard error Sample standard deviation is approx  stay similar if n increases Standard error of mean is usually less than  decreases as n increases Don’t get mixed up between the two!

28 Error distn (hours watching TV)      = 0,  = 0.124 0.124.248.372-.124-.248-.372 General normal Error distn Error in estimate, = 2.09 hours, unlikely to be more than 0.25 hrs almost certainly less than 0.4 hrs

29 General form for 95% C.I.  se  se  se Error distn If error distn is normal zero bias & we can find s.e. Prob( error is in ± 2 s.e.) is approx 0.95 95% confidence interval: estimate ± 2 s.e. 95% confidence interval: estimate ± 1.96 s.e. (if really sure error distn is normal)

30 95% confidence interval Mean hrs watching TV? 70-95-100 rule of thumb Prob(error < 2 x 0.124) is approx 95% We are 95% confident that  is between 2.09 – 0.248 and 2.09 + 0.248 1.84 and 2.34 hours 95% C. I. = 2.09 hrs, n = 175

31 Requirements for C.I. Sample should be randomly selected from population “Large” sample size — n > 30 is often recommended If finite population, at least 10 times sample size

32 Problem with small n Known  Unknown  Variable width Less likely to include  Confidence level less than 95% works fine

33 C.I. for mean, small n Solution is to replace 1.96 (or 2) by a bigger number. Look up t-tables with (n - 1) ‘degrees of freedom’ Sample size, nd.f., n – 1t n-1 100991.98 30292.05 1092.26 542.78

34 Example: Mean Forearm Length Data:From random sample of n = 9 men 25.5, 24.0, 26.5, 25.5, 28.0, 27.0, 23.0, 25.0, 25.0 95% C.I.: 25.5  2.31(.507) => 25.5  1.17 => 24.33 to 26.67 cm df = 8 t 8 = 2.31

35 What Students Sleep More? Q: How many hours of sleep did you get last night, to the nearest half hour? Notes:  CI for Stat 10 is wider (smaller sample size)  Two intervals do not overlap Class n Mean StDev SE Mean Stat 10 (stat literacy) 25 7.66 1.34 0.27 Stat 13 (stat methods)148 6.81 1.73 0.14

36 Interpreting 95% C.I. Confidence interval is function of sample data Random It may not include population parameter (  here) In repeated samples, about 95% of CIs calculated as described will include  We therefore say we are 95% confident that our single CI will include 


Download ppt "Week 8 Confidence Intervals for Means and Proportions."

Similar presentations


Ads by Google