1 MET 136 Statistical Climatology - Lecture 11 Confidence Intervals Dr. Marty Leach San Jose State University Reading: Gonick Chapter 7
2 Sampling We previously studied how samples of large populations were distributed. Now, we’ll look at one sample, and study what we can determine from this alone.
3 Confidence Intervals Used in election polls (watch it!) The average global air temperature near the Earth's surface increased 0.74 0.18ºC (1.33 0.32 º F) during the 100 years ending in (IPCC 2007)C
4 Confidence Intervals Are used extensively in science Used in election polls (watch it!) Example: The average global air temperature near the Earth's surface increased 0.74 0.18ºC (1.33 0.32 º F) during the 100 years ending in (IPCC 2007)C
5 Example 1 Election Numbers x?g=252060cf-f1d3-49bc-80ed- 24d0c9122b49&d=0 x?g=252060cf-f1d3-49bc-80ed- 24d0c9122b49&d=0 Let’s look at the numbers
6 Poll Surveyed 661 likely to vote people N=661 Randomly selected Result:
7 Poll Surveyed 661 likely to vote people N=661 Randomly selected Result:
8 Standard deviation of normal To determine the accuracy of this probability, we need to calculate the standard deviation: Only problem…we don’t know true probability, p.
9 Standard deviation of normal To determine the accuracy of this probability, we need to calculate the standard deviation: Only problem…we don’t know true probability, p.
10 Standard Error Only thing we can do is use the standard error (which uses the sampled probability (p- hat)
11 Standard Error Only thing we can do is use the standard error (which uses the sampled probability (p- hat) This is called the standard error
12 Standard Error So now we can estimate the confidence interval at the 95% level
13 Standard Error So now we can estimate the confidence interval at the 95% level This says that 95% of the time, the true probability p will fall within these two values.
14 Calculate confidence interval Let’s calculate the 95% confidence interval for the presidental poll in CA. N=661
15 Calculate confidence interval Let’s calculate the 95% confidence interval for the presidental poll in CA. N=661 So that now, p is within the range: 0.53±1.96*0.019 p=0.53 ± 0.038
16 Interpretation So what does this mean? p=0.53 ± Slight oversight…
17 Interpretation So what does this mean? p=0.53 ± ≤ p ≤ Slight oversight… Obama: 53 McCain: 43 Undecided/other: 4
18 20 samples with n=1000; assume true value p=0.5. Shown are 95% confidence interval. On average 1 in 20 will not cover 0.5
19 Improve the results Suppose we want a more confidence, say 99%, what can we do?
20 Improve the results Suppose we want more confidence, say 99%, what can we do? Widen the confidence interval Increase the sample size
21 Example Redo the confidence interval at the 99% level Result: But now our margin of error is larger… (e.g. I’m 100% confident the probability will be between 0 and 1!
22 Example Redo the confidence interval at the 99% level Result: 0.53±2.58*0.019 p=0.53 ± ≤ p ≤ But now our margin of error is larger… (e.g. I’m 100% confident the probability will be between 0 and 1!
23 Sample Size But what if we are not happy that our error has gone up. The other way to keep the error down and the confidence high is to increase the sample size.
24 Sample Size But what if we are not happy that our error has gone up. The other way to keep the error down and the confidence high is to increase the sample size. Where Z is from the normal table (pg 84), p* is the estimate of the probability and E is the margin of error.
25 Example So now calculate the sample size required to produce a margin of error of 0.01 and a 99% confidence level. Result Limits to polling…
26 Example So now calculate the sample size required to produce a margin of error of 0.01 and a 99% confidence level. Result More then 16,000 respondents! Limits to polling…
27 Confidence intervals for the mean Now, we’ll look at confidence intervals for the mean, not the probability.
28 Confidence intervals for the mean Now, we’ll look at confidence intervals for the mean, not the probability.
29 Standard Error The standard error of the mean is defined as:
30 Standard Error The standard error of the mean is defined as: Where s is the sample standard deviation
31 Example Suppose that you calculate the average winter low temperature in Silicon Valley during the last 25 years to be 41.5F and the standard deviation is 3.2F. Compute the 95% confidence interval for the mean temperature. If temperatures below 40F are required for fruit to start growing in the valley, would you expect this to happen in a typical winter?
32 Student’s t We’ve discussed that as the sample size increases, the distribution approaches a normal distribution. We can quantify this using the degrees of freedom. If you have x 1, x 2, …x n data points, then you have n-1 degrees of freedom. So, we can choose a t-distribution for n-1 degrees of freedom.
33 t-distribution
34 Mean using a t-distribution
35 Mean using a t-distribution So, using a t-distribution, the mean and the confidence interval is given by:
36 Notation:
37 t-distribution table