Estimating with Confidence: Means and Proportions
Today, we will Learn to get better estimates with our confidence interval for means (using the t distribution) Learn to generate confidence intervals for proportions
REVIEW CIs We have a sample mean but want to know where the population mean is. To answer this question “what is the population mean?” we construct confidence limits around the sample mean. 95% CI = where = and z* is the critical value in a normal probability distribution for computing the upper and lower estimates. Number of standard deviations we need to capture a certain percent of cases (usually 95%)
Can we do better? We can get better estimates under some circumstances! z distribution (standard normal) is problematic when our sample size is small & sampling distribution is non-normal We need some way to have our distribution change shape to reflect the uncertainty we have as our sample size changes That distribution is the t distribution
Student’s t Distribution It is a standardized probability distribution Unlike z, t changes peakedness as the sample size varies, becoming bell shaped as the sample size increases. t test for computing confidence limits is 95% CI = with a t for certain degrees of freedom Degrees of Freedom (df) control the size of the peak based on the sample size df= n-1
t-table 95% d.f. = (n-1) 90% 98% 80% 99%
Steps for C.I. using t Obtain the Std. Error Get a value of t from the t distribution Compute the Interval (Plug and Chug)
Let’s Practice Suppose we were interested in how frequently people vote. To study this, a researcher asks 10 people how many times they have voted in the last 5 Congressional Elections The average number of times a person in this sample voted was 2.7, with a standard deviation of 1.3
Step 1: Obtain the Std. Error
Step 2: Get a value of t from the t distribution D.F.= n-1 = 9 Choose a level of risk (.05) t critical value = 2.262
Step 3: Compute the Interval In repeated samples of the same size from the same population, 95% of samples would yield an interval that contains the true mean.
Now You Try A man drives 30 miles to work every day. There are many stop lights on the way, so it seems to take a different length of time each day. He wants to estimate the average drive time. He times his drive over 25 days and finds a mean drive time of 38 minutes with a standard deviation of 9 minutes. Using 95% level of confidence, estimate the average drive time with a confidence interval.
Step 1: Obtain the Std. Error Step 2: Get a value of t from the t table Step 3: Calculate the Confidence interval df = 24 α = .05 t = 2.064 38 + 2.064*1.84 38 + 3.79 We are 95% confident that the True mean is between 34.21 & 41.79
Should I use z or t ? With t, you get more accurate results for smaller sample sizes. As the degrees of freedom get larger and larger, the t distribution turns into the standard normal distribution (the z distribution) As a result, we should always use t. Why did I have to learn this stupid z thing?
What if I don’t have a mean Percentage of people who vote for Bush Proportion of the population who is in a certain category We need another test
Count = f / n =proportion 100/600 = .60 Surveys and experiments often produce counts which we can turn into proportions. Count = f / n =proportion 100/600 = .60 Or multiply by 100 to get a percentage 60*100=60%
Sampling Error for Proportions Proportion in a sample is not the same as the True population proportion. We can estimate Confidence Intervals for proportions just like means
The Formulas CI for Means CI = CI for Proportions CI = The Difference
p = proportion 1- p = Not p, sometimes called q Then proportion of people favoring abortion is “p” The number of people opposing abortion is 1 – p. If the sample size (n) is “large enough” the sampling distribution will be normal. The sampling distribution from which you are drawing your one sample will approximate a normal probability distribution – a Z distribution when : N*p > 10 and n(1 – P) > 10 If n*p<10 or n*(1-P)<10, then we must use something else. We will not encounter that this semester
Steps for Computing a Confidence Interval for Proportions Convert the frequency count in your sample into proportions. P = count / sample size, [f/n] 600/1000 .60 Find the appropriate critical value of z. Use the last line on the t table for infinite degrees of freedom (90% 1.645, 95% 1.96, etc…)
3. Calculate the Standard Error 4. Plug and Chug
Practice Problem: Would a majority of all park visitors favor stricter controls on animals (requiring a leash)? Can you be 95% confident that more than half the visitors would approve stricter limits. Results of the survey were 89 of 150 visitors favored stricter restrictions. Step 1: Generate Proportion 89/150 = P = 0.593 or 59.3%
Step 3: Compute the Standard Error Step 2: Find the Critical Value of Z. (Z table or bottom line of t table). This works out to be 1.96 Step 3: Compute the Standard Error Step 4: Plug and Chug = .593 + 1.96*.040 = .593 + .078 Interval is .515 to .671 95% confident that most visitors favor restrictions
A National SRS poll of n=500 finds that 330 in the sample favor stronger gun controls. Stated in percent, 66% of this sample favors stronger gun controls. Step 1: What is the problem? What is the percent of people in the population favoring gun control. Convert the frequency into a proportion 330/500=.66
Step 3: Compute the Std. Err. Step 2: Find the critical Value of Z . If we choose 95% confidence, we use 1.96 Step 3: Compute the Std. Err. Step 4: Plug and Chug 95% CI = .66 + 1.96*(.02) = .66 + .0392 The mean support for gun control is 66% + 3.92% = = .02
As the SRS becomes bigger the estimated error around the measure of central tendency gets smaller. The larger the sample the less the chance of getting an atypical average.
Choosing between t and z for Confidence Intervals Proportions: Always use z Means: Always use t Why? If the sample size is large enough to use z, then the t table will give you the right value anyways.