Welcome to the Unit 8 Seminar Dr. Ami Gates

Slides:



Advertisements
Similar presentations
Estimating a Population Proportion
Advertisements

Chapter 9 Chapter 10 Chapter 11 Chapter 12
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Introduction to Statistics: Chapter 8 Estimation.
CHAPTER 8 Estimating with Confidence
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Section 8.2 Estimating a Population Proportion
Review of normal distribution. Exercise Solution.
Chapter 7 Confidence Intervals and Sample Sizes
MM207 Statistics Welcome to the Unit 8 Seminar Prof. Charles Whiffen.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Dan Piett STAT West Virginia University
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Estimates and Sample Sizes Lecture – 7.4
AP Statistics Chap 10-1 Confidence Intervals. AP Statistics Chap 10-2 Confidence Intervals Population Mean σ Unknown (Lock 6.5) Confidence Intervals Population.
CHAPTER 20: Inference About a Population Proportion ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
CHAPTER 8 Estimating with Confidence
1 Sampling Distributions. Central Limit Theorem*
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Section 8.2 ~ Estimating Population Means Introduction to Probability and Statistics Ms. Young.
CHAPTER 20: Inference About a Population Proportion ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
Confidence Interval Estimation For statistical inference in decision making:
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
Copyright © 2014 Pearson Education. All rights reserved Estimating Population Means LEARNING GOAL Learn to estimate population means and compute.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Copyright © 2009 Pearson Education, Inc. 8.1 Sampling Distributions LEARNING GOAL Understand the fundamental ideas of sampling distributions and how the.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 12.1 Estimating a Population Proportion.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
Can't Type? press F11 Can’t Hear? Check: Speakers, Volume or Re-Enter Seminar Put ? in front of Questions so it is easier to see them. 1 Welcome to Unit.
MM207 Statistics Welcome to the Unit 9 Seminar With Ms. Hannahs Final Project is due Tuesday, August 7 at 11:59 pm ET. No late projects will be accepted.
Section 8.1 Sampling Distributions Page Can't Type? press F11 Can’t Hear? Check: Speakers, Volume or Re-Enter Seminar Put ? in front of Questions.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
8.2 Estimating Population Means
8.2 Estimating Population Means
Chapter 8: Estimating with Confidence
Confidence Intervals for a Population Mean, Standard Deviation Known
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Calculating Probabilities for Any Normal Variable
Estimating a Population Proportion
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
2/3/ Estimating a Population Proportion.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

Welcome to the Unit 8 Seminar Dr. Ami Gates MM207 Statistics Welcome to the Unit 8 Seminar Dr. Ami Gates

Confidence Intervals Sampling Distributions Margin of Error Sample Sizes with Dr. Ami Gates

95% Confidence Interval

Confidence Intervals to Estimate the Population Mean Using the Sample Mean Suppose 5100 Statistics students show a mean time required to graduate with a combined BA/MS Degree of 5.12 years with a standard deviation of 1.71 years. Estimate the mean time required to graduate with a BA/MS for all statistics students. In general, for all possible samples of a population of a given size, the means of all the samples best estimates that population mean. Here, we have only one sample, but it is still the best estimate of the population mean that we have. Therefore, the estimate of our mean time for students to graduate is 5.12 years.

Confidence Intervals and Margin of Error to Estimate the Population Mean Using the Sample Mean Suppose 5100 Statistics students show a mean time required to graduate with a combined BA/MS Degree of 5.12 years with a standard deviation of 1.71 years. Find the 95% Confidence Interval for the mean time required to get a BA/MS. First, we will need to find the margin of error using this estimate formula: Margin of error = E ~= 2s/sqrt(n) where “s” is the sample standard deviation. Our margin of error E = 2(1.71)/sqrt(5100) = 3.42/71.4 = .048 which we can round to .05. The 95% Confidence Interval is created by adding and subtracting the margin of error from the sample mean Our sample mean is 5.12 5.12 + .05 = 5.17 5.12 - .05 = 5.07 So the 95% CI is: 5.07 < population mean () < 5.17

Our “Quick” formula for Margin of Error for 95% Confidence Intervals Important Note: This estimation “quick” method of using E = 2s/sqrt(n) can be used because we know that with a very large sample size (> 200) and a 95% interval (leaving 2.5% in each tail) and that the t-critical value for each tail is greater than 1.98 and here is rounded up to 2. The technical formula for the margin of error E for a 95% CI with an unknown population standard deviation and mean is E = (t-critical value)*(sample standard deviation) / sqrt(n) Again, because our sample sizes are so large, this “technical” formula is essentially the same as our “quick” formula. Four our class, we will use the simplified “quick” formula of E = 2s/sqrt(n) to calculate the margin or error for very large samples. Recall that s is the sample standard deviation and n is the sample size. This is for a 95% CI.

Confidence Intervals for Proportions A sample of 10000 people showed that 2100 (or 21%) of people prefer vanilla to chocolate. What is the 95% confidence interval for the proportion of all people that prefer vanilla to chocolate? The sample proportion is 2100/10000 = .21 To find the 95% confidence interval, we first need to calculate the margin or error “E”. This formula will approximate E for us. E = 2 * sqrt [ (1 – )/n] Here, is the sample proportion and “n” is the sample size. E = 2* sqrt( .21 (.79) / 10000) = 2* sqrt(.00001659) = .0081 So the 95% confidence interval is between .21 + .0081 .21 - .0081 .202 < p < .218 ˆ p ˆ p ˆ p ˆ p

Sampling Distributions A tutoring service ran for 3 days. Here are the number of calls they received on those three days: 12, 10, 5 Assume that samples of size 2 are randomly selected with replacement from these three values. List all the possible samples and find the mean of each sample: 12, 12 12, 10 12,5 10,12 10,10 10,5 5,12 5,10 5,5 Samples    Mean of each sample 12 10 11 5 8.5 7.5

Sampling Distributions Identify the probability of each sample and describe the sampling distribution of the sample means. To find the probability of each sample, notice first that there are 9 samples. Each is equally likely. So, each has a probability of 1/9. Samples    Mean of each sample 12 10 11 5 8.5 7.5 To describe the sampling distribution of the sample means, we must first group together all the samples that are identical and then look at the probability of getting that sample: The mean of 12 occurs once P(12) = 1/9 The mean of 11 occurs twice P(11) = 2/9 The mean of 10 occurs once P(10) = 1/9 The mean of 8.5 occurs twice P(8.5) = 2/9 The mean of 7.5 occurs twice P(7.5) = 2/9 The mean of 5 occurs once P(5) = 1/9

To Find the Mean of the Sampling Distribution The mean of 12 occurs once P(12) = 1/9 The mean of 11 occurs twice P(11) = 2/9 The mean of 10 occurs once P(10) = 1/9 The mean of 8.5 occurs twice P(8.5) = 2/9 The mean of 7.5 occurs twice P(7.5) = 2/9 The mean of 5 occurs once P(5) = 1/9 The mean of the sampling distribution = sum (x * p(x) ) 12 *1/9 + 11*2/9 + 10*1/9 + 8.5*2/9 + 7.5*2/9 + 5*1/9 = 9 The mean from our original population is (12 + 10 + 5) / 3 = 9 Therefore, the mean of the sampling distribution and the mean of the original population are the same.

Finding the Smallest Sample Size Needed for a Given Margin or Error Suppose you want to estimate the mean distance between to molecules in an elephant. A margin of error that you want is .01 micrometers. Past studies suggest that a population standard deviation of .16 micrometers is reasonable. Estimate the minimum sample size required to estimate the population mean with the given accuracy.

Finding the Smallest Sample Size Needed for a Given Margin or Error Here, we want to calculate the smallest sample size we will need to create a 95% confidence interval with a margin of error of .01. The formula is: n = [(2*sigma)/E]2 The sigma is the population standard deviation. The “E” is the desired margin of error. The “n” is the smallest sample size that will give us this error. We use a “2” because we want this sample size to work for a 95% CI and we know that 2 a good estimate for the critical values at the tails. Answer: Sample size “n” = [(2*.16)/.01] 2 = 322 = 1024

Finding the Sample Mean and Sample Standard Deviation Suppose you collect the following sample data: What is the sample size? Here, the sample size n = 14 What is the mean for the sample? To get the mean, add all the numbers together and divide by the sample size. The answer is 242.7.

Finding the Sample Mean and Sample Standard Deviation What is the std dev for the sample? You can use this formula: s = sqrt[sum(x – sample mean)2 / (n-1)] = 115.6 Note: In Excel, the formula for this is =STDEV.S(A1:A14) assuming your data values are in cells A through A 14. You can also do this in StatCrunch.

Calculating Sample Standard Deviation by Hand s = sqrt[sum(x – sample mean)2 / (n-1)] STEPS: The first step is to find the sample mean. Then, subtract the sample mean from each data value. Then, square each difference. (x – mean)^2 Next, sum up all the squared values together. Next, divide that sum by n – 1 Finally, take the square root of the result. The next slide shows an Excel spread sheet of these steps – color coded.

Calculating the Sample Standard Deviation by Hand – color coded The data Set sample mean x – mean (x – mean)^2 92 242.7 -150.7 22710.49 356 113.3 12836.89 428 185.3 34336.09 360 117.3 13759.29 178 -64.7 4186.09 232 -10.7 114.49 274 31.3 979.69 372 129.3 16718.49 216 -26.7 712.89 156 -86.7 7516.89 344 101.3 10261.69 46 -196.7 38690.89 152 -90.7 8226.49 192 -50.7 2570.49 sum of (x-mean)^2 173620.9 Divide by n – 1 13355.45 Take the sqrt Solution 115.5658

Using your Calculated Sample Mean and Sample Standard Deviation What is the best estimate for the population mean? The sample mean is our best estimate for the population mean. The sample mean is 242.7. What is the margin of Error The Margin of Error E can be estimated by the equation: E = 2s/sqrt(n) = (2*115.6)/sqrt(14) = 61.8

Using your Calculated Sample Mean and Sample Standard Deviation What is the 95% CI for the population mean? The mean + E is 242.7 + 61.8 = 304.5 The mean – E is 242.7 – 61.8 = 180.9 So, the 95% CI for the population mean is 181 < mean () < 305 Notice that I rounded up as my data have no decimals in them. Rounding will depend on what the problem requests.

Sample Proportions and Sample Statistics You select a random sample of 140 people at a chocolate conference that is attended by 1691 people. Within your sample, you find that 67 people secretly prefer vanilla. Based on your sample statistic, estimate how many people at the conference secretly prefer vanilla? 67/140 = .479 is our sample proportion p^. This is our sample statistic. Of the 1691 people at the conference, we can estimate using our sample proportion that: .479 * 1691 = 810 This tells us that 810 people secretly prefer vanilla at our conference.

Sample Proportions and Sample Statistics Would you be more confident of your estimate if you sampled 300 people? Yes – a higher sample is more likely to provide a more reliable estimate. Suppose you found out that 400 people at the conference actually secretly prefer vanilla. What is the population proportion for our conference? 400/1691 = .237

Sampling Distributions Sampling Distributions: A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population. Distribution of Sample Means: A sampling distribution of the mean gives all the values the mean can take, along with the probability of getting each value if sampling is random from the null-hypothesis population. Distribution of Sample Proportions: The distribution that results when we find the proportions ( ) in all possible samples of a given size. ˆ p

Sampling Error Sampling Error: The discrepancy between the statistic obtained from the sample and the parameter for the population from which the sample was obtained. For example, the mean ( ) calculated from a sample will not always equal the population mean (). ¯ x

Central Limit Theorem* Central Limit Theorem: For any population with mean  and standard deviation , the distribution of sample means for sample size n will have a mean of  and a standard deviation of /n, and will approach a normal distribution as n approaches infinity (n >30 is the general rule). * See Page 217 ¯ x

Distribution of Sample Means Example Consider the following data as a Population 2, 4, 6, 8 The population mean is 5 The population standard deviation is 2.236 Now we are going to take ALL possible samples of n = 2 from this population. We will calculate the mean for each sample

Sampling Distribution of Means for Samples of n = 2 Pick 1 Pick 2 Mean Mean 2 Variance Standard Deviation 2 4 0.000 3 9 1.414 6 16 8 2.828 5 25 18 4.243 36 0.00 7 49 64   80 440

Central Limit Theorem Applied = 80/16 = 5, which equals the population mean. So we have shown that the mean of the means is equal to mu or the population mean. Sx = √X2 – (X)2/N / N = √440 – (80)2/16 / 16 (notice we divide by N since this is a population). = √40/16 = √2.5 = 1.58 Now, we will calculate what the Central Limit Theorem tells us the standard deviation will be. It is   σx = σ/ √n = 2.236 / √2 = 2.236 / 1.14142 ¯ x

Distribution of Sample Proportions The distribution of sample proportions is the distribution that results when we find the proportions ( ) in all possible samples of a given size. The larger the sample size, the more closely this distribution approximates a normal distribution. In all cases, the mean of the distribution of sample proportions equals the population proportion. If only one sample is available, its sample proportion, , is the best estimate for the population proportion, p. ˆ p Consider the distribution of sample proportions shown in Figure 8.7 (slide 30). Assume that its mean is p = 0.6 and its standard deviation is 0.1. Suppose you randomly select the following sample of 32 responses: Y Y N Y Y Y Y N Y Y Y Y Y Y N Y Y N Y Y Y N Y Y N Y Y N Y N Y Y Compute the sample proportion, p, for this sample. How far does it lie from the mean of the distribution? What is the probability of selecting another sample with a proportion greater than the one you selected? The proportion of Y responses in this sample is P-hat = 24/32 = 0.75 Using a mean of 0.6 and a standard deviation of 0.1, we find that the sample statistic, = 0.75, has a standard score of .75 - .6 / .1 = 1.5 The sample proportion is 1.5 standard deviations above the mean of the distribution. Using Table 5.1, we see that a standard score of 1.5 corresponds to the 93rd percentile. The probability of selecting another sample with a proportion less than the one we selected is about 0.93. ˆ p

Margin of Error The margin of error for the 95% confidence interval is where s is the standard deviation of the sample. We find the 95% confidence interval by adding and subtracting the margin of error from the sample mean. That is, the 95% confidence interval ranges from (x – margin of error) to (x + margin of error) We can write this confidence interval more formally as – E < μ < + E or more briefly as ± E margin of error = E ≈ 2s n ¯ x ¯ x ¯ x

Constructing a Confidence Interval A study finds that the average time spent by eighth-graders watching television is 6.7 hours per week, with a margin of error of 0.4 hour (for 95% confidence). Construct and interpret the 95% confidence interval The best estimate of the population mean is the sample mean, = 6.7 hours. We find the confidence interval by adding and subtracting the margin of error from the sample mean, so the interval extends from 6.7 – 0.4 = 6.3 hours to 6.7 + 0.4 = 7.1 hours. ¯ x

Using StatCrunch -Confidence Intervals In the data set; select: STAT Z Statistics One-Sample With Data Select Variable Click next Select confidence interval and percent Calculate

Interpreting the Confidence Interval Figure 8.10 This figure illustrates the idea behind confidence intervals. The central vertical line represents the true population mean, μ. Each of the 20 horizontal lines represents the 95% confidence interval for a particular sample, with the sample mean marked by the dot in the center of the confidence interval. With a 95% confidence interval, we expect that 95% of all samples will give a confidence interval that contains the population mean, as is the case in this figure, for 19 of the 20 confidence intervals do indeed contain the population mean. We expect that the population mean will not be within the confidence interval in 5% of the cases; here, 1 of the 20 confidence intervals (the sixth from the top) does not contain the population mean.

Determine Minimum Sample Size Solve the margin of error formula [E =2s/√n] for n. You want to study housing costs in the country by sampling recent house sales in various (representative) regions. Your goal is to provide a 95% confidence interval estimate of the housing cost. Previous studies suggest that the population standard deviation is about $7,200. What sample size (at a minimum) should be used to ensure that the sample mean is within a. $500 of the true population mean? E E

Core Logic of Hypothesis Testing Considers the probability that the result of a study could have come about if the experimental procedure had no effect If this probability is low, scenario of no effect is rejected and the theory behind the experimental procedure is supported

Hypothesis Testing using Confidence Intervals State the claim about the population mean Determine desired confidence level Select a random sample from the population Calculate the confidence interval for the desired level of confidence. If the claim is contained within the interval, the claim is reasonable; if it is not within the interval, the claim is not reasonable, at the given level of confidence. See Testing a Claim document in Doc Sharing

Questions?