Presentation is loading. Please wait.

Presentation is loading. Please wait.

Objective: To estimate population means with various confidence levels.

Similar presentations


Presentation on theme: "Objective: To estimate population means with various confidence levels."— Presentation transcript:

1 Objective: To estimate population means with various confidence levels

2 » Descriptive Statistics – a summary or description of data (usually the calculation) ˃Ex) The mean grade for Ms. Halliday’s CHS Statistics students on the 5.1 – 5.4 Quiz was 89.1. (89.1 is the descriptive statistic) » Inferential Statistics – when we use sample data to make generalizations (inferences) about a population ˃We use sample data to: +Estimate the value of a population parameter +Test the claim (or hypothesis) about a population ˃Chapter 6 is dedicated to presenting the methods for determining inferential statistics.

3 » Here is a list of 99 body temperatures obtained by the University of Maryland. Calculate the descriptive statistics for these data (mean, standard deviation, sample size). Discuss the shape, center, and spread, as well as any outliers. 98.698.698.098.099.098.498.4 98.498.498.698.698.898.6 97.0 97.098.897.697.798.897.697.7 98.898.098.098.398.5 97.398.7 97.498.998.699.597.597.397.6 98.299.698.796.4 98.598.098.6 98.697.298.498.698.298.097.8 98.098.498.6 98.697.899.096.5 97.698.096.997.697.197.998.4 97.398.0 97.597.698.298.598.8 98.797.898.097.197.499.498.4 98.6 98.498.598.698.398.798.6 97.197.998.898.797.698.299.2 97.898.098.497.898.497.498.0 97.0

4

5 » A confidence interval uses a sample statistic to estimate a population parameter. ˃In other words, a confidence interval uses a sample mean and standard deviation to estimate the population mean for that particular populations of interest. » But, since samples vary, the statistics we use, and thus the confidence intervals we construct, vary as well.

6

7

8 » The figure below shows that some of our 95% confidence intervals (from 20 random samples) actually capture the true mean (the green horizontal line), while others do not:

9 » Our confidence is in the process of constructing the interval, not in any one interval itself. » Thus, we expect 95% of all 95% confidence intervals to contain the true parameter that they are estimating.

10

11

12 » To be more confident, we wind up being less precise. ˃We need more values in our confidence interval to be more certain. » Because of this, every confidence interval is a balance between certainty and precision. » The tension between certainty and precision is always there. ˃Fortunately, in most cases we can be both sufficiently certain and sufficiently precise to make useful statements.

13 » The choice of confidence level is somewhat arbitrary, but keep in mind this tension between certainty and precision when selecting your confidence level. » The most commonly chosen confidence levels are 90%, 95%, and 99% (but any percentage can be used).

14

15 » The sampling model found by William S. Gosset has been known as Student’s t. » The Student’s t-models form a whole family of related distributions that depend on a parameter known as degrees of freedom. ˃We often denote degrees of freedom as df, and the model as t df. » When Gosset corrected the model for the extra uncertainty, the margin of error got bigger. ˃Your confidence intervals will be just a bit wider. » By using the t-model, you’ve compensated for the extra variability in precisely the right way.

16 As the degrees of freedom increase, the t-models look more and more like the Normal model. In fact, the t-model with infinite degrees of freedom is exactly Normal.

17 Conditions: » Independence Assumption: ˃Independence Assumption: The data values should be independent. ˃Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal. ˃10% Condition: When a sample is drawn without replacement, the sample should be no more than 10% of the population.

18 » Normal Population Assumption: We can never be certain that the data are from a population that follows a Normal model, but we can check the… ˃Nearly Normal Condition: The sample data come from a distribution that is unimodal and symmetric. +Check this condition by making a histogram of raw data, if available. Otherwise, look for indications that the data follow a Normal distribution. +The smaller the sample size (n < 15 or so), the more closely the data should follow a Normal model. +For moderate sample sizes (n between 15 and 40 or so), the t works well as long as the data are unimodal and reasonably symmetric. +For larger sample sizes, the t methods are safe to use unless the data are extremely skewed.

19

20

21 » Practice calculating the critical t* values for the following using the table: ˃95% confidence for a sample of 10 ˃95% confidence for a sample of 20 ˃95% confidence for a sample of 30 ˃90% confidence for a sample of 57 ˃99.5% confidence for a sample of 7

22 1.Check Conditions and show that you have checked these! ˃Random Sample: Can we assume this? ˃10% Condition: Do you believe that your sample size is less than 10% of the population size? ˃Nearly Normal: +If you have raw data, graph a histogram to check to see if it is approximately symmetric and sketch the histogram on your paper. +If you do not have raw data, check to see if the problem states that the distribution is approximately Normal. 2.State the test you are about to conduct (this will come in hand when we learn various intervals and inference tests) ˃Ex) One sample t-interval 3.Show your calculations for your t-interval 4.Report your findings. Write a sentence explaining what you found. ˃EX) “We are 95% confident that the true mean weight of men is between 185 and 215 lbs.”

23 Amount of nitrogen oxides (NOX) emitted by light-duty engines (games/mile): Construct a 95% confidence interval for the mean amount of NOX emitted by light-duty engines. 1.281.171.161.080.61.321.240.710.491.381.20.78 0.952.21.781.831.261.731.311.81.150.971.120.72 1.311.451.221.321.471.440.511.491.330.860.571.79 2.271.872.941.161.451.511.471.062.011.39

24 » Given a set of data: ˃Enter data into L1 ˃Set up STATPLOT to create a histogram to check the nearly Normal condition ˃STAT  TESTS  8:Tinterval ˃Choose Inpt: Data, then specify your data list (usually L1) ˃Specify frequency – 1 unless you have a frequency distribution that tells you otherwise ˃Chose confidence interval  Calculate » Given sample mean and standard deviation: ˃STAT  TESTS  8:Tinterval ˃Choose Stats  enter ˃Specify the sample mean, standard deviation, and sample size ˃Chose confidence interval  Calculate

25 » Interpretation of your confidence interval is key. » What NOT to say: ˃“90% of all the vehicles on Triphammer Road drive at a speed between 29.5 and 32.5 mph.” +The confidence interval is about the mean not the individual values. ˃“We are 90% confident that a randomly selected vehicle will have a speed between 29.5 and 32.5 mph.” +Again, the confidence interval is about the mean not the individual values. ˃“The mean speed of the vehicles is 31.0 mph 90% of the time.” +The true mean does not vary—it’s the confidence interval that would be different had we gotten a different sample. ˃“90% of all samples will have mean speeds between 29.5 and 32.5 mph.” +The interval we calculate does not set a standard for every other interval—it is no more (or less) likely to be correct than any other interval.

26 » DO SAY: ˃“90% of intervals that could be found in this way would cover the true value.” ˃Or make it more personal and say, “I am 90% confident that the true mean is between 29.5 and 32.5 mph.”


Download ppt "Objective: To estimate population means with various confidence levels."

Similar presentations


Ads by Google