Confidence Intervals with Means

Slides:



Advertisements
Similar presentations
Confidence Intervals Chapter 9.
Advertisements

Chapter 23 – Inferences About Means
Two-Sample Inference Procedures with Means
Hypothesis Tests Hypothesis Tests One Sample Means.
Confidence Intervals with Means Chapter 9. What is the purpose of a confidence interval? To estimate an unknown population parameter.
Confidence Intervals Chapter 7. Rate your confidence Guess my mom’s age within 10 years? –within 5 years? –within 1 year? Shooting a basketball.
Confidence Intervals Chapter 10. Rate your confidence Name my age within 10 years? 0 within 5 years? 0 within 1 year? 0 Shooting a basketball.
Two-Sample Inference Procedures with Means. Remember: We will be intereste d in the differen ce of means, so we will use this to find standard error.
Hypothesis Tests Hypothesis Tests One Sample Means.
Chapter 11: Inference for Distributions
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
Confidence Intervals Chapter 10. Rate your confidence Name my age within 10 years? within 5 years? within 1 year? Shooting a basketball at a wading.
Chapter 11.1 Inference for the Mean of a Population.
Two independent samples Difference of Means
Two-Sample Inference Procedures with Means. Two-Sample Procedures with means two treatments two populationsThe goal of these inference procedures is to.
Confidence Intervals with Means. What is the purpose of a confidence interval? To estimate an unknown population parameter.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Hypothesis Tests OR Tests of Significance One Sample Means.
Two-Sample Inference Procedures with Means. Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.6 inches.
Two-Sample Inference Procedures with Means. Of the following situations, decide which should be analyzed using one-sample matched pair procedure and which.
Hypothesis Tests for Notes: Page 194 Hypothesis Tests for One Sample Means Notes: Page 194.
Chapter 23 Inference for One- Sample Means. Steps for doing a confidence interval: 1)State the parameter 2)Conditions 1) The sample should be chosen randomly.
Hypothesis Tests Hypothesis Tests One Sample Means.
Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error.
Confidence Intervals with Means Unit 12, Notes p 189.
Warm Up 8/26/14 A study of college freshmen’s study habits found that the time (in hours) that college freshmen use to study each week follows a distribution.
Confidence Intervals For a Sample Mean. Point Estimate singleUse a single statistic based on sample data to estimate a population parameter Simplest approach.
Confidence Intervals. Rate your confidence Name my age within 10 years? within 5 years? within 1 year? Shooting a basketball at a wading pool,
Hypothesis Tests One Sample Means
AP Statistics Chapter 24 Comparing Means.
Confidence Intervals with Means. Rate your confidence Name my age within 10 years? Name my age within 10 years? within 5 years? within 5 years?
Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.6 inches. We also have a population of adult women.
Matched Pairs Test A special type of t-inference Notes: Page 196.
Hypothesis Tests Hypothesis Tests One Sample Means.
AP Statistics Friday, 29 January 2016 OBJECTIVE TSW determine confidence intervals. Yesterday’s tests are not graded. TEST: Continuous Distributions tests.
AP Statistics Tuesday, 09 February 2016 OBJECTIVE TSW explore Hypothesis Testing. Student to Ms. Havens: “Is either yesterday’s test or the previous test.
Hypothesis Tests Hypothesis Tests (for Means). 1. A government agency has received numerous complaints that a particular restaurant has been selling underweight.
Hypothesis Tests Hypothesis Tests One Sample Means.
AP Statistics Chapter 24 Comparing Means. Objectives: Two-sample t methods Two-Sample t Interval for the Difference Between Means Two-Sample t Test for.
Confidence Intervals with Means Chapter 9. What is the purpose of a confidence interval? To estimate an unknown population parameter.
Confidence Intervals Chapter 9. How confident are you? Guess my weight… –within 50 pounds –within 20 pounds –within 1 pound Shoot a basketball… –at a.
Two-Sample Inference Procedures with Means. Two independent samples Difference of Means.
Two-Sample Inference Procedures with Means. Two-Sample Procedures with means two treatments two populationsThe goal of these inference procedures is to.
Two-Sample Inference Procedures with Means. Two-Sample Procedures with Means two differentGoal: Compare two different populations/treatments INDEPENDENT.
Confidence Intervals with Means. Rate your confidence Name my age within 10 years? Name my age within 10 years? within 5 years? within 5 years?
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Hypothesis Tests One Sample Means
Confidence Intervals Chapter 8.
Two-Sample Inference Procedures with Means
Student t-Distribution
Confidence Intervals with Means
Basketball Applet
Two-Sample Inference Procedures with Means
Hypothesis Tests One Sample Means
Two-Sample Inference Procedures with Means
Confidence Intervals.
Hypothesis Tests One Sample Means
Two-Sample Inference Procedures with Means
Two-Sample Inference Procedures with Means
Confidence Intervals with Means
Hypothesis Tests One Sample Means
Two-Sample Hypothesis Test with Means
A special type of t-inference
Two-Sample Inference Procedures with Means
Two-Sample Inference Procedures with Means
Two-Sample Inference Procedures with Means
Two-Sample Inference Procedures with Means
Two-Sample Inference Procedures with Means
Presentation transcript:

Confidence Intervals with Means Chapter 9

Formula: Margin of error Standard deviation of statistic Critical value statistic Margin of error

Student’s t- distribution Developed by William Gosset Continuous distribution Unimodal, symmetrical, bell-shaped density curve Above the horizontal axis Area under the curve equals 1 Based on degrees of freedom df = n - 1

How does the t-distributions compare to the standard normal distribution? Shorter & more spread out More area under the tails As n increases, t-distributions become more like a standard normal distribution

Standard error – when you substitute s for s. Formula: Standard deviation of statistic Standard error – when you substitute s for s. Critical value statistic Margin of error

How to find t* Find these t* 90% confidence when n = 5 Can also use invT on the calculator! Need upper t* value with 5% is above – so 95% is below invT(p,df) Find these t* 90% confidence when n = 5 95% confidence when n = 15 t* =2.132 t* =2.145

Steps for doing a confidence interval: Assumptions – Calculate the interval Write a statement about the interval in the context of the problem. We are ________% confident that the true mean context is between ______ and ______.

Assumptions for t-inference Have an SRS from population (or randomly assigned treatments) s unknown Normal (or approx. normal) distribution Given Large sample size Check graph of data Use only one of these methods to check normality

Ex. 2) A medical researcher measured the pulse rate of a random sample of 20 adults and found a mean pulse rate of 72.69 beats per minute with a standard deviation of 3.86 beats per minute. Assume pulse rate is normally distributed. Compute a 95% confidence interval for the true mean pulse rates of adults. We are 95% confident that the true mean pulse rate of adults is between 70.883 & 74.497.

Ex. 3) Consumer Reports tested 14 randomly selected brands of vanilla yogurt and found the following numbers of calories per serving: 160 200 220 230 120 180 140 130 170 190 80 120 100 170 Compute a 98% confidence interval for the average calorie content per serving of vanilla yogurt. We are 98% confident that the true mean calorie content per serving of vanilla yogurt is between 126.16 calories & 189.56 calories.

Ex 3 continued) A diet guide claims that you will get 120 calories from a serving of vanilla yogurt. What does this evidence indicate? Note: confidence intervals tell us if something is NOT EQUAL – never less or greater than! Since 120 calories is not contained within the 98% confidence interval, the evidence suggest that the average calories per serving does not equal 120 calories.

Robust CI & p-values deal with area in the tails – is the area changed greatly when there is skewness An inference procedure is ROBUST if the confidence level or p-value doesn’t change much if the normality assumption is violated. t-procedures can be used with some skewness, as long as there are no outliers. Larger n can have more skewness. Since there is more area in the tails in t-distributions, then, if a distribution has some skewness, the tail area is not greatly affected.

Find a sample size: If a certain margin of error is wanted, then to find the sample size necessary for that margin of error use: Always round up to the nearest person!

Ex 4) The heights of SHS male students is normally distributed with s = 2.5 inches. How large a sample is necessary to be accurate within + .75 inches with a 95% confidence interval? n = 43

Some Cautions: The data MUST be a SRS from the population (or randomly assigned treatment) The formula is not correct for more complex sampling designs, i.e., stratified, etc. No way to correct for bias in data

Cautions continued: Outliers can have a large effect on confidence interval Must know s to do a z-interval – which is unrealistic in practice

Hypothesis Tests One Sample Means

Steps for doing a hypothesis test “Since the p-value < (>) a, I reject (fail to reject) the H0. There is (is not) sufficient evidence to suggest that Ha (in context).” Assumptions Write hypotheses & define parameter Calculate the test statistic & p-value Write a statement in the context of the problem. H0: m = 12 vs Ha: m (<, >, or ≠) 12

Assumptions for t-inference Have an SRS from population (or randomly assigned treatments) s unknown Normal (or approx. normal) distribution Given Large sample size Check graph of data Use only one of these methods to check normality

Formulas: s unknown: m t =

Calculating p-values For z-test statistic – For t-test statistic – Use normalcdf(lb,ub) [using standard normal curve] For t-test statistic – Use tcdf(lb, ub, df)

Example 1: Bottles of a popular cola are supposed to contain 300 mL of cola. There is some variation from bottle to bottle. An inspector, who suspects that the bottler is under-filling, measures the contents of six randomly selected bottles. Is there sufficient evidence that the bottler is under-filling the bottles? Use a = .1 299.4 297.7 298.9 300.2 297 301

What are your hypothesis statements? Is there a key word? SRS? I have an SRS of bottles Normal? How do you know? Since the boxplot is approximately symmetrical with no outliers, the sampling distribution is approximately normally distributed Do you know s? s is unknown What are your hypothesis statements? Is there a key word? H0: m = 300 where m is the true mean amount Ha: m < 300 of cola in bottles p-value =.0880 a = .1 Plug values into formula. Compare your p-value to a & make decision Since p-value < a, I reject the null hypothesis. Write conclusion in context in terms of Ha. There is sufficient evidence to suggest that the true mean cola in the bottles is less than 300 mL.

Example 3: The Wall Street Journal (January 27, 1994) reported that based on sales in a chain of Midwestern grocery stores, President’s Choice Chocolate Chip Cookies were selling at a mean rate of $1323 per week. Suppose a random sample of 30 weeks in 1995 in the same stores showed that the cookies were selling at the average rate of $1208 with standard deviation of $275. Does this indicate that the sales of the cookies is lower than the earlier figure?

What is the potential error in context? Assume: Have an SRS of weeks Distribution of sales is approximately normal due to large sample size s unknown H0: m = 1323 where m is the true mean cookie sales Ha: m < 1323 per week Since p-value < a of 0.05, I reject the null hypothesis. There is sufficient evidence to suggest that the sales of cookies are lower than the earlier figure. What is the potential error in context? What is a consequence of that error?

Example 9: President’s Choice Chocolate Chip Cookies were selling at a mean rate of $1323 per week. Suppose a random sample of 30 weeks in 1995 in the same stores showed that the cookies were selling at the average rate of $1208 with standard deviation of $275. Compute a 90% confidence interval for the mean weekly sales rate. CI = ($1122.70, $1293.30) Based on this interval, is the mean weekly sales rate statistically less than the reported $1323?

A special type of t-inference Matched Pairs Test A special type of t-inference

Matched Pairs – two forms Pair individuals by certain characteristics Randomly select treatment for individual A Individual B is assigned to other treatment Assignment of B is dependent on assignment of A Individual persons or items receive both treatments Order of treatments are randomly assigned or before & after measurements are taken The two measures are dependent on the individual

Is this an example of matched pairs? 1)A college wants to see if there’s a difference in time it took last year’s class to find a job after graduation and the time it took the class from five years ago to find work after graduation. Researchers take a random sample from both classes and measure the number of days between graduation and first day of employment No, there is no pairing of individuals, you have two independent samples

Is this an example of matched pairs? 2) In a taste test, a researcher asks people in a random sample to taste a certain brand of spring water and rate it. Another random sample of people is asked to taste a different brand of water and rate it. The researcher wants to compare these samples No, there is no pairing of individuals, you have two independent samples – If you would have the same people taste both brands in random order, then it would be an example of matched pairs.

Is this an example of matched pairs? 3) A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to a random sample, company researchers take a weight measurement on each person. After a month of using the drug, each person’s weight is measured again. Yes, you have two measurements that are dependent on each individual.

A whale-watching company noticed that many customers wanted to know whether it was better to book an excursion in the morning or the afternoon. To test this question, the company collected the following data on 15 randomly selected days over the past month. (Note: days were not consecutive.) You may subtract either way – just be careful when writing Ha Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Morning After-noon Since you have two values for each day, they are dependent on the day – making this data matched pairs First, you must find the differences for each day.

-1 -2 I subtracted: Morning – afternoon Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Morning After-noon Differences -1 -2 I subtracted: Morning – afternoon You could subtract the other way! Assumptions: Have an SRS of days for whale-watching s unknown Since the normal probability plot is approximately linear, the distribution of difference is approximately normal. You need to state assumptions using the differences! Notice the granularity in this plot, it is still displays a nice linear relationship!

Differences -1 -2 1 2 Is there sufficient evidence that more whales are sighted in the afternoon? Be careful writing your Ha! Think about how you subtracted: M-A If afternoon is more should the differences be + or -? Don’t look at numbers!!!! If you subtract afternoon – morning; then Ha: mD>0 H0: mD = 0 Ha: mD < 0 Where mD is the true mean difference in whale sightings from morning minus afternoon Notice we used mD for differences & it equals 0 since the null should be that there is NO difference.

finishing the hypothesis test: Differences -1 -2 1 2 finishing the hypothesis test: Since p-value > a, I fail to reject H0. There is insufficient evidence to suggest that more whales are sighted in the afternoon than in the morning. In your calculator, perform a t-test using the differences (L3) Notice that if you subtracted A-M, then your test statistic t = + .945, but p-value would be the same How could I increase the power of this test?

Two-Sample Inference Procedures with Means

Remember: We will be interested in the difference of means, so we will use this to find standard error.

mx-y =6 inches & sx-y =3.471 inches Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.6 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are normally distributed. Describe the distribution of the difference in heights between males and females (male-female). Normal distribution with mx-y =6 inches & sx-y =3.471 inches

71 65 Female Male 6 Difference = male - female s = 3.471

What is the probability that the height of a randomly selected man is at most 5 inches taller than the height of a randomly selected woman? b) What is the 70th percentile for the difference (male-female) in heights of a randomly selected man & woman? P((xM-xF) < 5) = normalcdf(-∞,5,6,3.471) = .3866 (xM-xF) = invNorm(.7,6,3.471) = 7.82

Two-Sample Procedures with means When we compare, what are we interested in? The goal of these inference procedures is to compare the responses to two treatments or to compare the characteristics of two populations. We have INDEPENDENT samples from each treatment or population

Assumptions: Have two SRS’s from the populations or two randomly assigned treatment groups Samples are independent Both distributions are approximately normally Have large sample sizes Graph BOTH sets of data s’s unknown

Formulas Since in real-life, we will NOT know both s’s, we will do t-procedures.

Calculator does this automatically! Degrees of Freedom Option 1: use the smaller of the two values n1 – 1 and n2 – 1 This will produce conservative results – higher p-values & lower confidence. Option 2: approximation used by technology Calculator does this automatically!

Confidence intervals: Called standard error

Pooled procedures: Used for two populations with the same variance When you pool, you average the two-sample variances to estimate the common population variance. DO NOT use on AP Exam!!!!! We do NOT know the variances of the population, so ALWAYS tell the calculator NO for pooling!

Two competing headache remedies claim to give fast-acting relief Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Assume the absorption time is normally distributed. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A 20.1 8.7 12 Brand B 18.9 7.5 12 Describe the shape & standard error for sampling distribution of the differences in the mean speed of absorption. (answer on next screen)

Describe the sampling distribution of the differences in the mean speed of absorption. Find a 95% confidence interval difference in mean lengths of time required for bodily absorption of each brand. (answer on next screen) Normal distribution with S.E. = 3.316

Closest without going over Assumptions: Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed s’s unknown State assumptions! Think “Price is Right”! Closest without going over Formula & calculations From calculator df = 21.53, use t* for df = 21 & 95% confidence level Conclusion in context We are 95% confident that the true difference in mean lengths of time required for bodily absorption of each brand is between –5.685 minutes and 8.085 minutes.

Note: confidence interval statements Matched pairs – refer to “mean difference” Two-Sample – refer to “difference of means”

Hypothesis Statements: H0: m1 - m2 = 0 Ha: m1 - m2 < 0 Ha: m1 - m2 > 0 Ha: m1 - m2 ≠ 0 H0: m1 = m2 Be sure to define BOTH m1 and m2! Ha: m1< m2 Ha: m1> m2 Ha: m1 ≠ m2

Hypothesis Test: Since we usually assume H0 is true, then this equals 0 – so we can usually leave it out

The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A 20.1 8.7 12 Brand B 18.9 7.5 12 Is there sufficient evidence that these drugs differ in the speed at which they enter the blood stream?

State assumptions! Hypotheses & define variables! H0: mA= mB Ha:mA= mB Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed s’s unknown State assumptions! Hypotheses & define variables! H0: mA= mB Ha:mA= mB Where mA is the true mean absorption time for Brand A & mB is the true mean absorption time for Brand B Formula & calculations Conclusion in context Since p-value > a, I fail to reject H0. There is not sufficient evidence to suggest that these drugs differ in the speed at which they enter the blood stream.

Suppose that the sample mean of Brand B is 16 Suppose that the sample mean of Brand B is 16.5, then is Brand B faster? No, I would still fail to reject the null hypothesis.

Robustness: Two-sample procedures are more robust than one-sample procedures BEST to have equal sample sizes! (but not necessary)