Hypothesis Testing
What is a Hypothesis? Claim Average Weekly Entertainment Spending is 55? How would you test that claim? Make it interesting, what if a add company will pay $1,000 to get a commerical message infront of the class if the Mojo is greater than 55, but they will only pay $500 if the Mojo level is less than 55. I make the claim that the Mojo level is 55, I'd like the $1,000. How could I convince the advertiser that is true? Sample, it will not equal true population value. It will be close, but how close will it be?
How to Test a Hypothesis Average Weekly Entertainment Spending 75.0 70.0 65.0 60.0 55.0 50.0 45.0 40.0 35.0 30.0 12 10 8 6 4 2 Std. Dev = 11.90 Mean = 49.8 N = 38.00
Hypothesis Testing Statistical hypothesis testing represents a formal, systematic approach to evaluating data, and deciding whether the results from an observed sample of data can be generalized to a larger population, or if instead the results might just be due to chance.
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females Greeks prefer the W&G over independents
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females Greeks prefer the W&G over independents If you have heard about W&G you will prefer them over people who have not heard of the W&G
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females Greeks prefer the W&G over independents If you have heard about W&G you will prefer them over people who have not heard of the W&G If you have watched the W&G you will prefer them over people who have not watched the W&G
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females Greeks prefer the W&G over independents If you have heard about W&G you will prefer them over people who have not heard of the W&G If you have watched the W&G you will prefer them over people who have not watched the W&G If you have lived outside the USA for more than 6 months you will prefer W&G over people who have lived in the USA
Who Likes Wallace and Gromit? Potential Hypothesis about the Wallace and Gromit (W&G): Males prefer W&G to Females Greeks prefer the W&G over independents If you have heard about W&G you will prefer them over people who have not heard of the W&G If you have watched the W&G you will prefer them over people who have not watched the W&G If you have lived outside the USA for more than 6 months you will prefer W&G over people who have lived in the USA A greater percentage of people who have lived outside of the USA have watched W&G previously
Checking out Wallace and Gromit
Wallace and Gromit Questionnaire Survey is Anonymous
Wallace and Gromit Questionnaire Survey is Anonymous Age Gender Member of a fraternity or sorority Ever lived out side of the USA for more than 6 months
Wallace and Gromit Questionnaire Survey is Anonymous Age Gender Member of a fraternity or sorority Ever lived out side of the USA for more than 6 months Ever heard of Wallace and Gromit Ever watched Wallace and Gromit If so, when was last time watched
Wallace and Gromit Questionnaire Survey is Anonymous Age Gender Ever lived out side of the USA for more than 6 months Member of a fraternity or sorority Ever heard of Wallace and Gromit Ever watched Wallace and Gromit If so, when was last time watched Rate on scale from 1 to 5 (1 = strongly disagree; 5 = strongly agree) Wallace and Gromit is clever comedy Wallace and Gromit is my kind of entertainment Would watch Wallace and Gromit again at home
Hypothesis Testing Structure 1. Business question: 2. Null hypothesis (H0): 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region: 6. Observed test statistic: 7. p-value: 8. Statistical conclusion: 9. Business conclusion:
Example 1: Over a period of years, a toothpaste has received a mean customer satisfaction rating of 5.9 out of 7. Because of a change in suppliers, there is concern that customer satisfaction may have decreased. In a sample of 60 customers, the mean rating is found to be 5.60, with a standard deviation of 0.87.
Hypothesized Mean Hypothesized Mean
Toothpaste Example 1. Business question: 2. Null hypothesis (H0): 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Decreased? 2. Null hypothesis (H0): 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Decreased? 2. Null hypothesis (H0): Try to Prove Wrong! 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed True Average Not Equal to 5.9 4. Test statistic: 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed True Average Not Equal to 5.9 4. Test statistic:Test using a Z-score 5. Rejection region:
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed True Average Not Equal to 5.9 4. Test statistic:Test using a Z-score 5. Rejection region: If Z-score,
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed True Average Not Equal to 5.9 4. Test statistic:Test using a Z-score 5. Rejection region: If Z-score, based on m0,
Toothpaste Example 1. Business question: Has Customer Satisfaction Changed? 2. Null hypothesis (H0): Try to Prove Wrong! m0 = m : True Mean equals the Hypothesized Mean 5.9 = m : True Average Equals 5.9 3. Alternative hypothesis (Ha): What We Really Think! m0 m : Something Has Changed True Average Not Equal to 5.9 4. Test statistic:Test using a Z-score 5. Rejection region: If Z-score, based on m0, is unusual -- far away from 0, reject Null Hypothesis
Toothpaste Example The Z-score: Example 1: Over a period of years, a toothpaste has received a mean customer satisfaction rating of 5.90 out of 7. Because of a change in suppliers, there is concern that customer satisfaction may have decreased. In a sample of 60 customers, the mean rating is found to be 5.60, with a standard deviation of 0.87. The Z-score:
How Unusual is Z = -2.67? Hypothesized Mean
Toothpaste Example 5. Rejection region: Depends on Confidence Level 6. Observed test statistic: 7. p-value: 8. Statistical conclusion: 9. Business conclusion:
Toothpaste Example 5. Rejection region: Depends on Confidence Level 95% Confidence Level, if Z > 1.96 or Z < -1.96, Reject Null 6. Observed test statistic: 7. p-value: 8. Statistical conclusion: 9. Business conclusion:
95% Rejection Region Reject Accept Reject
Other Rejection Regions -1.645 1.645 Reject 99% Region Reject -2.57 2.57
Toothpaste Example 5. Rejection region: Depends on Confidence Level 95% Confidence Level, if Z > 1.96 or Z < -1.96, Reject Null 6. Observed test statistic: Already Calculated z = -2.67 7. p-value: 8. Statistical conclusion: 9. Business conclusion:
Toothpaste Example 5. Rejection region: Depends on Confidence Level 95% Confidence Level, if Z > 1.96 or Z < -1.96, Reject Null 6. Observed test statistic: Already Calculated z = -2.67 7. p-value: 8. Statistical conclusion: Reject Null Hypothesis 9. Business conclusion:
Toothpaste Example 5. Rejection region: Depends on Confidence Level 95% Confidence Level, if Z > 1.96 or Z < -1.96, Reject Null 6. Observed test statistic: Already Calculated z = -2.67 7. p-value: 8. Statistical conclusion: Reject Null Hypothesis 9. Business conclusion: Average Customer Satisfaction Has Changed In Particular, Decreased.
Toothpaste Example 5. Rejection region: Depends on Confidence Level 95% Confidence Level, if Z > 1.96 or Z < -1.96, Reject Null 6. Observed test statistic: Already Calculated z = -2.67 7. p-value: pr(z<-2.67 or z>2.67) 8. Statistical conclusion: Reject Null Hypothesis 9. Business conclusion: Average Customer Satisfaction Has Changed In Particular, Decreased
p-value The p-value is a measure of the evidence against H0; it is the probability of observing the test statistic z given that H0 is true.
p-value The p-value is a measure of the evidence against H0; it is the probability of observing the test statistic z given that H0 is true. Small p-value (accept/reject) H0 Large p-value (accept/reject) H0
p-value pr(z<-2.67) + pr(z>2.67) 2 x pr(z>2.67) = 2 x (0.0038) p-value = (0.0076)