Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses
Learning Objectives 1.Distinguish Independent and Related Populations 2.Solve Inference Problems for Two Populations Mean Proportion Variance 3.Determine Sample Size
Thinking Challenge Who gets higher grades: males or females? Which program is faster to learn: Word or Excel? How would you try to answer these questions?
Target Parameters Difference between Means – Difference between Proportions p – p Ratio of Variances
Independent & Related Populations 1.Different data sources Unrelated Independent Related 1.Same data source Paired or matched Repeated measures (before/after) 2.Use difference between each pair of observations d i = x 1i – x 2i 2.Use difference between the two sample means X 1 – X 2
Two Independent Populations Examples 1.An economist wishes to determine whether there is a difference in mean family income for households in two socioeconomic groups. 2.An admissions officer of a small liberal arts college wants to compare the mean SAT scores of applicants educated in rural high schools and in urban high schools.
Two Related Populations Examples 1.Nike wants to see if there is a difference in durability of two sole materials. One type is placed on one shoe, the other type on the other shoe of the same pair. 2.An analyst for Educational Testing Service wants to compare the mean GMAT scores of students before and after taking a GMAT review course.
Thinking Challenge 1.Miles per gallon ratings of cars before and after mounting radial tires 2.The life expectancy of light bulbs made in two different factories 3.Difference in hardness between two metals: one contains an alloy, one doesn’t 4.Tread life of two different motorcycle tires: one on the front, the other on the back Are they independent or related?
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Comparing Two Means
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Comparing Two Independent Means
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Sampling Distribution Population 1 1 1 Select simple random sample, n 1. Compute X 1 Compute X 1 – X 2 for every pair of samples Population 2 2 2 Select simple random sample, n 2. Compute X 2 Astronomical number of X 1 – X 2 values 1 - 2 Sampling Distribution
Large-Sample Inference for Two Independent Means
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Conditions Required for Valid Large-Sample Inferences about μ 1 – μ 2 Assumptions Independent, random samples Can be approximated by the normal distribution when n 1 30 and n 2 30
Large-Sample Confidence Interval for μ 1 – μ 2 (Independent Samples) Confidence Interval
Large-Sample Confidence Interval Example You’re a financial analyst for Charles Schwab. You want to estimate the difference in dividend yield between stocks listed on NYSE and NASDAQ. You collect the following data: NYSE NASDAQ Number Mean Std Dev What is the 95% confidence interval for the difference between the mean dividend yields?
Large-Sample Confidence Interval Solution
Hypotheses for Means of Two Independent Populations HaHa Hypothesis Research Questions No Difference Any Difference Pop 1 Pop 2 Pop 1 < Pop 2 Pop 1 Pop 2 Pop 1 > Pop 2 H0H0
Large-Sample Test for μ 1 – μ 2 (Independent Samples) Two Independent Sample Z-Test Statistic Hypothesized difference
Large-Sample Test Example You’re a financial analyst for Charles Schwab. You want to find out if there is a difference in dividend yield between stocks listed on NYSE and NASDAQ. You collect the following data: NYSE NASDAQ Number Mean Std Dev Is there a difference in average yield ( =.05)?
Large-Sample Test Solution H 0 : H a : n 1 =, n 2 = Critical Value(s): Test Statistic: Decision: Conclusion: Reject at =.05 There is evidence of a difference in means z Reject H 1 - 2 = 0 ( 1 = 2 ) 1 - 2 0 ( 1 2 )
You’re an economist for the Department of Education. You want to find out if there is a difference in spending per pupil between urban and rural high schools. You collect the following: Urban Rural Number3535 Mean$ 6,012 $ 5,832 Std Dev$ 602$ 497 Is there any difference in population means ( =.10)? Large-Sample Test Thinking Challenge
Large-Sample Test Solution* H 0 : H a : n 1 =, n 2 = Critical Value(s): Test Statistic: Decision: Conclusion: Do not reject at =.10 There is no evidence of a difference in means z Reject H 1 - 2 = 0 ( 1 = 2 ) 1 - 2 0 ( 1 2 )
Small-Sample Inference for Two Independent Means
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Conditions Required for Valid Small-Sample Inferences about μ 1 – μ 2 Assumptions Independent, random samples Populations are approximately normally distributed Population variances are equal
Small-Sample Confidence Interval for μ 1 – μ 2 (Independent Samples) Confidence Interval
Small-Sample Confidence Interval Example You’re a financial analyst for Charles Schwab. You want to estimate the difference in dividend yield between stocks listed on the NYSE and NASDAQ? You collect the following data: NYSE NASDAQ Number 1115 Mean Std Dev Assuming normal populations, what is the 95% confidence interval for the difference between the mean dividend yields? © T/Maker Co.
Small-Sample Confidence Interval Solution df = n 1 + n 2 – 2 = – 2 = 24 t.025 = 2.064
Two Independent Sample t–Test Statistic Small-Sample Test for μ 1 – μ 2 (Independent Samples) Hypothesized difference
Small-Sample Test Example You’re a financial analyst for Charles Schwab. Is there a difference in dividend yield between stocks listed on the NYSE and NASDAQ? You collect the following data: NYSE NASDAQ Number 1115 Mean Std Dev Assuming normal populations, is there a difference in average yield ( =.05)? © T/Maker Co.
H 0 : H a : df Critical Value(s): Test Statistic: Decision: Conclusion: 1 - 2 = 0 ( 1 = 2 ) 1 - 2 0 ( 1 2 ) = 24 t Reject H Small-Sample Test Solution
H 0 : 1 - 2 = 0 ( 1 = 2 ) H a : 1 - 2 0 ( 1 2 ) .05 df = 24 Critical Value(s): Test Statistic: Decision: Conclusion: Do not reject at =.05 There is no evidence of a difference in means t Reject H Small-Sample Test Solution
You’re a research analyst for General Motors. Assuming equal variances, is there a difference in the average miles per gallon (mpg) of two car models ( =.05)? You collect the following: Sedan Van Number1511 Mean Std Dev Small-Sample Test Thinking Challenge
H 0 : H a : df Critical Value(s): Test Statistic: Decision: Conclusion: t Reject H 1 - 2 = 0 ( 1 = 2 ) 1 - 2 0 ( 1 2 ) = 24 Small-Sample Test Solution*
Test Statistic: Decision: Conclusion: Do not reject at =.05 There is no evidence of a difference in means Small-Sample Test Solution* H 0 : H a : df Critical Value(s): t Reject H 1 - 2 = 0 ( 1 = 2 ) 1 - 2 0 ( 1 2 ) = 24
Paired Difference Experiments Small - Sample
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Paired-Difference Experiments 1.Compares means of two related populations Paired or matched Repeated measures (before/after) 2.Eliminates variation among subjects
Conditions Required for Valid Small - Sample Paired - Difference Inferences Assumptions Random sample of differences Both population are approximately normally distributed
Paired-Difference Experiment Data Collection Table ObservationGroup 1Group 2Difference 1x 11 x 21 d 1 = x 11 – x 21 2x 12 x 22 d 2 = x 12 – x 22 ix 1i x 2i d i = x 1i – x 2i nx 1n x 2n d n = x 1n – x 2n
Paired-Difference Experiment Small-Sample Confidence Interval Sample MeanSample Standard Deviation d d n S (d i - d) 2 n i i n d i n 11 1 d d df = n d – 1
Paired-Difference Experiment Confidence Interval Example You work in Human Resources. You want to see if there is a difference in test scores after a training program. You collect the following test score data: NameBefore (1)After (2) Sam8594 Tamika9487 Brian7879 Mike8788 Find a 90% confidence interval for the mean difference in test scores.
Computation Table ObservationBeforeAfterDifference Sam Tamika Brian7879 Mike8788 Total- 4 d = –1S d = 6.53
Paired-Difference Experiment Confidence Interval Solution df = n d – 1 = 4 – 1 = 3 t.05 = 2.353
Hypotheses for Paired- Difference Experiment HaHa Hypothesis Research Questions No Difference Any Difference Pop 1 Pop 2 Pop 1 < Pop 2 Pop 1 Pop 2 Pop 1 > Pop 2 H0H0 Note: d i = x 1i – x 2i for i th observation
Paired-Difference Experiment Small-Sample Test Statistic t d S n df = n – 1 0 d D d Sample MeanSample Standard Deviation d didi n S (d i - d) 2 n i n d i n 11 1 d d
Paired-Difference Experiment Small-Sample Test Example You work in Human Resources. You want to see if a training program is effective. You collect the following test score data: NameBefore After Sam8594 Tamika9487 Brian7879 Mike8788 At the.10 level of significance, was the training effective?
Null Hypothesis Solution 1.Was the training effective? 2.Effective means ‘Before’ < ‘After’. 3.Statistically, this means B < A. 4.Rearranging terms gives B – A < 0. 5.Defining d = B – A and substituting into (4) gives d . 6.The alternative hypothesis is H a : d 0.
Paired-Difference Experiment Small-Sample Test Solution H 0 : H a : = df = Critical Value(s): Test Statistic: Decision: Conclusion: d = 0 ( d = B - A ) d < = 3 t Reject H 0
Computation Table ObservationBeforeAfterDifference Sam Tamika Brian7879 Mike8788 Total- 4 d = –1S d = 6.53
Paired-Difference Experiment Small-Sample Test Solution H 0 : H a : = df = Critical Value(s): Test Statistic: Decision: Conclusion: d = 0 ( d = B - A ) d < = 3 t Reject H 0 Do not reject at =.10 There is no evidence training was effective
Paired-Difference Experiment Small-Sample Test Thinking Challenge You’re a marketing research analyst. You want to compare a client’s calculator to a competitor’s. You sample 8 retail stores. At the.01 level of significance, does your client’s calculator sell for less than their competitor’s? (1)(2) Store Client Competitor 1$ 10$
Paired-Difference Experiment Small-Sample Test Solution* H 0 : H a : = df = Critical Value(s): Test Statistic: Decision: Conclusion: Reject at =.01 There is evidence client’s brand (1) sells for less t Reject H 0 d = 0 ( d = 1 - 2 ) d < = 7
Comparing Two Population Proportions
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Conditions Required for Valid Large-Sample Inference about p 1 – p 2 Assumptions Independent, random samples Normal approximation can be used if
Large-Sample Confidence Interval for p 1 – p 2 Confidence Interval
Confidence Interval for p 1 – p 2 Example As personnel director, you want to test the perception of fairness of two methods of performance evaluation. 63 of 78 employees rated Method 1 as fair. 49 of 82 rated Method 2 as fair. Find a 99% confidence interval for the difference in perceptions.
Confidence Interval for p 1 – p 2 Solution
Hypotheses for Two Proportions HaHa Hypothesis Research Questions No Difference Any Difference Pop 1 Pop 2 Pop 1 < Pop 2 Pop 1 Pop 2 Pop 1 > Pop 2 H0H0
Large-Sample Test for p 1 – p 2 Z-Test Statistic for Two Proportions
Test for Two Proportions Example As personnel director, you want to test the perception of fairness of two methods of performance evaluation. 63 of 78 employees rated Method 1 as fair. 49 of 82 rated Method 2 as fair. At the.01 level of significance, is there a difference in perceptions?
H 0 : H a : = n 1 = n 2 = Critical Value(s): Test Statistic: Decision: Conclusion: p 1 - p 2 = 0 p 1 - p 2 z Reject H Test for Two Proportions Solution
Test Statistic: Decision: Conclusion: Reject at =.01 There is evidence of a difference in proportions Test for Two Proportions Solution H 0 : H a : = n 1 = n 2 = Critical Value(s): p 1 - p 2 = 0 p 1 - p 2 z Reject H Z = +2.90
Test for Two Proportions Thinking Challenge You’re an economist for the Department of Labor. You’re studying unemployment rates. In MA, 74 of 1500 people surveyed were unemployed. In CA, 129 of 1500 were unemployed. At the.05 level of significance, does MA have a lower unemployment rate than CA? MA CA
Test Statistic: Decision: Conclusion: H 0 : H a : = n MA = n CA = Critical Value(s): p MA – p CA = 0 p MA – p CA < Z Reject H 0 Test for Two Proportions Solution*
Test Statistic: Decision: Conclusion: Z = –4.00 Reject at =.05 There is evidence MA is less than CA Test for Two Proportions Solution* H 0 : H a : = n MA = n CA = Critical Value(s): p MA – p CA = 0 p MA – p CA < Z Reject H 0
Determining Sample Size
Sample size for estimating μ 1 – μ 2 Sample size for estimating p 1 – p 2 ME = Margin of Error
Sample Size Example What sample size is needed to estimate μ 1 – μ 2 with 95% confidence and a margin of error of 5.8? Assume prior experience tells us σ 1 =12 and σ 2 =18.
Sample Size Example What sample size is needed to estimate p 1 – p 2 with 90% confidence and a width of.05?
Comparing Two Population Variances
Two Population Inference Two Populations Z (Large sample) t (Paired sample) Z ProportionVariance F t (Small sample) Paired Indep. Mean
Sampling Distribution Population 1 1 1 Select simple random sample, size n 1. Compute S 1 2 Population 2 2 2 Select simple random sample, size n 2. Compute S 2 2 Sampling Distributions for Different Sample Sizes Astronomical number of S 1 2 /S 2 2 values Compute F = S 1 2 /S 2 2 for every pair of n 1 & n 2 size samples
Conditions Required for a Valid F-Test for Equal Variances Assumptions Both populations are normally distributed —Test is not robust to violations Independent, random samples
F-Test for Equal Variances Hypotheses Hypotheses H 0 : 1 2 = 2 2 OR H 0 : 1 2 2 2 (or ) H a : 1 2 2 2 H a : 1 2 2 2 (or >) Test Statistic F = s 1 2 /s 2 2 Two sets of degrees of freedom — 1 = n 1 – 1; 2 = n 2 – 1 Follows F distribution
F-Test for Equal Variances Critical Values 0 Reject H 0 Do Not Reject H 0 F 0 Note! /2
F-Test for Equal Variances Example You’re a financial analyst for Charles Schwab. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: NYSE NASDAQ Number 2125 Mean Std Dev Is there a difference in variances between the NYSE & NASDAQ at the.05 level of significance? © T/Maker Co.
F-Test for Equal Variances Solution H 0 : H a : 1 2 Critical Value(s): Test Statistic: Decision: Conclusion: 1 2 = 2 2 12 2212
F-Test for Equal Variances Solution 0 Reject H 0 Do Not Reject H 0 F 0 /2 =.025
F-Test for Equal Variances Solution H 0 : 1 2 = 2 2 H a : 1 2 2 2 .05 1 20 2 24 Critical Value(s): Test Statistic: Decision: Conclusion: 0 F Reject H Do not reject at =.05 There is no evidence of a difference in variances
F-Test for Equal Variances Thinking Challenge You’re an analyst for the Light & Power Company. You want to compare the electricity consumption of single-family homes in two towns. You compute the following from a sample of homes : Town 1Town 2 Number Mean$ 85$ 68 Std Dev $ 30 $ 18 At the.05 level of significance, is there evidence of a difference in variances between the two towns?
F-Test for Equal Variances Solution* H 0 : H a : 1 2 Critical Value(s): Test Statistic: Decision: Conclusion: 1 2 = 2 2 12 2212
Critical Values Solution* 0 Reject H 0 Do Not Reject H 0 F 0 /2 =.025
F-Test for Equal Variances Solution* H 0 : 1 2 = 2 2 H a : 1 2 2 2 .05 1 24 2 20 Critical Value(s): Test Statistic: Decision: Conclusion: Reject at =.05 There is evidence of a difference in variances 0 F Reject H 0.025
Conclusion 1.Distinguished Independent and Related Populations 2.Solved Inference Problems for Two Populations Mean Proportion Variance 3.Determined Sample Size