Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-1 Lesson 9: Confidence Intervals and Tests of Hypothesis Two or more samples.

Slides:



Advertisements
Similar presentations
Business and Economics 9th Edition
Advertisements

Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Nine Tests of Hypothesis Small Samples GOALS When you have completed.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 10 Two-Sample Tests
Chapter 8 Estimation: Additional Topics
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 10 Hypothesis Testing:
Chapter 10 Two-Sample Tests
Tests of Hypotheses: Small Samples Chapter Rejection region.
Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson9-1 Lesson 9: Two Sample Tests of Hypothesis.
Chapter Goals After completing this chapter, you should be able to:
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
1/45 Chapter 11 Hypothesis Testing II EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
A Decision-Making Approach
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Two Sample Tests Statistics for Managers Using Microsoft.
Basic Business Statistics (9th Edition)
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests Basic Business Statistics 10 th Edition.
AM Recitation 2/10/11.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Chapter Eleven McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Two-Sample Tests of Hypothesis Pages &
Ch7 Inference concerning means II Dr. Deshi Ye
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th & 7 th Lesson Hypothesis Testing for Two Population Parameters.
Statistics Are Fun! Two-Sample Tests of Hypothesis
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 10-1 Chapter 2c Two-Sample Tests.
10-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 10 Two-Sample Tests Statistics for Managers using Microsoft Excel 6 th.
Ka-fu Wong © 2003 Chap Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests and One-Way ANOVA Business Statistics, A First.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Industrial Statistics 2
11- 1 Chapter Eleven McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Hypothesis Tests for One and Two Population Variances.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 10 Hypothesis Testing:
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Two-Sample Tests Statistics for Managers Using Microsoft.
© Copyright McGraw-Hill 2000
Chapter 10 Statistical Inferences Based on Two Samples Statistics for Business (Env) 1.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Comparing Sample Means
Copyright © 2016, 2013, 2010 Pearson Education, Inc. Chapter 10, Slide 1 Two-Sample Tests and One-Way ANOVA Chapter 10.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
10-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 10 Two-Sample Tests Statistics for Managers using Microsoft Excel 6 th.
8 - 1 © 1998 Prentice-Hall, Inc. Statistics for Managers Using Microsoft Excel, 1/e Statistics for Managers Using Microsoft Excel Two-Sample & c-Sample.
EXAMPLE 1 with a standard deviation of $7,000 for a sample of 35 households. At the.01 significance level can we conclude the mean income in Bradford is.
8-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 9 Estimation: Additional Topics
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Chapter 11 Hypothesis Testing II
Chapter 10 Two Sample Tests
Learning Objectives Compute Confidence Interval Estimates for Population Mean (Sigma Unknown) Distinguish Types of Hypotheses Describe Hypothesis Testing.
Estimation & Hypothesis Testing for Two Population Parameters
Chapter 11 Hypothesis Testing II
Chapter 10 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 10 Two-Sample Tests.
Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Statistics for Business and Economics
Data Mining 2016/2017 Fall MIS 331 Chapter 2 Sampliing Distribution
Inferential Statistics and Probability a Holistic Approach
Chapter 10 Two-Sample Tests
Chapter 9 Estimation: Additional Topics
Presentation transcript:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-1 Lesson 9: Confidence Intervals and Tests of Hypothesis Two or more samples

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-2 The most important part of testing hypothesis Suppose we are interested in testing whether the population parameter (  ) is equal to k. H 0 :  = k H 1 :   k First, we need to get a sample estimate (q) of the population parameter (  ). Second, we need to identify the sampling distribution of q, including its mean and variance. Third, we know in most cases, the test statistics will be in the following form: t=(q-k)/  q  q is the standard deviation of q under the null. The form of  q depends on what q is. Fourth, given the level of significance, determine the rejection region.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-3 Testing a two-sided hypothesis at 5% level of significance 00 0 q z z=(q-  0 )/std(q) is approximately normally distribution under CLT.  / Rejection region  /2 Rejection region  *  q  *  q

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-4 The most important part of constructing confidence intervals Suppose we are interested in constructing a (1-  )*100% confidence interval about the unknown the population parameter (  ), based on some sampling information. First, we must have a sample estimate (q) of the population parameter (  ). Second, we need to identify the sampling distribution of q, including its mean and variance. Third, we know in most cases, the following statistics will be approximately normal or student-t distributed: t=(q-k)/  q  q is the standard deviation of q under the null. The form of  q depends on what q is. Fourth, given the confidence level, determine the upper and lower confidence limit for . q ± t  /2 *  q

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-5 Constructing a 95% confidence interval for  q*q* 0 q z z=(q-  )/std(q) is approximately normally distribution under CLT.  / Upper limit  /2 lower limit q * +1.96*  q q * -1.96*  q q*: estimate of  from a sample. confidence interval

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-6 Examples of the population parameter of interest Population mean:  =  The difference of two population means  =    –   The sum of two population means  =    +   The sum of three population means  =    +   +   Population variance:  =   Ratio of two population variances:  =       Sampling distribution usually normal, due to CLT. Sampling distribution usually chi-square. Sampling distribution usually F.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-7 Distribution of linear combinations of random variables If m 1, m 2, and m 3 are random variables that are independently normally distributed, For constants a, b and c, z= am 1 + bm 2 +cm 3 are also normally distributed. E(z) = aE(m 1 )+ bE(m 2 )+cE(m 3 ) Var(z) = a 2 Var(m 1 )+ b 2 Var(m 2 )+c 2 Var(m 3 )

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-8 Distribution of sample variance Let x 1, x 2,..., x n be a random sample from a population. The sample variance is The sampling distribution of s 2 has mean σ 2 And the following statistics has a  2 distribution with n – 1 degrees of freedom.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-9 Distribution of a ratio of sample variances The random variable has an F distribution with (n x – 1) numerator degrees of freedom and (n y – 1) denominator degrees of freedom

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-10 Hypothesis testing Two samples Constructing confidence interval

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-11 Control Group Experimental Group Sample 1 Sample 2 To test the effect of an herbal treatment on improvement of memory you randomly select two samples, one to receive the treatment and one to receive a placebo. Results of a memory test taken one month later are given. The resulting test statistic is = 4. Is this difference significant or is it due to chance (sampling error)? Treatment Placebo An example of hypothesis testing

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-12 Two Sample Tests TEST FOR EQUAL VARIANCES TEST FOR EQUAL MEANS HHoHHo HH1HH1 Population 1 Population 2 Population 1 Population 2 HHoHHo HH1HH1 Population 1 Population 2 Population 1Population 2

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Comparing two populations We wish to know whether the distribution of the differences in sample means has a mean of 0. If both samples contain at least 30 observations we use the z distribution as the test statistic.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-14 Hypothesis Tests for Two Population Means Format 1 Two-Tailed Test Upper One- Tailed Test Lower One- Tailed Test Format 2 Preferred

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-15 Two Independent Populations: Examples 1.An economist wishes to determine whether there is a difference in mean family income for households in two socioeconomic groups. Do HKU students come from families with higher income than CUHK students? 2.An admissions officer of a small liberal arts college wants to compare the mean SAT scores of applicants educated in rural high schools & in urban high schools. Do students from rural high schools have lower A- level exam score than from urban high schools? Note: The SAT (Scholastic Achievement Test) is a standardized test for college admissions in the United States.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-16 Two Dependent Populations: Examples 1.An analyst for Educational Testing Service wants to compare the mean GMAT scores of students before & after taking a GMAT review course. Get HKU graduates to take A-Level English and Chinese exam again. Do they get a higher A-Level English and Chinese exam score than at the time they enter HKU? 2.Nike wants to see if there is a difference in durability of 2 sole materials. One type is placed on one shoe, the other type on the other shoe of the same pair. Note: The Graduate Management Admissions Test, better known by the acronym GMAT (pronounced G-mat), is a standardized test for determining aptitude to succeed academically in graduate business studies. The GMAT is used as one of the selection criteria by most respected business schools globally, most commonly for admission into an MBA program.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-17 Thinking Challenge 1.Miles per gallon ratings of cars before & after mounting radial tires 2.The life expectancies of light bulbs made in two different factories 3.Difference in hardness between 2 metals: one contains an alloy, one doesn ’ t 4.Tread life of two different motorcycle tires: one on the front, the other on the back Are they independent or dependent? independent dependent

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Comparing two populations No assumptions about the shape of the populations are required. The samples are from independent populations. Values in one sample have no influence on the values in the other sample(s). Variance formula for independent random variables A and B: V(A-B) = V(A) + V(B) The formula for computing the value of z is:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 1 Two cities, Bradford and Kane are separated only by the Conewango River. There is competition between the two cities. The local paper recently reported that the mean household income in Bradford is $38,000 with a standard deviation of $6,000 for a sample of 40 households. The same article reported the mean income in Kane is $35,000 with a standard deviation of $7,000 for a sample of 35 households. At the.01 significance level can we conclude the mean income in Bradford is more?

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 1 continued Step 1: State the null and alternate hypotheses. H 0 : µ B ≤ µ K ; H 1 : µ B > µ K Step 2: State the level of significance. The.01 significance level is stated in the problem. Step 3: Find the appropriate test statistic. Because both samples are more than 30, we can use z as the test statistic.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Example 1 continued Step 4: State the decision rule. The null hypothesis is rejected if z is greater than Rejection Region  = 0.01 H 0 : µ B ≤ µ K ; H 1 : µ B > µ K Probability density of z statistic : N(0,1) Acceptance Region  = 0.01

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Example 1 continued Step 5: Compute the value of z and make a decision. H 0 : µ B ≤ µ K ; H 1 : µ B > µ K 1.98 Rejection Region  = 0.01 Acceptance Region  = 0.01

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Example 1 continued The decision is to not reject the null hypothesis. We cannot conclude that the mean household income in Bradford is larger.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Example 1 continued The p-value is: P(z > 1.98) = =.0239 Rejection Region  = 0.01 H 0 : µ B ≤ µ K ; H 1 : µ B > µ K 1.98 P-value =

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Small Sample Tests of Means The t distribution is used as the test statistic if one or more of the samples have less than 30 observations. The required assumptions are: 1.Both populations must follow the normal distribution. 2.The populations must have equal standard deviations. 3.The samples are from independent populations.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-26 Small sample test of means continued Finding the value of the test statistic requires two steps. Step 1: Pool the sample standard deviations. Step 2: Determine the value of t from the following formula. Why not n 1 + n 2 ?

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-27 Small sample test of means continued Why not n 1 + n 2 ? (n 1 – 1) is the degree of freedom. One df is lost because sample mean must be fixed before computation of the sample variance. Division by df instead of n 1 ensures the unbiasedness of the s 1 2 as an estimate of the population variance. (n 1 +n 2 – 2) is the degree of freedom. Two dfs are lost because two sample means must be fixed before computation of the sample variance. Division by df instead of (n 1 +n 2 ) ensures the unbiasedness of the s p 2 as an estimate of the population variance.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 2 A recent EPA study compared the highway fuel economy of domestic and imported passenger cars. A sample of 15 domestic cars revealed a mean of 33.7 mpg with a standard deviation of 2.4 mpg. A sample of 12 imported cars revealed a mean of 35.7 mpg with a standard deviation of 3.9. At the.05 significance level can the EPA conclude that the mpg is higher on the imported cars?

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-29 Example 2 continued Step 1: State the null and alternate hypotheses. H 0 : µ D ≥ µ I ; H 1 : µ D < µ I Step 2: State the level of significance. The.05 significance level is stated in the problem. Step 3: Find the appropriate test statistic. Both samples are less than 30, so we use the t distribution.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 2 continued Step 4: The decision rule is to reject H 0 if t< There are 25 degrees of freedom. Rejection Region  = 0.05 Probability density of t statistic : t (df=25)

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 2 continued Step 5: We compute the pooled variance:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-32 Example 2 continued We compute the value of t as follows.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-33 Example 2 continued Rejection Region  = H 0 is not rejected. There is insufficient sample evidence to claim a higher mpg on the imported cars.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Hypothesis Testing Involving Paired Observations Independent samples are samples that are not related in any way. Dependent samples are samples that are paired or related in some fashion. For example: If you wished to buy a car you would look at the same car at two (or more) different dealerships and compare the prices. If you wished to measure the effectiveness of a new diet you would weigh the dieters at the start and at the finish of the program.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Hypothesis Testing Involving Paired Observations Use the following test when the samples are dependent: where is the mean of the differences is the standard deviation of the differences n is the number of pairs (differences)

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 3 An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis. A random sample of eight cities revealed the following information. At the.05 significance level can the testing agency conclude that there is a difference in the rental charged?

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 3 continued CityHertz ($)Avis ($) Atlanta4240 Chicago5652 Cleveland4543 Denver48 Honolulu3732 Kansas City4548 Miami4139 Seattle4650

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 3 continued Step 1: State the null and alternate hypotheses. H 0 : µ d = 0 ; H 1 : µ d ≠ 0 Step 2: State the level of significance. The.05 significance level is stated in the problem. Step 3: Find the appropriate test statistic. We can use t as the test statistic.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data EXAMPLE 3 continued Step 4: State the decision rule. H 0 is rejected if t We use the t distribution with 7 degrees of freedom. H 0 : µ d = 0 ; H 1 : µ d ≠ 0 Rejection Region II probability=0.025 Acceptance Region  = 0.01 Rejection Region I Probability =0.025 Probability density of t statistic : t (df=7)

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-40 Example 3 continued CityHertz ($)Avis ($)dd2d2 Atlanta Chicago Cleveland Denver48 00 Honolulu Kansas City Miami Seattle

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-41 Example 3 continued

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-42 Example 3 continued Step 5: Because is less than the critical value, do not reject the null hypothesis. There is no difference in the mean amount charged by Hertz and Avis. Rejection Region II probability=0.025 Acceptance Region  = 0.01 Rejection Region I Probability = H 0 : µ d = 0 ; H 1 : µ d ≠ 0

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-43 Two Sample Tests of Proportions We investigate whether two independent samples came from populations with an equal proportion of successes. The two samples are pooled using the following formula. where X 1 and X 2 refer to the number of successes in the respective samples of n 1 and n 2.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-44 Two Sample Tests of Proportions continued The value of the test statistic is computed from the following formula. Note: The form of standard deviation reflects the assumption of independence of the two samples.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-45 Example 4 Are unmarried workers more likely to be absent from work than married workers? A sample of 250 married workers showed 22 missed more than 5 days last year, while a sample of 300 unmarried workers showed 35 missed more than five days. Use a.05 significance level.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-46 Example 4 continued The null and the alternate hypothesis are: H 0 :  U ≤  M H 1 :  U >  M The null hypothesis is rejected if the computed value of z is greater than 1.65.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-47 Example 4 continued The pooled proportion is The value of the test statistic is

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-48 Example 4 continued The null hypothesis is not rejected. We cannot conclude that a higher proportion of unmarried workers miss more days in a year than the married workers. The p-value is: P(z > 1.10) = =.1357

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-49 Two Sample Tests TEST FOR EQUAL VARIANCES TEST FOR EQUAL MEANS HHoHHo HH1HH1 Population 1 Population 2 Population 1 Population 2 HHoHHo HH1HH1 Population 1 Population 2 Population 1Population 2

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-50 follows a chi-square distribution with (n – 1) degrees of freedom Hypothesis Tests of one Population Variance If the population is normally distributed, The test statistic for hypothesis tests about one population variance is Variance under null hypothesis Population variance

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-51 Decision Rules: Variance Population variance Lower-tail test: H 0 : σ 2  σ 0 2 H 1 : σ 2 < σ 0 2 Upper-tail test: H 0 : σ 2 ≤ σ 0 2 H 1 : σ 2 > σ 0 2 Two-tail test: H 0 : σ 2 = σ 0 2 H 1 : σ 2 ≠ σ 0 2  /2  Reject H 0 if or

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-52 Hypothesis Tests for Two Variances H 0 : σ x 2 = σ y 2 H 1 : σ x 2 ≠ σ y 2 Two-tail test Lower-tail test Upper-tail test H 0 : σ x 2  σ y 2 H 1 : σ x 2 < σ y 2 H 0 : σ x 2 ≤ σ y 2 H 1 : σ x 2 > σ y 2 The two populations are assumed to be independent and normally distributed

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-53 Hypothesis Tests for Two Variances The random variable has an F distribution with (n x – 1) numerator degrees of freedom and (n y – 1) denominator degrees of freedom (continued) Under the null that  x 2 =  y 2, we have

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-54 Decision Rules: Two Variances rejection region for a two- tail test is: F 0  Reject H 0 Do not reject H 0 F0  /2 Reject H 0 Do not reject H 0 H 0 : σ x 2 = σ y 2 H 1 : σ x 2 ≠ σ y 2 H 0 : σ x 2 ≤ σ y 2 H 1 : σ x 2 > σ y 2 Let s x 2 be the larger of the two sample variances.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-55 Example: F Test You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the NYSE & NASDAQ. You collect the following data: Is there a difference in the variances between the NYSE & NASDAQ at the  = 0.10 level? NYSE NASDAQ Number 2125 Mean Std dev

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-56 F Test: Example Solution Form the hypothesis test: H 0 : σ x 2 = σ y 2 (there is no difference between variances) H 1 : σ x 2 ≠ σ y 2 (there is a difference between variances) Degrees of Freedom: Numerator (NYSE has the larger standard deviation): n x – 1 = 21 – 1 = 20 d.f. Denominator: n y – 1 = 25 – 1 = 24 d.f. Find the F critical values for  =.10/2:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-57 The test statistic is:  /2 =.05 Reject H 0 Do not reject H 0 H 0 : σ x 2 = σ y 2 H 1 : σ x 2 ≠ σ y 2 F Test: Example Solution F = is not in the rejection region, so we do not reject H 0 (continued) Conclusion: There is not sufficient evidence of a difference in variances at  =.10 F

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-58 Hypothesis testing Two samples Constructing confidence interval

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-59 Dependent Samples Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values: d i = x i - y i

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-60 Mean Difference The i th paired difference is d i, where d i = x i - y i The point estimate for the population mean paired difference is d : The sample standard deviation is:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-61 Confidence Interval for Mean Difference The confidence interval for difference between population means, μ d, is where n = the sample size (number of matched pairs in the paired sample)

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-62 Six people sign up for a weight loss program. You collect the following data: Paired Samples Example Weight: Person Before (x) After (y) Difference, d i d =  didi n = 7.0

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-63 For a 95% confidence level, the appropriate t value is t n-1,  /2 = t 5,.025 = The 95% confidence interval for the difference between means, μ d, is Paired Samples Example (continued) Since this interval contains zero, we cannot be 95% confident, given this limited data, that the weight loss program helps people lose weight

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-64 Difference Between Two Means Population means, independent samples Confidence interval uses z  /2 Confidence interval uses a value from the Student’s t distribution σ x 2 and σ y 2 assumed equal σ x 2 and σ y 2 known σ x 2 and σ y 2 unknown σ x 2 and σ y 2 assumed unequal

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-65 Population means, independent samples σ x 2 and σ y 2 Known Assumptions: Samples are randomly and independently drawn both population distributions are normal Population variances are known σ x 2 and σ y 2 known σ x 2 and σ y 2 unknown

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-66 …and the random variable has a standard normal distribution When σ x and σ y are known and both populations are normal, the variance of X – Y is σ x 2 and σ y 2 Known Population means, independent samples σ x 2 and σ y 2 known σ x 2 and σ y 2 unknown

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-67 The confidence interval for μ x – μ y is: Confidence Interval, σ x 2 and σ y 2 Known

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-68 Population means, independent samples σ x 2 and σ y 2 Unknown, Assumed Equal Assumptions: Samples are randomly and independently drawn Populations are normally distributed Population variances are unknown but assumed equal σ x 2 and σ y 2 assumed equal σ x 2 and σ y 2 known σ x 2 and σ y 2 unknown σ x 2 and σ y 2 assumed unequal

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-69 Forming interval estimates: The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ use a t value with (n x + n y – 2) degrees of freedom σ x 2 and σ y 2 Unknown, Assumed Equal The pooled variance is

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-70 The confidence interval for μ 1 – μ 2 is: Where Confidence Interval, σ x 2 and σ y 2 Unknown, Equal

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-71 Pooled Variance Example You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz): CPU x CPU y Number Tested Sample mean Sample std dev Assume both populations are normal with equal variances, and use 95% confidence

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-72 Calculating the Pooled Variance The pooled variance is: The t value for a 95% confidence interval is:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-73 Calculating the Confidence Limits The 95% confidence interval is We are 95% confident that the mean difference in CPU speed is between and Mhz.

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-74 σ x 2 and σ y 2 Unknown, Assumed Unequal Assumptions: Samples are randomly and independently drawn Populations are normally distributed Population variances are unknown and assumed unequal Population means, independent samples σ x 2 and σ y 2 assumed equal σ x 2 and σ y 2 known σ x 2 and σ y 2 unknown σ x 2 and σ y 2 assumed unequal

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-75 σ x 2 and σ y 2 Unknown, Assumed Unequal Forming interval estimates: The population variances are assumed unequal, so a pooled variance is not appropriate use a t value with degrees of freedom, where

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-76 The confidence interval for μ 1 – μ 2 is: Confidence Interval, σ x 2 and σ y 2 Unknown, Unequal Where

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-77 Two Population Proportions Goal: Form a confidence interval for the difference between two population proportions, P x – P y The point estimate for the difference is Assumptions: Both sample sizes are large (generally at least 40 observations in each sample)

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-78 Two Population Proportions (continued) The random variable is approximately normally distributed The confidence limits for P x – P y are:

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-79 Example: Two Population Proportions Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees. In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-80 Example: Two Population Proportions Men: For 90% confidence, Z  /2 = Women: The confidence limits are: Since this interval does not contain zero we are 90% confident that the two proportions are not equal

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-81 Confidence Intervals for the Population Variance The confidence interval is based on the sample variance, s 2 Assumed: the population is normally distributed The random variable follows a chi-square distribution with (n – 1) degrees of freedom The chi-square value  denotes the number for which

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-82 Confidence Intervals for the Population Variance The (1 -  )% confidence interval for the population variance is

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-83 Example You are testing the speed of a computer processor. You collect the following data (in Mhz): CPU x Sample size 17 Sample mean 3004 Sample std dev 74 Assume the population is normal. Determine the 95% confidence interval for σ x 2

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-84 Finding the Chi-square Values n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom  = 0.05, so use the the chi-square values with area in each tail: probability α/2 =.025  2 16 =  2 16 = 6.91 probability α/2 =.025

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson9-85 Calculating the Confidence Limits The 95% confidence interval is Converting to standard deviation, we are 95% confident that the population standard deviation of CPU speed is between 55.1 and Mhz

Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson END - Lesson 9: Confidence Intervals and Tests of Hypothesis Two or more samples