St. Edward’s University

Slides:



Advertisements
Similar presentations
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Advertisements

Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
1 1 Slide © 2003 South-Western /Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Uji Kebaikan Suai (Uji Kecocokan) Pertemuan 23
Inference about the Difference Between the
1 1 Slide Mátgæði Kafli 11 í Newbold Snjólfur Ólafsson + Slides Prepared by John Loucks © 1999 ITP/South-Western College Publishing.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Discrete (Categorical) Data Analysis
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Statistical Inference About Means and Proportions With Two Populations
Chapter Goals After completing this chapter, you should be able to:
1 Pertemuan 09 Pengujian Hipotesis Proporsi dan Data Katagorik Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Chapter 16 Chi Squared Tests.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Chapter 11a: Comparisons Involving Proportions and a Test of Independence Inference about the Difference between the Proportions of Two Populations Hypothesis.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1. State the null and alternative hypotheses. 2. Select a random sample and record observed frequency f i for the i th category ( k categories) Compute.
Chi-Square Tests and the F-Distribution
Goodness of Fit Test for Proportions of Multinomial Population Chi-square distribution Hypotheses test/Goodness of fit test.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
GOODNESS OF FIT TEST & CONTINGENCY TABLE
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide Slides by John Loucks St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 10 Statistical Inference About Means and Proportions With Two Populations n Inferences About the Difference.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1/71 Statistics Tests of Goodness of Fit and Independence.
Chapter Outline Goodness of Fit test Test of Independence.
© Copyright McGraw-Hill 2004
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 23 日 第六週:配適度與獨立性檢定.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 Pertemuan 24 Uji Kebaikan Suai Matakuliah: I0134 – Metoda Statistika Tahun: 2005 Versi: Revisi.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
1. State the null and alternative hypotheses. 2. Select a random sample and record observed frequency f i for the i th category ( k categories) Compute.
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Slides by JOHN LOUCKS St. Edward’s University.
Keller: Stats for Mgmt & Econ, 7th Ed Chi-Squared Tests
Test of independence: Contingency Table
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
St. Edward’s University
St. Edward’s University
CHAPTER 11 CHI-SQUARE TESTS
John Loucks St. Edward’s University . SLIDES . BY.
John Loucks St. Edward’s University . SLIDES . BY.
Statistics for Business and Economics (13e)
Chapter 11 Inferences About Population Variances
Econ 3790: Business and Economics Statistics
Inference on Categorical Data
CHI SQUARE TEST OF INDEPENDENCE
CHAPTER 11 CHI-SQUARE TESTS
Chapter Outline Goodness of Fit test Test of Independence.
Presentation transcript:

St. Edward’s University SLIDES BY John Loucks St. Edward’s University .

Chapter 11 Comparisons Involving Proportions and a Test of Independence Inferences About the Difference Between Two Population Proportions Hypothesis Test for Proportions of a Multinomial Population Test of Independence

Inferences About the Difference Between Two Population Proportions Interval Estimation of p1 - p2 Hypothesis Tests About p1 - p2

Inferences About the Difference Between Two Population Proportions Let: p1 denote the proportion for population 1 p2 denote the population for population 2 To make an inference about p1 - p2 we will select two independent random samples consisting of n1 units from population 1 and n2 units from population 2. Let: denote the sample proportion for population 1 denote the sample proportion for population 2

Sampling Distribution of Expected Value Standard Deviation (Standard Error) where: n1 = size of sample taken from population 1 n2 = size of sample taken from population 2

Sampling Distribution of If the sample sizes are large, the sampling distribution of can be approximated by a normal probability distribution. The sample sizes are sufficiently large if all of these conditions are met: n1p1 > 5 n1(1 - p1) > 5 n2p2 > 5 n2(1 - p2) > 5

Sampling Distribution of p1 – p2

Interval Estimation of p1 - p2 Interval Estimate where: Point Estimate is Margin of Error is

Interval Estimation of p1 - p2 Example: Market Research Associates Market Research Associates is conducting research to evaluate the effectiveness of a client’s new advertising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households “aware” of the client’s product. The new campaign has been initiated with TV and newspaper advertisements running for three weeks.

Interval Estimation of p1 - p2 Example: Market Research Associates A survey conducted immediately after the new campaign showed 120 of 250 households “aware” of the client’s product. Does the data support the position that the advertising campaign has provided an increased awareness of the client’s product?

Point Estimator of the Difference Between Two Population Proportions p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign = sample proportion of households “aware” of the product after the new campaign product before the new campaign

Interval Estimation of p1 - p2 For = .05, z.025 = 1.96: .08 + 1.96(.0510) .08 + .10 Hence, the 95% confidence interval for the difference in before and after awareness of the product is -.02 to +.18.

Interval Estimation of p1 - p2 Excel Formula Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 No. of "Yes" =COUNTIF(A2:A251,"Yes") =COUNTIF(B2:B151,"Yes") 4 Samp. Propor. =D3/D2 =E3/E2 5 6 Confid. Coeff. 0.95 7 Lev. Of Signif. =1-D6 8 z Value =NORM.S.INV(1-D7/2,TRUE) 9 10 Std. Error =SQRT(D4*(1-D4)/D2+E4*(1-E4)/E2) 11 Marg. of Error =D8*D10 12 13 Pt. Est. of Diff. =D4-E4 14 Lower Limit =D13-D11 15 Upper Limit =D13+D11 Note: Rows 16-251 are not shown.

Interval Estimation of p1 - p2 Excel Value Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 No. of "Yes" 120 60 4 Samp. Propor. 0.48 0.40 5 6 Confid. Coeff. 0.95 7 Lev. Of Signif. 0.05 8 z Value 1.960 9 10 Std. Error 0.0510 11 Marg. of Error 0.0999 12 13 Pt. Est. of Diff. 0.080 14 Lower Limit -0.020 15 Upper Limit 0.180 Note: Rows 16-251 are not shown.

Hypothesis Tests about p1 - p2 Hypotheses We focus on tests involving no difference between the two population proportions (i.e. p1 = p2) H0: p1 - p2 < 0 Ha: p1 - p2 > 0 Left-tailed Right-tailed Two-tailed

Hypothesis Tests about p1 - p2 Pooled Estimate of Standard Error of where:

Hypothesis Tests about p1 - p2 Test Statistic

Hypothesis Tests about p1 - p2 Example: Market Research Associates Can we conclude, using a .05 level of significance, that the proportion of households aware of the client’s product increased after the new advertising campaign?

Hypothesis Tests about p1 - p2 p -Value and Critical Value Approaches 1. Develop the hypotheses. H0: p1 - p2 < 0 Ha: p1 - p2 > 0 p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign

Hypothesis Tests about p1 - p2 p -Value and Critical Value Approaches 2. Specify the level of significance. a = .05 3. Compute the value of the test statistic.

Hypothesis Tests about p1 - p2 p –Value Approach 4. Compute the p –value. For z = 1.56, the p–value = .0594 5. Determine whether to reject H0. Because p–value > a = .05, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.

Hypothesis Tests about p1 - p2 Critical Value Approach 4. Determine the critical value and rejection rule. For a = .05, z.05 = 1.645 Reject H0 if z > 1.645 5. Determine whether to reject H0. Because 1.56 < 1.645, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.

Hypothesis Tests about p1 - p2 Excel Formula Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size =COUNTA(A2:A251) =COUNTA(B2:B151) 3 Resp. of Interest 4 Count for Resp. =COUNTIF(A2:A251,D3) =COUNTIF(B2:B151,E3) 5 Sample Propor. =D4/D2 =E4/E2 6 7 Hypoth. Value 8 Point Est. of Diff. =D5-E5 9 10 Pooled Est. of p =(D2*D5+E2*E5)/(D2+E2) 11 Standard Error 12 Test Statistic =(D8-D7)/D11 13 14 -Value (lower tail) =NORM.S.DIST(D12,TRUE) 15 -Value (upper tail) =1-NORM.S.DIST(D12,TRUE) 16 -Value (two tail) =2*MIN(D14,D15) =SQRT(D10*(1-D10)*(1/D2+1/E2)) Note: Rows 17-251 are not shown.

Hypothesis Tests about p1 - p2 Excel Value Worksheet A B C D E 1 Sur2 Sur1 Survey 2 (from Popul.1) Survey 1 (from Popul.2) 2 No Yes Sample Size 250 150 3 Resp. of Interest 4 Count for Resp. 120 60 5 Sample Propor. 0.48 0.40 6 7 Hypoth. Value 8 Point Est. of Diff. 0.08 9 10 Pooled Est. of p 0.450 11 Standard Error 12 Test Statistic 1.557 13 14 -Value (lower tail) 0.940 15 -Value (upper tail) 0.060 16 -Value (two tail) 0.120 0.0514 Note: Rows 17-251 are not shown.

Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned to one and only one of several classes or categories. Such a population is a multinomial population. The multinomial distribution can be thought of as an extension of the binomial distribution. On each trial of a multinomial experiment: One and only one of the outcomes occurs Each trial is assumed to be independent The probabilities of the outcomes remain the same for each trial

1. State the null and alternative hypotheses. Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 1. State the null and alternative hypotheses. H0: The population follows a multinomial distribution with specified probabilities for each of the k categories Ha: The population does not follow a multinomial distribution with specified probabilities for each of the k categories

Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 2. Select a random sample and record the observed frequency, fi , for each of the k categories. 3. Assuming H0 is true, compute the expected frequency, ei , in each category by multiplying the category probability by the sample size.

Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 4. Compute the value of the test statistic. where: fi = observed frequency for category i ei = expected frequency for category i k = number of categories Note: The test statistic has a chi-square distribution with k – 1 df provided that the expected frequencies are 5 or more for all categories.

Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 5. Rejection rule: p-value approach: Reject H0 if p-value < a Critical value approach: Reject H0 if where  is the significance level and there are k - 1 degrees of freedom

Multinomial Distribution Goodness of Fit Test Example: Finger Lakes Homes (A) Finger Lakes Homes manufactures four models of prefabricated homes, a two-story colonial, a log cabin, a split-level, and an A-frame. To help in production planning, management would like to determine if previous customer purchases indicate that there is a preference in the style selected.

Multinomial Distribution Goodness of Fit Test Example: Finger Lakes Homes (A) The number of homes sold of each model for 100 sales over the past two years is shown below. Split- A- Model Colonial Log Level Frame # Sold 30 20 35 15

Multinomial Distribution Goodness of Fit Test Hypotheses H0: pC = pL = pS = pA = .25 Ha: The population proportions are not pC = .25, pL = .25, pS = .25, and pA = .25 where: pC = population proportion that purchase a colonial pL = population proportion that purchase a log cabin pS = population proportion that purchase a split-level pA = population proportion that purchase an A-frame

Multinomial Distribution Goodness of Fit Test Rejection Rule Reject H0 if p-value < .05 or c2 > 7.815. With  = .05 and k - 1 = 4 - 1 = 3 degrees of freedom Do Not Reject H0 Reject H0 2 7.815

Multinomial Distribution Goodness of Fit Test Expected Frequencies Test Statistic e1 = .25(100) = 25 e2 = .25(100) = 25 e3 = .25(100) = 25 e4 = .25(100) = 25 = 1 + 1 + 4 + 4 = 10

Multinomial Distribution Goodness of Fit Test Conclusion Using the p-Value Approach Area in Upper Tail .10 .05 .025 .01 .005 c2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Because c2 = 10 is between 9.348 and 11.345, the area in the upper tail of the distribution is between .025 and .01. The p-value < a . We can reject the null hypothesis.

Multinomial Distribution Goodness of Fit Test Conclusion Using the Critical Value Approach c2 = 10 > 7.815 We reject, at the .05 level of significance, the assumption that there is no home style preference.

Multinomial Distribution Goodness of Fit Test Excel Worksheet (showing data) Note: Rows 13-101 are not shown.

Multinomial Distribution Goodness of Fit Test Excel Formula Worksheet C D E F G H I 1 Hyp. Observed Expect. Sq'd. Sq.Diff./ 2 Categ. Prop. Frequency Freq. Diff. Exp.Freq. 3 Col. 0.25 =COUNTIF(B2:B101,"Col") =D3*$E$7 =E3-F3 =G3^2 =H3/F3 4 Log =COUNTIF(B2:B101,"Log") =D4*$E$7 =E4-F4 =G4^2 =H4/F4 5 Split-L =COUNTIF(B2:B101,"Spl") =D5*$E$7 =E5-F5 =G5^2 =H5/F5 6 A-Fr. =COUNTIF(B2:B101,"Afr") =D6*$E$7 =E6-F6 =G6^2 =H6/F6 7 Total =SUM(E3:E6) =SUM(I3:I6) 8 9 10 =I7 11 =E9-1 12 =CHISQ.DIST.RT(E10,E11) Categories Degr. of Free. p -Value Test Statistic Note: Columns A-B and rows 13-101 are not shown.

Multinomial Distribution Goodness of Fit Test Excel Value Worksheet C D E F G H I 1 Hyp. Observed Expect. Sq'd. Sq.Diff./ 2 Categ. Prop. Frequency Freq. Diff. Exp.Freq. 3 Col. 0.25 30 25 5 4 Log 20 -5 Split-L 35 10 100 6 A-Fr. 15 -10 7 Total 8 9 11 12 0.0186 Categories Degr. of Free. p -Value Test Statistic Note: Columns A-B and rows 13-101 are not shown.

Test of Independence Another important application of the chi-square distribution involves using sample data to test for the independence of two variables. To test whether two variables are independent, one sample is selected and crosstabulation is used to summarize the data for the two variables simultaneously.

Test of Independence 1. Set up the null and alternative hypotheses. H0: The column variable is independent of the row variable Ha: The column variable is not independent of the row variable 2. Select a random sample and record the observed frequency, fij , for each cell of the contingency table. 3. Compute the expected frequency, eij , for each cell.

Test of Independence 4. Compute the test statistic. 5. Determine the rejection rule. Reject H0 if p -value < a or . where  is the significance level and, with n rows and m columns, there are (n - 1)(m - 1) degrees of freedom.

Test of Independence Example: Finger Lakes Homes (B) Each home sold by Finger Lakes Homes can be classified according to price and to style. Finger Lakes’ manager would like to determine if the price of the home and the style of the home are independent variables.

Test of Independence Example: Finger Lakes Homes (B) The number of homes sold for each model and price for the past two years is shown below. For convenience, the price of the home is listed as either $200,000 or less or more than $200,000. Price Colonial Log Split-Level A-Frame < $200,000 18 6 19 12 > $200,000 12 14 16 3

Test of Independence Hypotheses H0: Price of the home is independent of the style of the home that is purchased Ha: Price of the home is not independent of the style of the home that is purchased

Test of Independence Expected Frequencies Price Colonial Log Split-Level A-Frame Total < $200K > $200K Total 18 6 19 12 55 12 14 16 3 45 30 20 35 15 100

Test of Independence Rejection Rule With  = .05 and (2 - 1)(4 - 1) = 3 d.f., Reject H0 if p-value < .05 or 2 > 7.815 Test Statistic = .1364 + 2.2727 + . . . + 2.0833 = 9.149

Test of Independence Conclusion Using the p-Value Approach Area in Upper Tail .10 .05 .025 .01 .005 c2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838 Because c2 = 9.145 is between 7.815 and 9.348, the area in the upper tail of the distribution is between .05 and .025. The p-value < a . We can reject the null hypothesis.

Test of Independence Conclusion Using the Critical Value Approach We reject, at the .05 level of significance, the assumption that the price of the home is independent of the style of home that is purchased.

Test of Independence Excel Worksheet (showing data) A B C D E 1 Home Price ($) Style 2 >200K Colonial 3 <=200K Log 4 5 A-Frame 6 7 Split-Level 8 9 10 Note: Rows 11-101 are not shown.

Test of Independence Excel Worksheet (showing Pivot Table) J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference Grand Tot. Note: Columns A-D are not shown.

Test of Independence Excel Formula Worksheet G H I J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference 7 Expected Frequencies 8 9 10 11 =I5*J3/J5 =I5*J4/J5 =H5*J3/J5 =H5*J4/J5 =F5*J3/J5 =G5*J3/J5 =F5*J4/J5 =G5*J4/J5 p -Value =CHISQ.TEST(F3:I4,F9:I10) Grand Tot. Note: Columns A-D are not shown.

Test of Independence Excel Value Worksheet G H I J 1 2 Price ($) Colonial Log Split-Lev. A-Frame 3 <=200K 18 6 19 12 55 4 >200K 14 16 45 5 Grand Total 30 20 35 15 100 Count of Home Preference 7 Expected Frequencies 8 9 10 11 8.25 6.75 19.25 15.75 16.50 11.00 13.50 9.00 p -Value 0.0274 Grand Tot. Note: Columns A-D are not shown.

End of Chapter 11