Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 16.1 Chapter 16 Chi-Squared Tests.

Slides:



Advertisements
Similar presentations
Categorical Data Analysis
Advertisements

Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Goodness-of-Fit Tests.
Inference about the Difference Between the
Statistical Inference for Frequency Data Chapter 16.
Chapter 10 Chi-Square Tests and the F- Distribution 1 Larson/Farber 4th ed.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Chapter Ten Comparing Proportions and Chi-Square Tests McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 11 Chi-Square Procedures 11.1 Chi-Square Goodness of Fit.
Chapter 16 Chi Squared Tests.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 12 Inference About A Population.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
8-3 Testing a Claim about a Proportion
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Chapter 9 Hypothesis Testing.
Chi-Square Tests and the F-Distribution
© 2004 Prentice-Hall, Inc.Chap 12-1 Basic Business Statistics (9 th Edition) Chapter 12 Tests for Two or More Samples with Categorical Data.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Goodness-of-Fit Tests and Categorical Data Analysis
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section 10.1 Goodness of Fit. Section 10.1 Objectives Use the chi-square distribution to test whether a frequency distribution fits a claimed distribution.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 11: Applications of Chi-Square. Chapter Goals Investigate two tests: multinomial experiment, and the contingency table. Compare experimental results.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 1.. Section 11-2 Goodness of Fit.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
Chapter Chi-Square Tests and the F-Distribution 1 of © 2012 Pearson Education, Inc. All rights reserved.
1 Pertemuan 11 Uji kebaikan Suai dan Uji Independen Mata kuliah : A Statistik Ekonomi Tahun: 2010.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
GOODNESS OF FIT Larson/Farber 4th ed 1 Section 10.1.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter Outline Goodness of Fit test Test of Independence.
1 Chapter 10. Section 10.1 and 10.2 Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Statistics 300: Elementary Statistics Section 11-2.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Statistics 300: Elementary Statistics Section 11-3.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Goodness-of-Fit and Contingency Tables Chapter 11.
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
Keller: Stats for Mgmt & Econ, 7th Ed Chi-Squared Tests
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Elementary Statistics Tenth Edition
Overview and Chi-Square
Chapter Outline Goodness of Fit test Test of Independence.
Presentation transcript:

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc A Common Theme… What to do?Data Type? Number of Categories? Statistical Technique: Describe a population NominalTwo or more X2 goodness of fit test Compare two populations NominalTwo or more X2 test of a contingency table Compare two or more populations Nominal-- X2 test of a contingency table Analyze relationship between two variables Nominal-- X2 test of a contingency table One data type… …Two techniques

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Two Techniques… The first is a goodness-of-fit test applied to data produced by a multinomial experiment, a generalization of a binomial experiment and is used to describe one population of data. The second uses data arranged in a contingency table to determine whether two classifications of a population of nominal data are statistically independent; this test can also be interpreted as a comparison of two or more populations. In both cases, we use the chi-squared ( ) distribution.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc The Multinomial Experiment… Unlike a binomial experiment which only has two possible outcomes (e.g. heads or tails), a multinomial experiment: Consists of a fixed number, n, of trials. Each trial can have one of k outcomes, called cells. Each probability p i remains constant. Our usual notion of probabilities holds, namely: p 1 + p 2 + … + p k = 1, and Each trial is independent of the other trials.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chi-squared Goodness-of-Fit Test… We test whether there is sufficient evidence to reject a specified set of values for p i. To illustrate, our null hypothesis is: H 0 : p 1 = a 1, p 2 = a 2, …, p k = a k (where a 1, a 2, …, a k are the values of interest) Our research hypothesis is: H 1 : At least one p i ≠ a i

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chi-squared Goodness-of-Fit Test… The test builds on comparing actual frequency and the expected frequency of occurrences in all the cells. Example 16.1… We compare market share before and after an advertising campaign to see if there is a difference (i.e. if the advertising was effective in improving market share). H 0 : p 1 = a 1, p 2 = a 2, …, p k = a k Where a i is the market share before the campaign. If there was no change, we’d expect H 0 to not be rejected. If there is evidence to reject H 0 in favor of: H 1 : At least one p i ≠ a i, what’s a logical conclusion?

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Market shares before the advertising campaign… Company A – 45% Company B – 40% All Others – 15 % 200 customers surveyed after the campaign. The results: Company A – 102 customers preferred their product. Company B – 82 customers… All Others – 16 customers. Before the campaign, we’d expect 45% of 200 customers (i.e. 90 customers) to prefer company A’s product. After the campaign, we observe its 102 customers. Does this mean the campaign was effective? (i.e. at a 5% significance level). IDENTIFY

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Observed Frequency A B Expected Frequency A B Are these changes statistically significant?

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Our null hypothesis is: H 0 : p CompanyA =.45, p CompanyB =.40, p Others =.15 (i.e. the market shares pre-campaign), and our alternative hypothesis is: H 1 : At least one p i ≠ a i In order to complete our hypothesis testing we need a test statistic and a rejection region… IDENTIFY

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chi-squared Goodness-of-Fit Test… Our Chi-squared goodness of fit test statistic is given by: Note: this statistic is approximately Chi-squared with k–1 degrees of freedom provided the sample size is large. The rejection region is: observed frequency expected frequency

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… In order to calculate our test statistic, we lay-out the data in a tabular fashion for easier calculation by hand: Company Observed Frequency Expected Frequency Delta Summation Component fifi eiei (f i – ei)(f i – e i ) 2 /e i A B Others Total Check that these are equal COMPUTE

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Our rejection region is: Since our test statistic is 8.18 which is greater than our critical value for Chi-squared, we reject H 0 in favor of H 1, that is, “There is sufficient evidence to infer that the proportions have changed since the advertising campaigns were implemented” INTERPRET

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Note: Table 5 in Appendix B does not allow for the direct calculation of, so we have to use Excel: COMPUTE

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.1… Note: There are a couple of different ways to calculate the p-value of the test: p-value Computed manually from our table Computed directly from the data

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Required Conditions… In order to use this technique, the sample size must be large enough so that the expected value for each cell is 5 or more. (i.e. n x p i ≥ 5) If the expected frequency is less than five, combine it with other cells to satisfy the condition.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Identifying Factors… Factors that Identify the Chi-Squared Goodness-of-Fit Test: e i =(n)(p i )

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chi-squared Test of a Contingency Table The Chi-squared test of a contingency table is used to: determine whether there is enough evidence to infer that two nominal variables are related, and to infer that differences exist among two or more populations of nominal variables. In order to use use these techniques, we need to classify the data according to two different criteria.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… The demand for an MBA program’s optional courses and majors is quite variable year over year. The research hypothesis is that the academic background of the students (i.e. their undergrad degrees) affects their choice of major. A random sample of data on last year’s MBA students was collected and summarized in a contingency table…data IDENTIFY

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2 The Data MBA Major Undergrad Degree AccountingFinanceMarketingTotal BA BEng BBA Other Total

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… Again, we are interesting in determining whether or not the academic background of the students affects their choice of MBA major. Thus our research hypothesis is: H 1 : The two variables are dependent Our null hypothesis then, is: H 0 : The two variables are independent.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… In this case, our test statistic is: (where k is the number of cells in the contingency table, i.e. rows x columns) Our rejection region is: where the number of degrees of freedom is (r–1)(c–1)

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… In order to calculate our test statistic, we need the calculate the expected frequencies for each cell… The expected frequency of the cell in row i and column j is: COMPUTE

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Contingency Table Set-up…

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2 COMPUTE MBA Major Undergrad Degree AccountingFinanceMarketingTotal BA BEng x BBA Other Total e 23 = (31)(47)/152 = 9.59 — compare this to f 23 = 7 Compute expected frequencies…

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… We can now compare observed with expected frequencies… and calculate our test statistic: MBA Major Undergrad Degree AccountingFinanceMarketing BA BEng BBA Other

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… We compare = with: Since our test statistic falls into the rejection region, we reject H 0 : The two variables are independent. in favor of H 1 : The two variables are dependent. That is, there is evidence of a relationship between undergrad degree and MBA major. INTERPRET

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Example 16.2… We can also leverage the tools in Excel to process our data:data COMPUTE Tools > Data Analysis Plus > Contingency Table Compare p-value

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Required Condition – Rule of Five… In a contingency table where one or more cells have expected values of less than 5, we need to combine rows or columns to satisfy the rule of five. Note: by doing this, the degrees of freedom must be changed as well.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Identifying Factors… Factors that identify the Chi-squared test of a contingency table: