Contingency Tables: Independence and Homogeneity

Slides:



Advertisements
Similar presentations
STATISTICS ELEMENTARY MARIO F. TRIOLA
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)
AP Statistics Tuesday, 15 April 2014 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi-square test for independence;
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
Presentation 12 Chi-Square test.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
1 Pertemuan 11 Uji kebaikan Suai dan Uji Independen Mata kuliah : A Statistik Ekonomi Tahun: 2010.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Other Chi-Square Tests
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Chapter Outline Goodness of Fit test Test of Independence.
Slide 1 Copyright © 2004 Pearson Education, Inc..
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 10 Section 5 Chi-squared Test for a Variance or Standard Deviation.
Statistics 300: Elementary Statistics Section 11-3.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Test of Independence Tests the claim that the two variables related. For example: each sample (incident) was classified by the type of crime and the victim.
Goodness-of-Fit and Contingency Tables Chapter 11.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Chi Square Test Dr. Asif Rehman.
Comparing Counts Chi Square Tests Independence.
Other Chi-Square Tests
Test of independence: Contingency Table
CHAPTER 26 Comparing Counts.
Chi-Square hypothesis testing
Warm Up Check your understanding on p You do NOT need to calculate ALL the expected values by hand but you need to do at least 2. You do NOT need.
Presentation 12 Chi-Square test.
CHAPTER 11 Inference for Distributions of Categorical Data
Chi-square test or c2 test
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
CHAPTER 11 CHI-SQUARE TESTS
Lecture Slides Elementary Statistics Twelfth Edition
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Data Analysis for Two-Way Tables
Chapter 11 Goodness-of-Fit and Contingency Tables
Inference on Categorical Data
Elementary Statistics
Lecture Slides Elementary Statistics Tenth Edition
Chi Square Two-way Tables
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Overview and Chi-Square
Chi-square test or c2 test
Inference on Categorical Data
Analyzing the Association Between Categorical Variables
Hypothesis Tests for a Standard Deviation
Section 11-1 Review and Preview
Inference for Two Way Tables
Chapter Outline Goodness of Fit test Test of Independence.
Testing a Claim About a Standard Deviation or Variance
Chapter 11 Lecture 2 Section: 11.3.
Presentation transcript:

Contingency Tables: Independence and Homogeneity Section 11-3 Contingency Tables: Independence and Homogeneity

(or two-way frequency table) Contingency Table (or two-way frequency table) A contingency table is a table in which frequencies correspond to two variables. (One variable is used to categorize rows, and a second variable is used to categorize columns.) page 606 of Elementary Statistics, 10th Edition Contingency tables have at least two rows and at least two columns.

Case-Control Study of Motorcycle Drivers Is the color of the motorcycle helmet somehow related to the risk of crash related injuries? 491 213 704 377 112 489 31 8 39 899 333 1232 Black White Yellow/Orange Row Totals Controls (not injured) Cases (injured or killed) Column Totals page 606 of Elementary Statistics, 10th Edition

Test of Independence A test of independence tests the null hypothesis that there is no association between the row variable and the column variable in a contingency table. (For the null hypothesis, we will use the statement that “the row and column variables are independent.”) page 607 of Elementary Statistics, 10th Edition

(H) Hypothesis Statements The null hypothesis H0 is the statement that the row and column variables are independent; the alternative hypothesis H1 is the statement that the row and column variables are dependent.

(A) Assumptions/Requirements The sample data are randomly selected and are represented as frequency counts in a two-way table. For every cell in the contingency table, the expected frequency E is at least 5. (There is no requirement that every observed frequency must be at least 5. Also there is no requirement that the population must have a normal distribution or any other specific distribution.) page 607 of Elementary Statistics, 10th Edition

Total number of all observed frequencies in the table Expected Frequency (row total) (column total) (grand total) E = Total number of all observed frequencies in the table page 607 of Elementary Statistics, 10th Edition

(T) Test of Independence Test Statistic 2 =  (O – E)2 E Critical Values 1. Found in Table A-4 using degrees of freedom = (r – 1)(c – 1) r is the number of rows and c is the number of columns 2. Tests of Independence are always right-tailed. page 607 of Elementary Statistics, 10th Edition Same chi-square formula as for multinomial tables.

(S) Statement Same Conclusion!!!!

Test of Independence This procedure cannot be used to establish a direct cause-and-effect link between variables in question. Dependence means only there is a relationship between the two variables.

Find the Expected Counts: 491 213 704 377 112 489 31 8 39 899 333 1232 Black White Yellow/Orange Row Totals Controls (not injured) Cases (injured or killed) Column Totals 899 1232 704 For the upper left hand cell: (row total) (column total) E = (grand total) = 513.714 E = (899)(704) 1232

Find the Expected Counts: Row Totals Black White Yellow/Orange Controls (not injured) Expected 491 513.714 213 704 377 112 489 31 8 39 899 333 1232 Cases (injured or killed) Expected Column Totals (row total) (column total) E = (grand total) = 513.714 E = (899)(704) 1232

Find the Expected Counts: 491 513.714 213 704 377 112 489 31 8 39 899 333 1232 Black White Yellow/Orange Row Totals Controls (not injured) Expected Cases (injured or killed) Column Totals 356.827 132.173 28.459 10.541 190.286 Expected Calculate expected for all cells. To interpret this result for the upper left hand cell, we can say that although 491 riders with black helmets were not injured, we would have expected the number to be 513.714 if crash related injuries are independent of helmet color.

Case-Control Study of Motorcycle Drivers Using a 0.05 significance level, test the claim that group (control or case) is independent of the helmet color. H0: Whether a subject is in the control group or case group is independent of the helmet color. (Injuries are independent of helmet color.) H1: The group and helmet color are dependent.

Case-Control Study of Motorcycle Drivers Row Totals Black White Yellow/Orange Controls (not injured) Expected 491 513.714 213 704 356.827 132.173 377 112 489 28.459 10.541 31 8 39 899 333 1232 190.286 Cases (injured or killed) Expected Column Totals 190.286

Case-Control Study of Motorcycle Drivers H0: Row and column variables are independent. H1: Row and column variables are dependent. The test statistic is 2 = 8.775  = 0.05 The number of degrees of freedom are (r–1)(c–1) = (2–1)(3–1) = 2. The critical value (from Table A-4) is 2.05,2 = 5.991. The test statistic chi-square values need to be compared with the chi-square critical value found in Table A-4.

Case-Control Study of Motorcycle Drivers Figure 11-4 page 610 of Elementary Statistics, 10th Edition We reject the null hypothesis. It appears there is an association between helmet color and motorcycle safety.

Test of Homogeneity In a test of homogeneity, we test the claim that different populations have the same proportions of some characteristics. page 611 of Elementary Statistics, 10th Edition

How to Distinguish Between a Test of Homogeneity and a Test for Independence: Were predetermined sample sizes used for different populations (test of homogeneity), or was one big sample drawn so both row and column totals were determined randomly (test of independence)? The key to identifying it is a test of homogeneity is the predetermined sample sizes.

Example: Influence of Gender Using Table 11-6 with a 0.05 significance level, test the effect of pollster gender on survey responses by men. page 612 of Elementary Statistics, 10th Edition

Hypotheses H0: The proportions of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women. H1: The proportions are different. page 612 of Elementary Statistics, 10th Edition

Calculations with Expected Values to use in Assumptions Chi-Square Test of Homogeneity Minitab page 613 of Elementary Statistics, 10th Edition

Assumptions We have expected counts greater than five in all categories. Assume two random samples of survey responses.

STatement Since the pvalue < .05 we reject the H0 , there is sufficient evidence to suggest there are differences in opinion for subjects interviewed by men and the subjects interviewed by women

Recap Contingency tables where categorical data is arranged in a table with a least two rows and at least two columns. Test of Independence tests the claim that the row and column variables are independent of each other. Test of Homogeneity tests the claim that different populations have the same proportion of some characteristics.