Contingency Tables (cross tabs)

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

Basic Statistics The Chi Square Test of Independence.
Contingency Tables (cross tabs)  Generally used when variables are nominal and/or ordinal Even here, should have a limited number of variable attributes.
The Chi-Square Test for Association
Hypothesis Testing IV Chi Square.
Chapter 13: The Chi-Square Test
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
CJ 526 Statistical Analysis in Criminal Justice
Chi-square Test of Independence
Inferential Statistics  Hypothesis testing (relationship between 2 or more variables)  We want to make inferences from a sample to a population.  A.
Crosstabs and Chi Squares Computer Applications in Psychology.
Chapter 11(1e), Ch. 10 (2/3e) Hypothesis Testing Using the Chi Square ( χ 2 ) Distribution.
Hypothesis Testing IV (Chi Square)
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
The schedule moving forward… Today – Evaluation Research – Remaining time = start on quantitative data analysis Thursday – Methodology section due – Quantitative.
CJ 526 Statistical Analysis in Criminal Justice
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance
Tests of Significance June 11, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Copyright © 2012 by Nelson Education Limited. Chapter 10 Hypothesis Testing IV: Chi Square 10-1.
1 Chi-Square Heibatollah Baghi, and Mastee Badii.
Chapter 11 Hypothesis Testing IV (Chi Square). Chapter Outline  Introduction  Bivariate Tables  The Logic of Chi Square  The Computation of Chi Square.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Reasoning in Psychology Using Statistics Psychology
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Week 13a Making Inferences, Part III t and chi-square tests.
Leftover Slides from Week Five. Steps in Hypothesis Testing Specify the research hypothesis and corresponding null hypothesis Compute the value of a test.
Chi-Square Analyses.
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
ANOVA Knowledge Assessment 1. In what situation should you use ANOVA (the F stat) instead of doing a t test? 2. What information does the F statistic give.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
I. ANOVA revisited & reviewed
Introduction to Marketing Research
Basic Statistics The Chi Square Test of Independence.
CHI-SQUARE(X2) DISTRIBUTION
Chi-Square (Association between categorical variables)
Chapter 9: Non-parametric Tests
Presentation 12 Chi-Square test.
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Chi-Square Tests.
Chapter Fifteen McGraw-Hill/Irwin
Association between two categorical variables
Hypothesis Testing Review
Qualitative data – tests of association
Hypothesis Testing Using the Chi Square (χ2) Distribution
PPA 501 – Analytical Methods in Administration
Is a persons’ size related to if they were bullied
Different Scales, Different Measures of Association
Consider this table: The Χ2 Test of Independence
Reasoning in Psychology Using Statistics
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Chapter 11 Chi-Square Tests.
Analyzing the Association Between Categorical Variables
Reasoning in Psychology Using Statistics
Parametric versus Nonparametric (Chi-square)
Reasoning in Psychology Using Statistics
Copyright © Cengage Learning. All rights reserved.
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chapter 11 Chi-Square Tests.
Hypothesis Testing - Chi Square
Contingency Tables (cross tabs)
CHI SQUARE (χ2) Dangerous Curves Ahead!.
What is Chi-Square and its used in Hypothesis? Kinza malik 1.
Presentation transcript:

Contingency Tables (cross tabs) Generally used when variables are nominal and/or ordinal Even here, should have a limited number of variable attributes (categories) Some find these very intuitive…others struggle It is very easy to misinterpret these critters

Interpreting a Contingency Table WHAT IS IN THE INDIVIDUAL CELLS? The number of cases that fit in that particular cell In other words, frequencies (number of cases that fit criteria) For small tables, and/or small sample sizes, it may be possible to detect relationships by “eyeballing” frequencies. For most.. Convert to Percentages: a way to standardize cells and make relationships more apparent

Example 1 Random sample of UMD students to examine political party membership Are UMD students more likely to belong to particular political parties? Democrat Republican Independent Green (N=40 UMD students)

Univariate Table Null: The distribution is even across all categories. (N=40 UMD students) Categories F % Republican 12 30% Democrat 14 35% Independent 9 23% Green 5 10% FORMULA FOR X2 Fe (expected frequencies) = number of cases / k=category THESE SUM TO 4.6…

Example 2 A survey of 10,000 U.S. residents Is one’s political view related to attitudes towards police? What are the DV and IV? Convention for bivariate tables The IV is on the top of the table (dictates columns) The DV is on the side (dictates rows).

“Bivariate” Table Political Party Attitude Towards Police Total Repub Democrat Libertarian Socialist Favorable 2900 2100 180 30 5210 Unfav. 1900 1800 160 28 3888 4800 3900 340 58 9098

The Percentages of Interest Attitude Towards Police Political Party Total Repub Democrat Libertarian Socialist Favorable 2900 (60%) 2100 (54%) 180 (53%) 30 (52%) 5210 Unfav 1900 1800 160 28 3888 4800 3900 340 58 9098

The Test Statistic for Contingency Tables Chi Square, or χ2 Calculation Observed frequencies (your sample data) Expected frequencies (UNDER NULL) Intuitive: how different are the observed cell frequencies from the expected cell frequencies Degrees of Freedom: 1-way = K-1 2-way = (# of Rows -1) (# of Columns -1)

“One-Way” CHI SQUARE The most simple form of the Chi square is the one-way Chi square test For Univariate Table Do frequencies observed differ significantly from an even distribution? Null hypothesis: there are no differences across the categories in the population Formula:  fo =observed frequency in any category fe =expected frequency in any category

Chi Square: Steps Find the expected (under null hypothesis) cell frequencies Compare expected & observed frequencies cell by cell If null hypothesis is true, expected and observed frequencies should be close in value Greater the difference between the observed and expected frequencies, the greater the possibility of rejecting the null

Calculating χ2 χ2 = ∑ [(fo - fe)2 /fe] Where Fe= Row Marginal X Column Marginal N So, for each cell, calculate the difference between the actual frequencies (“observed”) and what frequencies would be expected if the null was true (“expected”). Square, and divide by the expected frequency. Add the results from each cell.

1-WAY CHI SQUARE 1-way Chi Square Example: There is an even distribution of membership across 4 political parties (N=40 UMD students) Find the expected cell frequencies (Fe = N / K) Categories Fo Fe Republican 12 10 Democrat 14 Independ. 9 Green 5 FORMULA FOR X2 Fe (expected frequencies) = number of cases / k=category THESE SUM TO 4.6…

1-WAY CHI SQUARE 1-way Chi Square Example: There is an even distribution of membership across 4 political parties (N=40 UMD students) Compare observed & expected frequencies cell-by-cell Categories Fo Fe fo - fe Republican 12 10 2 Democrat 14 4 Independ. 9 -1 Green 5 -5

1-WAY CHI SQUARE 1-way Chi Square Example: There is an even distribution of membership across 4 political parties (N=40 UMD students) Square the difference between observed & expected frequencies Categories Fo Fe fo - fe (fo - fe)2 Republican 12 10 2 4 Democrat 14 16 Independ. 9 -1 1 Green 5 -5 25

1-WAY CHI SQUARE 1-way Chi Square Example: There is an even distribution of membership across 4 political parties (N=40 UMD students) Divide that difference by expected frequency Categories Fo Fe fo - fe (fo - fe)2 (fo - fe)2 /fe Republican 12 10 2 4 0.4 Democrat 14 16 1.6 Independ. 9 -1 1 0.1 Green 5 -5 25 2.5 ∑= 4.6 FORMULA FOR X2 THESE SUM TO 4.6…

Interpreting Chi-Square Chi-square has no intuitive meaning, it can range from zero to very large As with other test statistics, the real interest is the “p value” associated with the calculated chi-square value Conventional testing = find χ2 (critical) for stated “alpha” (.05, .01, etc.) Reject if χ2 (observed) is greater than χ2 (critical) SPSS: find the exact probability of obtaining the χ2 under the null (reject if less than alpha)

Example of Chi-Square Sampling Distributions (Assuming Null is True)

Interpreting χ2 the old fashioned way: The UMD political party example Chi square = 4.6 df (1-way Chi square) = K-1 = 3 X2 (critical) (p<.05) = 7.815 (from Appendix C) Obtained (4.6) < critical (7.815) Decision Fail to reject the null hypothesis. There is not a significant difference in political party membership at UMD

2-WAY CHI SQUARE For use with BIVARIATE Contingency Tables Display the scores of cases on two different variables at the same time (rows are always DV & columns are always IV) Intersection of rows & columns is called “cells” Column & row marginal totals (a.k.a. “subtotals”) should always add up to N N=40 Packers Fan Vikings Fan TOTALS Have most of original teeth 14 7 21 Do not have most of original teeth 6 13 19 TOTALS: 20 40 [HANDOUT] BOARD: bivariate tables = “cross tabulation” or “contingency tables” What is plotted in each cell? The # of cases that fall into that unique category. Observed frequencies A.K.A.: OBSERVED FREQUENCIES

Null Hypothesis for 2-Way χ2 The two variables are independent Independence: Classification of a case into a category on one variable has no effect on the probability that the case will be classified into any category of the second variable Fan status (Viking vs. Packer) is not related to whether or not a person has most of their original teeth. Knowing someone’s fan status will not help you predict their tooth status.

2-WAY CHI SQUARE Find the expected frequencies Fe= Row Marginal X Column Marginal N “Have teeth” Row = (21 x 20)/40 =420/40=10.5 “Don’t have” Row = (19 x 20)/40 = 380/40= 9.5 N=40 Packers Fan Vikings Fan TOTALS Have most of original teeth 14 (10.5) 7 (10.5) 21 Do not have most of original teeth 6 (9.5) 13 (9.5) 19 TOTALS: 20 40

χ2 = ∑ [(fo - fe)2 /fe] (14-10.5) 2 = 12.25  12.25/10.5 = 1.17 (14-10.5) 2 = 12.25  12.25/10.5 = 1.17 (7-10.5) 2 = 12.25  12.25/10.5 = 1.17 (6-9.5) 2 = 12.25  12.25/9.5 = 1.29 (13-9.5) 2 = 12.25  12.25/9.5 = 1.29 Χ2observed = 4.92

2-WAY CHI SQUARE Compare expected & observed frequencies cell by cell X2(obtained) = 4.920 df= (r-1)(c-1) = 1 X 1 = 1 X2(critical) for alpha of .05 is 3.841 (Healey Appendix C) Obtained > Critical CONCLUSION: Reject the null: There is a relationship between the team that students root for and their opinion of Brett Favre (p<.05).

CHI SQUARE EXAMPLE #2 Is opinion of Governor Dayton related to geographic region of the state within the sample data? Calculate the proper percentages GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 14 12 35 NEGATIVE 16 11 13 40 25 75 [HANDOUT] Another day, another handout…

CHI SQUARE EXAMPLE #2 Is opinion of Governor Dayton significantly related to geographic region of the state? Is there a relationship in the POPULATION To answer this question, do a chi square test using survey data from sampled Minnesotans (N=75) GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 14 12 35 NEGATIVE 16 11 13 40 25 75 [HANDOUT] Another day, another handout…

CHI SQUARE EXAMPLE #2 Is opinion of Governor Dayton significantly related to geographic region of the state? First, find the EXPECTED FREQUENCIES Fe= Row Marginal X Column Marginal N GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 (11.67) 14 (11.67) 12 (11.67) 35 NEGATIVE 16 (13.33) 11 (13.33) 13 (13.33) 40 25 75 NOTICE THE COLOR CODING HERE…

CHI SQUARE EXAMPLE #2 Next, for each cell, calculate observed minus expected & square the difference (fo - fe)2 (9 – 11.67)2 = (-2.67)2 = 7.13 (14 – 11.67)2 = (2.33)2 = 5.43 (12 – 11.67)2 = (0.33)2 = 0.11 (16 – 13.33)2 = (2.67)2 = 7.13 (11 – 13.33)2 = (-2.33)2= 5.43 (13 – 13.33)2 = (-0.33)2= 0.11 GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 (11.67) 14 (11.67) 12 (11.67) 35 NEGATIVE 16 (13.33) 11 (13.33) 13 (13.33) 40 25 75

CHI SQUARE EXAMPLE #3 Then, divide each (observed – expected) squared difference by the appropriate expected value: 7.13 / 11.67 = 0.61 5.43 / 11.67 = 0.47 0.11 / 11.67 = 0.01 7.13 / 13.33 = 0.53 5.43 / 13.33 = 0.41 0.11 /13.33 = 0.01 Summing () these values gives you 2(obtained)= 2.04 GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 (11.67) 14 (11.67) 12 (11.67) 35 NEGATIVE 16 (13.33) 11 (13.33) 13 (13.33) 40 25 75

CHI SQUARE EXAMPLE #2 GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS Is a 2(obtained) of 2.04 significant if alpha = .05? To answer this question: First, calculate degrees of freedom: df = (r – 1)(c – 1) = (2 – 1)(3 – 1) = 2 Then, go to Appendix C, where you find out that 2(critical) (2 df) = 5.991 2.04 (obtained) < 5.991 (critical) CONCLUSION: Fail to reject the null hypothesis of independence between geographic region and attitude toward Dayton GEOGRAPHIC REGION ATTITUDE CITIES SUBURBS OUT-STATE TOTALS POSITIVE 9 (11.67) 14 (11.67) 12 (11.67) 35 NEGATIVE 16 (13.33) 11 (13.33) 13 (13.33) 40 25 75 r = # of rows c = # of columns

Conclusions In the sample, those in the suburbs (56%) were more likely to hold positive attitudes towards Governor Dayton than those from the twin cities (36%) or those from out state (48%). However, none of these differences were statistically significant. We cannot conclude that support for the governor varies across regions in Minnesota. What could we change to make these percentages “statistically significant?”

SPSS Procedure Statistics Analyze Descriptive Statistics  Crosstabs Rows = DV Columns = IV Cells Column Percentages Statistics Chi square

SPSS Exercise Examine the relationship between sex (male/female) and another variable in the GSS data. Sex should be the IV (most things cannot change your sex). Is there a relationship in the sample? Percentages Is this relationship significant? Chi square