Download presentation
Presentation is loading. Please wait.
1
Applied Statistics Using SPSS
Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay
2
Outline ALL variables must be categorical
Goal one: verify a distribution of Y One-sample Chi-square test (SPSS lesson 40) Goal two: test the independence between two categorical variables Chi-square test for two-way contingency table (SPSS lesson 41) McNemar’s test for paired data (SPSS lesson 44) Measure the dependence (Phil and Kappa coefficients) (SPSS lesson 41, 44) Goal three: test the independence between two categorical variables after controlling the third factor Mantel-Haenszel Chi-square test (SPSS in class)
3
Example: Postpartum Depression Study
Are women equally likely to show an increase, no change, or a decrease in depression as a function of childbirth? Are the proportions associated with a decrease, no change, and an increase in depression from before to after childbirth the same?
4
Raw data vs. Grouped data
Grouped data are shown in next slide. ID Name Depression level after birth in comparison with before birth 1 *** Same 2 Less depressed 3 More depressed
5
Example: Postpartum Depression Study
Depression after birth in comparison with before birth Observed frequencies Hypothesized proportions Expected frequencies Less depressed (-1) 14 1/3 20 Neither less nor more depressed (0) 33 More depressed (1) 13 From a random sample of 60 women
6
One-sample Chi-Square Test
Must be a random sample The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories
7
One-sample Chi-Square Test
Ho: the hypothesis distribution is true vs. Ha: Ho is false Test statistic: Z^2 = Chi-square with df =1; Z is N(0,1) (Z1)^2 +(Z2)^2 = Chi-square with df =2; Z1, Z2 are N(0,1) and they are independent (Oi-ei)/sqrt(ei) is approximately N(0,1) as sample size goes large. df = # of categories -1; reject Ho if Chi-square is too large Oi = the observed frequency of i-th category ei = the expected frequency of i-th category
8
SPSS Output Weight your data by count first (data>>weight cases)
Analyze >> Nonparametric Tests >> Legacy Dialogs >> Chi Square, count as test variable
9
Conclusion Reject Ho: equally likely to be more or less depressed or no change The proportions associated with a decrease, no change, and an increase in depression from before to after childbirth are significantly different to 1/3, 1/3, 1/3.
10
Example: Postpartum Depression Study
Are the proportions associated with a change and no change from before to after childbirth the same? Ho: P(change)=P(no change)=.50 Expected no. for change = .50* 60 =30 Expected no. for no change = .50*60 =30
11
Example: Postpartum Depression Study
Depression after birth in comparison with before birth Observed frequencies Hypothesized proportions Expected frequencies Same amount of depression (0) 33 1/2 30 More or less depressed (1) 27 From a random sample of 60 women
12
SPSS Output
13
Conclusion It is equally like to experience change or no change in depression before and after child birth. Question: For those who do experience change, is it equally like to be less or more depressed? Answer: Yes. It is equally likely to be less or more depressed given that there is a change in depression level. Over all, comparing to before child birth, the after-childbirth depression level is equally likely to change or no change. If a woman experiences change, she is equally likely to be less or more depressed.
14
Two-way Contingency Tables
Report frequencies on two variables Such tables are also called crosstabs.
15
Contingency Tables (Crosstabs)
1991 General Social Survey Frequency Party Identification Democrat Independent Republican Race White 341 105 405 Black 103 15 11
16
Crosstabs Analysis (Two-way Chi-square test)
Chi-square test for testing the independence between two variables: For a fixed column, the distribution of frequencies over rows keeps the same regardless of the column For a fixed row, the distribution of frequencies over columns keeps the same regardless of the row
17
Measure of dependence for 2x2 tables
The phi coefficient measures the association between two categorical variables -1 < phi < 1 | phi | indicates the strength of the association If the two variables are both ordinal, then the sign of phi indicate the direction of association
18
SPSS Output P. 332 : Data>> weight cases>> Weight cases by, select count variable P. 333: Analyze >> descriptive statistics >> crosstabs, cell
19
Measure of dependence for non-2x2 tables
Cramers V Range from 0 to 1 V may be viewed as the association between two variables as a percentage of their maximum possible variation. V= phi for 2x2, 2x3 and 3x2 tables
20
Fisher’s Exact Test for Independence
The Chi-squared tests are ONLY for large samples: The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories
21
SPSS Output SPSS output: in “crosstabs” window, click “exact”, then tick “exact”:
22
Matched-pair Data Comparing categorical responses for two “paired” samples When either Each sample has the same subjects (or say subjects are measured twice) Or A natural pairing exists between each subject in one sample and a subject from the other sample (eg. Twins)
23
Example: Rating for Prime Minister
Second Survey First Survey Approve Disapprove 794 150 86 570
24
Marginal Homogeneity The probabilities of “success” for both samples are identical Eg. The probability of “approve” at the first and 2nd surveys are identical
25
McNemar Test (for 2x2 Tables only)
SPSS: Lesson 44 Ho: marginal homogeneity Ha: no marginal homogeneity Exact p-value Approximate p-value (When n12+n21>10)
26
Output In SPSS: Analyze >> Descriptive statistics >> crosstabs, in “statistics” tick “Kappa” and “McNemar” McNemar's Test Statistic (S) DF Asymptotic Pr > S <.0001 Exact Pr >= S E-05 Simple Kappa Coefficient Kappa ASE 95% Lower Conf Limit 95% Upper Conf Limit Sample Size = 1600 Level of agreement
27
SPSS Output SPSS(p. 361): Analyze >> Nonparametric tests >> Legacy dialogs >> 2 related samples; in “two-samples tests” tick “McNemar” and click “exact”, then tick “exact” again
28
Stratified 2 by 2 Tables (Meta-Analysis)
Goal: to investigate the risk factor (lack of sleep) to the outcome (failing a test) Test Results, Boys Sleep Fail Pass Low 20 100 High 15 150 Test Results, Girls Sleep Fail Pass Low 30 100 High 25 200
29
Cochran Mantel-Haenszel Test
After Importing your dataset, and providing names to variables, click on: ANALYZE >> DESCRIPTIVE STATISTICS >> CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable For LAYERS, Select the Strata Variable Under STATISTICS, Click on COCHRAN’S AND MANTEL-HAENSZEL STATISTICS NOTE: You will want to code the data so that the outcome present (Yes) category has the lower value (e.g. 1) and the outcome absent (No) category has the higher value (e.g. 2). Do the same for risk factor: 1 for exposure; 2 for no exposure. Use Value Labels to keep output straight.
30
SPSS Output
31
SAS Output Common Odds Ratio and Relative Risks Statistic Method Value
Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob 1 Nonzero Correlation 0.0004 2 Row Mean Scores Differ 3 General Association Breslow-Day Test for Homogeneity of the Odds Ratios Chi-Square 0.1501 DF 1 Pr > ChiSq 0.6985 Common Odds Ratio and Relative Risks Statistic Method Value 95% Confidence Limits Odds Ratio Mantel-Haenszel 2.2289 1.4185 3.5024 Logit 2.2318 1.4205 3.5064 Relative Risk (Column 1) 1.9775 1.3474 2.9021 1.9822 1.3508 2.9087 Relative Risk (Column 2) 0.8891 0.8283 0.9544 0.8936 0.8334 0.9582
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.