Download presentation
Presentation is loading. Please wait.
Published byClementine Fields Modified over 9 years ago
1
INFO 515Lecture #91 Action Research More Crosstab Measures INFO 515 Glenn Booker
2
INFO 515Lecture #92 Nominal Crosstab Tests Four more measures which could apply to nominal data in a crosstab Eta Lambda Goodman and Kruskal’s tau Uncertainty coefficient
3
INFO 515Lecture #93 Eta Coefficient Used when the dependent variable uses an interval or ratio scale, and the independent variable is nominal or ordinal Eta () squared is the proportion of the dependent variable’s variance which is explained by the independent variable Eta squared is symmetric, and ranges from 0 to 1 This is the same eta from the end of lecture 6
4
INFO 515Lecture #94 Directional vs Symmetric Directional measures give a different answer depending on whether A is dependent on B, or B is dependent on A Symmetric measures don’t care which variable is dependent or independent Tests indicate whether there is a statistically significant relationship; measures, here, describe the strength of association
5
INFO 515Lecture #95 Directional Measures Directional measures help determine how much the dependent variable is affected by the independent variable Directional measures for nominal data: Lambda (recommended) Goodman and Kruskal’s tau Uncertainty coefficient
6
INFO 515Lecture #96 Directional Measures Directional measures generally range from 0 to 1 A value of 0 means the independent variable doesn’t help predict the dependent variable A value of 1 means the independent variable perfectly predicts the resulting dependent variable
7
INFO 515Lecture #97 Directional Measures In this context, either variable can be considered dependent or independent Does A predict B? Does B predict A? A “symmetric” value is the weighted average of the two possible selections (A predicts B, or B predicts A)
8
INFO 515Lecture #98 Proportional Reduction in Error Proportional Reduction in Error (PRE) measures find the fractional reduction in errors due to some factor (such as an independent variable) PRE = (Error without X – Error with X) / Error with X Two we’ll look at are Lambda, and Goodman and Kruskal’s Tau
9
INFO 515Lecture #99 Lambda Coefficient Lambda has a symmetric option for output Its Value is the proportion of the dependent variable predicted by the independent one The Asymptotic Std. Error allows a 95% confidence interval to be made “Approx. T” is the Value divided by the Std. Error if the parameter were zero (not the usual definition!)
10
INFO 515Lecture #910 Goodman and Kruskal’s Tau SPSS note: Goodman and Kruskal’s Tau is not directly selected; it appears only when Lambda is checked! Does not have Symmetric option Does not approximate T Based on chi square Otherwise similar to Lambda for interpretation
11
INFO 515Lecture #911 Uncertainty Coefficient Does have symmetric dependency option Does have T approximation Also based on chi square Goodman and Kruskal’s tau and the Uncertainty Coefficient may give opposite results as Lambda, so use them cautiously!
12
INFO 515Lecture #912 Nominal Example Use “GSS91 political.sav” data set Use Analyze / Descriptive Statistics / Crosstabs… Select “region” for Row(s), and “relig” for Column(s) Under “Statistics…” select Lambda, and Uncertainty Coefficient
13
INFO 515Lecture #913 Nominal Example
14
INFO 515Lecture #914 Nominal Example - Lambda Focus on the Lambda () output first Lambda measures the percent of error reduction when using the independent variable to predict the dependent variable Calculation based on any desired outcome contributing to lambda Lambda ranges from 0 to 1
15
INFO 515Lecture #915 Nominal Example As usual, we want Sig. < 0.050 for the meaning of lambda to be statistically significant If Region is dependent, then we see that religious preference is a significant (sig. = 0.000) predictor “relig” contributes (Value) 4.8% +/- (Std Error) 1.2% of the variability of a person’s region
16
INFO 515Lecture #916 Lambda Example 95% confidence interval of that contribution is (not shown) 4.8 – 2*1.2 = 2.4% to 4.8 + 2*1.2 = 7.2% But “region” is not a significant predictor of “relig” (sig. = 0.099) Ignore the value of lambda if it isn’t significant The symmetric value is significant, and its Value is between the other two lambda values
17
INFO 515Lecture #917 G and K Tau Example Goodman and Kruskal’s tau () is similar to lambda, but is based on predictions in the same proportion as the marginal totals (individual row or column subtotals) No symmetric value is given – it’s only directional Same method for interpretation, but notice it predicts both variables can be significant as dependent, and ‘relig’ is much stronger! Still from slide 13
18
INFO 515Lecture #918 Uncertainty Coefficient Example Is a measure of association that indicates the proportional reduction in error when values of one variable are used to predict values of the other variable The program calculates both symmetric and directional versions of it Here, gives results similar to G and K Tau
19
INFO 515Lecture #919 Tests for 2x2 Tables Many special measures can be applied to a 2x2 table, including: Relative risk Odds ratio Look at these in the context of answering questions like: “Are people who approve of women working more likely to vote for a woman President?”
20
INFO 515Lecture #920 Tests for 2x2 Tables Use “GSS91 social.sav” data set Variables are “should women work” (fework) and “vote for woman president” (fepres) Isolate the cases using Data / Select Cases Use the If condition (fepres=1 | fepres=2) & (fework=1 | fework=2) ‘|’ means ‘or’; ‘&’ means ‘and’
21
INFO 515Lecture #921 Tests for 2x2 Tables Use Analyze / Descriptive Statistics / Crosstabs… Select “fework” for Row(s), and “fepres” for Column(s) For Statistics select Risk For Cells select Row percentages This gives 947 valid cases
22
INFO 515Lecture #922 Tests for 2x2 Tables
23
INFO 515Lecture #923 Tests for 2x2 Tables ‘cohort’ = subset
24
INFO 515Lecture #924 Relative Risk The relative risk is a ratio of percentages It is very directional Those who (approve of voting for a woman president) are 1.178 times as likely to (approve of women working) Based on 93.4%/79.3% = 1.178 Note the 95% confidence intervals for each ratio are given; roughly 1.09 to 1.27 for this example
25
INFO 515Lecture #925 Relative Risk Conversely, those who do not approve of voting for a woman president are 0.317 times as likely to approve of women working (6.6/20.7=0.317), with a broader confidence interval of 0.22 to 0.47
26
INFO 515Lecture #926 Odds Ratio The odds ratio is the ratio of (the probability that the event occurs) to (the probability that the event does not occur) The odds ratio that someone who (would vote for a woman president) also (approves of women working) has two terms One is the ratio of (those who approve of women working) divided by (voting for a woman president) (93.4/6.6=14.152)...
27
INFO 515Lecture #927 Odds Ratio Divided by the ratio of (those who would NOT approve of women working) (voting for a woman president) (79.3/20.7=3.831) Hence the odds ratio is 14.152/3.831 =3.694 or (93.4*20.7)/(6.6*79.3) Round off error, probably in the 6.6 value, kept us from getting the stated odds ratio of 3.712 (first row of output on slide 23)
28
INFO 515Lecture #928 Square Tables (RxR) Tables with the same number of rows as columns (RxR tables) also have special measures Cohen’s Kappa (), which measures the strength of agreement (did two people’s measurements match well?) Applies for R values of one nominal variable
29
INFO 515Lecture #929 Kappa Kappa is used only when the rows and columns have the same categories Set of possible diagnoses achieved by two different doctors Two sets of outcomes which are believed to be dependent on each other Kappa ranges from zero to one; is one when the diagonal has the only non-zero values
30
INFO 515Lecture #930 Kappa Example Example here is the educational level of one’s parents (maeduc and paeduc; as in ‘ma and pa education’) Use “GSS91 social.sav” data set Define new variables madeg and padeg, which are derived from maeduc and paeduc (convert years of education into rough levels of achievement)
31
INFO 515Lecture #931 Kappa Example New scale for madeg and padeg is Education <12 is code 1, “LT High School” Education 12-15 is code 2, “High School” Education 16 is code 3, “Bachelor degree” Education 17+ is code 4, “Graduate” Use Analyze / Descriptive Statistics / Crosstabs…
32
INFO 515Lecture #932 Kappa Example Select “padeg” for Row(s), and “madeg” for Column(s) For Statistics select Kappa The basic crosstab just shows the data counts (next slide) Then we get the Kappa measure (slide after next) As usual, check to make sure the result is significant before going any further
33
INFO 515Lecture #933 Kappa Example
34
INFO 515Lecture #934 Kappa Example
35
INFO 515Lecture #935 Kappa Example Here the significance is 0.000, very clearly significant (< 0.050) This is confirmed by the approximate T of over 20 - as before, this T is based on the null hypothesis The actual value of kappa and its standard error are 0.325 +/- 0.018 What does this mean?
36
INFO 515Lecture #936 Kappa Kappa is judged on a fairly fixed scale Kappa below 0.40 indicates poor agreement beyond chance Kappa from 0.40 to 0.75 is fair to good agreement Kappa above 0.75 is strong agreement So in this case we are confident there is poor agreement between parents’ education Scale from J.L. Fleiss, 1981
37
INFO 515Lecture #937 Ordinal Crosstab Measures Several association measures can be used for a table with R rows and C columns which contain ordinal data (and presumably R ≠ C) Kendall’s tau-b Kendall’s tau-c (Goodman and Kruskal’s) Gamma (preferred) Somers’ d Spearman’s Correlation Coefficient
38
INFO 515Lecture #938 General RxC Table Measures Many are based on comparing adjacent pairs of data from the two variables If B increases when A increases, the pair is concordant If B decreases when A increases, the pair is discordant If A and B are equal, the pair is tied
39
INFO 515Lecture #939 General RxC Table Measures The number of concordant pairs is “P” The number of discordant pairs is “Q” The number of ties on X are “Tx” The number of ties on Y are “Ty” The smaller of the number of rows R and columns C is called “m” m = min(R,C) Given this vocabulary, we can define many measures
40
INFO 515Lecture #940 General RxC Table Measures Kendall’s tau-b is tau-b = (P-Q) / sqrt[ (P+Q+Tx)*(P+Q+Ty) ] Kendall’s tau-c is tau-c = 2m*(P-Q) / [N 2 *(m-1)] Gamma () is Gamma = (P-Q) / (P+Q) Somers’ d is dy = (P-Q) / (P+Q+Ty) or dx = (P-Q) / (P+Q+Tx)
41
INFO 515Lecture #941 General RxC Table Measures All of the RxC measures are symmetric except Somers’ d, which has both symmetric and directional values given All are evaluated by their significance, which also has an approximate T score All are expressed by a Value +/- its Std Error
42
INFO 515Lecture #942 RxC Measures Example Use “GSS91 social.sav” data set Use Analyze / Descriptive Statistics / Crosstabs… Select “paeduc” for Row(s), and “maeduc” for Column(s) Under “Statistics…” select Eta, Correlations, Gamma, Somers’ d, Kendall’s tau-b and tau-c
43
INFO 515Lecture #943 RxC Measures Example This compares the number of years of education of one’s mother and father to see how strongly they affect one another The crosstab data table is very large, since it ranges from 0 to 20 for each category, with irregular gaps (we’re not using the simplified categories from the Kappa example) Hence we’re not showing it here!
44
INFO 515Lecture #944 RxC Measures Example Both measures show the mother’s education is a slightly better predictor
45
INFO 515Lecture #945 RxC Measures Example Directional measures: Somers’ d is significant It shows that there are about 55% +/- 2% more concordant pairs than discordant ones, excluding ties on the independent variable The Eta measure shows that around 69% of the variability of one parent’s education is shared with the other’s
46
INFO 515Lecture #946 RxC Measures Example
47
INFO 515Lecture #947 RxC Measures Example All of the symmetric measures are statistically significant, with approximate t values around 27-28 The Kendall tau-b and tau-c measures disagree a little on the magnitude of the agreement Gamma and Spearman give fairly strong positive correlations
48
INFO 515Lecture #948 RxC Measures Example Spearman, like ‘r’, ranges from -1 to +1, and does not require a normal distribution Based on ordered categories, not their values Even ‘r’ can be calculated for this case, and it gives results similar to Gamma and Spearman
49
INFO 515Lecture #949 Yule’s Q A special case of gamma for a 2x2 table is called Yule’s Q It is appropriate for ordinal data in 2x2 tables; so values for each variable are Low/High, Yes/No, or similar Define Yule’s Q = (a*d – b*c) / (a*d + b*c) See PDF page 59 of Action Research handout for the definition of a, b, c, and d (cell labels)
50
INFO 515Lecture #950 Yule’s Q Measures the strength and direction of association from -1 (perfect negative association) to 0 (no association) to +1 (perfect positive association) Judge the results for Yule’s Q by the table on page 59 of Action Research handout ; and see pages 58-64 for other related discussion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.