The Multigraph for Loglinear Models Harry Khamis Statistical Consulting Center Wright State University Dayton, Ohio, USA
OUTLINE 1.LOGLINEAR MODEL (LLM) - two-way table - three-way table - examples 2.MULTIGRAPH - construction - maximum spanning tree - conditional independencies - collapsibility 3.EXAMPLES 2
Loglinear Model Goal Identify the structure of associations among a set of categorical variables. 3
LLM: two variables Y 123…JTotal n 11 n 12 n 13 …n 1J n 1+ 2n 21 n 22 n 23 …n 2J n X In I1 n I2 n I3 … n IJ n I+ Total n +1 n +2 n +3 … n +J n 4
LLM: two variables Example Survey of High School Seniors in Dayton, Ohio Collaboration: WSU Boonshoft School of Medicine and United Health Services of Dayton Marijuana Use? YesNoTotal Yes Cigarette Use? No Total
LLM: two variables 6 Two discrete variables, X and Y Model of independence: generating class is [X][Y]
LLM: two variables LLM of independence: 7
LLM: two variables Saturated LLM: generating class is [XY]: 8
LLM: two variables GeneratingProbabilistic InterpretationClassModel X and Y independent[X][Y]p ij = p i+ p +j X and Y dependent[XY]p ij 9
LLM: three variables Example: Dayton High School Data AlcoholCigarette Marijuana Use UseUseYesNo YesYes No NoYes 3 43 No
1111 LLM: three variables Saturated LLM, [XYZ]:
LLM: three variables Generating Probabilistic InterpretationClassModel mutual independence[X][Y][Z]p ijk = p i++ p +j+ p ++k joint independence[XZ][Y]p ijk = p i+k p +j+ conditional independence[XY][XZ]p ijk = p ij+ p i+k /p i++ homogeneous association * [XY][XZ][YZ] * saturated model[XYZ]p ijk * nondecomposable model 12
Decomposable LLMs closed-form expression for MLEs closed-form expression for asymptotic variances (Lee, 1977) asymptotic variances (Lee, 1977) conditional G 2 statistic simplifies allow for causal interpretations easier to interpret the LLM 13
14
3 Categorical Variables: X, Y, and Z If [X ⊗ Y] and [Y ⊗ Z] then [X ⊗ Z] FALSE! 15
LLM: three variables Generating Probabilistic InterpretationClassModel mutual independence[X][Y][Z]p ijk = p i++ p +j+ p ++k joint independence[XZ][Y]p ijk = p i+k p +j+ conditional independence[XY][XZ]p ijk = p ij+ p i+k /p i++ homogeneous association[XY][XZ][YZ]p ijk = ψ ij φ ik ω jk saturated model[XYZ]p ijk 16
3 Categorical Variables: X, Y, and Z If [Y ⊗ Z] for all X = 1, 2, …. then [Y ⊗ Z] FALSE! 17
LLM: three variables Generating Probabilistic InterpretationClassModel mutual independence[X][Y][Z]p ijk = p i++ p +j+ p ++k joint independence[XZ][Y]p ijk = p i+k p +j+ conditional independence[XY][XZ]p ijk = p ij+ p i+k /p i++ homogeneous association[XY][XZ][YZ]p ijk = ψ ij φ ik ω jk saturated model[XYZ]p ijk 18
3 Categorical Variables: X, Y, and Z If [Y ⊗ Z] then [Y ⊗ Z] for all X = 1, 2, 3, … FALSE! 19
Which Treatment is Better? TRIAL 1 TRIAL 2 CURED? CURED? YesNoTotalYesNoTotal A40 (.20) (.85) TREATMENT B30 (.15) (.75) Combine TRIALS 1 and 2: CURED? Yes NoTotal A125 (.42) TREATMENT B330 (.55) “Ask Marilyn”, PARADE section, DDN, pages 6-7, April 28,
Florida Homicide Convictions Resulting in Death Penalty ML Radelet and GL Pierce, Florida Law Review 43: 1-34, 1991 Death Penalty Yes No White53 (0.11) 430 Defendant’s Race Black15 (0.08) 176 White VictimBlack VictimDeath Penalty YesNoYesNo White53 (0.11)414White 0 (0.00) 16 Defendant’s Race Black11 (0.23) 37Black 4 (0.03)139 21
Multigraph Representation of LLMs Vertices = generators of the LLM Multiedges = edges that are equal in number to the number of indices shared by the two vertices being joined 22
Multigraph: three variables [XY][XZ]XY XZ 23
Examples of Multigraphs 24 [AS][ACR][MCS][MAC] ASACR MACMCS
Examples of Multigraphs 25 [ABCD][ACE][BCG][CDF] ABCD CDF ACEBCG
Maximum Spanning Tree The maximum spanning tree of a multigraph M: tree (connected graph with no circuits) includes each vertex sum of the edges is maximum 26
Examples of maximum spanning trees 27 [XY][XZ]XYXZ
Examples of maximum spanning trees 28 [AS][ACR][MCS][MAC] ASACR MACMCS
Examples of maximum spanning trees 29 [ABCD][ACE][BCG][CDF] ABCD CDF ACEBCG
Fundamental Conditional Independencies for a Decomposable LLM 1.Let S be the set of indices in a branch of the maximum spanning tree 2.Remove each factor of S from the multigraph, M; the resulting multigraph is M/S 3.An FCI is determined as: where C 1, C 2, …, C k are the sets of factors in the components of M/S 30
31 FCIs [XY][XZ]XYXZ X S = {X} M/S: Y Z [Y ⊗ Z|X]
Collapsibility Conditions Consider a conditional independence relationship of the form [C 1 ⊗ C 2 |S]. If the levels of all factors in C 1 are collapsed, then all relationships among the remaining factors are undistorted EXCEPT for relationships among factors in S. 32
33 FCIs [XY][XZ]XYXZ X S = {X} M/S: Y Z [Y ⊗ Z|X]
Example: Ob-Gyn Study Example: Ob-Gyn Study (Darrocca, et al., 1996) n = 201 pregnant mothers Variables: E: EGA (Early, Late) B: Bishop score (High, Low) T: Treatment (Prostin, Placebo) 34
Example: Ob-Gyn Study BISHOP SCORE (B) HighLow EGA (E) EGA (E) TREATMENT (T) Early Late Early Late Prostin Placebo Best-fitting model: [E][TB] 35
Example: Ob-Gyn Study Generating Class: [E][TB] Multigraph: ETB FCI: [E ⊗ T,B] 36
Example: Ob-Gyn Study Collapsed Table (collapse over EGA): BISHOP SCORE (B) HighLowTotal Prostin58 (0.55) TREATMENT (T) Placebo38 (0.40) P =
Example: WSU-United Way Study M: Marijuana (No, Yes) A: Alcohol (No, Yes) C: Cigarettes (No, Yes) R: Race (Other, White) S: Sex (Female, Male) Observed cell frequencies (n = 2,276):
Example: WSU-United Way Study Generating class: [ACE][MAC][MCG] Multigraph, M: ACE MCGMAC 39
Example: WSU-United Way Study M: S = {A,C} ACE M/S: E A C MGM MCG MAC [E ⊗ M,G|A,C] A = AlcoholC = CigaretteE = Ethnic G = GenderM = Marijuana 40
Example: WSU PASS Program “Preparing for Academic Success” GPA below 2.0 at the end of first quarter 41
Example: WSU PASS Program Variables (n = 972): FACTORLABELLEVELS RetentionR1=No, 2=Yes CohortC1, 2, 3, 4 PASS ParticipationP1=No, 2=Yes Ethnic GroupE1=Caucasian, 2=African-American, 3=Other GenderG1=Male, 2=Female 42
Example: WSU PASS Program The best-fitting LLM has generating class [EG][CP][RC][PG] Multigraph, M: G EGPG P RC C CP 43
Example: WSU PASS Program M: S = {C} EG PG RC CP R P C M M/S [E,G,P ⊗ R|C] C = CohortE = EthnicG = Gender P = PASS ParticipationR = Retention 44
Example: Affinal Relations in Bosnia-Herzegovina Example: Affinal Relations in Bosnia-Herzegovina Data courtesy of Dr. Keith Doubt, Department of Sociology, Wittenberg University, Springfield, Ohio N = 861 couples from Bosnia-Herzegovina are surveyed concerning affinal relations. M: Marriage Type (traditional, elopement) L: Location of Man and Wife (same, different) E: Ethnicity (Bosniak, Serb, Croat) S: Settlement (rural, urban) Best-fitting model: [MLES] Consider structural associations among M, L, and S for each ethnic group (E) separately. 45
Example: Affinal Relations in Bosnia-Herzegovina Bosniaks:[ML][LS] Serbs:[MS][SL] Croats:[M][L][S] M: Marriage TypeL: Location of Man and WifeS: Settlement 46
Conclusions The generator multigraph uses mathematical graph theory to analyze and interpret LLMs in a facile manner Properties of the multigraph allow one to: – –Find all conditional independencies – –Determine all collapsibility conditions REFERENCE Khamis, H.J. (2011). The Association Graph and the Multigraph for Loglinear Models, SAGE series Quantitative Applications in the Social Sciences, No
Without data, you’re just one more person with an opinion 48