Using Multilevel Modeling in Institutional Research Ling Ning & Mayte Frias Senior Research Associate Neil Huefner Associate Director Timo Rico Executive Director Research interests: - Developing effect size measures to quantify multilevel interventions - Explore robustness of MLM to assumption violations - Methods to evaluate measurement invariance, and the impact of non-invariance on statistical modeling
Multilevel Data Structure In institutional research, the data structure in the population is usually hierarchical, with students nested within instructors, majors, departments, colleges and campus divisions. Sampling is conducted using multi-stage cluster sampling, rather than simple random sampling. The above two factors give rise to multilevel data, in which lower level units are nested within higher level units
Overview Motivations for using Multilevel Modeling (MLM) in Institutional Research Consequences of not using MLM with clustered data What is Multilevel Modeling (MLM)? Random intercept, random slope and cross-level model Computation software
How can the use of Multilevel Modeling (MLM) positively contribute to Institutional Research? Allows one to calculate standard error more accurately Offers insight into how effects vary at higher levels within the structure Permits the inclusion of predictors at different levels within the model
Motivations for MLM | Standard Error Adjustment Conventional methods assume independent observations Knowing the information of one student tells nothing about another student’s information Often not realistic in institutional research Students in the same majors/student support programs same context more similar Characteristics of one student information for the major information of another student in the same major
Intraclass Correlation (ICC; ρ) Motivations for MLM | Standard Error Adjustment Intraclass Correlation (ICC; ρ) One way to quantify the degree of clustering (i.e., common effect associated with majors ) = variance associated with majors = student variance Roughly speaking, correlation between students in the same major Let’s use the previous example, divide the variance into two parts
Consequences of Clustering Motivations for MLM | Standard Error Adjustment Consequences of Clustering Clustering Overlapping information Clustering Information SE Ignore clustering Inflated Type I error (α) Spurious results Inflated information Overall area less than two unique circles, Animation: inflated information, enlarge SE too small
Example: Effect of a Tutoring Program on retention Motivations for MLM | Standard Error Adjustment Example: Effect of a Tutoring Program on retention 1000 students nested within 63 majors 315 Participants , 685 Non-participants Model: Participation Retention The data set I used for the Catholic example
Value of the Intraclass Correlation Motivations for MLM | Standard Error Adjustment The Inflation of the alpha level of 0.05 in the presence of intra-class correlation (ICC) Average n per group Value of the Intraclass Correlation (ICC) 0.00 0.01 0.05 0.2 10 0.06 0.11 0.28 25 0.08 0.19 0.46 50 0.30 0.59 100 0.17 0.43 0.70 The data set Which I will give more info later, Mention students nested within schools Source: Based on Barcikowski, R.S.(1981). Statistical power with group mean as the unit of analysis. Journal of Educational Statistics, 6, 267-285.
Level1: log(Pij /(1-Pij ))= β0j (j = 1, 2, … , 63) What is MLM | Unconditional Random Intercept Model (RIM) Level1: log(Pij /(1-Pij ))= β0j (j = 1, 2, … , 63) Level 2: 0j =γ00 +U0j Parameter Unconditional RIM Fixed effects Parameter estimate SE Intercept (γ00) 2.91 1.26 Tutoring (β1) - STEM (γ01) HS GPA (β2) Tutoring*STEM(γ11) Variance estimates Level-one variance (σ2) 3.29 Intercept variance (τ00) 0.44 0.34 Slope variance (τ11) Model fit measures Deviance 353.62 AIC 357.62 BIC 362.83
What is MLM | Two-level Random Intercept Model (RIM) Level1: log(Pij /(1-Pij ))= β0j + β1Tutoringij +β2HS GPAij (j = 1, 2, … , 63) Level 2: 0j =γ00 + γ01 STEM0j+U0j Parameter Unconditional RIM (M1) Conditional RIM (M2) Fixed effects Parameter estimate SE Intercept (γ00) 2.91 1.26 3.12 1.41 Tutoring (β1) - 1.86* 0.60 STEM (γ01) -0.81* 0.36 HS GPA (β2) 3.35 3.84 Tutoring*STEM(γ11) Variance estimates Level-one variance (σ2) 3.29 Intercept variance (τ00) 0.44 0.34 0.32 0.68 Slope variance (τ11) Model fit measures Deviance 353.62 331.86 AIC 357.62 341.86 BIC 362.83 354.88
What is MLM | Random Intercept Model Level1: log(Pij /(1-Pij ))= β0j + β1Tutoringij +β2HS GPAij (j = 1, 2, … , 63) Level 2: 0j =γ00 + γ01 STEM0j+U0j logit (Retention) Major 1 … Major 5 Average Tutoring Effect … Major 63 0j : γ00 : Tutoring
Random intercept model What is MLM | Random Slope Model Level1: log(Pij /(1-Pij ))= β0j + β1jTutoringij +β2HS GPAij (j = 1, 2, … , 63) Level 2: 0j =γ00 + γ01 STEM0j+U0j 1j =γ10 + U1j Parameter Unconditional model (M1) Random intercept model (M2) Random slope model (M3) Fixed effects Parameter estimate SE Intercept (γ00) 2.91 1.26 3.12 1.41 3.22 1.52 Tutoring(γ10) - 1.86* 0.60 2.75* 0.10 STEM (γ01) -0.81* 0.36 -1.11* 0.41 HS GPA(β2) 3.35 3.84 3.78 4.12 Tutoring*STEM(γ11) Variance estimates Level-one variance (σ2) 3.29 Intercept variance (τ00) 0.44 0.34 0.32 0.68 0.13 0.48 Slope variance (τ11) 1.33 1.05 Model fit measures Deviance 353.62 331.86 319.56 AIC 357.62 341.86 331.56 BIC 362.83 354.88 347.19
What is MLM | Random Slope Model Level1: log(Pij /(1-Pij ))= β0j + β1jTutoringij +β2HS GPAij (j = 1, 2, … , 63) Level 2: 0j =γ00 + γ01 STEM0j+U0j 1j =γ10 + U1j Major 1 logit (Retention) … Major 5 Average Tutoring Effect …… Major 63 0j : 1j γ00 : Tutoring
Unconditional model (M1) What is MLM | Cross-level Model Level1: log(Pij /(1-Pij ))= β0j + β1jTutoringij +β2HS GPAij (j = 1, 2, … , 63) Level 2: 0j =γ00 + γ01 STEM0j+U0j 1j =γ10 + γ11 STEM0j+U1j Parameter Unconditional model (M1) Random intercept Model (M2) Random slope model (M3) Cross-level Model (M4) Fixed effects estimate SE Intercept (γ00) 2.91 1.26 3.12 1.41 3.22 1.52 3.68 1.54 Tutoring(γ10) - 1.86* 0.60 2.75** 0.10 2.29* 0.30 STEM (γ01) -0.81* 0.36 -1.11** 0.41 -1.14* HS GPA (γ02) 3.35 3.84 3.78 4.12 3.82 4.13 Tutoring*STEM (γ11) 5.39** 0.14 Variance estimates Level-one variance (σ2) 3.29 Intercept variance (τ00) 0.44 0.34 0.32 0.68 0.26 0.48 0.24 Slope variance (τ11) 1.33 1.05 1.61 2.13 Model fit measures Deviance 353.62 331.86 319.56 311.64 AIC 357.62 341.86 331.56 323.64 BIC 362.83 354.88 347.19 340.88
What is MLM | Retention in each major 3.01 2.85 Major 2 3.13 2.91 Major 3 2.51 2.12 Major 4 3.74 1.16 … Major 60 3.49 2.32 Major 61 3.38 2.58 Major 62 4.06 2.73 Major 63 2.89 2.47 Mean 3.22 2.75 Variance 0.48 1.33 63 Majors RandomSlope Average intercept, average slope, overall average model, gamma00, gamma10 Random Intercept
MLM | Computing Software Specialized programs for fitting multilevel models HLM MLWin General-purpose statistical software SAS R Stata Mplus
MLM | Motivations Recap. Learning about effects that vary by higher level cluster e.g., A student support program that is more effective in some majors than others Using all the data to perform inferences for groups with small sample size e.g., A program director wants to know how effective her program is for majors with very small number of students. Prediction e.g., To predict a new student’s outcome
MLM | Motivations Recap. 4. Analysis of data from cluster sampling Many national surveys [e.g., PISA(the Program for International Student Assessment)] used multi-stage probability sample design. 5. Including predictors at different levels e.g., A student support program that is more effective in some majors than others due to some characteristics associated with the major such as STEM or Non-STEM. 6. Getting the right standard error The estimated standard errors of regression coefficients might be wrong when we use multiple regression to analyze multilevel data.
MLM | Good References. Hoox, Joop (2010).Multilevel Analysis, Techniques and Applications, Routledge http://joophox.net/mlbook2/MLbook.htm Gelman, A., and Hill, J. (2006) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge. http://www.stat.columbia.edu/~gelman/arm/ Singer, J. (2003) Applied Longitudinal Data Analysis: Modeling Change and Event Occurance. Oxford http://gseacademic.harvard.edu/alda/ Snijders, T., and Bosker, R. (2012) Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling, 2nd Ed. Sage. http://www.stats.ox.ac.uk/~snijders/
MLM | Questions http://csaa.ucdavis.edu/contact.html Contact us