Four Innovative Applications of Hierarchical Linear Modeling (HLM) Shenyang Guo, Ph.D. University of North Carolina at Chapel Hill
Acknowledgment Support for this research is provided by the Discretionary Grants Program of Children’s Bureau to Shenyang Guo (PI). The project aims to develop innovative quantitative methods for child welfare research.
Overview of HLM Why HLM? The need to study multilevel influences on an outcome variable, and to run growth curve analysis. Central problem: intraclass correlation. Conceptually one may view HLM as running regression model several times, or at 2 or 3 levels. Other names:multi-level analysis, mixed-effects model, random-effects model, growth curve analysis, random-coefficient regression model, covariance components model. The key idea is to estimate random effects. In addition to traditional regression coefficients, HLM estimates a set of random effects associated with each high-level unit, which can be used to control for autocorrelation.
Four innovative models 1. Latent-variable analysis of HLM 2. Omnibus score of CBCL & TRF using latent-variable HLM 3. Meta analysis using HLM 4. Modeling multivariate change
Latent-variable analysis (1) Latent variables: variables that are not directly observed. Under this framework, any observed variable is an indicator, and can be viewed as a latent true-score plus measurement error. Statistical models for analyzing latent variables: structural equation modeling: (1) measurement model – relations between indicator and latent variable; (2) structural model – relations among latent variables. In HLM, a latent-variable analysis consists of two parts: measurement model, and structural model involving explanatory variables.
Latent-variable analysis (2) Example Sampson, Raudenbush, & Earls (1997, Science 277(15): ) applied this approach to analyzing multilevel influences of collective efficacy, in which they view collective efficacy as a latent variable. Their three-level HLM treats ten items collected from all survey respondents as level 1, and conceptualizes that these items are commonly determined by a latent true score “collective efficacy” plus measurement errors. Their model then explores how informants within neighborhoods (i.e., level 2) vary randomly around the neighborhood mean of “collective efficacy”, and how neighborhoods across whole study area (i.e., level 3) vary randomly about the grand mean of “collective efficacy”.
Omnibus score of CBCL & TRF (1) Disentangle multiple raters’ measurement error from clients true change (Guo & Hussey, 1999, Social Work Research 23(4): ). Ratings are likely to be collected by multiple raters (e.g., Achenbach instruments: CBCL, TRF, & YSR). Attritions can also occur in raters. None of the prior studies (before 1999) ever controlled for raters’ impact on ratings, though many used multiple raters to collect ratings. A theoretical framework to investigate multiple sources of measurement error: Cronbach’s Generalizability Theory.
The Need: Hypothetical Data Two raters’ratings on a single subject 1a 1b 1c Time Y Rater A Rater B Time Y Time Y Time Y 1d Omnibus score of CBCL & TRF (2)
Omnibus score of CBCL & TRF (3) Problems & solutions
Omnibus score of CBCL & TRF (4)
Omnibus score of CBCL & TRF (5) Model 1
Omnibus score of CBCL & TRF (6) Model 2
Omnibus score of CBCL & TRF (6) Illustrating example Acknowledgment to Dr. Richard Barth and Ms. Ariana Wall at UNC for their help. Data: National Survey of Child and Adolescent Well- being (NSCAW). We focus on externalizing and internalizing scores. Each child has four such scores: two from caregiver (CBCL), and two from teacher (TRF). The task: how to create one score? Variables employed in level 3 of Model 2: age gender, race, social behavior, MBA reading score, MBA math score, count of risky behaviors of delinquency, count of risky behaviors of substance abuse, and count of risky behavior of suicidal attempt.
Omnibus score of CBCL & TRF (7) Correlation coefficients and descriptive statistics on disagreement between caregiver and teacher’s scores (N=448)
Omnibus score of CBCL & TRF (8) Evaluation Schemes: C1 Caregiver's scores only.5Ec +.5Ic C2 Teacher's scores only.5Et +.5It C3 All 4 scores from both versions with equal weights.25Ec +.25Ic +.25Et +.25It C4 Similar to C3 but heavier weights giving to caregiver's scores.35Ec +.35Ic +.15Et +.15It C5 Similar to C3 but heavier weights giving to teacher's scores.15Ec +.15Ic +.35Et +.35It C6 Similar to C3, a 50/50 split between caregiver and teacher's scores but heavier weights giving to externalizing scores.35Ec +.15Ic +.35Et +.15It
Omnibus score of CBCL & TRF (9) Evaluation Schemes (continued): C7 Similar to C3, a 50/50 split between caregiver and teacher's scores but heavier weights giving to internalizing scores.15Ec +.35Ic +.15Et +.35It C8 Extreme value, low end.5 [Min (Ec,Et)] +.5 [Min (Ic,It)] C9 Extreme value, high end.5 [Max (Ec,Et)] +.5 [Max (Ic,It)] C10 Arbitrary: half high externalizing and half low internalizing 1.5 [Max (Ec,Et)] +.5 [Min (Ic,It)] C11 Arbitrary: half low externalizing and half high internalizing.5 [Min (Ec,Et)] +.5 [Max (Ic,It)]
Omnibus score of CBCL & TRF (10) Evaluations
Omnibus score of CBCL & TRF (11) Use the score as a dependent variable
Omnibus score of CBCL & TRF (12) Use the score as independent variable:
Meta analysis using HLM (1) R & B (2002): Chapter 7 Meta analysis: research synthesis, or a “study of the studies”. Objective: summarize results from a series of related studies. Collect the following data from literature review: the mean outcome for the experimental group; the mean outcome for the control group; the pooled, within-group standard deviation; the sample size of the experimental group; the sample size of the control group; where j indicates the jth study.
Meta analysis using HLM (2) Based on these data, calculate effect size: And variance of the effect size: Square root of V j is called “standard error of d j ”
Meta analysis using HLM (3) General model Level 1: d j = j + e j Level 2: j = 0 + u j or combined model: d j = 0 + u j + e j where d j ~N( 0, j ) with j = + V j In this model, we only have one subscript j to indicate study. This is a special case of two- level model, in which subscript i is omitted, because we don’t have original data at the study subject level. V-known model: unlike previous HLM, this model has known variance V j,or S.E.(d j )= V j.
Meta analysis using HLM (4) Use HLM DOS version to run the v- known model. Data look like this: …… Format of the raw data file: (a11,3f11.3) See HLM 5 manual pp
Meta analysis using HLM (5) Experimental Studies of Teacher Expectancy Effects on Pupil IQ
Meta analysis using HLM (6) Running HLM, we obtain the following findings: The estimated grand-mean effect size is 0.084, implying that, on average, experimental students scored about.084 standard deviation units above the controls. However, the estimated variance of the effect parameter is =.019. This corresponds to a standard deviation of.138 (i.e., .019 =.138), which implies that important variability exists in the true-effect sizes. For example, an effect one standard deviation above the average would be =.222, which is of nontrivial magnitude.
In a cross-sectional study, we use correlation coefficients to see the level of association of an outcome variable with other variables. In a longitudinal study, we have a similar task, that is, we need to model multivariate change: whether two change trajectories (outcome measures) correlate over time? For details of this method, see MacCallum, R.C., & Kim, C. (2000). “Modeling multivariate change”, in Little, Schnabel, & Baumert edited, Modeling Longitudinal and Multilevel Data. Lawrence Erlbaum Associates, pp Multivariate change (1)
What kind of questions can be answered? Whether benefits clients gained from an intervention over time negatively correlate with the intervention’s side effects? Whether clients’ change in physical health correlates with their change in mental health? Whether a program’s designed change in outcome (e.g., abstinence from alcohol or substance abuse) correlates with clients’ level of depression? Use software MLn/MLwiN to estimate the model. It’s possible to use SAS Proc Mixed. Multivariate change (2)