© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 1 S052: Applied Data Analysis Everything in the Universe Is Grouped – or Nested -- Within Something Else! Bigger ticks have smaller ticks on their backs to bite ‘em You need to respect these data hierarchies in your analysis, if you want your estimates (and your standard errors) to be unbiased: You must specify your regression models so that they adequately represent and respect the multilevel structure of your data. I introduced a way of doing this, earlier, using a “Random-Effects” strategy. There is also a “Fixed-Effects” strategy that achieves similar ends, and that possesses some superior properties, particularly as regards more liberal assumptions on the model. You need to respect these data hierarchies in your analysis, if you want your estimates (and your standard errors) to be unbiased: You must specify your regression models so that they adequately represent and respect the multilevel structure of your data. I introduced a way of doing this, earlier, using a “Random-Effects” strategy. There is also a “Fixed-Effects” strategy that achieves similar ends, and that possesses some superior properties, particularly as regards more liberal assumptions on the model. In our data, “units” are often nested within other “units,” in a hierarchical or multilevel structure: Kids are nested within teachers or classes. Teachers and classes are nested within schools. Schools are nested within districts. Districts are nested within States. States are nested within countries. Countries are nested within planets. Planets are nested within galaxies. Galaxies are nested within universes. Universes are nested within the mind of God … In our data, “units” are often nested within other “units,” in a hierarchical or multilevel structure: Kids are nested within teachers or classes. Teachers and classes are nested within schools. Schools are nested within districts. Districts are nested within States. States are nested within countries. Countries are nested within planets. Planets are nested within galaxies. Galaxies are nested within universes. Universes are nested within the mind of God …
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 2 S052: Applied Data Analysis Remembering the “Diversity Assessment” Example! Broad Research Theme 5131 kids nested in 57 schools Here’s the data-example I used to introduce the “random-effects” approach earlier:
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 3 S052: Applied Data Analysis Remembering the “Diversity Assessment” Example! School-level Control predictor School-level Control predictor School-level Question predictors School-level Question predictors Individual-level Question predictor Individual-level Question predictor Individual-level Control predictor Individual-level Control predictor Individual-level Outcome variable Individual-level Outcome variable Variable that identifies students in schools
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 4 S052: Applied Data Analysis There Is Variability At Two Levels – Within & Between School – In The Outcome! Such questions are difficult enough to answer, but the problem is made more complex … by the presence of multiple levels of variability in the outcome, LIVEWORK! Such questions are difficult enough to answer, but the problem is made more complex … by the presence of multiple levels of variability in the outcome, LIVEWORK! Schools Between-School Variability Total Variability And the difference is the Within-School Variability
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 5 S052: Applied Data Analysis So, What Is The Problem With Students Being Nested Within Schools? Clearly, when students within a school share common experiences and contexts, they may behave similarly in observed and unobserved ways... This is problematic because, in standard OLS regression models … All unobserved effects reside in the residuals, which are assumed to be independent of each other. If students within the same school do share unobserved experiences that impact their values of the outcome, their residuals may then be correlated. Then, if we fit an OLS regression model to such data, unthinkingly: The error independence assumption may be violated. The stochastic part of the regression model will be mis-specified. The standard errors associated with parameter estimates may be mis-estimated. All statistical inference may be incorrect. This is problematic because, in standard OLS regression models … All unobserved effects reside in the residuals, which are assumed to be independent of each other. If students within the same school do share unobserved experiences that impact their values of the outcome, their residuals may then be correlated. Then, if we fit an OLS regression model to such data, unthinkingly: The error independence assumption may be violated. The stochastic part of the regression model will be mis-specified. The standard errors associated with parameter estimates may be mis-estimated. All statistical inference may be incorrect. The solution is, to … Modify the regression model to represent the new reality. Specify the model so that each child’s residual is allowed to be correlated with the residuals of other children in the same school. What is the best way to do this? The solution is, to … Modify the regression model to represent the new reality. Specify the model so that each child’s residual is allowed to be correlated with the residuals of other children in the same school. What is the best way to do this?
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 6 multilevel model Earlier, I offered one kind of multilevel model to solve this problem … here is an example. For illustrative purposes, it contains only control predictors, FEMALE and ENROLL. multilevel model Earlier, I offered one kind of multilevel model to solve this problem … here is an example. For illustrative purposes, it contains only control predictors, FEMALE and ENROLL. S052: Applied Data Analysis One Solution – Specify A “Random Effects of School” Multilevel Model Subscripts at two levels: i= school j = student Subscripts at two levels: i= school j = student Structural part Structural part of the model contains predictors at two levels: Predictors, like FEMALE, at the individual level, or level-1, Predictors, like ENROLL at the school level, or level-2. Predictors at each level are distinguished by their subscripts. Structural part Structural part of the model contains predictors at two levels: Predictors, like FEMALE, at the individual level, or level-1, Predictors, like ENROLL at the school level, or level-2. Predictors at each level are distinguished by their subscripts. Regression parameters in the structural part of the model are referred to as “Fixed Effects.” Stochastic partcomposite total) residual Stochastic part of the model contains a composite (or total) residual made up from two “Random Effects” Residual ij is the Random Effect of Student: Residual for student j in school i, Like a regular regression residual, has a different independent value for each student in each school. Normally distributed, zero mean, homoscedastic. Residual ij is the Random Effect of Student: Residual for student j in school i, Like a regular regression residual, has a different independent value for each student in each school. Normally distributed, zero mean, homoscedastic. Residual u i is the Random Effect of School: School i’s contribution to the composite residual. Represents the effects, on the outcome, of the unobserved experiences shared by students within the school. Identical for all students in the same school. Normally distributed, zero mean, homoscedastic. Residual u i is the Random Effect of School: School i’s contribution to the composite residual. Represents the effects, on the outcome, of the unobserved experiences shared by students within the school. Identical for all students in the same school. Normally distributed, zero mean, homoscedastic.
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 7 S052: Applied Data Analysis How Does This Solve The Problem? So, the model no longer requires that (composite) residuals be independent within-school. If we can find a way to fit this new “random effects of school” model, and its hypothesized stochastic structure is correct, we are home free: GLS methods (Stata XTREG), MLE methods (SAS PROC MIXED) So, the model no longer requires that (composite) residuals be independent within-school. If we can find a way to fit this new “random effects of school” model, and its hypothesized stochastic structure is correct, we are home free: GLS methods (Stata XTREG), MLE methods (SAS PROC MIXED) 32 etc Overall ResidualStudent j School i Now all students in the same school are assume to share a common value of the school-level residual, so their composite residuals are naturally linked. Composite residuals for all students in School #1 are linked because they all contain school level residual u 1. Composite residuals for all students in School #2 are linked because they all contain school level residual u 2 … etc.
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 8 * * Input the data, name and label the variables in the DAQ dataset * *; * Input the data; DATA DAQ; INFILE 'C:\DATA\S052\DAQ2.txt'; INPUT SCHID LIVEWORK COMFORT FEMALE RACE PCT_B PCT_A PCT_H PCT_W ENROLL PCT_FL; LABEL SCHID= 'School Identification Number' LIVEWORK= 'Attitude To Living/Working in Multiracial Settings' FEMALE= 'Is Child Female (1=yes)?' PCT_W= '% Caucasian Children in School' ENROLL = 'Total School Enrollment'; * Limit analytic sample to Latino/a & African-American students in order to simplify the analytic problem, format the student race/ethnicity descriptor, and add suitable additional transformed predictors; PROC FORMAT; VALUE RFMT 1 = 'Afro-American‘ 2 = 'Asian‘ 3 = 'Latino/a' 4='Caucasian'; DATA DAQ; SET DAQ; IF RACE=1 OR RACE=3; FORMAT RACE RFMT.; * Create dichotomous predictor to indicate whether child is Hispanic; IF RACE=3 THEN H=1; ELSE H=0; * Create quadratic transformation of PCT_W; PCT_W2 = PCT_W*PCT_W; * Sort the data by the School ID; PROC SORT DATA=DAQ; BY SCHID; * * Fit the "random effects" specification of the Final Multilevel Model * *; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DAQ; TITLE5 'M4R: Final Model -- Random Effects Specification'; MODEL LIVEWORK = FEMALE H ENROLL PCT_W PCT_W2 / SOLUTION; RANDOM INTERCEPT / SUBJECT=SCHID; * * Input the data, name and label the variables in the DAQ dataset * *; * Input the data; DATA DAQ; INFILE 'C:\DATA\S052\DAQ2.txt'; INPUT SCHID LIVEWORK COMFORT FEMALE RACE PCT_B PCT_A PCT_H PCT_W ENROLL PCT_FL; LABEL SCHID= 'School Identification Number' LIVEWORK= 'Attitude To Living/Working in Multiracial Settings' FEMALE= 'Is Child Female (1=yes)?' PCT_W= '% Caucasian Children in School' ENROLL = 'Total School Enrollment'; * Limit analytic sample to Latino/a & African-American students in order to simplify the analytic problem, format the student race/ethnicity descriptor, and add suitable additional transformed predictors; PROC FORMAT; VALUE RFMT 1 = 'Afro-American‘ 2 = 'Asian‘ 3 = 'Latino/a' 4='Caucasian'; DATA DAQ; SET DAQ; IF RACE=1 OR RACE=3; FORMAT RACE RFMT.; * Create dichotomous predictor to indicate whether child is Hispanic; IF RACE=3 THEN H=1; ELSE H=0; * Create quadratic transformation of PCT_W; PCT_W2 = PCT_W*PCT_W; * Sort the data by the School ID; PROC SORT DATA=DAQ; BY SCHID; * * Fit the "random effects" specification of the Final Multilevel Model * *; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DAQ; TITLE5 'M4R: Final Model -- Random Effects Specification'; MODEL LIVEWORK = FEMALE H ENROLL PCT_W PCT_W2 / SOLUTION; RANDOM INTERCEPT / SUBJECT=SCHID; S052: Applied Data Analysis Example of SAS Output From Fitting a “Random Effects of School” Multilevel Model Students are clustered within school, as defined by SCHID Requests a “random effects” analysis PROC MIXED fits the “random intercepts” or “random effects” multilevel model Outcome Predictors Earlier, we fitted these models with SAS PROC MIXED... here’s the SAS code for refitting the final Random Intercepts Multilevel Model (now called “M4R”) in our earlier taxonomy … Data-Analytic Handout I.3(b).4
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 9 S052: Applied Data Analysis Fitted “Random-Effects of School” Model Estimated Fixed Effects Variance Components (Estimated variances of the random effects) Variance Components (Estimated variances of the random effects) Estimated Intraclass Correlation Etc. And here are the parameter estimates, the approximate p-values and the goodness of fit stats …
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 10 S052: Applied Data Analysis Another Solution? Introducing the “Fixed Effects of Schools” Multilevel Model But, this is not the only way to solve the problem … all you need to do is ensure is that there are School Effects actually present in the model (to account for the lack of independence of the students within the school), you do not need to treat them as random, you can also treat them as fixed … This, in practice, can easily be arranged by adding dummy predictors to distinguish schools -- S2, … S57 (with S1 removed as the reference school) – whose parameters will then become the Fixed Effects of Schools: So, for instance, you can move the hypothesized school effects from the Stochastic to the Structural part of the model, without changing the algebraic properties of the specification one whit … Finally, we can fit the new Fixed Effects of School multilevel model with OLS methods again, because the re- specification has limited the remaining residual to its usual “independent across kids” form … This is tantamount to having a separate unique intercept in the model, for each school … X X
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 11 * Create a set of dichotomous (dummy) predictors to represent the schools; DATA DAQ; SET DAQ; IF SCHID=11 THEN S1=1; ELSE S1=0; IF SCHID=12 THEN S2=1; ELSE S2=0; IF SCHID=13 THEN S3=1; ELSE S3=0; IF SCHID=14 THEN S4=1; ELSE S4=0; IF SCHID=17 THEN S5=1; ELSE S5=0;... > IF SCHID=7751 THEN S55=1; ELSE S55=0; IF SCHID=7791 THEN S56=1; ELSE S56=0; IF SCHID=7901 THEN S57=1; ELSE S57=0; * Refit final model w/ OLS & a fixed-effects of school specification Drop the dummy for first school as a reference category Notice that no you cannot include any school-level predictors in the model once the school fixed-effects are present in the model; PROC REG DATA=DAQ; TITLE6 'M4F: Final Model -- Fixed-Effects Specification'; MODEL LIVEWORK = S2-S57 FEMALE H ; * Create a set of dichotomous (dummy) predictors to represent the schools; DATA DAQ; SET DAQ; IF SCHID=11 THEN S1=1; ELSE S1=0; IF SCHID=12 THEN S2=1; ELSE S2=0; IF SCHID=13 THEN S3=1; ELSE S3=0; IF SCHID=14 THEN S4=1; ELSE S4=0; IF SCHID=17 THEN S5=1; ELSE S5=0;... > IF SCHID=7751 THEN S55=1; ELSE S55=0; IF SCHID=7791 THEN S56=1; ELSE S56=0; IF SCHID=7901 THEN S57=1; ELSE S57=0; * Refit final model w/ OLS & a fixed-effects of school specification Drop the dummy for first school as a reference category Notice that no you cannot include any school-level predictors in the model once the school fixed-effects are present in the model; PROC REG DATA=DAQ; TITLE6 'M4F: Final Model -- Fixed-Effects Specification'; MODEL LIVEWORK = S2-S57 FEMALE H ; S052: Applied Data Analysis Fitting The “Fixed-Effects of Schools” Multilevel Model By OLS Methods I fitted a parallel “Fixed Effects of Schools” model also in Data-Analytic Handout I.3(b).4 … Include all the school dummies as predictors, except one (the “reference” school), as usual. Create the school dummies Conduct regular OLS regression analysis, with LIVEWORK as the outcome
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 12 S052: Applied Data Analysis Comparing the Fitted “Random-Effects-” and “Fixed Effects” of Schools Models Estimated effects of student-level predictors are similar under both strategies, in this case. Effects of school-level predictors can be estimated in the Random Effects of School model, but not in the Fixed Effects of School model. And here are the parameter estimates, the approximate p-values and the goodness of fit stats …
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 13 S052: Applied Data Analysis Which Approach Is Better? Random Effects of Higher-Order Units Approach: Model the unobserved impact of higher-level units – the u i – as random effects, treating them as part of the stochastic (random) part of the regression model. Random Effects of Higher-Order Units Approach: Model the unobserved impact of higher-level units – the u i – as random effects, treating them as part of the stochastic (random) part of the regression model. Fixed Effects of Higher-Order Units Approach: Model unobserved impact of the higher –level units – the u i – as fixed effects, treating them as predictors in the structural (fixed) part of the regression model. Fixed Effects of Higher-Order Units Approach: Model unobserved impact of the higher –level units – the u i – as fixed effects, treating them as predictors in the structural (fixed) part of the regression model. New assumptions are NOT imposed, and in fact, it doesn’t matter if the higher-level fixed effects are correlated with other predictors in the model. New assumptions are imposed as a result of including the new random effects – the u i ’s must be normally distributed, have zero mean, be homoscedastic, and be independent of all the predictors in the model. Convenient to implement. Cumbersome to implement, especially when you have many higher-level units. Cheap, statistically speaking – only costs you one degree of freedom, for the parameter that represents the variance of the new random effect, σ u 2. Expensive, statistically speaking –costs you many degrees of freedom, one for each fixed effect of a higher-level unit included in the model (minus 1). Only the “observed” part of the higher-level unit outcome variability is “explained” by predictors in the model. The remaining “unobserved” part remains entrapped in the residual variation. All of the between-higher unit outcome variability – both “observed” and “unobserved” -- is “explained” by predictors in the model -- the higher-level fixed-effects. Restrictive: once you include the fixed effects of the higher-level units in the model, you can no longer introduce other interesting higher-level predictors alongside them. Unrestrictive: once you include the random effects of the higher-level units in the model, you can still include predictors that portray specific characteristics of the higher-level units as predictors in the model.