Unit 3a: Introducing the Multilevel Regression Model © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 1
Revisiting the assumption of independent population residuals Identifying and visualizing a multilevel structure Contrasting “Total,” “Between,” and “Within” Regression models © Andrew Ho, Harvard Graduate School of Education Unit 3a– Slide 2 Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, Use Factor Analysis: EFA or CFA? Course Roadmap: Unit 3a Today’s Topic Area
© Andrew Ho, Harvard Graduate School of Education Unit 3a – Slide 3 If the population residuals are correlated across observations, then OLS-estimated standard errors will be too small. So, t-statistics will be inflated, and null hypotheses will be rejected more frequently than is proper (Increased Type I Error). … the errors must be independent from observation to observation.” “In the population, … Residual Independence Assumption How Does Failure of the Assumption Affect OLS Regression Analysis? Once you have addressed linearity and measurement error conditions, then you should consider the following assumptions about population residuals (a.k.a. errors)… in this rough order of diminishing priority: In a regression model, all unobserved effects end up in the residuals, and so the residuals of students in the same school may lose their required independence. Then, OLS regression analysis will provide incorrect standard errors and inference. In a regression model, all unobserved effects end up in the residuals, and so the residuals of students in the same school may lose their required independence. Then, OLS regression analysis will provide incorrect standard errors and inference. Students within the same school share many common unobserved experiences that may impact their values of the outcome, in a similar way. Students within the same school share many common unobserved experiences that may impact their values of the outcome, in a similar way.
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 4 RQ: What is the relationship between SES and math achievement? We can download datasets from online sources directly! This is an oft-analyzed dataset from the High School and Beyond survey, that tracks young adults through schooling and their careers.High School and Beyond These data are a subsample of 7185 students from 160 schools assessed on their mathematics achievement at a single timepoint in We also have their socioeconomic status estimated from a composite scale that incorporates parental education, parental occupation, and parental income. This is an oft-analyzed dataset from the High School and Beyond survey, that tracks young adults through schooling and their careers.High School and Beyond These data are a subsample of 7185 students from 160 schools assessed on their mathematics achievement at a single timepoint in We also have their socioeconomic status estimated from a composite scale that incorporates parental education, parental occupation, and parental income. I’m including some generally useful formatting that allows you to reorder variables and capitalize all variable names.
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 5 Understanding a Multilevel Structure A random sort allows us to sample 10 random observations from our dataset. A “School ID” code that tells us that all of these students are from the same school. We see that we’ve sampled 10 students from different schools. They differ on math achievement and SES, but we know regression inferences will be flawed if we don’t take school membership into account. We see that we’ve sampled 10 students from different schools. They differ on math achievement and SES, but we know regression inferences will be flawed if we don’t take school membership into account.
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 6 Identifying Your Grouping Variable with xtset This allows you to identify a single observation from each group. Seems silly, for now. This command allows you to identify your “grouping variable” for subsequent commands. A grouping variable can be a classroom, school, district, state, or hospital, as long as there are multiple observations within this group and multiple groups. A grouping variable is also often a participant or patient on whom multiple measures are gathered. This command allows you to identify your “grouping variable” for subsequent commands. A grouping variable can be a classroom, school, district, state, or hospital, as long as there are multiple observations within this group and multiple groups. A grouping variable is also often a participant or patient on whom multiple measures are gathered. This is one way to count the number of schools that you have. How many 1s? So we have 160 schools. Still silly, but you’ll see…
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 7 The Super-Helpful egen Command with its by Option The egen command allows you to create new variables by your grouping variable. Want counts, means, standard deviations, or alternative identifiers by school? Look to egen. Instead of wild integer IDs with gaps between them, this allows you to number your schools 1…N. This should always be a knee-jerk reaction with multilevel data. What is the distribution of group sizes, that is, what is the distribution of the number of observations per group? When there is variance here, we call this “unbalanced.” We often (but do not always) have “balanced” designs in measurements of individuals, where a participant ID is the grouping variable, and we measure all participants the same number of times. This should always be a knee-jerk reaction with multilevel data. What is the distribution of group sizes, that is, what is the distribution of the number of observations per group? When there is variance here, we call this “unbalanced.” We often (but do not always) have “balanced” designs in measurements of individuals, where a participant ID is the grouping variable, and we measure all participants the same number of times.
Exploratory Data Analysis for Multilevel Structures This is a naïve perspective that does not take the grouping variable into account. It’s not a bad place to start as long as you understand that grouping lurks underneath. This is a naïve perspective that does not take the grouping variable into account. It’s not a bad place to start as long as you understand that grouping lurks underneath. © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 8 Tabulate and graph the means and variances for each group, or a random sample of groups. Try to distinguish “within-group” variation, the standard deviations and spreads of each school distribution, from “between-group” variation, the variation in school means from different schools. Here, we’re looking for differences in central tendency across groups, or “between group variation,” vs. the average spread within each group, or “within- group variation. We can also visualize hetero- scedasticity.
Between-Group Variation on the Outcome Variable We can easily calculate group means using the egen command. © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 9 We show the distribution of school average mathematics achievement. Compare the variance of this distribution to the unconditional, total distribution of the MATHACH variable (the relationship should not be surprising).
Within-Group Centering or “De-Meaning” to Visualize Within-Group Variation © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 10 We center all of the within-school distributions on the grand mean. A bit off because box plots show medians, not means
Decomposing and distinguishing variance with the xtsum command. © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 11 Unconditional standard deviation Standard deviation of school means: “Between” Standard deviation of de- meaned scores: “Within” Number of observations and groups, respectively. Average group size.
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 12 Not-so-good, old-fashioned regression, completely ignoring school membership and possible correlations among residuals
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 13
© Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 14
Contrasting “Between,” “Total,” and “Within” Regression Lines © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 15 Between: Means on Means, ignores within-group variation. Total: Points on Points, ignores group membership. Within: Demeaned Points on Demeaned Points. By demeaning, we have taken group membership into account! Between: Means on Means, ignores within-group variation. Total: Points on Points, ignores group membership. Within: Demeaned Points on Demeaned Points. By demeaning, we have taken group membership into account!