By Zach Andersen Jon Durrant Jayson Talakai Multilevel Analysis By Zach Andersen Jon Durrant Jayson Talakai
OUTLINE Jon – What is Multilevel Regression Jayson – The Model Zach – R code applications / examples
WHAT IS MULTILEVEL REGRESSION Regression models at multiple levels, because of dependencies in nested data Not two stage, this occurs all at once
Employees in organizations Firms in various industries EXAMPLES Students in schools Individuals by area Employees in organizations Firms in various industries Repeated observations on a person https://www.youtube.com/watch?v=wom6uPdI-P4
WHEN TO USE A MULTILEVEL MODEL? Individual units (often people), with group indicators (e.g. Schools, area). Dependent variable (level 1) More than one person per group Generally we need at least 5 groups, preferably more. (Ugly rule of thumb) https://www.youtube.com/watch?v=wom6uPdI-P4
WHEN TO USE A MULTILEVEL MODEL? Use a multilevel model whenever your data is grouped (or nested) into categories (or clusters) Allows for the study of effects that vary by group Regular regression ignores the average variation between groups and may lack the ability to generalize http://www.princeton.edu/~otorres/Multilevel101.pdf
DATA STRUCTURE AND DEPENDENCE Independence makes sense sometimes and keeps statistical theory relatively simple. Eg; standard error(sample average) = s/n requires that the n observations are independent But data often have structure, and observations have things in common; same area, same school, repeated observations on the same person Observations usually cannot be regarded as independent https://www.youtube.com/watch?v=wom6uPdI-P4
Multilevel Models https://www.youtube.com/watch?v=wrTiCfgGdro
PROBLEMS CAUSED BY CORRELATION Imprecise parameter estimates Incorrect standard errors
A SIMPLE 2-LEVEL HIERARCHY School 1 School 2 Student 1 Student 2 Student 3 Student 1 Student 2 Student 3 https://www.youtube.com/watch?v=wom6uPdI-P4
A SIMPLE 2-LEVEL HIERARCHY School 1 School 2 Level 2 Student 1 Student 2 Student 3 Student 1 Student 2 Student 3 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4
The first level of a hierarchy is not necessarily a person PEOPLE ARE AT LEVEL 1?? The first level of a hierarchy is not necessarily a person https://www.youtube.com/watch?v=wom6uPdI-P4
A SIMPLE 2-LEVEL HIERARCHY Industry 1 Industry 2 Level 2 Firm 1 Firm 2 Firm 3 Firm 1 Firm 2 Firm 3 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4
A SIMPLE 2-LEVEL HIERARCHY Person 1 Person 2 Level 2 Event 1 Event 2 Event 3 Event 1 Event 2 Event 3 Level 1 https://www.youtube.com/watch?v=wom6uPdI-P4
BRIEF HISTORY Problems of single level analysis, cross level inferences and ecological fallacy https://www.youtube.com/watch?v=wom6uPdI-P4
DISCUSSION AS TO WHY A NORMAL REGRESSION CAN BE A POOR MODEL Because Reality might not conform to the assumptions of linear regression (Independence) Because in nature observation tend to cluster A random person in Lubbock is more likely to be a student then a random person in another city (clustering of populations/not independent) Different clusters react differently https://www.youtube.com/watch?v=wom6uPdI-P4
Also longitudinal, geographical studies EXTENSIONS Focus was initially on hierarchical structures and especially students in schools Also longitudinal, geographical studies More recently moved to non hierarchical situations such as cross-classified models. (single level is part of more than one group)
INTRACLASS CORRELATION Level 1 variance explained by the group (level 2) ICC is the proportion of group-level variance to the total variance Formula for ICC: Variance in group Overall variance http://en.wikipedia.org/wiki/Intraclass_correlation
Random or Fixed Effects MULTILEVEL MODELING Random or Fixed Effects What are random and fixed effects? When should you use random and fixed effects? Types of random effects models The Model Assumptions of the model Building a multilevel model
Fixed vs random effects **Anytime that you see the word “population” substitute it with the word “processes.” http://www2.sas.com/proceedings/forum2008/374-2008.pdf
INTRODUCING THE MODEL
Types of Models: Random Intercepts Model Intercepts are allowed to vary: The scores on the dependent variable for each individual observation are predicted by the intercept that varies across groups. http://en.wikipedia.org/wiki/Multilevel_model
Types of Models: Random Slopes Model Slopes are different across groups. This model assumes that intercepts are fixed (the same across different contexts). http://en.wikipedia.org/wiki/Multilevel_model http://www.strath.ac.uk/aer/materials/5furtherquantitativeresearchdesignandanalysis/unit4/randomslopemodelling/
Types of Models: Random intercepts and slopes model Includes both random intercepts and random slopes Is likely the most realistic type of model, although it is also the most complex. http://en.wikipedia.org/wiki/Multilevel_model
Assumptions for Multilevel Models Modification of assumptions Linearity and normality assumptions are retained Homoscedasticity and independence of observations need to be adjusted. Observations within a group are more similar to observations in different groups. Groups are independent from other groups, but observations within a group are not. http://en.wikipedia.org/wiki/Multilevel_model
Multilevel Model: Example http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
Multilevel Model: Level 1 Regression Equation http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
Multilevel Model continued: http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
Adding a Random Sample Component http://faculty.smu.edu/kyler/training/AERA_overheads.pdf
EXAMPLES IN R Example of group effects without Multilevel modeling Example of the Covariance Theorem Example of Random Intercept Model