What is multilevel modelling? Realistically complex modelling Structures that generate dependent data Dataframes for modelling Distinguishing between.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Contextual effects In the previous sections we found that when regressing pupil attainment on pupil prior ability schools vary in both intercept and slope.
Lecture 23 Spatial Modelling 2 : Multiple membership and CAR models for spatial data.
MCMC estimation in MlwiN
Mark Tranmer Cathie Marsh Centre for Census and Survey Research Multilevel models for combining macro and micro data Unit 5.
THREE-LEVEL MODEL Two views The intractable statistical complexity that is occasioned by unduly ambitious three-level models (Bickel, 2007, 246) AND higher.
Multilevel modelling short course
What is multilevel modelling?
The Census Area Statistics Myles Gould Understanding area-level inequality & change.
The Marginal Utility of Income Richard Layard* Guy Mayraz* Steve Nickell** * CEP, London School of Economics ** Nuffield College, Oxford.
Latent normal models for missing data Harvey Goldstein Centre for Multilevel Modelling University of Bristol.
Multiple Regression and Model Building
Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.
Advanced Lazarsfeldian Methodology Conference From Lazarsfeldian Contextual analysis to Multilevel models (Strategies for analysis of individual and/or.
Structural Equation Modeling
By Zach Andersen Jon Durrant Jayson Talakai
Multilevel Modeling in Health Research April 11, 2008.
3-Dimensional Gait Measurement Really expensive and fancy measurement system with lots of cameras and computers Produces graphs of kinematics (joint.
Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey.
2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.
Lecture 8 Relationships between Scale variables: Regression Analysis
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
School of Veterinary Medicine and Science Multilevel modelling Chris Hudson.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
1 BA 275 Quantitative Business Methods Residual Analysis Multiple Linear Regression Adjusted R-squared Prediction Dummy Variables Agenda.
A multilevel approach to geography of innovation Martin Srholec TIK Centre University of Oslo DIME International Workshop.
Clustered or Multilevel Data
Multilevel Modelling of PLASC data Harvey Goldstein University of Bristol.
Chapter 11 Multiple Regression.
Chapter 6 Variance And Covariance. Studying sets of numbers as they are is unwieldy. It is usually necessary to reduce the sets in two ways: (1) by calculating.
Experimental Group Designs
Meta-Analysis and Meta- Regression Airport Noise and Home Values J.P. Nelson (2004). “Meta-Analysis of Airport Noise and Hedonic Property Values: Problems.
Unit 3b: From Fixed to Random Intercepts © Andrew Ho, Harvard Graduate School of EducationUnit 3b – Slide 1
Analysis of Clustered and Longitudinal Data
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Introduction to Multilevel Modeling Using SPSS
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Lecture 5 “additional notes on crossed random effects models”
Advanced Business Research Method Intructor : Prof. Feng-Hui Huang Agung D. Buchdadi DA21G201.
Modelling non-independent random effects in multilevel models William Browne Harvey Goldstein University of Bristol.
Workshop 1 Specify a multilevel structure for EITHER a response variable of your choice OR for a model to explain house prices OR voting behaviour Template.
Introduction Multilevel Analysis
Funded through the ESRC’s Researcher Development Initiative Prof. Herb MarshMs. Alison O’MaraDr. Lars-Erik Malmberg Department of Education, University.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Introduction to Multilevel Modeling Stephen R. Porter Associate Professor Dept. of Educational Leadership and Policy Studies Iowa State University Lagomarcino.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
Talk by William Browne Slides by Kelvyn Jones In memory of Jon Rasbash all University of Bristol Monday 5th July 2010, Session 2 WHAT IS: multilevel modelling?
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Instructor: Dr. Amery Wu
Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Session 2.1 – Revision of Day 1.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Kelvyn Jones, University of Bristol Wednesday 2nd July 2008, Session 29 WHAT IS: multilevel modelling?
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Multilevel Modelling Dr Andrew Bell,
An introduction to basic multilevel modeling
How to handle missing data values
From GLM to HLM Working with Continuous Outcomes
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

What is multilevel modelling? Realistically complex modelling Structures that generate dependent data Dataframes for modelling Distinguishing between variables and levels (fixed and random classifications) Why should we use multilevel modelling as compared to other approaches? Going further and sources of support

Multilevel Models: AKA random-effects models, hierarchical models, variance-components models, random-coefficient models, mixed models First known application: 1861 one-way, random-effects model: several telescopic observations on the same night for several different nights; separated the variance into between and within-night variation Modern day version: 1986, publication of algorithms (linked to software) for dealing with unbalanced data and complex variance functions

Realistically complex modelling Statistical models as a formal framework of analysis with a complexity of structure that matches the system being studied Four KEY Notions 1 : Modelling data with a complex structure A large range of structures that ML can handle routinely; eg houses nested in neighbourhoods 2: Modelling heterogeneity standard regression models ‘averages’, ie the general relationship ML additionally models variances; eg individual house prices vary from n’hood to neighbourhood 3: Modelling dependent data potentially complex dependencies in the outcome over time, over space, over context; eg houses within a n’hood tend to have similar prices 4: Modelling contextuality: micro & macro relations eg individual house prices depends on individual property characteristics and on neighbourhood characteristics

Modelling data with complex structure 1: Hierarchical structures : model all levels simultaneously a) People nested within places: two-level model b) People nested within households within places: three-level model 2 Note imbalance allowed!

Non- Hierarchical structures a) cross-classified structure b) multiple membership with weights So far unit diagrams now……

CLASSIFICATION DIAGRAMS People Neighbourhoods Regions a) 3-level hierarchical structure b) cross-classified structure Students Neighbourhoods Schools Conjecture: All quantitative social designs are a combination of these three structures?? c) multiple membership structure Neighbourhoods People

Combining structures: crossed-classifications and multiple membership relationships School S1 S2 S3 S4 Pupils P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 Area A1 A2 A3 Pupil 8 has moved schools but still lives in the same area P8 P1 Pupil 7 has moved areas but still attends the same school P7 Pupil 1 moves in the course of the study from residential area 1 to 2 and from school 1 to 2 Student School Area Now in addition to schools being crossed with residential areas pupils are multiple members of both areas and schools.

A data-frame for examining neighbourhood effects on price of houses Classifications or levels Response Explanatory variables House i N’hood j Price ij No of Rooms ij House type Type j 1 75 6 Semi Suburb 2 71 8 3 91 7 Det 68 4 Ter Central 37 67 82 85 5 54 9 Terr 43 66 55 Questions for multilevel (random coefficient) models What is the between-neighbourhood variation in price taking account of size of house? Are large houses more expensive in central areas? Are detached houses more variable in price Form needed for MLwiN

classifications, units and dataframes Two level repeated measures design: classifications, units and dataframes Classification diagram Unit diagram Person Measurement Occasion P1 P2 P3 ..... O1 O2 O3 O4 O1 O2 O1 O2 O3 Classifications or levels Response Explanatory variables Occasion i Person j Incomeij Ageij Genderj 1 75 25 F 2 85 26 3 95 27 82 32 M 91 33 88 45 93 46 96 47 Person Inc-Occ1 Inc-Occ2 Inc-Occ3 Age-Occ1 Age-Occ2 Age-Occ3 Gender 1 75 85 95 25 26 27 F 2 82 91 * 32 33 M 3 88 93 96 45 46 47 b) in short form : Form needed for MLwiN a) in long form

Distinguishing Variables and Levels House H1 H2 H3 H1 H2 H3 H1 H2 H1 H2 H3 H4 N’hood N1 N2 N1 N2 N’hood type Surburb Central NO! N’hood type is not a random classification but a fixed classification, and therefore an attribute of a level; ie a VARIABLE Random classification: if units can be regarded as a random sample from a wider population of units. Eg houses and n’hoods Fixed classification is a small fixed number of categories. Eg Suburb and central are not two types sampled from a large number of types, on the basis of these two we cannot generalise to a wider population of types of n’hoods, Classifications or levels Response Explanatory Variables House I Nhood j Type k Price ijk Rooms ijk House type ijkijk 1 Suburb 75 6 Det 2 71 4 3 91 7 F Central 68 9 37 M Etc

What are the alternatives; and why use multilevel modelling? Analysis Strategies for Multilevel Data What are the alternatives; and why use multilevel modelling?

Example: research on school effects I Group-level analysis. Move up the scale: analyse only at the macro level; Aggregate to level 2 and fit standard regression model. Problem: Cannot infer individual-level relationships from group-level relationships (ecological or aggregation fallacy) Example: research on school effects Response: Current score on a test, turned into an average for each of j schools; Predictor: past score turned into an average for each of j schools Model: regress means on means Means on means analysis is meaningless! Mean does not reflect within group relationship Aitkin, M., Longford, N. (1986), "Statistical modelling issues in school effectiveness studies", Journal of the Royal Statistical Society, Vol. 149 No.1, pp.1-43. Same mean , but three very different within school relations (elitist; egalitarian, bizarre!)

I Group-level analysis Continued Aggregate to level 2 and fit standard regression model. Problem: Cannot infer individual-level relationships from group-level relationships (ecological or aggregation fallacy) Robinson (1950) demonstrated the problem by calculated the correlation between illiteracy and ethnicity in the USA for 2 aggregate and individual 2 scales of analysis for 1930 USA - Individual: for 97 million people; States: 48 units - very different results! The ECOLOGICAL FALLACY

What does an individual analysis miss What does an individual analysis miss? Subramaniam, SV, Jones, K,et al (2009) 'Revisiting Robinson: The perils of individualistic and ecological fallacy', International Journal of Epidemiology Re-analysis as a two level model (97m in 48 States) Who is illiterate? Individual model States People Does this vary from State to State? Cross-level interactions? Individual (single) level model of (Logit) illiteracy & ethnicity Cross-level intercations between people and places

Analysis Strategies (cont.) III Contextual analysis. Analysis individual-level data but include group-level predictors Problem: Assumes all group-level variance can be explained by group-level predictors; incorrect SE’s for group-level predictors Do pupils in single-sex school experience higher exam attainment? Structure: 4059 pupils in 65 schools Response: Normal score across all London pupils aged 16 Predictor: Girls and Boys School compared to Mixed school Parameter Single level Multilevel Cons (Mixed school) -0.098 (0.021) -0.101 (0.070) Boy school 0.122 (0.049) 0.064 (0.149) Girl school 0.245 (0.034) 0.258 (0.117) Between school variance(u2) 0.155 (0.030) Between student variance (e2) 0.985 (0.022) 0.848 (0.019) SEs

Analysis Strategies (cont.) IV Analysis of covariance (fixed effects model). Include dummy variables for each and every group Problems What if number of groups very large, eg households? No single parameter assesses between group differences Cannot make inferences beyond groups in sample Cannot include group-level predictors as all degrees of freedom at the group-level have been consumed Target of inference: individual School versus schools

Analysis Strategies (cont.) V Fit single-level model but adjust standard errors for clustering (GEE approach) Problems: Treats groups as a nuisance rather than of substantive interest; no estimate of between-group variance; not extendible to more levels and complex heterogeneity VI Multilevel (random effects) model. Partition residual variance into between- and within-group (level 2 and level 1) components. Allows for un-observables at each level, corrects standard errors, Micro AND macro models analysed simultaneously, avoids ecological fallacy and atomistic fallacy: richer set of research questions BUT (as usual) need well-specified model and assumptions met.

Type of questions tackled by ML: fixed AND random effects Even with only ‘simple’ hierarchical 2-level structure EG 2-level model: current attainment given prior attainment of pupils(1) in schools(2) Do Boys make greater progress than Girls (F: ie averages) Are boys more or less variable in their progress than girls? (R: modelling variances) What is the between-school variation in progress? (R) Is School X different from other schools in the sample in its effect? (F)……….

Type of questions tackled by ML cont. Are schools more variable in their progress for pupils with low prior attainment? (R) Does the gender gap vary across schools? (R) Do pupils make more progress in denominational schools? (F) ) (correct SE’s) Are pupils in denominational schools less variable in their progress? (R) Do girls make greater progress in denominational schools? (F) (cross-level interaction) (correct SE’s) More generally a focus on variances: segregation, inequality are all about differences between units

Resources http://www.cmm.bris.ac.uk Centre for Multilevel Modelling http://www.cmm.bris.ac.uk Provides access to general information about multilevel modelling and MlwiN. Email discussion group: http://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=multilevel With searchable archives

http://www.cmm.bristol.ac.uk/

http://www.cmm.bristol.ac.uk/learning-training/course.shtml

http://www.cmm.bristol.ac.uk/links/index.shtml

http://www. cmm. bristol. ac http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/index.shtml

The MLwiN manuals are another training resource http://www.cmm.bristol.ac.uk/MLwiN/download/manuals.shtml

Texts Thorough but a little dated: Snijders & Bosker Comprehensive but demanding! : Goldstein Thorough but a little dated: Snijders & Bosker Approachable : Hox Authoritative: de Leeuw & Meijer Applications: education, O’Connell & McCoach Applications: health, Leyland & Goldstein http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-support/books.shtml

Why should we use multilevel models? Sometimes: single level models can be seriously misleading!