Term 4, 2006BIO656--Multilevel Models1 140.656 Multi-Level Statistical Models If you did not receive the welcome from me, me at:

Slides:



Advertisements
Similar presentations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Advertisements

{ Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.
By Zach Andersen Jon Durrant Jayson Talakai
Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS)
Term 3, 2008Bio753 Advanced Methods III1 Weighted Means and RE models.
2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.
Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.
2005 Hopkins Epi-Biostat Summer Institute1 Module I: Statistical Background on Multi-level Models Francesca Dominici Michael Griswold The Johns Hopkins.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 4: Mathematical Tools for Econometrics Statistical Appendix (Chapter 3.1–3.2)
EVAL 6970: Meta-Analysis Fixed-Effect and Random- Effects Models Dr. Chris L. S. Coryn Spring 2011.
Chapter 10 Simple Regression.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Statistics for the Social Sciences
Clustered or Multilevel Data
Multilevel Modeling Soc 543 Fall Presentation overview What is multilevel modeling? Problems with not using multilevel models Benefits of using.
Stat 217 – Week 10. Outline Exam 2 Lab 7 Questions on Chi-square, ANOVA, Regression  HW 7  Lab 8 Notes for Thursday’s lab Notes for final exam Notes.
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
Chapter 12 Section 1 Inference for Linear Regression.
Analysis of Clustered and Longitudinal Data
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Review of Lecture Two Linear Regression Normal Equation
Introduction to Multilevel Modeling Using SPSS
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
STA291 Statistical Methods Lecture 27. Inference for Regression.
Sampling and Nested Data in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.
1 Lecture 1 Introduction to Multi-level Models Course Website: All lecture materials extracted and.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.
Introduction Multilevel Analysis
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador
Introduction to Multilevel Modeling Stephen R. Porter Associate Professor Dept. of Educational Leadership and Policy Studies Iowa State University Lagomarcino.
Term 4, 2006BIO656--Multilevel Models 1 Midterm Open “book” and notes; closed mouth minutes to read carefully and answer completely  60 minutes.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Term 4, 2006BIO656--Multilevel Models 1 PART 4 Non-linear models Logistic regression Other non-linear models Generalized Estimating Equations (GEE) Examples.
Term 4, 2006BIO656--Multilevel Models 1 Part 2 Schematic of the alcohol model Marginal and conditional models Variance components Random Effects and Bayes.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
Statistics for the Social Sciences Psychology 340 Spring 2010 Introductions & Review of some basic research methods.
Multi-level Models Summer Institute 2005 Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg School of Public Health.
Lecture 4 Ways to get data into SAS Some practice programming
December 2010T. A. Louis: Basic Bayes 1 Basic Bayes.
Parametric tests: Please treat them well Chong Ho Yu.
Analysis of Experiments
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-1 Basic Mathematical tools Today, we will review some basic mathematical tools. Then we.
Introduction to Multilevel Analysis Presented by Vijay Pillai.
The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan.
Jump to first page Bayesian Approach FOR MIXED MODEL Bioep740 Final Paper Presentation By Qiang Ling.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Multilevel modelling: general ideas and uses
Module 2: Bayesian Hierarchical Models
Multiple Imputation using SOLAS for Missing Data Analysis
HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.
Lecture 4 - Model Selection
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
CHAPTER 29: Multiple Regression*
Working Independence versus modeling correlation Longitudinal Example
Additional notes on random variables
When the Mean isn’t Enough
Additional notes on random variables
Presentation transcript:

Term 4, 2006BIO656--Multilevel Models Multi-Level Statistical Models If you did not receive the welcome from me, me at:

Term 4, 2006BIO656--Multilevel Models2 ROOM CHANGE, AGAIN! Starting Thursday, March 30 th and henceforth, lectures will be in W2030 Labs will still be in W2009

Term 4, 2006BIO656--Multilevel Models3

Term 4, 2006BIO656--Multilevel Models4 Prerequisites, resources and Grading

Term 4, 2006BIO656--Multilevel Models5 Learning Objectives

Term 4, 2006BIO656--Multilevel Models6 Content & Approach

Term 4, 2006BIO656--Multilevel Models7 Approach Lectures include basic illustrations and case studies, structuring an approach and interpreting results –Labs address computing and amplify on the foregoing My approach is formal, but not “mathematical” To understand MLMs, you need a very good understanding on single-level models –If you understand these, you are ready to multi-level!

Term 4, 2006BIO656--Multilevel Models8 Structure

Term 4, 2006BIO656--Multilevel Models9 RULES FOR HOMEWORK, MID-TERM AND PROJECT Homework Must be individually prepared, but you can get help Homework due dates should be honored. Turn in hard copy for grading The in-class, midterm Must be prepared absolutely independently During the exam, no advice or information can be obtained from others You can use your notes and reference materials The term project Must be individually prepared, but you can get help Must be electronically submitted

Term 4, 2006BIO656--Multilevel Models10 Handouts and the Web Virtually all course materials will be on the web Check frequently for updates I’ve provided hard copy of the general information sheet However, other lectures will be on the web in powerpoint format and won’t be handed out Download to your computer so you have an electronic version each part Print if you need hard copy, but do it 4 or 6 to a page to save paper More generally, try to “go electronic” printing sparingly

Term 4, 2006BIO656--Multilevel Models11 COMPUTING & DATA We will support WinBUGS, Stata We provide partial support for SAS, which should be used only by current SAS users; we aren’t teaching it from scratch Some homeworks require use of WinBUGS and another “traditional” program (STATA, SAS, R,...) We provide datasets, including some in the WinBUGS examples

Term 4, 2006BIO656--Multilevel Models12 WHY BUGS? Freeware! In MLMs, it’s important to see distributions – e.g., Skewness of sampling distribution of variance component estimates It’s important to incorporate all uncertainties in estimating random effects Note that WinBugs isn’t very data input friendly And, it’s difficult to produce P-values  

Term 4, 2006BIO656--Multilevel Models13 STATISTICAL MODELS A statistical model is an approximation Almost never is there a “correct” or “best” model, no holy grail A model is a tool for structuring a statistical approach and addressing a scientific question An effective model combines the data with prior information to address a question

Term 4, 2006BIO656--Multilevel Models14 MULTI-LEVEL MODELS Biological, physical, psycho/social processes that influence health occur at many levels: –Cell  Organ  Person  Family  Nhbd   City  Society ...  Solar system –Crew  Vessel  Fleet ... –Block  Block Group  Tract ... –Visit  Patient  Phy  Clinic  HMO ... Covariates can be at each level Many “units of analysis” More modern and flexible parlance and approach: “many variance components”

Term 4, 2006BIO656--Multilevel Models15 Example: Alcohol Abuse Cell: neurochemistry Organ: ability to metabolize ethanol Person: genetic susceptibility to addiction Family: alcohol abuse in the home Neighborhood: availability of bars Society: regulations; organizations; social norms

Term 4, 2006BIO656--Multilevel Models16 ALCOHOL ABUSE: ALCOHOL ABUSE: A multi-level, interaction model Interaction between existence of bars & state, drunk driving laws Alcohol abuse in a family & ability to metabolize ethanol Genetic predisposition to addiction & household environment State regulations about intoxication & job requirements

Term 4, 2006BIO656--Multilevel Models17 Many names for similar, but not identical models, analyses and goals Multi-Level Models Random effects models Mixed models Random coefficient models Hierarchical models Bayesian Models

Term 4, 2006BIO656--Multilevel Models18 We don’t need MLMs If your question is about slopes on regressors, you can run a standard regression and (usually) get valid slope estimates Y =  0 +  1 (areal monitor) +  2 (home monitor) +... Y =  0 +  1 (zipcode income) +  2 (personal income) +... logit(P) = Analysis can be followed by computing a “robust” SE to get valid inferences

Term 4, 2006BIO656--Multilevel Models19 We do need MLMs If your question is about variance components, you need to build the multi-level model Y ijkl =  0 +  1 X 1 +  2 X  ijkl Var(Y ijkl ) = Var(  ijkl ) = = V Hospital + V Clinic + V Physician + V Patient + V unexplained These variances depend on what Xs are in the model

Term 4, 2006BIO656--Multilevel Models20 We do need MLMs To create a broad class of correlation structures –Longitudinal correlations –Nested correlations To structure improving unit-level estimates (latent effects) and to make unit-level predictions

Term 4, 2006BIO656--Multilevel Models21 MLMs are effective in producing “working models” that incorporate stochastic realities Producing efficient population estimates Broadening the inference beyond “these units” Protecting against some types of informative missing data processes Producing correlation structures Generating “overdispersed” versions of standard models Structuring estimation of latent effects But, MLMs can be fragile and care is needed

Term 4, 2006BIO656--Multilevel Models22 MLMs are not and should not be A religion A truth The only way to model multi-level data!

Term 4, 2006BIO656--Multilevel Models23 Improving individual-level estimates Improving individual-level estimates Similar to the BUGS rat data Dependent variable (Y ij ) is weight for rat “i” at age X ij i = 1,..., I (=10); j = 1,..., J (=5) X ij = X j = (-14, -7, 0, 7, 14) = (8-22, 15-22, 22-22, ) Y ij = b i0 + b i1 X j +  ij –As usual, the intercept depends on the centering Analyses  –Each rat has its own line  –All rats follow the same line: b i0 =  0, b i1 =  1 –A compromise between these two

Term 4, 2006BIO656--Multilevel Models24 Each rat has its own (LSE, MLE) line Each rat has its own (LSE, MLE) line (with the population line) Pop line

Term 4, 2006BIO656--Multilevel Models25 A multi-level model: A multi-level model: Each rat has its own line, but the lines come from the same distribution The b i0 are independent Normal(  0,  0 2 ) The b i1 are independent N(  1,  1 2 ) Overdispersion Sample variance of the OLS estimated intercepts: 345 = SE int 2 +  0 2 =  0 2   0 2 = 25,  0 = 5 Sample variance of the OLS estimated slopes 4.25 = SE slope 2 +  1 2 =  1 2   1 2 = 1.00,  1 = 1.00

Term 4, 2006BIO656--Multilevel Models26 A compromise: each rat has its own line, but the lines come from the same distribution Pop line

Term 4, 2006BIO656--Multilevel Models27 ONE-WAY RANDOM EFFECTS ANOVA

Term 4, 2006BIO656--Multilevel Models28 Simulated “Neighborhood Clustering” Random mean for each of 10 neighborhoods (J=10) b 1, b 2,..., b 10 (iid) N(10, 9) Random deviation from neighborhood mean for each of 10 persons in each neighborhood (n=10) Y ij = b j + e ij, e ij (iid) N(0, 4) Conditional Independence  Over-dispersion: Variance of each point is 13 (= 4 + 9) Correlation: Measurements within each cluster are correlated

Term 4, 2006BIO656--Multilevel Models29

Term 4, 2006BIO656--Multilevel Models30 Intra-class Correlation (ICC) Correlation of two observations in the same cluster: ICC = Var(Between)/ Var(Total) = 1 – Var(Within)/Var(Total) Estimated ICC: 0.67 = ( )/9.8 True ICC: 0.69 = 9/(9 + 4) = 9/13

Term 4, 2006BIO656--Multilevel Models31 V(b)

Term 4, 2006BIO656--Multilevel Models32

Term 4, 2006BIO656--Multilevel Models33

Term 4, 2006BIO656--Multilevel Models34

Term 4, 2006BIO656--Multilevel Models35

Term 4, 2006BIO656--Multilevel Models36

Term 4, 2006BIO656--Multilevel Models37 regression line Pop line 45 o line

Term 4, 2006BIO656--Multilevel Models38

Term 4, 2006BIO656--Multilevel Models39

Term 4, 2006BIO656--Multilevel Models40

Term 4, 2006BIO656--Multilevel Models41

Term 4, 2006BIO656--Multilevel Models42

Term 4, 2006BIO656--Multilevel Models43

Term 4, 2006BIO656--Multilevel Models44 WEIGHTED MEANS

Term 4, 2006BIO656--Multilevel Models45

Term 4, 2006BIO656--Multilevel Models46

Term 4, 2006BIO656--Multilevel Models47

Term 4, 2006BIO656--Multilevel Models48

Term 4, 2006BIO656--Multilevel Models49

Term 4, 2006BIO656--Multilevel Models50

Term 4, 2006BIO656--Multilevel Models51

Term 4, 2006BIO656--Multilevel Models52 INFERENCE SPACE (Sanders) The choice between fixed and random effects depends in part on the reference population (the inference space) –These studies or people – Studies or people like these –

Term 4, 2006BIO656--Multilevel Models53 Random Effects should replace “unit of analysis” Models contain Fixed-effects, Random effects (via Variance Components) and other correlation- inducers There are many “units” and so in effect no single set of units Random Effects induce unexplained (co)variance Some of the unexplained may be explicable by including additional covariates MLMs are one way to induce a structure and estimate the REs

Term 4, 2006BIO656--Multilevel Models54 PLEASE DO THIS If you did not receive the welcome from me, me at:

Term 4, 2006BIO656--Multilevel Models55 ROOM CHANGE, AGAIN! Starting Thursday, March 30 th and henceforth, lectures will be in W2030 Labs will still be in W2009

Term 4, 2006BIO656--Multilevel Models56 END OF PART I