When the Mean isn’t Enough

Slides:



Advertisements
Similar presentations
A Gentle Introduction to Linear Mixed Modeling and PROC MIXED
Advertisements

Lecture 11 (Chapter 9).
GENERAL LINEAR MODELS: Estimation algorithms
Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.
Inference for Regression
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Departments of Medicine and Biostatistics
Statistics for the Social Sciences
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Statistics: Data Analysis and Presentation Fr Clinic II.
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Repeated measures: Approaches to Analysis Peter T. Donnan Professor of Epidemiology and Biostatistics.
Introduction to Multilevel Modeling Using SPSS
Hypothesis Testing in Linear Regression Analysis
Simple Linear Regression
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Probability, contd. Learning Objectives By the end of this lecture, you should be able to: – Describe the difference between discrete random variables.
Modeling Developmental Trajectories: A Group-based Approach Daniel S. Nagin Carnegie Mellon University.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Statistical Models for the Analysis of Single-Case Intervention Data Introduction to:  Regression Models  Multilevel Models.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
Stats Methods at IC Lecture 3: Regression.
Repeated measures: Approaches to Analysis
An Introduction to Latent Curve Models
Inference about the slope parameter and correlation
Chapter 14 Introduction to Multiple Regression
Group Analyses Guillaume Flandin SPM Course London, October 2016
FW364 Ecological Problem Solving Class 6: Population Growth
Linear Regression.
CHAPTER 7 Linear Correlation & Regression Methods
Generalized Linear Models
Ch3: Model Building through Regression
Linear Mixed Models in JMP Pro
PCB 3043L - General Ecology Data Analysis.
Jonathan W. Duggins; James Blum NC State University; UNC Wilmington
Statistical Models for the Analysis of Single-Case Intervention Data
Chapter 8: ODS Graphics ODS graphics were not available prior to SAS 9.2 They have been implemented across a wide range of procedures Functionality isn’t.
Regression model Y represents a value of the response variable.
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
INTRODUCTION TO SGPLOT Zahir Raihan OVERVIEW  ODS Graphics  SGPLOT overview  Plot Content  High value plot statements  High value plot options 
BASIC REGRESSION CONCEPTS
A Gentle Introduction to Linear Mixed Modeling and PROC MIXED
Why use marginal model when I can use a multi-level model?
Combined predictor Selection for Multiple Clinical Outcomes Using PHREG Grisell Diaz-Ramirez.
Fixed, Random and Mixed effects
Indicator Variables Response: Highway MPG
1-Way Random Effects Model
Longitudinal Data & Mixed Effects Models
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

When the Mean isn’t Enough Methods for Assessing Individual Differences using SAS Melissa McTernan, PhD -- CSU Sacramento

Talk Outline Introduction Common focus on population means Costs of means-only analysis Motivation to look beyond the mean Methods for assessing individual differences in SAS Preparing the data Visualizing the individual Mixed effects models

The Average Californian Do you know her? The average person in California* Is 35 years old Is Latinx Is a woman Is overweight Is a democrat and a Shares a household with 1.95 other people Pays $1,851/mo for her mortgage Has a 28 minute commute to work Works 41.3 hours a week in a retail job Has $11,760 of student debts Drinks 4.8 alcoholic beverages a week … and has spent $295 dollars on spontaneous purchases while under the influence *Based on finder.com data and data from the US Census the Bureau of Labor Statistics

Common Focus on the Pop. Mean Many commonly-used methods only produce mean estimates T-tests Is the mean for group A different than the mean for group B? ANOVAs Do groups A, B, and C have different means? Linear (simple or multiple) Regression How much does Y change based on a single unit change in X, for the average individual?

Limitations of Means-Only Methods The mean may or may not be a good summary of the data Even if the mean is a good summary, the mean still may not represent ANY individual in the population Recall the example of the “average Californian” For which of these distributions is the mean more representative of the group?

Limitations of Means-Only Methods Implications Overgeneralization can lead us down an ugly path… Clinical and medical interventions may work well for the average person, but may actually be harmful for individuals in certain sub-groups Focus on the “average” may hide disparities Example: A longitudinal study may show that, on average, student performance is increasing. Without looking at the individual learning curves, we miss the important fact that some students’ performance is not increasing, or is declining.

Methods for Assessing Individual Differences Using NLSY97 Data, 2006-2008 Variables: Overall outlook on life, whether the participant has health care coverage, and a continuous measure of general health

Preparing the Data ”Wide” format vs. “Long” format

PROC TRANSPOSE is more efficient, but limited to a single variable Preparing the Data ARRAY statements are a simple way to reshape the data, but inefficient for large datasets PROC TRANSPOSE is more efficient, but limited to a single variable

Visualizing Individual Differences PROC SGPLOT to visualize complex data Allows you to build upon a base chart to add layers of chart components Examples: Build a histogram with a density plot overlay Build a scatterplot, then overlay a line of best fit Spaghetti Plots for Longitudinal Data Plot a trajectory for each individual across time Overlay the mean trajectory

Visualizing Individual Differences Layer 1 Layer 2

Visualizing Individual Differences What information would we be missing if we only plotted the red line?

Visualizing Individual Differences PROC SGPANEL Also takes advantage of layering Allows you to compare two side-by-side plots with a “panelby” option Now, we can look at individuals within subgroups, within the sample at large … rather than a mean trajectory across all groups and all people

Visualizing Individual Differences

Visualizing Individual Differences

Accounting for Individual Differences with PROC MIXED In longitudinal statistical analyses

Accounting for Individual Differences with PROC MIXED Linear mixed effects models allow you to add random effects to account for individual differences in model parameters Add a random intercept to account for variance in intercept across people Add a random slope to account for variance in slope across individuals First, let’s look at a model that only provides information about the typical person (i.e. a fixed effects model, or a model without any random effects)

Accounting for Individual Differences with PROC MIXED First, let’s look at a model that only provides information about the typical person …

Accounting for Individual Differences with PROC MIXED Add a REPEATED statement to add variance components for the growth parameters Added statement

Accounting for Individual Differences with PROC NLMIXED In longitudinal statistical analyses

Accounting for Individual Differences with PROC NLMIXED PROC NLMIXED is more flexible than PROC MIXED Non-linear mixed effects models Outcome may be non-normally distributed (i.e. binary) User-defined log-likelihood functions Variance in random intercept

Accounting for Individual Differences with PROC GLIMMIX In longitudinal statistical analyses

Accounting for Individual Differences with PROC GLIMMIX PROC GLIMMIX is also very flexible Generalized Linear Mixed Models Outcome may be non-normally distributed (i.e. binary)

Comparing NLMIXED and GLIMMIX PROC GLIMMIX defaults to a pseudolikelihood approach for selection model parameter estimates PROC NLMIXED defaults to maximum likelihood (ML) using the adaptive Gaussian-Hermite quadrature method of approximation Note: This is identical to the approach that would be used in PROC GLIMMIX if METHOD=QUAD in the GLIMMIX statement

What are the take-aways from this presentation? Conclusions

Conclusions Only visualizing average trends or estimating average parameters may provide information about the typical person, but that is often not useful The ”typical” person may not even exist! SAS Software offers many procedures for data management, data visualization, and data analysis, that preserve information about the individual Making use of these procedures and incorporating them into practice can lead to more effective and informed interventions/responses

Contact Information Name: Melissa McTernan, PhD Sac State University Sacramento, CA Email: mcternan@csus.edu