Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.

Slides:



Advertisements
Similar presentations
One-sample T-Test Matched Pairs T-Test Two-sample T-Test
Advertisements

1-Way Analysis of Variance
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Statistics in Science  Statistical Analysis & Design in Research Structure in the Experimental Material PGRM 10.
SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Comparison of Repeated Measures and Covariance Analysis for Pretest-Posttest Data -By Chunmei Zhou.
Multiple regression analysis
The Simple Linear Regression Model: Specification and Estimation
January 7, afternoon session 1 Multi-factor ANOVA and Multiple Regression January 5-9, 2008 Beth Ayers.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Clustered or Multilevel Data
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
Modeling clustered survival data The different approaches.
Regression Approach To ANOVA
Professor of Epidemiology and Biostatistics
Repeated measures: Approaches to Analysis Peter T. Donnan Professor of Epidemiology and Biostatistics.
Introduction to Multilevel Modeling Using SPSS
The Mimix Command Reference Based Multiple Imputation For Sensitivity Analysis of Longitudinal Trials with Protocol Deviation Suzie Cro EMERGE.
G Lecture 5 Example fixed Repeated measures as clustered data
Application of repeated measurement ANOVA models using SAS and SPSS: examination of the effect of intravenous lactate infusion in Alzheimer's disease Krisztina.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 2: Diagnostic Classification.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 3: Replicates.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 5: Choices for Longitudinal Data Analysis.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Corinne Introduction/Overview & Examples (behavioral) Giorgia functional Brain Imaging Examples, Fixed Effects Analysis vs. Random Effects Analysis Models.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multilevel Linear Modeling aka HLM. The Design We have data at two different levels In this case, 7,185 students (Level 1) Nested within 160 Schools (Level.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
March 28, 30 Return exam Analyses of covariance 2-way ANOVA Analyses of binary outcomes.
BUSI 6480 Lecture 8 Repeated Measures.
ANOVA: Analysis of Variance.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 6: Case Study.
Developing a Mixed Effects Model Using SAS PROC MIXED
Biostatistics Case Studies 2010 Peter D. Christenson Biostatistician Session 3: Clustering and Experimental Replicates.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 3: Testing Hypotheses.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
General Linear Model.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 3: An Alternative to Last-Observation-Carried-Forward:
Analysis of Experiments
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Session 6: Other Analysis Issues In this session, we consider various analysis issues that occur in practice: Incomplete Data: –Subjects drop-out, do not.
Experimental Statistics - week 9
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
ANOVA and Multiple Comparison Tests
1 Statistics 262: Intermediate Biostatistics Mixed models; Modeling change.
Jump to first page Bayesian Approach FOR MIXED MODEL Bioep740 Final Paper Presentation By Qiang Ling.
Simulation setup Model parameters for simulations were tuned using repeated measurement data from multiple in-house completed studies and baseline data.
Repeated measures: Approaches to Analysis
An Introduction to Latent Curve Models
This Week Review of estimation and hypothesis testing
Linear Mixed Models in JMP Pro
Introduction to Longitudinal Data Analysis Lisa Wang Jan. 29, 2015
6-1 Introduction To Empirical Models
Joanna Romaniuk Quanticate, Warsaw, Poland
An Introductory Tutorial
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Fixed, Random and Mixed effects
An Introductory Tutorial
Presentation transcript:

Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies

Case Study Hall S et al: A comparative study of Carvedilol, slow release Nifedipine, and Atenolol in the management of essential hypertension. J of Cardiovascular Pharmacology 1991;18(4)S35-38.

Case Study Outline Subjects randomized to one of 3 drugs for controlling hypertension: A: Carvedilol (new) B: Nifedipine (standard) C: Atenolol (standard) Blood pressure and HR measured at baseline and 4 post- treatment periods. Primary analysis is unclear, but changes over time in HR and bp are compared among the 3 groups.

Available Data: sitting dbp Visit # Week Number of Subjects PaperData ABC Screen Baselinedbp Post Post Post Post

Sitting dbp from Figure 2

Group A: Baseline and Final dbp Week 0 Last Value: Pre Week 8Week 8FinalΔ GraphN= ± 0.52 N= ± 0.96 N= ± ± ? CompletersN= ± 0.53 N= ± 0.96 N= ± ± 1.10 Last Observation Carried Forward (LOCF) N= ± 0.52 N= ± 3.47 N= ± 0.96 N= ± ± 1.11

Wanted: Use N=100 w/o LOCF Combine: Info on true 8 week change in 83 subjects. Info on baseline only in 17 subjects. Use week0-week8 correlation in 83 subjects. More generally: Suppose 9 subjects had only week 0 and 8 subjects had only week 8. Then, really 2 experiments, 1 paired (N=83) and 1 unpaired (N 1 =9 and N 2 =8). Combining involves weighting Δs from the 2 experiments. Does not impute (substitute) values for the 17 unknown values. Generalize further to >2 time periods and >1 treatment, etc.

Mixed Models Mixed models implement our need here. “Mixed” means combination of fixed effects (e.g., drugs; want info on those particular drugs) and random effects (e.g., centers or patients; not interested in the particular ones in the study). AKA multilevel models, hierarchical models. Very flexible, incorporate unequal patient variability, correlation, pairing, repeated values at multiple levels (e.g., sitting and standing dbp in Fig 2, or if subjects were clustered, say from the same family and genetics was an issue, etc), and data missing at random. More assumptions required than typical analyses.

Data Structure for Software Need: patient week dbp etc Not: patient wk0 wk2 wk4 wk6 wk

Software Need to use a mixed model module. Often, options are unclear. Use: SPSS Analyze > Mixed SAS proc mixed. Repeated measures modules with options for random factors do not typically handle missing data, e.g.: SPSS Analyze > GLM > Repeated > … Random SAS proc glm; model …; random …; are not in general OK, but will work with certain balanced patterns of missing data.

Mixed Models in SPSS Select Analyze > Mixed > Linear. First menu:

Mixed Models in SAS Select Solutions > Analysis > Analyst > Statistics > ANOVA > Mixed models Alternatively, typical code is: proc mixed; class week patient; model dbp=week/ddfm=satterthwaite; lsmeans week/cl; estimate 'Week Diff' week 1 -1; repeated week/subject=patient type=un rcorr; title 'Mixed Model N= Unstructured'; run;

Model 1 Results Estimated Change: Standard Label Estimate Error DF t Value Pr > |t| Week Diff <.0001 So, Δ = 12.61±1.04 incorporates observations. Estimated Means: Standard Effect week Estimate Error week week

Group A: Baseline and Final dbp Update Week 0 Last Value: Pre Week 8Week 8FinalΔ GraphN= ± 0.52 N= ± 0.96 N= ± ± 1.04 CompletersN= ± 0.53 N= ± 0.96 N= ± ± 1.10 Last Observation Carried Forward (LOCF) N= ± 0.52 N= ± 3.47 N= ± 0.96 N= ± ± 1.11 Is model appropriate? Depends on assumed covariance pattern.

Model 1 Covariance Pattern: Compound Symmetry Software Output Estimated R Correlation Matrix for patient 4 Row Col1 Col Covariance Parameter Estimates Cov Parm Subject Estimate CS patient Residual Output Interpretation Estimated Covariance Pattern: Week (7.06) (7.06) 2 (7.06) 2 = Note that this model assumes that variability among subjects is the same at each week, and that there is a correlation between the weeks (estimated at ). But: Week 0 SD = 5.2 Week 8 SD = 8.8

Model 2 Covariance Pattern: Unstructured Software Output Estimated R Correlation Matrix for patient 4 Row Col1 Col Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) patient UN(2,1) patient UN(2,2) patient Output Interpretation Estimated Covariance Pattern: Week (5.21) (8.79) 2 (5.21) 2 = This model allows different variability among subjects at each week, and a correlation between the weeks (estimated at 0.011). This better models the SDs: Week 0 SD = 5.2 Week 8 SD = 8.8

Model 3 Covariance: Heterogeneous Uncorrelated Software Output Estimated R Correlation Matrix for patient 4 Row Col1 Col Covariance Parameter Estimates Cov Parm Subject Estimate UN(1,1) patient UN(2,1) patient 0 UN(2,2) patient Output Interpretation Estimated Covariance Pattern: Week (5.21) (8.79) 2 (5.21) 2 = This model allows different variability among subjects at each week, but no correlation between the two weeks. Matches: Week 0 SD = 5.2 Week 8 SD = 8.8

Choice of Covariance Pattern ModelCovariance Pattern-2 Log Likelihood 1: Comp Sym1: Corr & = SDs : Unstructured2: Corr & ≠ SDs : Heterog Uncorr3: 0 Corr & ≠ SDs Use likelihood ratio test to test whether a more complex model significantly improves fit of the data. Models must be “nested”. Is model 2 significantly better than model 1? Χ 2 = = 24.2 has Χ 2 distribution with d.f.= difference in # of estimated parameters (here 3-2) if model 2 is not an improvement. P-value=Prob(Χ 2 >24.2) <0.0001, so model 2 is needed. Final choice: model 3.

Model 3 Results Estimated Change: Standard Label Estimate Error DF t Value Pr > |t| Week Diff <.0001 Thus, use Δ = 12.61±1.10 from observations. Estimated Means: Standard Effect week Estimate Error DF week week

Conclusions for Group A Week 0 to Week 8 dbp Δ Last observation carried forward overestimates dbp at week 8. Essentially 0 correlation between residual week 0 and week 8 dbp. Use mixed model with heterogeneous uncorrelated covariance pattern. This mixed model is equivalent to a 2-sample t-test with unequal variance using Satterthwaite’s weighting. This would not happen if either (1) some subjects only had dbp at week 8, or (2) correlation was stronger between weeks 0 and 8, which usually happens.

Generalize: Group A with all 5 Time Periods Covariance PatternParameters-2 Log Likelihood Compound Symmetry Heterogeneous Uncorrelated Toeplitz Heterogeneous Toeplitz Unstructured Since LR = = 30.7 is large for a Χ 2 6, there is substantial unstructured correlation over weeks.

Conclusions: Repeated Measures with Mixed Models Very useful for missing data. Requires more than usual assumptions. Mild deviations from assumed covariance pattern do not have a large influence. Software can be intimidating due to specifying many model assumptions, since the method is so general and flexible. May be difficult to apply unbiasedly in clinical trials where the primary analysis needs to be specifically detailed.