01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.

Slides:



Advertisements
Similar presentations
Allison Dunning, M.S. Research Biostatistician
Advertisements

Agency for Healthcare Research and Quality (AHRQ)
Surviving Survival Analysis
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
HSRP 734: Advanced Statistical Methods July 24, 2008.
SC968: Panel Data Methods for Sociologists
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
Introduction to Survival Analysis
Non-Experimental designs: Developmental designs & Small-N designs
Proportional Hazard Regression Cox Proportional Hazards Modeling (PROC PHREG)
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Using time-dependent covariates in the Cox model THIS MATERIAL IS NOT REQUIRED FOR YOUR METHODS II EXAM With some examples taken from Fisher and Lin (1999)
Introduction to Survival Analysis PROC LIFETEST and Survival Curves.
1 Journal Club Alcohol, Other Drugs, and Health: Current Evidence November–December 2010.
Modeling clustered survival data The different approaches.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival Analysis: From Square One to Square Two
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Survival analysis with time-varying covariates in SAS
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Assessing Survival: Cox Proportional Hazards Model
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
Time-dependent covariates and further remarks on likelihood construction Presenter Li,Yin Nov. 24.
01/20151 EPI 5344: Survival Analysis in Epidemiology Age as time scale March 31, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
INTRODUCTION TO SURVIVAL ANALYSIS
01/20151 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 10, 2015 Dr. N. Birkett, School of Epidemiology,
01/20141 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 13, 2014 Dr. N. Birkett, Department of Epidemiology.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
03/20131 EPI 5344: Survival Analysis in Epidemiology Risk Set Analysis Approaches April 16, 2013 Dr. N. Birkett, Department of Epidemiology & Community.
School of Epidemiology, Public Health &
01/20141 EPI 5344: Survival Analysis in Epidemiology SAS code and output March 4, 2014 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Lecture 12: Cox Proportional Hazards Model
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
12/20091 EPI 5240: Introduction to Epidemiology Incidence and survival December 7, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Measures of Disease Frequency
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
We’ll now look at the relationship between a survival variable Y and an explanatory variable X; e.g., Y could be remission time in a leukemia study and.
01/20151 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
01/20151 EPI 5344: Survival Analysis in Epidemiology SAS Code for Cox models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
01/20141 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models April 1, 2014 Dr. N. Birkett, Department of Epidemiology & Community.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
EPI 5344: Survival Analysis in Epidemiology Week 6 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 03/2016.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
03/20161 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 29, 2016 Dr. N. Birkett, School of Epidemiology, Public Health.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public.
Survival time treatment effects
April 18 Intro to survival analysis Le 11.1 – 11.2
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Presentation transcript:

01/20151 EPI 5344: Survival Analysis in Epidemiology Time varying covariates March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa

01/20152 Objectives Introduce time varying covariates Methods of inclusion into Cox models SAS (computer) issues

01/20153 Does heart transplantation improve survival? –Epidemiological study with ID measures –Observational study (not an RCT) Introduction (1)

01/20154 Assume that transplant has no effect on survival –IDR = candidates for transplant 2 year follow-up No losses 50% of people get a transplant –Always occurs on their first anniversary of entering study 25% of group die in first year 25% of first year survivors die in second year Introduction (2)

01/2015 Introduction (3) Ignore transplant status 5

01/20156 Introduction (4) Stratify by transplant status Transplant Done

01/20157 Introduction (5) Stratify by transplant status NO Transplant Done

01/20158 What is the observed IDR under this method of analysis? Transplant ID = 0.133/yr No transplant ID = 0.526/yr IDR = Correct IDR = 1.0 Introduction (6) STRONG BIAS Doing an RCT does NOT fix this issue as long as transplant is not done at time ‘0’

01/20159 How do we fix this? –No-one is at risk of dying with a transplant until the transplant has taken place Solution using epi methods: –People who never have transplant –People who have a transplant Accumulate PT (and events) to the non-transplant group until after a transplant occurs Accumulate PT (and events) to the transplant group only after transplant occurs Introduction (7)

01/ Introduction (8) CORRECT WAY: No Transplant Done

01/ Introduction (9) CORRECT WAY Transplant Done

01/ What is the observed IDR under this method of analysis? Transplant ID = 0.286/yr No transplant ID = 0.286/yr IDR = 1.0 Correct IDR = 1.0 Introduction (10) TIME VARYING COVARIATE Transplant status

01/ Exposures can change during follow-up –People stop/start smoking –BP increases –Air pollution varies from year to year Hazard often depends more strongly on recent values than original exposure –Not always true –Can depend on cumulative exposure Lagged exposure Time Varying Covariates (1)

01/ Produces non-proportional hazards –Change in exposure level causes hazard to change in one group Still proportional conditional on value of time varying exposure. Time Varying Covariates (2)

01/201515

01/201516

Before t*, HR = 1.0 After t*, HR* < 1.0 Time Varying Covariates (3) NOT PH over all time If we ignore the time of exposure and just treat these as two groups with PH, we get a biased estimate of the hazard ratio –A type of average of 1.0 and HR* (> HR*) 01/201517

01/ BUT: before t*, hazards are proportional after t*, hazards are proportional The true impact of the exposure is HR* and only occurs after t* Need an analysis approach to reflect this Time Varying Covariates (4)

01/ Is this hard to do? –YES and NO Consider a situation where all subjects start off as ‘unexposed’ but at some time in the future, some people become exposed Time Varying Covariates (5)

01/ Standard Cox Model Time Varying Covariates (6) Time Varying Cox Model Only change

01/ The theory really is this simple! WHY? Time Varying Covariates (7) RISK SETS

01/ Likelihood function for Cox model is computed at each time point when an event occurs –Depends only on subjects “at risk” at the event time –RISK SET Time Varying Covariates (8) x ij is the value of ‘x’ AT THE TIME of this event

01/ Fixed covariates: Time Varying Covariates (9) x ij is the same at all times Time varying covariates: Use the x ij which corresponds to the event time of this risk set Keep doing this over all risk sets

01/ So why isn’t it simple to do this? Practical Issues intrude!!!! To fit a time varying covariate, SAS needs to know the value of the covariate for every risk set. –Need to compute a value of the covariate at the time of every event. Interpretation is also tricky (later) Time Varying Covariates (10)

Time Varying Covariates (11) Example –4 subjects –2 get transplant at t = 15 & t = 25 –Want to include a time-varying covariate for transplant status. 01/ IDOutcomeTime of event TransplantTime of transplant 1dead10N. 2dead20Y15 3dead30N. 4dead40Y25 4 risk sets at t=10, 20, 30, & 40

Time Varying Covariates (12) 01/ Risk setIDX trans

01/ Two ways to do this in SAS: –Use programming statements in ‘Proc Phreg’. –Re-structure the data set and use a different method of describing the model to SAS Counting Process Input. Other programmes have similar options and choices Time Varying Covariates (13)

01/ We’ll look at both ways. –Some things can only be done in the Phreg programming approach –Counting Process input has some strong benefits. –Counting process approach can be tricky to use with age as the time scale Time Varying Covariates (14)

01/ SAS lets you include programme statements within PROC PHREG: proc phreg data=njb1; model surv*vs(0)=age sex x1; if (surv > 20) then x1 = 2; else x1 = 1; run; Proc Phreg programming (1)

01/ This code is processed once for each risk set ‘surv’ is the time when the risk set occurs –It is NOT the survival time for the subject ‘x1’ is the value of the variable in the subject at the time of the specific risk set under consideration. –Here, it is ‘1’ if the risk set occurs before time 20 but ‘2’ otherwise File can get VERY BIG Hard to de-bug your code –But, SAS 9.4 allows ‘out’ statements to be used Proc Phreg programming (2)

Stanford Heart Transplant Study 01/201531

01/201532

01/ Standard phreg analysis. Defines the ‘transplant’ status in the ‘data step’ using code like this: data njb1; set stanford; if (dot =.) then trans = 0; else trans = 1; run; proc phreg data=njb1; model time*cens(0)=trans; run;

01/ Trans=1  a) Had a transplant b) Lived long enough to have a transplant

01/ Hazard curves look something like this. Transplant No Transplant Transplant time In this interval, HR = 0  Overall HR is biased

01/ Stanford Heart Transplant Study: with time varying effect IDSurv1DeadWait For each event time, we need to define the transplant variable for every subject still in risk set plant = 0 no transplant by risk set time 1 transplant done on or before risk set time

01/ Risk set time ID’sWait timeplant

01/ Risk set time ID’sWait timeplant

01/ SAS Code to create ‘plant’ and run analysis proc phreg data=stan; model surv1*dead(0)=plant surg ageaccept/ ties=exact; if (wait > surv1 or wait =.) then plant = 0; else plant = 1; run;

Counting Process Input (1) Counting processes are a different way to look at survival –mathematically more powerful –essentially, each subject follows a ‘process’ ‘count up’ the events they experience can handle recurrent events enhances modeling of exposure. Don’t need to know all this to use SAS counting process style input. 01/201540

Counting Process Input (2) Data set needs to be restructured. To-date –one record per subject –To code covariate changes, need multiple variables value at baseline (v1) time of first change (t1) and new value (v2) and so on –Need to use ‘phreg’ programming to define value at risk set. 01/201541

Counting Process Input (3) New approach –Similar to piece-wise exponential model –Split data for each subject into multiple records Define intervals where every covariate is constant –[t1, t2) Each interval has one line (record) of data –Intervals continue until: Subject censored Subject has outcome event. 01/201542

01/ Need to re-structure data file Each interval needs a record in the data set Need to code Start of this interval End of this interval Outcome status at end of interval Value of time varying covariate(s) during the interval Values of fixed covariates, etc. Counting Process Input (4)

01/ Let’s use data from the Stanford Heart Transplant Study the same data as before. But, we only include transplant status Ignore other variables for now. Only have one time varying covariate. Counting Process Input (5)

01/2015 IDSurv1DeadWait Original data Re-structured data IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant IDStartStopStatusplant

01/ DATA stanlong; SET allison.stan; plant=0; start=0; IF (trans=0) THEN DO; dead2=dead; stop=surv1; IF (stop=0) THEN stop=.1; OUTPUT; END; ELSE DO; stop=wait; IF (stop=0) THEN stop=.1; dead2=0; OUTPUT; plant=1; start=wait; IF (stop=.1) THEN start=.1; stop=surv1; dead2=dead; OUTPUT; END; RUN; SAS Code to re-structure data DATA stanlong; SET allison.stan; plant=0; start=0; IF (trans=0) THEN DO; dead2=dead; stop=surv1; OUTPUT; END; ELSE DO; stop=wait; dead2=0; OUTPUT; plant=1; start=wait; stop=surv1; dead2=dead; OUTPUT; END; RUN;

01/ PROC PHREG DATA=stanlong; MODEL (start,stop)*dead2(0)=plant surg ageaccpt / TIES=EFRON; RUN; SAS Code for counting-process input analysis Identical to previous time-varying analysis

01/ Types of time varying covariates Internal (endogenous) –Change in the covariate is related to the behaviour of the subject. –Measurement requires subject to be under periodic examination Blood pressure Cholesterol Smoking –More challenging for analysis Often part of causal pathway Time Varying Covariates (15)

01/ External (exogenous) –Variables which vary independently of the subject’s normally biological processes. –The values do not depend on subject-specific information –Measurement does not require subject monitoring Hourly pollen count Time Varying Covariates (16)

01/ Some pattern types –Non-reversible dichotomy Transplant –Reversible dichotomy Smoking Drug use –Continuous variable Cholesterol Time Varying Covariates (17)

01/ Some issues –Need for valid measures for all subjects at all follow- up time Missing data ‘coarse’ measurement intervals Imputation Interpolation –Computationally intense Reverse causation effects Intermediate variables in the causal pathway Time Varying Covariates (18)

01/ Some Logical fallacies Can not use the future to predict the future! Example #1 –Recruit a cohort of neonates Age at entry = 0 for all subjects –Not useful as a predictor –Suggestion is made to use average age during follow-up to predict outcome –INVALID Average age during follow-up depends on ‘future’ information High average age is due to long survival Time Varying Covariates (19)

01/ Intermediaries (Internal covariates) RCT of anti-hypertensive treatment Outcome: time to stroke Main Q: Does drug   rate of stroke Model 1: ln(HR) = β 1 (drug) BUT, we measured BP on all subjects during follow-up. –Why not include this as a time-varying covariate? Time Varying Covariates (20)

01/ Intermediaries (cont) Model 1: ln(HR) = β 1 (drug) Model 2: ln(HR) = β 1 *(drug) + β 2 BP(t) Results Model 1 β 1 : p < Model 2 β 1 *: p =0.6 Time Varying Covariates (21) WHY?

01/ Drug  drop in BP  drop in stroke risk Effect of drug on stroke is already accounted for in the BP term Estimate from model of ‘drug’ effect is the effect of the drug after adjusting for changes in BP That is, after adjusting for the drug effect. Time Varying Covariates (22)

01/ Study of prisoners released from jail –One year follow-up –Monitor every week If subject was re-arrested, record the week of the arrest Recidivated –Key question Does financial security post-release reduce risk of recidivism? SAS examples (1)

01/201557

01/201558

01/201559

01/ Study also collected information about employment status for every week of follow-up after release Time varying covariate Hypothesis –Being in full-time employment reduces the risk of recidivism. SAS examples (2)

01/ IDEMP1EMP2EMP3………EMP ……… … and so on Data layout for employment information

01/ PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio employed / TIES=EFRON; ARRAY emp(*) emp1-emp52; employed=emp[week]; RUN;

01/ BUT: if you get arrested in week 10, you can’t work fulltime in week 10 REVERSE CAUSATION Lagged exposure

01/ title 'Single week lag'; PROC PHREG data=allison.recid; WHERE week>1; MODEL week*arrest(0)=fin age race wexp mar paro prio employed / TIES=EFRON; ARRAY emp(*) emp1-emp52; employed=emp[week-1]; RUN;

01/ Allison looks at some other models –Other lag intervals –cumulative work experience Worth reviewing for code examples and interpretation SAS examples (3)

01/ Albumin and death –Question: Does a falling serum albumin predict an increased likelihood of death? SAS examples (4)

01/ Albumin measured on the first day of each month –Ad-hoc measurement –Not available on every day of the month Can not use ‘average’ albumin around death date –No post-death value Use ‘closest’ value before risk set date SAS examples (5)

01/ DATA bloodcount; INFILE 'c:\blood.dat'; INPUT deathday status alb1-alb12; ARRAY alb(*) alb1-alb12; status2=0; deathmon=CEIL(deathday/30.4); DO j=1 TO deathmon; start=(j-1)*30.4; stop=start+30.4; albumin=alb(j); IF (j=deathmon) THEN DO; status2=status; stop=deathday-start; END; OUTPUT; END; Run; PROC PHREG DATA=bloodcount; MODEL (start,stop)*status2(0)=albumin; RUN; Uses counting process style input

01/ Alcohol cirrhosis and survival –Prothrombin time (a measure of blood clotting) is hypothesized as a predictor of survival –Cohort of men were followed up –Lab measures were taken at ‘clinically relevant’ times No pattern to the times Varied for each subject SAS examples (6)

01/201570

01/ DATA alcocount; SET allison.alco; time1=0; time11=.; ARRAY t(*) time1-time11; ARRAY p(*) pt1-pt10; dead2=0; DO j=1 TO 10 WHILE (t(j) NE.); start=t(j); pt=p(j); stop=t(j+1); IF (t(j+1)=.) THEN DO; stop=surv; dead2=dead; END; OUTPUT; END; run; PROC PHREG DATA=alcocount; MODEL (start,stop)*dead2(0)=pt; RUN; Uses counting process style input

01/201572