Survival Analysis II Reading VGSM 7.2.4 - 7.2.10 John Kornak April 5, 2011  Project description due today - Note does not have to.

Slides:



Advertisements
Similar presentations
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Advertisements

Survival Analysis III Reading VGSM
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Duration models Bill Evans 1. timet0t0 t2t2 t 0 initial period t 2 followup period a b c d e f h g i Flow sample.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Today Concepts underlying inferential statistics
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Survival Analysis III Reading VGSM
Biostat 209 Survival Data l John Kornak April 2, 2013 Reading VGSM 3.5 (review) and epibiostat.ucsf.edu/biostat/vgs.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Survival Data John Kornak March 29, 2011
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
Assessing Survival: Cox Proportional Hazards Model
EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without.
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Categorical data 1 Single proportion and comparison of 2 proportions دکتر سید ابراهیم جباری فر( (Dr. jabarifar تاریخ : 1388 / 2010 دانشیار دانشگاه علوم.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Medical Statistics as a science
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
We’ll now look at the relationship between a survival variable Y and an explanatory variable X; e.g., Y could be remission time in a leukemia study and.
Logistic Regression Analysis Gerrit Rooks
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Exact Logistic Regression
Nonparametric Statistics
Additional Regression techniques Scott Harris October 2009.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Discussion: Week 4 Phillip Keung.
Event History Analysis 3
Multiple logistic regression
Biost 513 Discussion Section Week 9
Problems with infinite solutions in logistic regression
Presentation transcript:

Survival Analysis II Reading VGSM John Kornak April 5, 2011  Project description due today - Note does not have to be for survival data (see project guidelines)  Homework #1 due next Tuesday, April 12  Reading for next lecture VGSM  Lab. 2 on web site

Survival review: key concepts Survival data: right censoring Linear/logistic regression inadequate Kaplan-Meier / log rank  non-parametric Hazard function: “instantaneous risk” Effect of 1 unit change in a predictor on Survival, given in terms of “hazard ratio”: the relative hazard Proportional hazards assumption: ratio of hazards is constant over time

Review Cox Model Assumes Proportional Hazards Do not need to estimate baseline hazard (only relative hazards) Can summarize predictor effects based on coefficients, β, or in terms of hazard ratios, exp(β) Hazard ratios work better for interpretation Math works better based on coefficients (easy to go back and forth)

Regression review Proportional hazards model  log(hazard ratio) depends linearly on regression coefficients h(t|x) = h 0 (t) exp(β 1 x β p x p ) log( h(t|x)/h 0 (t) ) = β 1 x β p x p C.f log odds in logistic regression and outcome in linear regression - each depends linearly on regression coefficients

Review of Interactions in regression

Binary Interactions a a + b 0 b Drink = 0 Drink = 1 Smoke = 0 Smoke = 1 No interaction a = Smoking effect, b = Drinking effect Outcome = Diastolic BP

Binary Interactions a a + b + c 0 b Drink = 0 Drink = 1 Smoke = 0 Smoke = 1 a = Smoking effect, b = Drinking effect, c = interaction Interaction c ≠ 0 Outcome = Diastolic BP

Binary Interaction with a Continuous Variable

age y Case 1 y = e.g. cognition score Dx = 1, AD (Alzheimer’s) Dx = 0, HC (Healthy Control)

age Case 2 y y = e.g. cognition score Dx = 1, AD (Alzheimer’s) Dx = 0, HC (Healthy Control)

age Case 3 AD HC y y = e.g. cognition score Dx = 1, AD (Alzheimer’s) Dx = 0, HC (Healthy Control)

age Case 4 AD HC y y = e.g. cognition score Dx = 1, AD (Alzheimer’s) Dx = 0, HC (Healthy Control)

Cox Model - Wald test and CIs Confidence intervals and Wald tests are based on the fact that has an approximate normal distribution (rule of thumb: at least events) Test and Confidence interval are based on estimators for coefficients β 95% CI for HR is Upper limit: exp( x SE( ) ) Lower limit: exp( x SE( ) ) Wald test: Z = / SE( ) (i.e. approx. normal)

Lung Cancer Data 40 subjects with Bronchioloalveolar Carcinoma (BAC / lung cancer) Each subject underwent a Positron Emission Tomography (PET) scan Determined uptake of Fludeoxyglucose, 18F (FDG) in standard units: variable fdgavid (if tumor Standard Uptake Value (SUV) > 2.5, Y/N) 12 subjects died during follow-up

Wald test and CI hazard ratio scale. stcox fdgavid No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(1) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] fdgavid | CIs are calculated from coefficients not hazard directly Upper limit: exp( x SE( ) ) Lower limit: exp( x SE( ) ) Wald test: Z = / SE( ) -- NOT HR / SE( HR )

Likelihood Ratio Tests Tests for effect of predictor(s) by comparing log-likelihood between two models Fit models with and without predictor(s) to be tested -2 Times difference in log-likelihoods compared to a chi-square distribution Important to use when number of failures is small and the HR is far from 1 (strong effect) LR tests

LR test for fdgavid stcox fdgavid tumorsize multi est store A fits model with all predictors (the reference model), and then asks Stata to save log-likelihood for above model, call it “A” stcox tumorsize multi est store B fits model leaving out fdgavid lrtest A B (or lrtest A ) compare log-likelihoods (defaults to the previous model)

Reference Model No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(3) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] fdgavid | tumorsize | multifocal | few failures fairly large HR “non”-significant Wald test

Likelihood Ratio Test LR chi2(2) = 8.76 Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] tumorsize | multifocal | lrtest A Likelihood-ratio test LR chi2(1) = 5.09 (Assumption:. nested in A) Prb > chi2 = Log likelihood = model w/o fdgavid Log likelihood = model with fdgavid -2 times diff = significant likelihood ratio test current model

. lrtest A B (Assumption: B nested in A) Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] fdgavid | multifocal | lrtest A C (Assumption: C nested in A) Prob > chi2 = Likelihood Ratio Test A.stcox fdgavid tumorsize multi B.stcox fdgavid tumorsize multi C. stcox fdgavid tumorsize multi fdgavid & tumorsize: high association

Likelihood Ratio vs. Wald Two tests for the same null hypothesis Typically very close in results Will disagree when sample size small and HR are far from 1 or if colinearity is present When they disagree, the likelihood ratio test is more reliable. LR test always better -- just less convenient to compute

Binary Predictors No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(1) = 1.26 Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] over3cm | “over3cm” coded 0/1 0 = tumor less than 3 cm 1 = tumor greater than 3 cm relative hazard for ≥ 3cm compared to < 3 cm = 1.95 Hazard is about double!

Suggest 0/1 coding One-point change is easy to interpret Makes the baseline hazard an identifiable group e.g., those with tumors < 3 cm Simplifies lincoms when we consider interactions we will model interactions soon… Get same answer if coded 10/11 Get same significance but different HR if coded 0/2 Binary Predictors

Reversed Coding. recode over3cm 0=1 1=0, gen(less3cm). stcox less3cm No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(1) = 1.26 Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] less3cm | “less3cm” coded 0/1 0 = tumor greater than 3 cm 1 = tumor less than 3 cm LR, Wald tests same. HR and it’s CI are reciprocals =1/

Issue: Zero Hazard Ratio No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(1) = 3.48 Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] fdg0 | 6.53e e Hazard Ratio equals zero LR test looks OK Wald test and CI’s have broken down AliveDead Tumor SUV=0 40 Tumor SUV> No Deaths in Those with Tumor SUV=0

Reverse the Reference No. of subjects = 40 Number of obs = 40 No. of failures = 12 Time at risk = LR chi2(1) = 3.48 Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] fdg_gt0 | 2.07e e Wald test and CI’s still don’t work Hazard Ratio equals ∞ LR test is the same fdg_gt0: 1= SUV > 0, 0 if SUV=0

Interpretation “Zero of four subjects with a SUV of 0 died while 12/36 subjects with SUV > 0 died (hazard ratio = 0); the effect was borderline statistically significant (p=0.06)”

Zero/Infinite HR Two sides of the same coin depends on reference Category has either 0% or 100% events often happens with lots of categories Use likelihood ratio tests: they’re fine Wald test performs poorly Confidence intervals: see statistician who can calculate likelihood ratio based CI Sometimes can consolidate categories to handle the issue

Categorical Predictors Fit in Stata: stcox i.categoricalpredictor Lots of different possible tests and comparisons o Overall versus trend tests (if ordinal) o Making pairwise comparisons Can also use this syntax when binary is not 0/1

PBC Data 312 patients: Primary Biliary Cirrhosis (PBC) Randomized trial: DPCA vs. Placebo 125 subjects died 15 predictors: hepatomegaly, spiders, bilirubin, etc.

. stcox sex i.histol No. of subjects = 312 No. of failures = 125 Time at risk = LR chi2(4) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] sex | histol | 2 | | | vs 2 | vs 3 | lincom 3.histol-2.histol, hr. lincom 4.histol-3.histol, hr Is histology a significant predictor? Cox Model

Overall vs. Trend Tests Both have same null hypothesis: no difference in event risks between the groups But different alternative hypothesis: overall: at least one group is different trend: there is a trend in the groups Use trend tests only for ordered predictors no trend test for ethnicity When trends exist, a trend test is (typically) more powerful For ordinal predictors it is more interpretable

Trend vs. Overall Tests Trend Test Overall Test ( Wald test or LR test ). test -1* 2.histol + 3.histol + 3* 4.histol = 0 chi2( 1) = Prob > chi2 = appropriate linear combination from p. 82 of VGSM p = , there is a survival trend with pathology grade. testparm i.histol ( 1) _Ihistol_2 = 0 ( 2) _Ihistol_3 = 0 ( 3) _Ihistol_4 = 0 chi2( 3) = Prob > chi2 = p<0.0001, at least one group different. stcox sex. est store M_wo. lrtest M_w M_wo chi2( 3) = Prob > chi2 =

PBC data: age (in days) as predictor No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = LR chi2(1) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] age_days | HR is nearly one Wald and LR tests highly significant

PBC data: age (decades) as predictor No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = LR chi2(1) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] age_decades | Wald and LR tests exactly the same HR is greater! HR per year is about 1.04

Continuous Predictors HR greatly affected by the scale of measurement (e.g., age in decades, years or days) Statistical significance is unaffected because SE is proportional to coefficient Choose interpretable unit change in predictor Can rescale by (1) defining new variable (2) using lincom (3) direct calculation - no need for except as an exercise

(1) Define new variable Let age_days be age in days o gen age_decades=age_days/(3650) o stcox age_decades Works for every regression -- always Dividing by -3650: effect of one decade younger The most simple method About 3650 days per decade Gives HR for one-decade older

(2) Lincom Let age_days be age in days o stcox age_days o lincom 3650*age_days, hr gives the effect of a decade (being 3650 days older) The HR option is important otherwise get coefficient not the HR Less effort to implement than redefining variables, but easier to make mistakes A 1-unit change in decade is 3650 unit change in days

Confounding in the Cox Model Handled the same way as other regression models Confounders added into model Interpretation: HR of a 1-unit change holding all other predictors constant All predictors adjust for each other

UNOS Kidney Example Interest: How recipients from cadaveric donors do compared to living kidney recipients crude HR = 1.97, 95% CI (1.63, 2.40) What might vary between living/cadaveric recipients : previous transplant, year of transplant, HLA match (0-2 loci v. 3+)? Could lead to inflated crude HR

Adjusted Model No. of subjects = 9678 Number of obs = 9678 No. of failures = 407 Time at risk = LR chi2(4) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] txtype | prevtx | year | ge3hla | Attenuated HR for transplant type vs crude HR = 1.97

Interpretation “The hazard ratio of mortality for the recipient of a cadaveric kidney is 1.41 compared to living kidney (p=0.01), adjusting for year of transplant, history of previous transplants and degree of HLA compatibility. The 95% CI for the hazard ratio is 1.09 to 1.83”

Only way to know if there is confounding: compare crude and adjusted HR Screening based on association with mortality & txtype is too insensitive Very predictive of mortality but only slightly different between txtype : can still be important confounder Is there confounding? Examination of the associations is a way of understanding potential confounding, not a screening method for confounding Diff of 2.0 v clinically important? yes, the txtype association is confounded

Mediation How much of the effect of better prognosis of living recipients is explained by closer HLA match ( ge3hla ) and less transport time for the donor organ ( cold_isc )? A question of mediation To what extent does the above mediate the txtype /mortality relationship?

After Adjustment Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] txtype | ge3hla | cold_isc | Reduction in txtype HR due to HLA and cold ischemia time is evidence of mediation

Mediation Measure β crude - β adj β crude =100% % mediation β crude = log(1.97) = β adj = log(1.46) = % = “Approximately 44% of the mortality difference between living and cadaveric kidney recipients is explained by difference in HLA match and cold ischemia time” Sec Sect 4.5 of VGSM for details

Interaction Addressed the same way across all regression methods o Create product terms ( use # or ## Stata 11 ) o Test of product terms reveals interaction o Understand interaction through series of lincom commands Predictors of graft failure in UNOS Is there an interaction between previous transplant and year of transplant?

Results. stcox prevtx year Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] prevtx | year | stcox prevtx ## c.year Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] prevtx | 8.39e e e e+84 year | prevtx#| c.year | 2 |

What gives? There is a huge HR for prevtx. Isn’t this an example of colinearity? There might be some colinearity. It is a minor issue The big issue: the HR gives the effect of prevtx when all other predictors are equal to zero (including year !) It’s huge because it’s a meaningless extrapolation!

HR Interpretation 2.prevtx : HR of previous transplant in year 0 year : HR of year of tx with no prev tx “effect of +1 year when prevtx=1 (ref group) 2.prevtx#c.year : HR is not easily interpreted an effect modifier _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] prevtx | 8.39e e e e+84 year | prevtx#| c.year | 2 |

Advice Don’t fixate on the model HRs HRs may not correspond to something meaningful (sometimes yes, sometimes not) Instead: look at test for product term Followed by a series of lincom statements If you do this, there is no colinearity issue 1. What is the effect of prevtx in 1990? 2. What is the effect of prevtx in 1995? 3. What is the effect of prevtx in 2000?

Effect of Previous Transplant in prevtxyear product= 2.prevtx#c.ye ar previous transp no prev tx diff lincom 2.prevtx * 2.prevtx#c.year, hr

lincom Effect of Previous transplant in lincom 2.prevtx * 2.prevtx#c.year, hr _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] (1) | Effect of Previous transplant in lincom 2.prevtx * 2.prevtx#c.year, hr _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] (1) | Effect of Previous transplant in lincom 2.prevtx * 2.prevtx#c.year, hr _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] (1) |

Interpretation The effect of previous transplant on risk of graft failure varies by year of transplant (p<0.0001). The relative hazards (and 95% Conf. Int.) for the previous transplant are 2.7 ( ), 1.9 ( ) and 1.4 ( ), in the years 1990, 1995 and 2000, respectively.

Centered Regression. gen cyear=year-1995 No. of subjects = 9678 Number of obs = 9678 No. of failures = 2501 Time at risk = LR chi2(3) = Log likelihood = Prob > chi2 = _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] prevtx | cyear | prevtx#| c.cyear | 2 | Year centered at 1995 Only 2.prevtx has changed: corresponds to effect of prev transplant in 1995

Effect of Previous Transplant in prevtxyear product= 2.prevtx#c.cy ear previous transp. 1-5 no prev tx 0-50 diff10-5 lincom 2.prevtx - 5 * 2.prevtx#c.cyear, hr

lincom (centered) Effect of Previous transplant in lincom 2.prevtx - 5 * 2.prevtx#c.cyear, hr _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] (1) | Effect of Previous transplant in lincom 2.prevtx + 0 * 2.prevtx#c.cyear, hr Effect of Previous transplant in lincom 2.prevtx + 5 * 2.prevtx#c.cyear, hr _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] (1) |

Interaction Same advice as in previous models: create product, test product, calculate lincom Colinearity is not the problem Centering gives same test for interaction and doesn’t change lincom Center makes some coefficients more interpretable, but can make use of lincom subject to error (if forget the variable is centered) Don’t forget “ hr ” in those lincom commands

Effect of Sex: PBC data crude Men do worse: HR=1.6, p=0.04

Men: Higher Copper median: 135 ug/day median: 67 ug/day

Adjusted Survival Curves Would like to visualize the adjusted effects of variables Can make survival prediction based on a Cox model S(t|x): survivor function (event-free proportion at time t) for someone with predictors x

β’s are the coefficients from the Cox model Under the Cox Model S(t|x) = { S 0 (t) } exp(β 1 x β p x p ) In Cox model we see estimates of exp( β p ) In background, Stata calculates estimates of S 0 (t) = survivor function when all predictors equal zero S 0 (t): baseline survivor function

Adjusted Curve Look at effect of x 1 (sex) adjusting for x 2 (copper) Create two curves with same value for x 2 otherwise we are not adjusting for copper adjustment: effect of sex w/ copper constant But differing by sex! But what value for x 2 ? This value will affect the curves Let’s use overall mean or median

Adjusted Curves. stcox sex copper. stcurve, survival at1(sex=0) at2(sex=1) stcurve: gives predicted curves survival: graph survival (not hazard) at1: (value for curve 1) at2: (value for curve 2) copper default: fixed at overall mean=97.6. stcurve, survival at1(sex=0 copper=97.6) at2(sex=1 copper=97.6) _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] sex | copper |

Adjusted Curves reference value for copper matters copper set to 97.6 (mean value) copper set to 73 (median value)

Compare Curves adjusting for the value of copper matters male copper=154, female copper=90 (mean values)

Adjusted/Predicted Curves Can be useful for visualizing effect of predictor Must choose reference values for confounders o often choose mean for continuous variable o most common category for categorical “ stcurve ” is a flexible tool for creating adjusted or predicted survival curves

Next Lecture Reading VGSM, Ch Proportional hazards assumption - diagnostic checks Time dependent covariates Stratification Followed by homework session