Bernard Rosner Channing Division of Network Medicine

Slides:



Advertisements
Similar presentations
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Advertisements

Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Using time-dependent covariates in the Cox model THIS MATERIAL IS NOT REQUIRED FOR YOUR METHODS II EXAM With some examples taken from Fisher and Lin (1999)
Using Weibull Model to Predict the Future: ATAC Trial Anna Osmukhina, PhD Principal Statistician, AstraZeneca 15 April 2010.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
1 Breast Cancer Risk Prediction Bernard Rosner Impact of Time-Dependent Risk Factors and Heterogeneity by ER/PR Receptor Status.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Coffee Consumption and Risk of Myocardial Infarction among Older Swedish Women SA Rosner, A Akesson,MJ. Stampfer, A Wolk; AJE; :
CI - 1 Cure Rate Models and Adjuvant Trial Design for ECOG Melanoma Studies in the Past, Present, and Future Joseph Ibrahim, PhD Harvard School of Public.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
Assessing Survival: Cox Proportional Hazards Model
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Lecture 9: Analysis of intervention studies Randomized trial - categorical outcome Measures of risk: –incidence rate of an adverse event (death, etc) It.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Breast Cancer in the Women’s Health Initiative Trial of Estrogen Plus Progestin For the WHI Investigators Rowan T Chlebowski, MD., Ph.D.
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
EPI 5344: Survival Analysis in Epidemiology Week 6 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 03/2016.
Date of download: 5/31/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Estrogen Plus Progestin and Breast Cancer Incidence.
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 8.1: Cohort sampling for the Cox model.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Carolinas Medical Center, Charlotte, NC Website:
Instructional Objectives:
Cancer Statistics 2016 A Presentation from the American Cancer Society
Cancer Statistics 2016 A Presentation from the American Cancer Society
The comparative self-controlled case series (CSCCS)
Comparing Cox Model with a Surviving Fraction with regular Cox model
April 18 Intro to survival analysis Le 11.1 – 11.2
The Importance of Adequately Powered Studies
Copyright © 2012 American Medical Association. All rights reserved.
Survival curves We know how to compute survival curves if everyone reaches the endpoint so there is no “censored” data. Survival at t = S(t) = number still.
Epidemiologic Measures of Association
From: Tipping the Balance of Benefits and Harms to Favor Screening Mammography Starting at Age 40 YearsA Comparative Modeling Study of Risk Ann Intern.
Statistical Inference for more than two groups
Chapter 8: Inference for Proportions
Statistics 262: Intermediate Biostatistics
Bronx Community Health Dashboard: Breast Cancer Last Updated: 1/19/2018 See last slide for more information about this project. While breast.
It is estimated that almost 1
Lecture 1: Fundamentals of epidemiologic study design and analysis
Statistics 103 Monday, July 10, 2017.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Review – First Exam Chapters 1 through 5
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Statistics 262: Intermediate Biostatistics
Combined predictor Selection for Multiple Clinical Outcomes Using PHREG Grisell Diaz-Ramirez.
It is estimated that more than 1
Presentation transcript:

Analysis of Lethal Cancer Risk Among a Cohort of Initially Disease-free Women Bernard Rosner Channing Division of Network Medicine Harvard Medical School Boston, MA 02115 Joint Statistical Meetings Vancouver, BC August 1, 2018 This work was supported by NCI grant T32 CA 9001 and CA 87969.

Background Pre-diagnosis factors may influence the likelihood that a cancer causes a patient’s death. Several methods have been used to evaluate associations with lethal cancer among an initially disease-free population, but each has limitations. In this talk we present a novel two-stage method that separately estimates association of pre-diagnosis risk factors with cancer incidence and with survival among cancer cases and combines them to yield a single measure of association.

Nurses’ Health Study Established in 1976 among 121,700 US female registered nurses, age 30-55. Followed every 2 years by mail questionnaire to inquire about lifestyle factors, health behaviors and medical history This analysis began in 1984 since this was the first year in which postmenopausal estrogen and progesterone (E+P) therapy was used in appreciable frequency. Follow-up was through 2010 Goal is to relate pre-diagnosis risk factors to death due to breast cancer. 8675 total breast cancer cases, 1382 deaths due to breast cancer.

Notation t0 = beginning of follow-up T1 = time from t0 to breast cancer diagnosis T2 = time from breast cancer diagnosis to breast cancer death. t = total follow-up time from t0 Xt = exposure of interest at time t Z1t = vector of pre-diagnosis covariates measured at time t included in models of breast cancer incidence Z2t = vector of pre-diagnosis covariates measured at time t included in models of breast cancer survival (among cases) Vt = vector of post-diagnosis covariates measured at time t included in models of breast cancer survival (among cases)

Types of models considered Time to event analysis using baseline covariates (TTE) Time to event analysis using time-dependent covariates (TDC) Time to event analysis with updated covariates through breast cancer diagnosis (TDX) 4. Ordered Multiple Event Analysis (Prentice, Williams and Peterson) (PWP) 5. Two-stage combined incidence and survival model (2S)

Time-to-event Analysis using Baseline Covariates (TTE) The time scale is time between baseline and end of follow-up or death whichever occurred first. Follow-up is censored at the date of death, but only women who died of breast cancer were counted as cases. Women diagnosed with breast cancer but who did not die during follow-up were censored in 2010.

TTE (continued) Non-cases were censored at the earliest of the date of diagnosis of any other cancer (except non-melanoma skin cancer), the time of death, or the end of follow-up Issue: There may be a long interval between baseline and the end of follow-up and many risk factors may have changed. If more recent exposure is important this may bias HR estimates.

Time-to-event analysis with time-dependent covariates (TDC) The Anderson-Gill method was used to update covariates every 2 years. Issue: Some covariates may change dramatically after diagnosis, thus biasing HR estimates if there is a different association between an exposure and incidence vs. an exposure and deaths among cases.

Time-to-event Analysis with Updated Covariates until Breast Cancer Diagnosis (TDX) Issues There may be a long time period between breast cancer diagnosis and death due to breast cancer (i.e., T2 large) Therefore, lethal cases may have exposure at an earlier time than non-cases. If there are secular trends in risk factors over time this may bias HR estimates.

Ordered Multiple Event Analysis (Prentice-Williams-Peterson conditional models; Biometrika 1981;68(2):373-79) non cases: cases: Issue: are assumed to be the same for all exposures before and after diagnosis, which may be unrealistic.

Two-stage method (2S) - Notation S1(t1) = survival function for incidence = where t1 = time from baseline. S2(t2) = survival function for mortality due to breast cancer = where is time from breast cancer diagnosis. The corresponding hazard functions are respectively.

Two-stage method (2S) - Rationale In a standard survival analysis we are interested in estimating the survival function S(t) and the hazard function h(t) where t = time from baseline to the event of interest (in this case death due to breast cancer). The risk set R(t) = set of subjects who are at risk, but have not developed the endpoint as of time t.

Two-stage method (2S) – Rationale (cont.) Usually, all subjects in R(t) are at risk of developing the endpoint at time t and the probability of getting the endpoint between time t and is while the probability of not developing the endpoint by time t is Hence, the risk of developing the endpoint between time t and is risk =

Two-stage method rationale (cont.) However, this construct doesn’t take into account that the only subjects who are at risk of dying of breast cancer between time t and are women who already have the disease at time t. But, we can separately estimate based on the survival curves for incidence and mortality mentioned previously:

Two-stage method – Rationale (cont.) = probability of getting breast cancer shortly after time t1 x probability of dying of breast cancer shortly after t2 = probability of not getting lethal breast cancer by time t1+t2 = probability of not getting breast cancer by time t1+t2 if disease incidence is low

Two-stage method – Rationale (cont.) (1) = probability of getting breast cancer between time t1 and time years and dying of breast cancer between t2 and years after diagnosis/( )/probability of remaining disease free over time t1+t2 years.

Two-stage model – Inclusion of Covariates Fit a separate Cox Proportional Hazards Model for incidence and mortality where (2) (3)

Two-stage model – Inclusion of Covariates (cont.) We now consider the log hazard ratio (HR) comparing a subject with exposure x+1 vs. a subject with exposure x at time 0, where all other pre-diagnosis covariates are the same at time 0 and post-diagnosis covariates are the same at time t2. Based on equations 1-3, this is given by: (4)

Two-stage Model – Inclusion of Covariates (cont.) Note that if we have time-dependent covariates (as we do in the Nurses’ Health Study), then t1 = 0 and equation 4 reduces to:

Two-stage Model – Inclusion of Covariates (cont.) Since are generally small we approximate equation 5 by a 1st order Taylor series expansion about which yields: (6) In general ln(HR) is a function of time since breast cancer diagnosis = t2. However, for cancers where both incidence and mortality rates are not high (7)

Two-stage model - Inference Standard methods of inference are performed based on either equations 6 or 7 assuming asymptotic normality.

Simulation Study - design We simulated data using an exponential distribution to mimic the HRs for breast cancer incidence and death due to breast cancer for a variety of risk factors. We simulated 4000 datasets of 100,000 observation each under 6 different combinations of HRincidence and HRsurvival with the expected HR simulated using the 2S method. 28-year breast cancer incidence rates and 30-year breast cancer survival rates were taken from NHS data (similar to SEER rates). The results for TTE, PWP, 2S (eq. 6), 2S (eq. 7) for a subset of the simulations are given on the next slide.

Simulation Study Results Methods HRincidence HRsurvival Variable TTE 2S (eq. 6) 2S (eq. 7) PWP 1.0 Bias* -0.00 Coverage** 95.1 95.0 2.0 Bias -0.02 0.00 -0.01 -0.10 Coverage 93.0 94.9 94.8 5.0 0.04 0.10 -0.50 88.2 88.1 0.0 -0.04 -0.60 75.9 88.6 * mean observed ** % of 95% confidence intervals that include the true

Simulation Study - Discussion Under the null hypothesis that exposure is not associated with either incidence or mortality, all the methods have little bias and adequate coverage. Under the alternative hypothesis that exposure is related to either incidence and/or survival, (Ha), the PWP method has substantial bias and low coverage probability. Under the alternative hypothesis (Ha) the TTE method has a slight negative bias and a coverage probability less than 95%. Most of the person-time for the TTE method is for incidence.

Simulation Study – Discussion (cont.) Under Ha, the 2S method incorporating survival probabilities at the 2nd stage has low bias and adequate coverage probability. Under Ha where the exposure has an effect on survival, the simplified 2S method has some positive bias and coverage probability < 95%. Overall, the 2S method incorporating survival probabilities at the 2nd stage performs best.

Nurses’ Health Study – Data Analysis Postmenopausal women – follow-up from 1984-2010, 2,532,073 person-years of follow-up 8675 incident breast cancer, 1382 deaths due to breast cancer Goal: to assess pre-diagnostic risk factors that predict deaths due to breast cancer. On the next slide we show results for weight change since age 18, an established risk factor for breast cancer, using different methods of analysis mentioned previously in this talk.

Nurses’ Health Study – Data Analysis Incidence of Breast Cancer Breast cancer survival among cases TTE* TDC** TDX*** 2S (eq. 6) 2S (eq. 7) PWP # cases/breast cancer deaths 8675 1382 Person-years 2,439,134 101,348 2,532,073 Weight change since age 18 stayed within 5 kg 1.0 (ref) Gained > 30 kg 1.56 1.11 2.32 0.73 1.41 1.73 1.72 1.49 95% CI (1.42-1.71) (0.86-1.43) (1.77-3.03) (0.56-0.95) (1.10-1.80) (1.33-2.25) (1.32-2.25) (1.13-1.96) * covariates are from the baseline questionnaire (1984) and not updated ** covariates are updated throughout follow-up; *** covariates are updated until diagnosis for cases and until 2010 for non-cases. + results are adjusted for age at menarche, age and type of menopause, age at each birth, hormone therapy, smoking (pack-years), history of benign breast disease, family Hx of breast cancer, physical activity (met-hrs/wk), BMI at age 18 and alcohol intake.

Nurses’ Health Study – Data Analysis - Discussion Weight change of > 30 kg since age 18 is associated with an increased incidence of breast cancer (HR = 1.56, 95% CI = 1.42-1.71) and a small but not statistically significant increase in breast cancer deaths among cases (HR = 1.11, 95% CI = 0.86-1.43). The TTE method reflecting early weight change (HR = 2.32, 95% CI = 1.77-3.03) and the TDC method of updating weight after breast cancer diagnosis (HR = 0.73, 95% CI = 0.56-0.95) provided dramatically different results. The TDX method of updating weight until diagnosis (HR = 1.41, 95% CI = 1.10-1.80) provided results intermediate between TTE and TDC.

Nurses’ Health Study – Data Analysis – Discussion (cont.) The TDC method is confounded by weight change in response to breast cancer treatment modalities and the TTE method is confounded by large changes in weight since 1984. The TDX method underestimates the HR because weight for cases is updated until diagnosis while weight for non-cases is updated until 2010, thus ignoring the secular trend of an increase in weight over time.

Nurses’ Health Study – Data Analysis – Discussion (cont.) The PWP method (HR = 1.49, 95% CI = 1.13-1.60) is essentially a weighted average of the HR for incidence and survival (emphasizing the former due to the larger number of person-years). The 2S method based on equation 6 (HR = 1.73, 95% CI = 1.33-2.25) integrates the HR for incidence and mortality into one cumulative HR and seems appropriate for this design. The simplified 2S method (eq. 7) yields essentially the same results as the original 2S method (eq. 6). Not all breast cancer risk factors showed differences as large between methods.

Summary We have presented several approaches for assessing risk factors for lethal cancer among disease-free women. The key difference between this design and the ordinary survival analysis design is that a subject must encounter two events, (a) getting breast cancer and (b) dying of breast cancer, to be considered a “case.” Thus, a person is not in the risk set for the 2nd event until they have realized the 1st event.

Summary (cont.) Some traditional approaches such as TTE or TDC don’t seem appropriate because in the former case, risk factors may change substantially over a long period of time and in the latter case, may be influenced by treatment variables after diagnosis. The TDX method is also inappropriate because it ignores secular trends in risk factors after diagnosis. The usual “multiple events” analysis such as models for breast cancer recurrence of which PWP is a prototype, also doesn’t seem appropriate since (a) there is an assumption that effects of risk factors are the same at each stage, and (b) the real goal is to add effects of a risk factor over multiple stages rather than to average them.

Summary (cont.) The 2S method seems appropriate for this design and with approximation can yield a single HR estimate and can be implemented using standard Cox regression software at each stage. It might be applicable to other diseases but would require an extension for diseases that are sometimes immediately fatal at the 1st stage, e.g., heart disease where some subjects may die immediately after a heart attack while others survive and subsequently may die of heart disease at a later age.