Analysis of Complex Survey Data

Slides:



Advertisements
Similar presentations
Surviving Survival Analysis
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
SC968: Panel Data Methods for Sociologists
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
1 Statistics 262: Intermediate Biostatistics Kaplan-Meier methods and Parametric Regression methods.
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Lecture 3 Survival analysis. Problem Do patients survive longer after treatment A than after treatment B? Possible solutions: –ANOVA on mean survival.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Biostatistics in Research Practice Time to event data Martin Bland Professor of Health Statistics University of York
Introduction to Survival Analysis
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Survival Analysis: From Square One to Square Two
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. Stanford University Department of Health.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
Survival Data John Kornak March 29, 2011
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Lecture 3 Survival analysis.
The life table LT statistics: rates, probabilities, life expectancy (waiting time to event) Period life table Cohort life table.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Assessing Survival: Cox Proportional Hazards Model
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Time-dependent covariates and further remarks on likelihood construction Presenter Li,Yin Nov. 24.
INTRODUCTION TO SURVIVAL ANALYSIS
Applied Epidemiologic Analysis Fall 2002 Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia Taylor Chelsea MorroniJudith.
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
Censoring an observation of a survival r.v. is censored if we don’t know the survival time exactly. usually there are 3 possible reasons for censoring.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
12/20091 EPI 5240: Introduction to Epidemiology Incidence and survival December 7, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Biostatistics Case Studies 2014 Youngju Pak Biostatistician Session 5: Survival Analysis Fundamentals.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 2: Aging and Survival.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
Some survival basics Developments from the Kaplan-Meier method October
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
Biostatistics Case Studies 2009 Peter D. Christenson Biostatistician Session 2: Survival Analysis Fundamentals.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 8.1: Cohort sampling for the Cox model.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
DURATION ANALYSIS Eva Hromádková, Applied Econometrics JEM007, IES Lecture 9.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Carolinas Medical Center, Charlotte, NC Website:
April 18 Intro to survival analysis Le 11.1 – 11.2
Survival curves We know how to compute survival curves if everyone reaches the endpoint so there is no “censored” data. Survival at t = S(t) = number still.
Statistics 103 Monday, July 10, 2017.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Kaplan-Meier survival curves and the log rank test
Presentation transcript:

Analysis of Complex Survey Data Day 4: Survival analysis and Cox proportional hazards models

Nonparametric Survival Analysis Kaplan-Meier Method (also called Product-Limit Method) Life Table Method (also called Actuarial Method)

Nonparametric Survival Analysis A statistical method to study time to an event Divide risk period into many small time intervals 2) Treat each interval as a small cohort analysis 3) Combine the results for the intervals

Basic Concepts of Survival Analysis Censoring Time to an event Survival Function

Censoring At the end of study, subjects did not experience the event (outcome). Or subjects withdrew from a study (lost to follow up or died from other diseases). Survival analysis assumes LTF and competing cause censoring is random (independent of exposure and outcome) When using longitudinal complex surveys (e.g., PSID, AddHealth), survival analysis is most useful We can also use it in cross-sectional studies when incorporating retrospective age of onset information. Right censoring example: Study the time to death after lung cancer diagnosis Known time to death: for those who died (uncensored observation) Unknown time to death as of the end of the study period: for those who survive (censored observation) Interval censoring: Subjects experience the event (outcome) within an interval. Example: (1) Framingham Heart Study. The ages at which subjects first developed coronary heart disease (CHD) are usually known exactly. However, the ages of first occurrence of the subcategory angina pectoris may be known only to be between two clinical examinations, approximately two years apart. (2) annual HIV testing: a person was tested negative at then end of year 2 and is found to be infected at the end of year 3. The time of infection is interval censored between year 2 and year 3.

Censoring Example: Cohort Size at Start : 1,000 for 1 year Number with disease : 28 Number LTF: 15 If assume all dropped out on 1st day of study, rate of disease/y 28 1,000 - 15 = 985 = .0284 = If assume all dropped out on last day of study, probability of disease 28 1,000 = .0280 = If drop out rate is constant over the period best estimate of when dropped out is midpoint : probability of disease then is 28 1,000 – 7.5 = 992.5 = .0282 =

Survival Function The probability of surviving beyond a specific time [i.e., S(t) = 1 – F(t)] F(t) = cumulative probability distribution for endpoint (e.g., death)

S4 S3 F S2 F S1 F F Probability for survival at each new time period = Probability at that time period conditioned “surviving” to that interval S4 q S3 p F S2 o F S1 Probability survival to S4 = n n * o * p * q F Failures (F) = deaths or cases or losses to follow up F

Life Table Method A classical method of estimating the survival function in epidemiology and actuarial science Time is partitioned into a fixed sequence of intervals (not necessarily of equal lengths) Interval lengths (arbitrary) Larger the interval, larger the bias Useful for large samples

Thus, effective sample size (n*)= n – ½ (censoring #) The LIFETEST Procedure Stratum 1: platelet = 0 Life Table Survival Estimates Conditional Effective Conditional Probability Interval Number Number Sample Probability Standard [Lower, Upper) Failed Censored Size of Failure Error Survival Failure 0 10 4 0 9.0 0.4444 0.1656 1.0000 0 10 20 2 1 4.5 0.4444 0.2342 0.5556 0.4444 20 30 0 0 2.0 0 0 0.3086 0.6914 30 40 1 0 2.0 0.5000 0.3536 0.3086 0.6914 40 50 0 0 1.0 0 0 0.1543 0.8457 50 60 1 0 1.0 1.0000 0 0.1543 0.8457 N* Effective sample size: whenever there is censoring (withdrawal or loss), we assume that, on average, those individuals who became lost or withdrawn during the interval were at risk for half the interval. Censored at midpoint of the interval: it is equivalent to assuming that the distribution of censoring time is uniform within the interval. Thus, effective sample size (n*)= n – ½ (censoring #) E.g., effective sample size (1st interval) = 9 – ½ (0) = 9 E.g., effective sample size (2nd interval) = 5 – ½ (1) = 4.5

e.g., P(F) (1st interval) = 4/9 = .44 The LIFETEST Procedure Stratum 1: platelet = 0 Life Table Survival Estimates Conditional Effective Conditional Probability Interval Number Number Sample Probability Standard [Lower, Upper) Failed Censored Size of Failure Error Survival Failure 0 10 4 0 9.0 0.4444 0.1656 1.0000 0 10 20 2 1 4.5 0.4444 0.2342 0.5556 0.4444 20 30 0 0 2.0 0 0 0.3086 0.6914 30 40 1 0 2.0 0.5000 0.3536 0.3086 0.6914 40 50 0 0 1.0 0 0 0.1543 0.8457 50 60 1 0 1.0 1.0000 0 0.1543 0.8457 Cumulative Survival P(F) Conditional Probability of Failure: Number failed / Effective Sample Size Conditional probability of failure: is an estimate of the probability that a patient will die during the interval, given that he/she made it to the start of the interval. e.g., P(F) (1st interval) = 4/9 = .44 e.g., P(F) (2nd interval) = 2/4.5 = .44 Survival probability (in each interval) = 1- failure probability (in each interval) Cum Survival Prob (S(t)) = S (t-1) * S(t) e.g., S(1) = 1 * (1-.4444) = 1* 0.5556 =.5556 e.g., S(2) = S(0)* S(1) * S(2) S(2) =1*(1-.4444)* (1-.4444) =1 * .5556 * .5556 = .3086

Kaplan-Meier (Product-limit) Method Time is partitioned into variable intervals Whenever a case arises, set up a time interval. Use the actual censored and event times If censored times > last event time, then the average duration will be underestimated using KM method

Kaplan-Meier Method Lost to follow-up Lost to follow-up 4 10 14 24 Patient 1 died Patient 2 Lost to follow-up Patient 3 died Patient 4 died Patient 5 Lost to follow-up Patient 6 died 4 10 14 24 Months Since Enrollment

Kaplan-Meier Method (1) Times to death from starting treatment (Months) (2) Number alive at each time (3) Number who died at each time (4) HAZARD Proportion who died at that time: (3)/(2) (5) Proportion who survived at that time: 1.00-(4) (6) Cumulative Survival 4 6 1 .167 .833 10 .250 .750 .625 14 3 .333 .667 .417 24 1.00 .000

Kaplan-Meier Plot (N=6) % Surviving 100 .833 80 .625 60 .417 40 20 .0 4 10 14 24 Months After Enrollment

Kaplan-Meier Curve (N = 5,398) . Tort No Fault 1 No Fault 2 “Effect of eliminating compensation for pain and suffering on the outcome of insurance claims for whiplash injury” Cassidy JD et al., N Engl J Med 2000;342:1179-1186

Median Survival Time Tort No Fault 1 No Fault 2

Semi-Parametric Methods Not required to choose some particular probability distribution to represent survival time Incorporate time-dependent covariates Example: exposure increases over time as with drug dosage or with workers in hazardous occupations

Cox Proportional Hazards Model Basic Model of the hazard for individual i at time t hi(t) = 0(t) exp{β1xi1 + ….. + βkxik} Baseline hazard function Non-negative Linear function of fixed covariates Take the logarithm of both sides, log hi(t) = (t) +β1xi1 + ….. + βkxik No need to specify the functional form of baseline hazard function log 0(t)

Cox Proportional Hazards Model Consider the hazard ratio of two individuals i and j hi(t) = 0(t) exp{β1xi1 + ….. + βkxik} hi(t) = 0(t) exp{β1xj1 + ….. + βkxjk} Hazard ratio = exp{β1(xi1 -xj1) ….. + βk(xik-xjk)} Hazard functions are multiplicatively related, hazard ratio is constant over survival time. Hazards of any two individuals are proportional.

Cox Proportional Hazards Model 2. Partial Likelihood Estimation Estimate the β coefficients of the Cox model without having to specify the baseline hazard function 0(t) Partial likelihood depends only on the order in which events occur, not on the exact times of occurrence. Partial likelihood estimates are not fully efficient because of loss of information about exact times of event occurrence

Interpretation of Coefficients No intercept h0(t): an arbitrary function of time. Cancel out of the estimating equations eβ: Hazard ratio Indicator variables (coded as 0 and 1) Hazard ratio of the estimated hazard for those with a value of 1 in X to the estimated hazard for those with a value of 0 in X (controlling for other covariates) Quantitative (Continuous) variables Estimated percent change in the hazard for each one-unit increase in X. For example, variable AGE, eβ=1.5, which yields 100(1.5 - 1) =50. For each one-year increase in the age at diagnosis, the hazard of death goes up by an estimated 50 percent, controlling for other covariates.

Lab 4: estimating survival curves and Cox models in SUDAAN