M2 Medical Epidemiology

Slides:



Advertisements
Similar presentations
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Advertisements

How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Statistical Issues in Research Planning and Evaluation
SCREENING FOR DISEASE Nigel Paneth. THREE KEY MEASURES OF VALIDITY 1.SENSITIVITY 2.SPECIFICITY 3.PREDICTIVE VALUE.
Chance, bias and confounding
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Measure of disease frequency
Samuel Clark Department of Sociology, University of Washington Institute of Behavioral Science, University of Colorado at Boulder Agincourt Health and.
How Science Works Glossary AS Level. Accuracy An accurate measurement is one which is close to the true value.
Vanderbilt Sports Medicine Chapter 4: Prognosis Presented by: Laurie Huston and Kurt Spindler Evidence-Based Medicine How to Practice and Teach EBM.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Manish Chaudhary MPH (BPKISH)
EVIDENCE BASED MEDICINE
Measuring Epidemiologic Outcomes
Sample Size Determination Ziad Taib March 7, 2014.
Analysis of Complex Survey Data
Survival Analysis Diane Stockton. Survival Curves Y axis, gives the proportion of people surviving from 1 at the top to zero at the bottom, representing.
Incidence and Prevalence
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
HSTAT1101: 27. oktober 2004 Odd Aalen
Lecture 3: Measuring the Occurrence of Disease
HSS4303B – Intro to Epidemiology
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
Life expectancy of patients treated with ART in the UK: UK CHIC Study Margaret May University of Bristol, Department of Social Medicine, Bristol.
INTRODUCTION TO SURVIVAL ANALYSIS
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Rates, Ratios and Proportions and Measures of Disease Frequency
Natural History/ Prognosis Studies Natural history of disease (clinical course; preclinical/ clinical stage) Prognosis (death/survivors of disease) Survival.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Unit 2 – Public Health Epidemiology Chapter 4 – Epidemiology: The Basic Science of Public Health.
Issues concerning the interpretation of statistical significance tests.
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
12/20091 EPI 5240: Introduction to Epidemiology Incidence and survival December 7, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Measures of Disease Frequency
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Biostatistics Case Studies 2014 Youngju Pak Biostatistician Session 5: Survival Analysis Fundamentals.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 2: Aging and Survival.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
Instructor Resource Chapter 13 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
Epidemiology. Classically speaking Classically speaking EPI DEMO LOGOS Upon,on,befall People,population,man the Study of The study of anything that happens.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Instructional Objectives:
April 18 Intro to survival analysis Le 11.1 – 11.2
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Chapter 4 SURVIVAL AND LIFE TABLES
Presentation transcript:

M2 Medical Epidemiology Survival Analysis; Evidence-Based Medicine

Survival Rates and Censoring Problems with naïve analyses Ubiquity of censored data Methods of adjustment for censoring Survival curves and their interpretation

Problems with Naïve Survival Analyses The table below was obtained several decades ago by averaging ages at death from death certificates of physicians who had practiced in different specialties. The investigators concluded that there was a dose-response effect associating specialties with higher exposure to radiation with shorter life expectancy. Why was their conclusion nonsense?

Problems with Naïve Survival Analyses The following headlines, the second of which is from the Champaign-Urbana News-Gazette, feature a researcher who has repeatedly reached very surprising conclusions about smoking and mortality.

Problems with Naïve Survival Analyses These include: Women are at greater risk of smoking-related death than men. Filter cigarettes are more dangerous than unfiltered cigarettes. Several years ago other investigators, using a similarly designed study that received even more publicity, reached the conclusion that left-handers have life expectancy 9 years below that of right-handers. Numerous “plausible” explanations were advanced to explain this. These conclusions were reached by the same method used by the researchers above. Other similar studies have erroneously concluded that effective medical innovations have no impact. Can you explain all this silliness by a single flaw in the design of the research?

Problems with Naïve Survival Analyses Death certificate studies such as those above can be grossly biased, because they don't start with a defined cohort. By selecting into the study population only those people who have already died, they systematically exclude the people who live longest. If the age distributions of two groups differ to start with, the resulting comparisons can lead to ridiculous scientific conclusions due to selection bias.

Problems with Naïve Survival Analyses Thus, only studies that begin with defined cohorts are usually valid. But the cohorts must usually be assembled over a period of time, since people don’t all get sick at once, and members of the cohorts may drop out of the study, or become lost to follow-up, for numerous reasons, and some suffer the outcome of interest earlier than others. Hence, some members of the cohort may be observed for much shorter periods than others. Methods that don’t take these differences in observation periods into account are subject to substantial measurement bias, because the process of monitoring for the outcome differs between individuals.

Problems with Naïve Survival Analyses One way to take observation times into account is to calculate incidence density rates using person-years. This method works well when incidence density is stable over the period of study. However, usually this assumption is false. In the study of long-term survival, we all know that the incidence density of death (the mortality rate) rises with age, which increases over time. Studies of surgical outcomes must deal with perioperative mortality, which is often much higher than later mortality. Mortality rates from cancer change substantially after treatment; 5-year survival for some cancers is regarded as cure. The rate of complications of certain diseases, such as diabetes, increases greatly with duration of the disease. For these situations, calculating overall incidence density rates using person-years pools information from different times in an inappropriate way. Other methods are necessary.

Ubiquity of Censored Data The term “survival data” is used in Medicine and Public Health to describe “time to event” data, where the event is any occurrence of importance to health. Time to death, renal failure, second myocardial infarction, first asthma attack after change in therapy, or time to recovery all are “survival data” in this technical sense.(CAUTION) Survival data in Medicine and Public Health are also usually "censored data," because for some subjects we know only that they have survived at least a certain period of time, but we don’t know when death or other outcome will occur. We stop most clinical trials or vaccine field trials before most subjects die or experience an unfavorable outcome. Otherwise, it would take to long to get an answer and researchers couldn’t get tenure.

Methods of adjustment for censoring To interpret censored survival data well, we must avoid pooling information from different subjects inappropriately avoid pooling information from different times inappropriately take censoring into account without introducing bias into the analysis. We use one of two methods actuarial (Cutler-Ederer) Product-limit (Kaplan-Meier) Both stem from the same fundamental approach: the fundamental equation of survival analysis.

Probability of surviving 2 Periods (example 2 years) Probability of surviving 1st year (e.g. 80%) Probability of surviving 2nd year ( not 2 years. Only the 2nd year)i.e. of those alive after 1 year what is the probability of surviving the 2nd year. (e.g. 70%) Then probability of surviving 2 years is 80% X 70% = 56% Can we just divide the number surviving 2 years by the starting number? NO

So all we need is: Percent surviving each time period. We get that by calculating the percent dying during time period. Example We start with 90 patients. During first year 20 withdraw and 16 die. Probability of dying during 1st is 16 dividing by 90 or 70 ? Half way.

Actuarial method Number dying during period divided by number alive at beginning of period minus half of the withdrawn. 16/80= 20% so 80% survive

2nd year We are starting with 90-20-16=54 During 2nd year 8 are lost to follow up and 15 die. Probability of dying in the 2nd year is 15/(54-4)=30%. So 70% survive the 2nd year. So probability of surviving 2 years is 80% X 70% = 56%

Methods of adjustment for censoring Survival data, converted from chronological to biological time:

Methods of adjustment for censoring Fundamental equation of survival analysis Suppose we select a set of times, symbolized by t1, t2, ... , tk. These represent not calendar time, but durations from a clinically defined starting point such as diagnosis or treatment. Suppose that patients are observed for different durations after this starting point, usually over different intervals of calendar time, as in the previous slide. We are interested in the probabilities that a patient survives until each of the given times t1, t2, ... , tk after the starting point. Why? these are useful measures of prognosis in clinical practice, both for their own sakes, and as complements of CI’s of death at t1, t2, ... , tk we may also use them for comparing cohorts with different exposures in observational epidemiological studies for comparing treatment effects in experimental clinical trials

Methods of adjustment for censoring To estimate these probabilities, we use the Fundamental Equation of Survival Analysis Pr{surviving through time tj} = 1 - Pr{death by time tj} = Pr{surviving through time t1}  Pr{surviving time t2|survival through t1}  Pr{surviving through t3|survival through t2}  Pr{surviving through t4|survival through t3} ... ... ... X Pr{surviving through tj|survival through tj-1}

Methods of adjustment for censoring Thus, the probability of surviving a given duration is expressed as the product of: probability of surviving an initial interval with conditional probabilities of surviving successive subsequent intervals having survived all previous intervals Each of these terms may be separately estimated by pooling data from relevant persons with possibly non-concurrent experiences!

Methods of adjustment for censoring Notation Ox = # alive at beginning of interval x Dx = # dying during interval x Wx = # withdrawn from study or lost to follow-up during interval x

Methods of adjustment for censoring Cutler-Ederer (Actuarial) Approach Intervals specified in advance. Pr{dying during interval x} = Dx /(Ox -Wx/2) Pr{surviving during interval x} = 1 - Pr{dying during interval x}

Kaplan-Meier Keep track of withdrawals all the time. Don’t touch the curve until someone dies. Probability of dying is number dying at this point divided by number still available at the time of death.

Example You start with 15 patients. You are notified about withdrawals. On July 3rd you are notified about 2 deaths (on the same day!) You look at the number withdrawn up to that point and you find there have been 5. You divide 2 by 15 minus 5= 20%

Contd On July 3rd you take your line straight down from 100% to 80%. So probability of dying is number dying at any point divided by number alive at beginning of previous period minus all withdrawals during that period.

Next Now we have only 8 patients. On December 23 1 patient dies. Between July 3rd and December 23rd 2 patients are withdrawn. Divide 1 by 8 minus 2 = 1/6= 16.7% Probability of surviving the 2nd period is 83.3% Probability of surviving 2 time periods is 80% X 83.3% =66.6%. So on December 23rd you take the line straight down from 80% to 66.6%

Where do you read at ? End of line

Methods of adjustment for censoring Product-Limit (Kaplan-Meier) Approach Intervals are determined by times at death. infinitesimally small intervals around each death time, and, in between, intervals during which no deaths occur. Pr{surviving intervals between deaths) = 1 Pr{dying at the xth death time} =Dx/Ox

Methods of adjustment for censoring Kaplan-Meier (product-limit) and Cutler-Ederer (actuarial) survival plots of the same data. Which is which?

Actuarial methods of adjustment for censoring Estimated chance that someone who starts the interval will die within the interval = qx = Dx/(Ox-Wx/2) Estimated chance that someone who starts the interval will survive through it = px = 1-qx Chance of surviving from the beginning of the study to the end of the interval = Px = pxpx-1  px-2 ...  p1 = px  Px-1

Actuarial Method of adjustment for censoring q1 = D1/(O1-W1/2) = 27/(146-(3/2)) = .1869 p1 = 1-q1 = 1-.1869 = .8131 Px = pxpx-1  px-2 ...  p1 = px  Px-1 P1 = p1 = .8131

Actuarial Method of adjustment for censoring P1 = p1 = .8131 q2 = D2/(O2-W2/2) = 18/(116-(10/2)) = .1622 p2 = 1-q2 = 1-.1622 = .8378 Px = pxpx-1  px-2 ...  p1 = px  Px-1 P2 = p2  p1=.8378x.8131 = .6812

Cox Proportional Hazards Car going at constant 20 MPH through varying traffic, curves etc. Risk of accident varies instantaneously according to traffic, road condition etc. Another car going through exact same roads and traffic but at 40 MPH. Risk of accident is twice(?) as much at every instant.

Proportional hazards Hazard varies over time but the ratio of the hazard remains constant. Sir David Cox in 1972 introduced a method to calculate proportional hazard without calculating the actual time dependent hazard. This proportional hazard can be “adjusted” for covariates (Cox Regression).Output: HR Hazard Ratio (similar to OR) Breslow introduced a way to estimate hazard at any particular time.

Survival Curves and their Interpretation Survival curves always start at 1.0=100% on the vertical axis, and must decline. The only issue is how fast they decline. Further, if one follows patients long enough, all curves describing actual survival (in contrast to some other outcome that doesn't affect everyone eventually) end at zero. The issue is therefore not where they end, but how much higher one curve is relative to another, or the area between the curves. This is no surprise, it’s just the cumulative incidence issue in another form, since survival “rates” are just complements, with respect to 1, of cumulative incidences.

Survival Curves and their Interpretation Trends in survival curves may be much less accurate towards the right end than at the beginning, because fewer people contribute to the computation at the right end, most subjects having been observed for shorter intervals. However, this problem of unreliability may be somewhat mitigated by the tendency of the true survival curve to flatten out in many real situations. Note that it’s not as much the height at the end that’s less accurate as it is the slope at the end. This point is important in understanding prognostic estimates made near the ends of the curves, as described below.

Survival Curves and their Interpretation Later Prognosis Survival curves can be used to estimate the outlook for a patient who has already survived a certain length of time, by dividing the height of the curve later by its present height. Thus, if a patient who has survived a myocardial infarction for 2 years wants to know the chances of surviving another year, divide the 3-year survival rate by the 2-year survival rate. This gives the estimated fraction, of those who survived the first 2 years, who will make it through another year.

Survival Curves and their Interpretation From the blurry curve below, can you determine roughly the chance that someone who has already survived for three years will survive for two more?

Survival Curves and their Interpretation Reiterating a previous point, survival analysis is applied to the development of any irreversible outcome, not just mortality. It is also frequently applied to the first occurrence of a reversible outcome as well. Survival curves are sometimes plotted with a logarithmic vertical scale, especially when the mortality rate is roughly constant. In that case the survival curves look like straight lines. Watch the scale or you can be badly misled.

Survival Curves and their Interpretation In interpreting survival curves, the choice of starting point is critical as well as the shapes of the curves. For instance, if you evaluate a screening program by starting at time of diagnosis, and compare survival from diagnosis of a screened and unscreened group, then screening will always look good. Why?

Survival Curves and their Interpretation Survival Curves and their Interpretation ...screening will always look good. Why? Because survival for the screened group is being measured from an earlier point in the disease process than for an unscreened group. This is called "lead-time bias," a measurement bias. It may be that an apparent survival advantage in the screened group simply reflects the extent by which screening moved up the date of diagnosis of the disease, rather than any impact of early detection and treatment on true survival. Beware this trap!

Survival Curves and their Interpretation The figure above compares three survival curves, but gives no indication of how reliable these curves are. They might be from large samples or very small samples, and be statistically very stable or highly variable. We can't tell.

Survival Curves and their Interpretation The graph to the right is more informative. With each curve, at the end of each interval, is the number who survived the interval without a recurrence (Ox-Dx-Wx), shown as a fraction of the number (Ox) who reached the start of the interval without a recurrence. We see from the curves that they are based on only a few patients. Specifically, we see that even though things look encouraging after two years, there is very little information in these data about that period of time.

Survival Curves and their Interpretation This plot gives information about variability in a different form, by using standard error bars for each survival rate. Just as means, proportions, or any other statistic, a survival rate has a standard error that reflects how variable the statistic is from sample to sample under the same conditions.

Survival Curves and their Interpretation The standard error bars give us more direct information than the sample sizes as to how precisely the survival rate at each time is estimated by the given set of data. The error bars below show the survival rates are quite imprecise.

Survival Curves and their Interpretation The figure below tries to combine the best features of the previous two, by including both the number of individuals observed to survive each interval, and standard error bars for the survival rates plotted at the end of each interval. This makes the figure "busy,” but more informative than the others we have seen.

Survival Curves and their Interpretation The plot below compares survival of lung cancer patients diagnosed during three successive decades. Visually, the increase in long-term survival looks quite noticeable. What special feature of this plot makes the visual impression exaggerate the beneficial trend?

Survival Curves and their Interpretation The literature is also replete with plots of cumulative probabilities of events over time, such as the plot below. These are obtained by the same method as survival plots. The only difference is that, rather than plot the survival probability, the researchers subtract it from one first.