Presentation is loading. Please wait.

Presentation is loading. Please wait.

Disease Occurrence II Main Points to be Covered Incidence rates (person-time incidence) “Average” incidence rate –Calculating “average” incidence rate.

Similar presentations


Presentation on theme: "Disease Occurrence II Main Points to be Covered Incidence rates (person-time incidence) “Average” incidence rate –Calculating “average” incidence rate."— Presentation transcript:

1 Disease Occurrence II Main Points to be Covered Incidence rates (person-time incidence) “Average” incidence rate –Calculating “average” incidence rate –Uses of incidence rates Instantaneous incidence (hazard) rate Cumulative incidence and incidence rate: different but related Assumptions of survival and person-time analyses Competing risks

2 Rate versus Risk Two basic measures of the occurrence of new events (disease) –Cumulative incidence=Risk=Probability of event in a given time period –Incidence rate=Rate=events per unit time Last week we discussed the concept of cumulative incidence –Commonly calculated by the Kaplan-Meier method when different follow-up times exist Incidence rate of disease is somewhat less intuitive but is the more fundamental measure

3 The Three Elements in Measures of Disease Incidence E = an event = a disease diagnosis or death N = number of at-risk persons in the population under study T = time period during which the events are observed

4 Measures of Incidence The proportion of individuals who experience the event in a defined time period (E/N during some time T) = cumulative incidence The number of events per amount of person-time observed (E/NT) = incidence rate. –Average incidence rate (“incidence rate”) –Instantaneous incidence rate (“hazard” or “hazard rate”)

5 “Average” Incidence Rates The numerator is the same as incidence based on proportion of persons = events (E) The denominator is the sum of the follow- up times for each individual The resulting ratio of E/NT is not a proportion--may be greater than 1 Value depends on unit of time used

6 How to Calculate an Average Incidence Rate: Obtaining the Denominator Method 1: If have exact entry, censoring, and event times for each person, can sum person-time for each person for denominator Method 2: If no individual data but have the time interval and average population size, can take their product as denominator –Some datasets may only have the average population size at risk

7 c

8 Rate: 6/9.583 = 0.626 per person-year = 62.6 per 100 person-years

9 Method 2: Using average number of persons at risk during time interval 10 persons at baseline; 1 person at end of 2 years (6 deaths + 3 censored before 2 years = 9 losses) Formula: Average number of persons at risk = N baseline + N end / 2 = 11 / 2 = 5.5 Rate = 6/5.5 over 2 years = 0.545 per person-year or 54.5 per 100 person-years OR: 1 person with 2 years of follow-up and 9 with “some” follow-up. Assume 1(2) + 9 (2)(1/2) = 11 person-years

10 Average incidence rate based on grouped vs. individual data Szklo and Nieto use incidence rate when based on group data (average population at risk) and incidence density when based on individual data This terminology distinction is not followed by most Average population method assumes uniform occurrence of events and of censoring during the interval (like life table)

11 Incidence rate value depends on the time units used Incidence rate of 8 cases per 100 person-years: Could also report as cases per 100 person-months: (8/100 person-yrs) * (1 yr/12 mos) = 0.67 cases per 100 person-months Or as cases per 100 person-weeks: (8/100 person-yrs) * (1 yr/52 weeks) = 0.15 cases per 100 person-weeks All identify the same incidence rate

12 Reporting Average Incidence Rate Person-time concept may seem unfamiliar because often described as “annual rate” or “annual rate per 100,000 persons” or “per 100,000 persons” (i.e., person-time denominator is not made explicit) Example: “The incidence of Pediatric Cardiomyopathy in two regions of the United States” (NEJM, 2003) –467 cases of cardiomyopathy in registry of 38 centers (New England, Southwest) 1996 - 1999 –denominator “population estimates…1990 census with an in- and out-migration algorithm” ages 1 - 18 –“overall annual incidence of 1.13 per 100,000 children” Better to make person-time explicit: “incidence among children was 1.13 per 100,000 person-years”

13 Waiting Time Property of Incidence Rates Waiting time to an event is reciprocal of the incidence rate (1/rate) –Eg, if rate 300 per 100 person-years, reciprocal is 1 (300/100 person-years) = (1/3) person-year –Average waiting time between events is 0.33 person-year = 4 person-months

14 Assumptions of Average Incidence Rate Estimation The rate is constant for the time period during which it is calculated –Rates calculated over long time periods may be less meaningful “A” time units of follow-up on “B” persons is the same as “B” time units on “A” persons –E.g. Observing 20 deaths in 200 persons followed for 50 years gives the same incidence rate as 20 deaths in 10,000 persons followed 1 year

15 When is the rate not constant? Event rate may change with follow-up time (e.g. age effect, cumulative exposure effect) –Example from text: risk of bronchitis for 3 smokers followed 30 years is not the same as the risk for 90 smokers followed 1 year. Cumulative effects of exposure. Event rate may change with calendar time (cohort or period effect)

16 Survival changing over calendar time

17 Why Use Average Incidence Rates? 1.To calculate incidence from population- based disease registries - where the persons at risk cannot all be individually followed

18 (1) Calculating a rate from population- based registry of diagnoses Research question: What is the incidence rate for first diagnoses of breast cancer in Marin County and how does it compare with rates from other counties? Nearly all new breast cancer diagnoses are reported to the SEER cancer registry How to obtain a denominator for a rate?

19 Large Population Incidence Rates “Since the production of stable rates for cancers at most individual sites requires a population of at least one million subjects, the logistic and financial problems of attempting to maintain a constant surveillance system [of everyone in the population] are usually prohibitive.” Breslow and Day, Statistical Methods in Cancer Research Solution: Do surveillance of all the cancer diagnoses and estimate the population denominator to get person-time at risk. To get an incidence rate person-time denominator by the group method requires only an estimate of the average population size during the year (=the population at mid-year).

20 Average Population (Group data) rates versus individual data rates If losses are perfectly uniform, total person- time calculation for the denominator (and thus the rate) is the same whether based on average population size or individual follow-up For large populations the rate will be nearly identical calculated by either method

21 Potential Weakness of Using Census Data Calculating rates from census population data is very useful but caution is required as a full census is only done every 10 years Interim estimates of population change are made by the Census but over 10 years denominators may become inaccurate

22 Invasive Breast Cancer Incidence Rates for Marin County versus Other California, 1995-2000 YearMarin CountyOther California* 1995162.1145.9 1996187.8145.4 1997176.6150.9 1998176.6155.9 1999190.7157.8 2000157.5153.9 Rates per 100,000 person-years *Excluding 5 Bay Area Counties

23 The estimates of breast cancer incidence (number of new cancers per year) most recently reported for Marin and other areas of the country were based on 1990 census information. Data from Census 2000 have enabled researchers to recalculate rates for Marin. Preliminary results show that revised incidence rates for Marin County based on the 2000 census are substantially lower than the rates calculated using 1990 census information. The discrepancy between using the 1990 and 2000 census data is due to projected population growth differing considerably from actual population growth. Census Denominators for Incidence Rates are Estimates

24 Why Use Average Incidence Rates? 1.To calculate incidence from population- based disease registries 2.To compare disease incidence in a cohort (individual-level data) with rate from the general population OR to compare incidences between 2 or more general populations

25 (2) Comparing a rate from a cohort to the rate in the general population A cohort study of petroleum refinery workers followed up subjects for mortality for 36 years and found 765 deaths. Research question: Was the cohort mortality incidence high, low, or just average for those calendar years? How would you calculate the mortality incidence in the cohort?

26 Example of Using Incidence Rates for Cohort Comparisons Cohort of petrochemical workers –6,588 white male employees of Texas plant –Mortality determined from 1941-1977 –137,745 person-years of follow-up time –765 deaths Overall death rate = 765 / 137,745 person-years = 5.6 per 1000 person-years Question: Is this a high death rate? Austin SG, et al., J Occupat Med, 1983

27 Cohort of petrochemical workers Could calculate KM estimate of cumulative incidence (for 36 years of follow-up), but what is the comparison group? Using the incidence rate, the observed rate can be compared to the rate that would be expected if the rate from a reference population (eg, U.S. population) is applied to the cohort

28 Standardized Mortality Ratio If U.S. death rates for age-sex-race-calendar period groups applied to the cohort, 924 deaths were expected in the cohort versus the 765 observed. Ratio of 765 observed/924 expected = 0.83. This is called a Standardized Mortality Ratio (SMR).

29 Obtaining an expected rate for comparison Group (Age, sex, race, yrs) Workers pers-yrs in group US death rate Expected N deaths Observed N deaths W, M, 40 - 45, 1941-45 1,2340.11/ 100 pers.-yrs 1.36 1 W, M, 45 - 50, 1941-45 2,3120.15 / 100 pers.-yrs 3.47 3 ---etc.--- ---- --- Total137,745 924 765

30 End stage renal disease: Cumulative incidence within cohorts defined by age at diagnosis Standardized mortality ratios in renal disease children compared with national child mortality rates Example of using both cumulative incidence and incidence rates in the same analysis for different purposes Age5 yr10 yr15 yr 5 - 9 13% 21% 27% 10 - 14 12% 21% 30% 15 - 19 14% 21% 28% Therapy began 5 - 9 yrs 10 - 14 yrs 15 - 19 yrs 1963 - 72 236 111 52 1973 - 82 122 71 20 1983 - 92 30 37 19 McDonald et al., NEJM 2004

31 Another example of SMR: Is mortality higher after a fracture? Bluic et al. JAMA 2009

32 (2b) Comparing hip fracture incidence in different populations Per 100,000 person-years e Standardized to 1990 non-Hispanic white US population

33 per 1,000 person-years

34 Why Use Average Incidence Rates? 1.To calculate incidence from population-based disease registries 2.To compare disease incidence in a cohort with a rate from the general population OR to compare incidence in 2 or more populations 3.To estimate incidence of outcomes associated with exposures that might change over time in given individuals

35 (3) To estimate incidence of outcomes associated with exposures that might change over time in individuals Research question: In a Medicaid database is there an association between use of non- aspirin non-steroidal anti-inflammatory drugs (NSAID) and coronary artery disease (CAD)? How would you study the relationship between NSAID use and CAD?

36 Calculating stratified average incidence rates in cohorts For persons followed in a cohort some potential risk factors may be fixed but some may be variable –gender is fixed –taking medications or getting regular exercise are behaviors that can change over time Adding up person-time in an exposure category to get a denominator of time at risk is a way to deal with risk factors that change over time

37 Analysis of changing exposure and disease incidence Tennessee Medicaid data base, 1987-1998: are NSAIDs associated with CAD risk? Same person could both use and not use NSAIDs at different times over the 11 years Ray, Lancet, 2002

38 Analysis of changing exposure with average incidence rates Person-time totaled for using and not using NSAIDs; MI or CAD death outcome 181,441 periods of “new” NSAIDS use in 128,002 individuals; 181,441 periods of non-use in 134,642 individuals (matched by age, sex, and calendar date) A person can contribute to the denominator both for use and non-use but only after a 365 day “wash out” period between use and non-use

39 Analysis of changing exposure with average incidence rates Rate ratio = 1.01 Concluded no evidence that NSAIDS reduced risk of CHD events Ray, Lancet, 2002 Person-yrsCHDRate per 1000 pers-yrs Users275,5653,31312.02 Non-users257,0693,04911.86

40 Calculating Rates in STATA Declare data set survival data:. stset timevar, fail(failvar).strate gives person-years rate.strate groupvar gives rates within groups Example: Biliary cirrhosis time to death data.use biliary cirrhosis data, clear.stset time, fail(d).strate D Y Rate Lower Upper 96 747.04 0.1285 0.1052 0.1570.strate treat Treat D Y Rate Lower Upper Placebo 49 355.0 0.138 0.104 0.183 Active 47 392.0 0.120 0.090 0.160

41 Immediate Commands in STATA STATA has an option to use it like a calculator for various computations without using a data set. Called immediate commands. Example, to calculate the confidence interval around a person-time rate:. cii #person-time units #events, poisson E.g. 6 events occur in 10 person-years of follow-up:. cii 10 6, poisson 95% CI = 0.220 – 1.306

42 Instantaneous Incidence Rate So far, we have considered the “average” incidence rate for an interval The hazard function h( t) gives the instantaneous potential per unit time for the event to occur, given that the individual has survived up to time t.

43 Hazard Function ( Conditional Failure Rate) Numerator is a conditional probability, of the form: Probability that the event will occur in the time interval between t and t + Δt, given that the survival time, T, is greater than or equal to t.

44 Denominator is time

45 Instantaneous probability of failure (event)

46 Properties of Hazard Function

47 Hazard function for mortality in general population Years Information provided here versus an average incidence rate

48 Hazard Function in STATA Illustrate with this dataset, shown previously for calculating average incidence rate in STATA: Declare data set survival data:. stset timevar, fail(failvar).strate gives person-years rate Example: Biliary cirrhosis time to death data.use biliary cirrhosis data, clear.stset time, fail(d).strate D Y Rate Lower Upper 96 747.04 0.1285 0.1052 0.1570 Average incidence rate = 0.1285 deaths per person-year

49 Hazard function in Stata sts graph, hazard K-M survival curve for same data Average incidence rate = 0.13 deaths per person-year 10 yr cum incidence = 0.2375 More information in this plot

50 Hypoglycemia with Oral Antidiabetic Drugs Smoothed hazard estimate of hypoglycaemia over time by drug cohort. h (t) = hazard over time. Vlckova et al. Drug Safety. 32:409-418, 2009. Methods: …Observation time for each patient and incidence rate (IR) per 1000 patient- years of treatment for hypoglycaemia was calculated for each drug cohort. Smoothed hazard estimates were plotted over time...

51 Difference between an Incidence Rate and Cumulative Incidence Rate can be thought of as how likely an event is to happen at any moment in time Cumulative incidence is the result of applying that rate to a defined population for a specified period of time Velocity of a car versus distance traveled in a certain time Average incidence rate is calculated by using data from a time period, but it is just an average. Estimate is meaningful if the rate is constant over time (constant hazard), but difficult to interpret if the rate changes over time.

52 Illustration of Incidence Rate versus Cumulative Incidence The mortality rate in the U.S. population in 2001 was 855 per 100,000 person-years (or 0.855 per 100 person-years) If everyone alive at the beginning of the period were followed for 5 years, the cumulative incidence of death (if the rate held constant) would be 4.2% at 5 years; at 10 years it would be 8.2%.

53 Relationship between Incidence Rate and Cumulative Incidence A constant rate produces an exponential cumulative incidence (or survival) distribution If know the constant incidence rate, can derive the cumulative incidence/survival function or vice-versa from where S(t) = cumulative survival and F(t) = cumulative incidence e= 2.71828; = rate; t = time units

54 Constant Rate Incidence Rate Cumulative incidence

55 Effect of high and low constant incidence rates on cumulative incidence Cumulative Incidence at: Inc. Rate 1 year2 years5 years20 years 1 per 100 pers.-yrs. 0.01000.01980.04880.1813 25 per 100 pers.-yrs. 0.22120.39350.71350.9933

56 Relationship between hazard function and survival function in the special case of a Constant Hazard Rate where h(t) =

57 From average incidence rate to cumulative survival/incidence #1 S(t) = exp (-0.1285 deaths/person-yr * 10 yr) = 0.277 = cumulative survival over 10 yrs Cumulative survival (10 yrs) from KM: 0.237 Previous example: Biliary cirrhosis time to death data Average incidence rate = 0.1285 deaths per person-year

58 From average incidence rate to cumulative survival/incidence #2 Breast cancer incidence rate in Marin in 2000: 157.5 cases per 100,000 person-years What is the cumulative incidence of breast cancer, assuming this constant rate, over 20 yrs? S(t) = exp(-157.5 cases/100,000 person-yrs * 20 yrs) = 0.97 survival over 20 yrs F(t) = 1-0.97 = 0.03 incidence over 20 yrs

59 Survival and Hazard Functions

60

61 Cumulative incidence Incidence rate

62 Competing Risks When studying incidence of repeat cardiac bypass surgery after initial bypass surgery, how to handle deaths that occur before repeat surgery? More generally, how should deaths be handled in any study where the outcome is not death? Options for the deaths: –censor at time of death? Could do this with KM, but as we’ll see in next few slides, this has drawbacks. –exclude these persons entirely? –other?

63 Competing Risks Definition: In a study of incidence of some event of interest, a “competing risk event”: –precludes the occurrence of the event of interest (e.g., death precludes all outcomes) or –substantially impacts incidence of event of interest (e.g., prophylactic ovary removal in study of ovarian cancer) “Competing” because it can occur prior to event of interest Two flavors: –Occurs independent of event of interest (e.g., accidental deaths in study of brain cancer incidence) –Related to event of interest (i.e., its occurrence is influenced by factors which influence event of interest; e.g., death from COPD in study of lung cancer incidence)

64 Why not just censor at time of competing risk events as in Kaplan-Meier calculation? Competing risks events that are independent of event of interest –censoring means you believe that event of interest can occur after death –inferences do not pertain to our world’s reality Competing risks events that are related to event of interest –violates basic assumption of censoring in K-M, which is that it is non-informative

65 Solution for Competing Risk Events Instead of censoring, account for competing risk events explicitly as additional separate outcomes in the analysis Technique called “cumulative incidence approach” or “cumulative incidence function” In successive time intervals: –calculate joint probability of a) experiencing none of the events (either event of interest or any of competing events) up through beginning of interval and b) experiencing event of interest during the time interval –add up these joint probabilities within successive time intervals to get cumulative incidence of the event of interest Explained in Satagopan et al. Brit. J. Cancer 2004 (optional reading)

66 Implementing Cumulative Incidence Approach in Stata Download user-written file “stcompet” from internet –command: ssc install stcompet, replace Within survival analysis dataset, instead of failure variable having only two values (0 = censor; 1=event), make sure there are additional values for each competing event type –e.g., 0=censor; 1=event of interest; 2=death stset data, with special attention towards telling Stata which is the event of interest (e.g., if “outcome” is failure variable) –command: stset time, failure(outcome=1) Declare competing events & calculate cumulative incidence –command: stcompet cif=ci, compet1(2) Tease out event of interest-specific cumulative incidence –command: gen cuminc1 = cif if outcome==1

67 Implementing Cumulative Incidence Approach in Stata List the cumulative incidence function for event of interest –command: sort time list time cuminc1 if outcome==1 Graph the cumulative incidence function for event of interest –command: twoway line cuminc1 _t, connect(J) sort ytitle(Cumulative incidence) xtitle(Time) For details –command: help stcompet

68 Summary Points Incidence rate (or density) –E/NT –Not a proportion, time in denominator Average incidence rate can be calculated with individual or average population data. Allows: –Incidence estimates in large populations that are not fully enumerated –Comparison with population reference rates –Accumulation of time at risk for different exposure strata Instantaneous incidence (hazard) rate –Hazard function – insight into changes in rate during follow-up –Basis for proportional hazards models Assumptions for average incidence rate –Uniform rate during time period; Censored would have same experience as those remaining –Special situation of Competing Risks


Download ppt "Disease Occurrence II Main Points to be Covered Incidence rates (person-time incidence) “Average” incidence rate –Calculating “average” incidence rate."

Similar presentations


Ads by Google