Measures of disease occurrence October Epidemiology 511 W. A. Kukull
Defining disease (health events) What disease features do cases have in common? What disease features make cases different from non-cases? How can we observe disease features –Interview –Exam –Lab test or autopsy
Observing onset Clinical diagnosis: hx, signs, symptoms Pathological diagnosis: examination of biological specimens, e.g., biopsy, labs Insidious onset Abrupt onset Recurrent: many “onsets” possible Persistent/Chronic
Defining a Population What characteristics do members of the chosen population have? How are member characteristics different from non-members? –Geography: residents of County –Individual features: 75 – 79 y.o. men –Time period:
Population and time Closed population: once defined, no new persons may enter. –Disease occurrence and death reduce pool –Airline passengers on a non-stop Open population: new members may be added, loss may occur –Non-diseased persons may be lost –Boeing machinists employed
Who is “at-risk” ? Susceptible: the probability you could get the disease is NOT zero. –Does not mean you are especially likely to get the disease, or suffer the health event. Non-Susceptible: the probability you could get the disease IS Zero. –Persons who have had their appendix removed are non-susceptible to future appendicitis
Goals Define disease (or health event) Define population Find all cases in the population –Existing cases –New cases Create measures of case frequency per population
Counts: “Numerator data” Number of people with the disease “ We report 5 cases of Parkinson’s disease in year olds” Numerator data: often hard to interpret without knowing the size of the population giving rise to the cases –Very rare or unusual occurrences
Cases per year
Problems determining disease Diagnostic criteria Poor recognition Survey errors –respondents –interviewers Hospital data not meant for research
Creating a frequency measure: Critical questions Count cases in relation to the population at-risk (per time) If each of the cases had not developed disease, would they have been in the population (denominator)? If each of the non-cases in the population had developed disease would they have been included as a case? The answers should be “yes”
Mortality for selected causes per 100,000 population (hypothetical data)
Prevalence How common is the disease today? EXISTING CASES at a specified time / persons in defined population at that time “47% of persons over 85 years old, in East Boston were demented, in 1990.” A “snapshot” view of the disease at a single point in time (a.k.a. point prevalence) NOT a measure of risk and NOT a Rate
Incidence: counting the new cases that occur with time Cumulative Incidence (a “risk”) –NEW CASES / initial pop-at-risk –The incidence of nasal papilloma in Seattle was 6 per million population in 1984” Incidence rate (a “rate”) –NEW CASES / at-risk time – Stroke incidence is 5 per 100,000 person-years
Prevalent case bias Longer disease duration increases chance of selection Time Cross-sectional Sample
Mortality: an incidence-like measure [Deaths from disease X in 19xx] divided by [midyear population] “the annual CHD mortality rate dropped from 370 per 100,000 in 1968 to 270 per 100,000 in 1975 Risk of dying from disease X, during the time interval, for someone in the population
Disease Frequency Relationships P = I * D –prevalence = incidence times average duration of the diseased state –Robust when I and D are stable and P is <10% M = I * C – Mortality = incidence times Case Fatality Rate –this holds when I and C are approximately stable over time
Example: Prevalence, incidence and duration Where is disease risk highest?
Comparing measures (“Rate” used in a broad sense) Crude Rates –overall, summary rate for a population of comparison group –may differ between populations due to other factors e.g., age distribution –usually not used for inter-population comparisons Specific Rates: can “always” be compared
Standardized Rates Alternative to Crude rate when a single summary rate is needed for comparison –example: when age distributions are different and disease is age related –“ficticious” summary rates are computed reflecting state “if the populations had the same age distributions”
Example (Direct) (After Jekel, Katz & Elmore, 2001) Age Population Size: “A” Age- Specific rate Expected number Population size “B” Age Specific rate Expected number Young 1, , Middle aged 5, , Older 4, , Total 10, , CR= ,000 = 4.51% CR= ,000 =3.08%
Example (Direct) Standard Population Age Population Size: A+B Age- Specific rate: “A” Expected number Population size A+B Age Specific rate”B” Expected number Young 5, , Middle aged 10, , Older 5, , Total 20, , Standardized rate = ,000 = 3.03% S tandardized rate= ,000 =6.05%
Direct Standardization (there will be an exercise in homework) Choose a “standard population” Multiply (age)-specific rates from pop#1 by standard pop age groups; repeat for pop#2 Sum the pop#1 numbers and divide by total standard population; repeat for pop #2 Compare! This adjusts for the confounding effect of age
Indirect Standardization An alternative method of standardization – when you know the total deaths and you know your age distribution but you don’t know age- specific rates Apply (age)-specific rates from a standard population to compute “expected” deaths [Observed deaths] / [expected deaths] *100 = SMR (standardized mortality ratio)
Direct and Indirect Standardization
Summary rates Magnitude depends on choice of standard population Give “what if” comparison between groups Specific rates are usually preferable (and are compare-able)
Proportional Mortality [# Deaths from a specific cause] divided by [all cause deaths] for a given time period Example: The proportion of all deaths (in NYC males 15-25) that were due to homicide in 1998 This is not a risk nor rate; the denominator is all deaths.
Proportions of all death due to specific causes (hypothetical data)
Proportionate Mortality Ratio PMR= [observed deaths in population A] / [expected deaths based on the proportion in the population B] Sometimes seen in occupational studies
Proportional mortality and PMR Often used when you don’t know the number of persons in the population Frequently used in Occupational Studies Can be Misleading –if all cause death rate differs; cause specific rates can differ greatly but proportionate mortality may stay the same
PMR In Bantu laborers in South Africa, 91% of cancer deaths were due to liver cancer Usually liver cancer accounts for about 1% of cancer deaths Therefore Bantus have an unusually high liver cancer death rate
Example: Mortality per 100,000 in 19xx (After MacMahon&Trichopoulos) PMR overstated excess of liver Ca in Bantu and did not reveal great difference at other sites
Sources of Morbidity Data Disease registries Insurance Plans State L&I Medicare/ HCFA; VA, armed forces –CDC web sites, MMWR Hospitals Industries, Schools Surveys and specific studies
Sources of Mortality Data US Vital Statistics State Vital Statistics Individual death certificates Disease registries Health maintenance organizations cdc.wonder.gov
Causes of death seen on death certificates (after Gordis) A mother died in infancy Deceased had never been fatally sick Died suddenly, nothing serious Went to bed feeling well, but woke up dead Died suddenly without the aid of a physician Cardio-Respiratory arrest
Rate confusion “Rates” loosely used includes: proportions, ratios, risk and instantaneous rate ( D t) Proportions include the numerator in the denominator (e.g., prevalence is a proportion but not a risk nor a rate) Ratios: numerator and denominator may be different groups e.g, male/female ratio
Rates and Risks Rate: –denominator in person-time; time must be part of the measure –average population during the observation time Risk: –result of rates that prevailed over a period –denominator: persons at-risk at beginning; a closed population followed over time –time is not a dimension but used descriptively to specify period of observation
Incidence Density and Cumulative Incidence ID = [new cases] / [person-years] –technically the rate CI = [new cases] / [initial pop-at-risk] –the cumulative effect of the ID on pop-at-risk over a specified time period –technically a risk CI t = 1 - e -ID (t) –to estimate the cumulative effect of a rate [ID] on a population after “t” years ( units of time)
Example Calculation CI t = 1 - e -ID (t) Where: e = … base of natural logs (or just push the ‘e’ button on your calculator) ID = incidence density rate (=124.7 per 1000) t = years of observation (2, 5, 10 or 20) So, e is raised to the “power” [ -(.1247)(2)] Then subtracted from 1 to yield CI
Example: Constant mortality rate of per 1000 person-years (ID). What is cumulative risk (CI) at 2, 5, 10 and 20 years [CI t = 1 - e -ID (t) ] Number of YearsCumulative Risk of Death (22%) (46%) (71%) (92%)