Research Designs Commonly Used In Epidemiology. One of the basic concepts in research designs which are trying to discern cause is that we have to make.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Comparing Two Proportions (p1 vs. p2)
Deriving Biological Inferences From Epidemiologic Studies.
Study Designs in Epidemiologic
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
CHAPTER 24: Inference for Regression
Chance, bias and confounding
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Confidence Intervals for Proportions
Who and How And How to Mess It up
Sampling Distributions
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Case-Control Studies. Feature of Case-control Studies 1. Directionality Outcome to exposure 2. Timing Retrospective for exposure, but case- ascertainment.
Sample size calculations
Epidemiological Study Designs And Measures Of Risks (2) Dr. Khalid El Tohami.
Chapter 19: Confidence Intervals for Proportions
Sample Design.
Cohort Study.
Multiple Choice Questions for discussion
Lecture 8 Objective 20. Describe the elements of design of observational studies: case reports/series.
Epidemiologic Study Designs Nancy D. Barker, MS. Epidemiologic Study Design The plan of an empirical investigation to assess an E – D relationship. Exposure.
Chapter 9 Large-Sample Tests of Hypotheses
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Experimental Design making causal inferences Richard Lambert, Ph.D.
Study Designs in Epidemiologic
Sample-Based Epidemiology Concepts Infant Mortality in the USA (1991) Infant Mortality in the USA (1991) UnmarriedMarriedTotal Deaths16,71218,78435,496.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Epidemiological Study designs
Inductive Generalizations Induction is the basis for our commonsense beliefs about the world. In the most general sense, inductive reasoning, is that in.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Leicester Warwick Medical School Health and Disease in Populations Case-Control Studies Paul Burton.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Causal relationships, bias, and research designs Professor Anthony DiGirolamo.
AP Statistics Chapter 24 Comparing Means.
Issues concerning the interpretation of statistical significance tests.
Overview of Study Designs. Study Designs Experimental Randomized Controlled Trial Group Randomized Trial Observational Descriptive Analytical Cross-sectional.
1 Basic epidemiological study designs and its role in measuring disease exposure association M. A. Yushuf Sharker Assistant Scientist Center for Communicable.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Chapter 13: Inferences about Comparing Two Populations Lecture 8b Date: 15 th November 2015 Instructor: Naveen Abedin.
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
 When every unit of the population is examined. This is known as Census method.  On the other hand when a small group selected as representatives of.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Instructor Resource Chapter 13 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Sampling Concepts Nursing Research. Population  Population the group you are ultimately interested in knowing more about “entire aggregation of cases.
Statistics 19 Confidence Intervals for Proportions.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.
Sampling and Sampling Distribution
Present: Disease Past: Exposure
Lecture Slides Elementary Statistics Twelfth Edition
بسم الله الرحمن الرحيم COHORT STUDIES.
Virtual University of Pakistan
Critical Appraisal วิจารณญาณ
HEC508 Applied Epidemiology
Presentation transcript:

Research Designs Commonly Used In Epidemiology

One of the basic concepts in research designs which are trying to discern cause is that we have to make sure our selection of subjects and our selection of dependent and independent variables are consistent with the biological basis of disease. In 1965 A. B. Bradley proposed a set of 9 criteria which must exist in order to conclude “cause”. They include: 1. Strength of association 2. Consistency 3. Specificity 4. Temporal relationship 5. Dose effect 6. Experimental evidence 7. Biological plausibility

In the previous example 200 births were randomly selected from the population, producing the following results: Joint Probability P (D & E) =0.035 Marginal ProbabilityP (D) = 0.07 Conditional Probability P (D | E) =0.119 P (D | E) RR: ———— =2.39 P (D | E) P (D | E) OR ———— ÷ ———— =2.58 P (D | E) P (D | E) P (D | E) P (D | E) P (D) - P (D | E) P (D) - P (D | E) AR ———————— =0.29 P (D) P (D) ER P (D | E) - P (D | E)=0.069 Because all births in the population had an equal chance of being selected; joint, marginal, and conditional probabilities can be estimated with reasonable confidence.

The type of research design illustrated by the previous examples is a population-based design (often called a cross-sectional study). This design is very useful for obtaining estimates of the different probabilities discussed and has a very simple procedure. 1. Obtain a simple random sample of size n from the study population 2. Measure the presence or absence of both D and E for all of the sampled individuals. sampled individuals. 3. Calculate away...

When the population-incidence of E (to the hypothesized) risks for the disease in question is pretty low, then population-based designs may not yield sufficient numbers of people with the E in order to calculate risk profiles associating E and D. One way to get around this is to conduct a cohort study. This type of research design is primarily concerned with defining two sub- populations based on E (or level of E) within the target population; creating two (or more) cohorts – sub-groups with one defining common characteristic. Basic Procedures: 1. Identify two subgroups of the population based on the presence or absence of E (E and E) of E (E and E) 2. Take a separate random sample of equal n from each of the two subgroups 3. Measure the presence or absence of D in both random samples 4. Calculate away...

An example of a cohort study might entail defining our E and E as unmarried and married, respectively and then randomly selecting 100 birth records for each category of exposure. We then determine the presence or absence of our variable D for each record; possibly producing the following data set: Birthweight Birthweight LowNormalTotal Unmarried Married Total Because the number of “married” and “unmarried” birth records are equal (and this is certainly NOT representative of the population) neither joint probabilities nor marginal probabilities for the population can be estimated from this sample. Only conditional probabilities can be estimated because only associations between D or D and E or E will be representative of the population. Because AR includes a marginal probability [P (E)], it cannot be calculated either. Only RR, OR, & ER can be calculated with this type of design.

Birthweight Birthweight LowNormalTotal Unmarried Married Total So; using the data obtained from a COHORT design, we can calculate the following sample statistics (and if I wasn’t so lazy I also would have calculated confidence intervals for each): P (D | E) RR:———— = (12/100) / (5/100) = 2.40 P (D | E) P (D | E) OR ———— ÷ ———— = [(12/100) / (88/100)] / [(5/100 / (95/100)] = 2.59 P (D | E) P (D | E) P (D | E) P (D | E) ER P (D | E) - P (D | E) = (12/100) - (5/100) = 0.070

When the incidence of D in the population is pretty low, then population- based designs and E-based cohort designs may not yield sufficient numbers of people with the D in order to calculate risk profiles associating D and E. One way to get around this is to conduct a case-control study. This type of research design defines two sub-populations based on D (D and D) within the target population. Basic Procedures: 1. Identify two subgroups of the population based on the presence or absence of D (D and D) of D (D and D) 2. Take a separate random sample of equal n from each of the two subgroups 3. Measure the presence or absence of E in both random samples 4. Calculate away...

An example of a case-control study using the same birth data would entail defining our D as low birthweight and D as normal birthweight and then randomly selecting 100 birth records from each category of disease. We would then measure our E (unmarried and married) for each record in both of the random samples; possibly producing the following data set: Birthweight LowNormalTotal Unmarried Married Total Because the number of “low” and “normal” birth records are equal (and this is certainly NOT representative of the population) neither joint probabilities nor marginal probabilities for the population can be estimated from this sample. Only conditional probabilities can be estimated because only associations between D or D and E or E will be representative of the population. In addition, only those associations which condition on disease outcome can be estimated; leaving OR as the only (accurate) calculation possible (this is because the frequency of D in the sampling process was decided by the investigator and any P of D regardless of the conditioned variable will not be relevant to the population).

So, we can calculate the OR: OR = [(50/100) / (50/100)] / [(28/100) / (72/100)] = 2.57 As mentioned previously, if the incidence of D is rare in both exposed and unexposed populations (you know whether this is true from population-based studies) then OR is a close approximation of RR. With some clever rearrangements of the formulas (and using Bayes’ formula twice) AR can be defined as a conditioned function which incorporates RR, therefore we can still get a close estimate of AR (by using the OR in place of RR; rearrangements of formulas not shown). Thus it is still possible to calculate (within defined confidence limits) estimates of the the OR, RR, and AR parameters with a simple case-control design. (Estimates of RR and AR will not be quite as accurate as with population-based designs but then, that is just a limitation to this type of design. Birthweight LowNormalTotal Unmarried Married Total

Although there may be limitations to the simple case-control design, in terms of calculating estimates of the population’s risk profiles, if one is very clever about selecting controls, then one can avoid the major limitations. Probably the most important limitation to simple case-control designs is that the majority of diseases that we are concerned about are not really rare – violating the rarity assumption that is necessary to be able to estimate RR from OR (breast cancer, lung cancer, CAD,...) yet, in order to have sufficient numbers of people in our studies, case-control designs are by far the most efficient and cost-effective to use. A common characteristic of these diseases is that the interval of risk is long; it takes a long time to produce the disease (many years). To put it another way, as the duration of exposure increases, there is an increase in total risk for the disease – resulting in a relatively high incidence in older people. This aspect of the disease process can be taken advantage of by carefully selecting controls.

One way to handle this is to perform what is called a risk-set sampling or density sampling. For each case, one or more controls are selected. The clever part is that the controls are stratified on the basis of time. (Think of it as a form of time-standardized incidence - exposure rate, similar to the age- standardized mortality rate.) Because of the long duration of the risk period, if the period is divided into small time-frames (for example, every 5 years for a cumulative total of 30 years) then you produce a series of small sub-groups of cases-controls, one for each time period (6 separate case- control groups for this example, each with a separate set of risk profile calculations – just LINKED over time). Notice that this is similar to producing a series sub-groups with levels of exposure as previously discussed for linear regression analysis (a series of separate risk/disease groups LINKED by level of exposure). This time it is time duration of exposure and not level of exposure. Because the separate case-control groups actually comprise a series of sub- sets or sub-samples from the total sample, this version of a case-control design is called a nested case-control design (there are a variety of variations of this nested design that we won’t bother with).

The neat thing about these designs is that at each individual time- frame, the incidence of disease would be rare so the rarity assumption is NOT violated and therefore the various components of the risk profiles can be calculated. This risk over time also introduces a new statistical term; Relative Hazard (RH) – the same basic calculations as RR except time duration of exposure (5 years vs. ten years vs. 15 years...) is the important concept rather than the level of exposure (10 cigarettes smoked/day vs. 20 vs. 30 vs ). As in case-control designs for rare diseases where OR estimates RR, the calculated OR in this version of a nested design estimates RH. This multiple groups over time design allows for a linear regression analysis in order to produce those cool graphs showing associations between P of D and time. With this design, the change in P for each increment of time would represent EH in the same way that each change in P for each level of exposure represents ER.

Variations on this nested case-control concept allow for a variety of sophisticated studies (none of them discussed here) that are capable of controlling for different kinds of population factors that limit the utility of simple case-control designs – leading to a variety of nested case-control designs being the dominant forms of research design in epidemiology. Of course, if you have records of the entire population at your disposal (often possible in those countries with single-payer health- care systems), then population-based case-control designs can be used – yielding much more power... (power refers to the ability to observe differences or associations with a high degree of confidence). This leads to the last major topic... Exactly how do we determine our “confidence” level (as opposed to confidence limits) in “observing” differences or associations. That is the realm of statistical analysis and how statistical analysis estimates the probability that your observed associations or differences happened by random chance or by exposure (treatment?).

Papers for discussion next time: (I’m sure you have already read them long ago, so a quick review should suffice) 1.Friedenreich (both of ‘em) 2.Thune 3.Dees