Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II

Similar presentations


Presentation on theme: "Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II"— Presentation transcript:

1 Lecture 14: Case-control studies: further design considerations and analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II Department of Public Health Sciences Medical University of South Carolina Spring 2015

2 Control groups Population, hospital, neighborhood, friends
If control group is over-represented with exposed people: Odds ratio will be under-estimated May be problem with hospital controls Low participation rates May be a particular problem with population-based control groups or random-digit dialing

3 Control groups Recall bias
May be strong in some studies: e.g. mothers who delivered a baby with congenital malformation Good control group might be mothers who delivered babies with other malformations

4 Control groups Sometimes no single control group is obviously best
Useful to have more than one type of control group analyzed Test robustness/validity of observed association Could be performed in a single study, or across different studies by different investigators Example next slide

5 Compare control groups
Estrogen replacement therapy and risk of endometrial cancer Relatively strong associations seen when using controls recruited through random-digit dialing, or several groups of hospital controls One study showed less of an association, using hospital controls admitted for dilation and curettage (D&C) Which comparison is more valid?

6 Compare control groups
D&C controls might be subject to similar detection bias compared to endometrial cancer cases Using estrogen replacement therapy → higher likelihood of vaginal bleeding, leading to: Being diagnosed with endometrial cancer (case) Being recommended to have a D&C (control) Similar selection pressures → more valid measures of association?

7 Compare control groups
To try to settle the issue: Case-control study was performed using three simultaneous sets of controls

8 Compare control groups
Outcome: results strengthened view that ERT led to endometrial cancer Dose-response relationship with duration seen for hospital (GYN) and community controls Bleeding was associated with estrogen use only among the D&C controls (not among cases or gynecology controls) Conclusion: D&C control group (but not case group) was overrepresented with exposed women

9 WHI RCT Follow-up to this issue: Women’s Health Initiative
Estrogen plus progestin: 19% decrease in endometrial cancer (progestin cancels harmful effects of unopposed estrogen in the uterus) But: E+P associated with 5-fold increase in women needing endometrial biopsies (E+P causes bleeding) Estrogen-alone arm: no effect on heart disease (contrary to cohort studies); stopped early (March 2004) because of increased risk of stroke

10 Case-control studies: exclusions
Subjects may be excluded to maximize the validity of comparing cases and controls Exclusion criteria should be same for cases and controls (gender, age, medical conditions, etc.) If disease is in a removable organ (e.g. gallbladder, uterus, appendix, etc), controls without that organ should be excluded (they were not at risk for the disease) Cases and controls with no likelihood of exposure should be excluded (e.g. males in a study of OC)

11 Case-control studies: exclusions
Cases and controls with no likelihood of exposure should be excluded (e.g. males in a study of OC and bladder cancer) If included -- higher sample size may increase precision But: the reason for being at zero risk of exposure (i.e. being male) may be related to another imperfectly measured risk factor (e.g. cigarette smoking or occupational exposures) Result: residual confounding; or males can be placed in a separate stratum that would provide no useful information (no cases or controls exposed) Conclusion: unwise to include an unexposed subgroup in the analysis (often impossible to know how [or whether] lack of exposure is related to disease)

12 Case-control studies: exclusions
Cases and controls who have had their disease for more than 6 months or a year should be excluded (better to have incident cases and controls) Limit to English speakers? Can translate and back-translate a questionnaire to ensure reliability Decision affects generalizability; may be dependent on local population characteristics Questionnaire language can be included as a covariate in the analysis

13 Analysis of case-control studies: control of confounding
Matching (design phase) Stratification and summary estimates, or multivariate modeling (analysis phase) Mixture Match on some variables, control for others in analysis Match loosely on some variables, control for them more tightly (and control for other variables) in analysis

14 Analysis of case-control studies: control of confounding
Analysis: Mantel-Haenszel summary odds ratios can be calculated for stratified 2x2 tables Alternative for more complex situations: logistic regression Unmatched data: unconditional logistic regression Matched data: conditional logistic regression Both use maximum likelihood estimation to derive coefficients (β) for the intercept and predictor variables

15 Unconditional logistic regression
The model always yields values for the probability of disease between 0 and 1 Multiplicative model Assumes that individual variables multiply the odds of disease by an amount that is the same regardless of the value of other variables Coefficient is for one-unit increase in exposure variable

16 Logistic regression: example
Case-control study of ovarian cancer 62 cases: aged from 6 Connecticut hospitals 1068 unmatched controls: women aged 45-74, admitted for surgery not related to gynecology Main risk factors: parity and OC use Age: potential confounder

17

18 Alternative: code age categorically

19 ORs for parity and OC use are robust to age coding strategies Model 1
OR for OC use: e(-.604) = 0.55 LCL for OC use: e(-.604-(1.96*.623)) = 0.16 UCL for OC use: e(-.604+(1.96*.623)) = 1.9 Log odds of ovarian cancer risk in a 43-year old nulliparous woman who used OCs: β0 + (0.016*43) + ((-.604)*1)

20 -2 log likelihood The “deviance” between nested models is twice the log likelihood ratio comparing the nested model to the full model. “Log likelihoods” can be subtracted and the difference multiplied by 2 to obtain the deviance Deviance is distributed as a chi-square with degrees of freedom equal to the number of extra variables in the full model

21 -2 log likelihood Objective method for comparing 2 nested models
SAS provides “-2 log likelihood” for each model: (difference between models)*2 is the deviance Note: SAS provides -2 log likelihood for “intercept alone” and “intercept plus covariates” Run PROC LOGISTIC twice for 2 nested models; obtain -2 log likelihood for each model; compare to obtain deviance

22 Example 2: matched c-c study
Cases: newly diagnosed prolapsed lumbar intervertebral disc Individual matching on gender, age, and hospital service or radiologist’s office 217 cases and 217 controls One risk factor of interest: driving a motor vehicle

23 Example 2: matched c-c study

24 Control for suburban/city residence

25 Conditional logistic regression
Model 1 is nested within model 2 Model 1 constrains the coefficient for place of residence to be equal to 0

26 Deviance: compare fit of 2 models
Model building can involve many alternatives depending on theoretical and other considerations

27 Effect modification True “full model” will include interaction terms
Maximum flexibility, maximum degrees of freedom used Both “main effect” terms and interaction term must be in model simultaneously, to allow for people with one factor, the other factor, or both (easiest to understand if all are coded 0/1, but the model works the same way regardless of coding)

28 Variable coding Higher order terms? (e.g. square) Interaction terms?
May allow better fit/prediction if curvilinear Definitely will complicate interpretation Interaction terms? May allow better fit/prediction Definitely will complicate interpretation, especially if you fit multiple interactions

29 Variable coding Dummy variables? May improve fit/prediction
May make interpretation easier Need large enough sample size Need to choose categories carefully

30 Variable coding Alternative variable coding?
Continuous, ordinal, or categorical variables may be coded differently Single variables can be constructed to reflect multiple measures Need to base your decisions on theory, published evidence, scientific reasoning, and the limitations of your data


Download ppt "Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II"

Similar presentations


Ads by Google