Measuring covariate data_Presentation (November 14, 2007) 1 Measuring covariate data in subsets of study populations: Design options Jean-François Boivin, MD, ScD McGill University 19 August 2007
2
3 16 th International Conference on Pharmacoepidemiology Barcelona 2000
4 What about missing covariate data?
5 Do not research that topic Option #1
6 Conduct study without covariates Scientifically reasonable for certain questions Example: Sharpe et al Option #2
7 British Journal of Cancer 2002 The effects of tricyclic antidepressants on breast cancer risk Genotoxicity in Drosophila Comparison of antidepressants: –6 genotoxic vs 4 nongenotoxic Confounding unlikely
8 Option #3 “Confounding by other determinants was studied in analyses with data obtained by interviewing samples of subjects…”
9 List different sampling strategies: “Confounding by other determinants was studied in analyses with data obtained by interviewing samples of subjects…” a) ? b) ? c) ? d) ?
10
11
12 Two-stage sampling
13 Entire population (=truth) OR=0.5 OR=2.5 Obese Not obese All E+E- D+ D- 12, ,20010,400 22,20010,54032,740 2, , ,000 2, ,00010,100
14 Obese Not obese All E+E- D+ D- 22,20010,540 not available computerized databases 2, ,00010,100 D+ D-
15 Two-stage sampling
16 Obese Not obese All E+E- 250/ 2, ,000 10,100 32, D+ D- D+ D- D+ D- Two-stage sampling OR 1 biased OR 2 biased 250 x 250 = 1
17 White. AJE 1982 Walker. Biometrics 1982 Cain, Breslow. AJE 1988 Weinberg, Wacholder. Biometrics 1990 Weinberg, Sandler. AJE 1991 Statistical analysis; further design issues
18
19 Option 1: Option 2: Option 3: Option 4: No study No covariate measurement 2-stage sampling Case only measurement
20 Ray et al. Archives of Internal Medicine 1991
21 Cyclic antidepressants and the risk of hip fracture
22 E+E- All RR=0.5 RR= D+ D- D+ D- D+ D- All Not obese Obese RR=0.5 N 1 =? N 2 =? RR=0.5 N 3 =?N 4 =? RR= RR=0.5 N 1 =1,000 N 2 =1,000 RR=0.5 N 3 =1,000N 4 =1,000 RR=0.5 N 1 =1,000 N 2 =1,000 cross-product ratio =1 RR=0.5 N 3 =1,000N 4 =1,000 RR= RR=0.5 N 1 =1,000 N 2 =1,000 RR=0.5 N 3 =1,000N 4 =1,000 RR= Confounding: Quick review
23 Obese Not obese All D+ D- OR=0.5 OR= E+E- OR= ,500 OR=0.5 1,0003,000 OR= OR=0.5 cross-product ratio =1 OR=0.5 OR= Case-control study
24 Cyclic antidepressants and the risk of hip fracture
25 E+E- D+ Obese Not obese All D- D+ D- D+ D- 2, computerized database 20,000 10,100 22,200 10,540 medical record review 2, computerized database 20,000 10,100 22,200 10,540 2, ?? ?? 2, ,000 10,100 22,200 10,540 Covariate data on cases only
26 E+E- D+ Obese Not obese All D- D+ D- D+ D- 2, ?? ?? 2, ,000 10,100 22,200 10,540 OR 1 OR 2 assume OR 1 = OR 2 then: cross-product ratio =1 implies no confounding Covariate data on cases only
27 What if confounding seems to be present? Extensions
28
29 Option 1: No study Option 2: No covariate measurement Option 3: 2-stage sampling Option 4: Case only measurements Suissa, Edwardes. 1997
30 Confounder data on cases only Obese Not obese E+E- D+ D- 2, ? ? ?? Cross-product ratio =10 Confounding plausible D+ D-
31 Epidemiology 1997 Extensions of Ray’s method to presence of confounding Requires additional data from external sources
32 Smoker Nonsmoker All E+E- D+ D- Theophylline ,1544, % of 4, % of 4, % of 4,080 obtained from population survey % of 4,080 Confounding; no interaction
33 Extensions of Ray’s method to presence of interaction Requires further additional data from external sources Suissa, Edwardes. 1997
34 No interaction OR=0.5 Obese Not obese E+E- D+ D- 12, ,20010,400 2, , ,000
35 Option 1: No study Option 2: No covariate measurement Option 3: 2-stage sampling Option 4: Case only measurements Suissa, Edwardes Multi-stage sampling Partial questionnaires Propensity score adjustments Others:
36
37
38 Monotone missingness
39 Wacholder S, et al.
40 Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n Cov Subject … n
41 Wacholder S, et al. Restricted to a small number of discrete covariates
42 Methodologic research Stürmer et al. AJE 2005, 2007 Propensity score calibration
43 Summarizes information about several covariates into a single number Used for matching, stratification, regression Propensity score
44 Main cohort: selected covariates -“error-prone” scores estimated -regression coefficients estimated Sample: additional covariates -gold standard scores -regression calibration Advantage: multivariable technique Stürmer et al. 2005
45 “Until the validity and limitation of… [propensity score calibration] have been assessed in different settings, the method should be seen as a sensitivity analysis.” Stürmer et al. 2005
46
47
48 Stage 1: 278 cases in 4561 pregnancies Stage 2: 244 cases non cases
49
50 “Relatively few examples of two-and three- phase sampling designs for case-control studies have appeared to date in the epidemiologic literature.This is unfortunate, because the stratified designs are easy to implement and can result in substantial savings.” NE Breslow (2000)
51 Consent for second-stage interviews: Cases: 49% Controls: 39%
52