1 Two-stage sampling JF Boivin Version 14 November 2007 S:\BOIVIN\695\Winter 2007\Two-stage Sampling.ppt
2 1980s-1990s: Progress in use of administrative drug databases
3 Advantages Large Population-based Valid prescription data Long-time periods
4 Disadvantages Missing data on certain outcomes Temporal sequence not always clear Glucocorticoids cataracts Cataract surgery glucocorticoids Lack of data on confounders
5 NSAIDs and breast cancer
6 Previous research Poor exposure data Dose Duration Self-reports Small numbers Short follow-up Inadequate control of confounding
7 NSAIDs and breast cancer Cases: Saskatchewan cancer registry Controls: Saskatchewan population Drug exposure: 15 yr of computerized information Missing: - Over the counter drugs - Other confounding factors: Menarche Menopause Pregnancies Obesity
8 Entire population (= truth) Obese E+ E−E− cancerno cancer Not obese E+ E−E− All E+ E−E− OR= OR=2.5
9 Obese E+ E−E− cancerno cancer Not obese E+ E−E− All E+ E−E− not available computerized databases
10 What to do about missing confounder data?
11 Option #1 Do not conduct research on that topic
12 Option #2 Cohort or case-control study without data on confounder Obese women E+ E−E− cancerno cancer Not obese E+ E−E− All women E+ E−E− ? ? ? ? ? ? ? ?
13 Advantages Cheaper May be scientifically reasonable for certain questions
14 Option #3 Collect covariate data on a sample of the study subjects two-stage samples three-stage samples partial questionnaire case series only etc.
15 Two-stage sample Sampling approaches: simple random balanced etc.
16 Two-stage balanced design Obese E+ E−E− no cancer Not obese E+ E−E− All E+ E−E− cancer / (I)
17 White JE. A two-stage design for the study of the relationship between a rare exposure and a rare disease. AJE 1982 Cain KC, Breslow NE. Logistic regression analysis and efficient design for two-stage studies. AJE 1988
18 Consent for interviews Cases:49% Controls : 39% (Sharpe et al. Saskatchewan study)
19 Other related sampling designs three-stage sampling partial questionnaire confounder data on cases only
20 Obese E+ E−E− cancerno cancer Not obese E+ E− All E+ E− medical record review computerized databases Confounded data on cases only ? ? ? ?