Bias in Epidemiology Wenjie Yang
“The search for subtle links between diet, lifestyle, or environmental factors and disease is an unending source of fear but often yields little certainty.” ____Epidemiology faces its limits. Science 1995; 269:
Residential Radon — lung cancer Sweden Yes Canada No
DDT metabolite in blood stream Breast Cancer Abortion Maybe yes,maybe no
Electromagnetic fields(EMF) Canada & France: Leukemia America: Brain Cancer
What can be wrong in the study? Random error Results in low precision of the epidemiological measure measure is not precise, but true 1 Imprecise measuring 2 Too small groups Systematic errors (= bias) Results in low validity of the epidemiological measure measure is not true 1 Selection bias 2 Information bias 3 Confounding
Random errors
Systematic errors
Errors in epidemiological studies Error Study size Systematic error (bias) Random error (chance)
Random error Low precision because of –Imprecise measuring –Too small groups Decreases with increasing group size Can be quantified by confidence interval
Bias in epidemiology 1 Concept of bias 2 Classification and controlling of bias 2.1 selective bias 2.2 information bias 2.3 confounding bias
Overestimate? Underestimate?
Random error : Definition Deviation of results and inferences from the truth, occurring only as a result of the operation of chance.
Definition: Systematic, non-random deviation of results and inferences from the truth. Bias:
2 Classification and controlling of bias Assembling subjects collecting data analyzing data Selection bias Information bias Confounding bias Time
VALIDITY OF EPIDEMIOLOGIC STUDIES Reference Population Study Population External Validity ExposedUnexposed Internal Validity
2.1 Selection bias definition Due to improper assembling method or limitation, research population can not represent the situation of target population, and deviation arise from it several common Selection biases
( 1 ) Admission bias ( Berkson’s bias) There are 50,000 male citizen aged years old in a community. The prevalence of hypertension and skin cancer are considerably high. Researcher A want to know whether hypertension is a risk factor of lung cancer and conduct a case- control study in the community.
case control sum Hypertension No hypertension sum χ 2 =0 OR=(1000×36000)/(9000 ×4000)=1
Researcher B conduct another case-control study in hospital of the community.(chronic gastritis patients as control).
No association between hypertension and chronic gastritis
admission rate Lung cancer & hypertension 20% Lung cancer without hypertension 20% chronic gastritis & hypertension 20% chronic gastritis without hypertension 20%
case control sum hypertension 200 (1000) 200 (2000) 400 No hypertension 800 (4000) 400 (8000) 1200 sum 1000 (5000) 600 (10000) 1600
case control sum hypertention No hypertention sum χ 2 =10.58 P<0.01 OR=(40×200)/(100×160)=0.5
(2)prevalence-incidence bias ( Neyman ’ s bias)
Risk factor A Prognostic B
A case control sum exposed unexposed sum χ 2 =13.33, P<0.01 OR=3
Risk Factor A Prognostic Factor B
Risk Factor A Prognostic Factor B
A case control sum exposed unexposed sum χ 2 =13.33, P<0.01 OR=3
B case control sum exposed unexposed sum χ 2 =8.47 P<0.01 OR=2.0
( 3 ) non-respondent bias
Survey skills to sensitive question Abortion
yes no
Abortion Yes No number of subjects:N proportion of red ball:A numbers who ’ s answer is “ 1 ” :K Abortion rate: X
Abortion Yes No number of subjects:N=1000 proportion of red ball:A=40% numbers who ’ s answer is “ 1 ” :K=540 Abortion rate: X=? N*A *X+ N*(1-A) *(1-X)=K
( 4 ) detection signal bias Endometrium cancer Intake estrogen
( 4 ) detection signal bias 50%
Early stage Terminal stage Medium stage
50% Early stage:90% Medium stage:30% Terminal stage 5%
Intake estrogenUterus bleed Frequently check Early findout
( 5 ) susceptibility bias : Physical check drop out E UE
2.2 Information Bias
( 1 ) recalling bias
( 2 ) report bias
( 3 ) diagnostic/exposure suspicion bias
(4) Measurement bias
2.3 Confounding bias Definition: The apparent effect of the exposure of interest is distorted because the effect of an extraneous factor is mistaken for or mixed with the actual exposure effect.
Properties of a Confounder: A confounding factor must be a risk factor for the disease. The confounding factor must be associated with the exposure under study in the source population. A confounding factor must not be affected by the exposure or the disease. The confounder cannot be an intermediate step in the causal path between the exposure and the disease.
2.3.2 Control of confounding bias 1 ) restriction 2) randomization 3) matching 1 In designing phase
2 In analysis phase 1) Stratified analysis (Mantal-Hazenszel’s method) 2) Standardized 3) logistic analysis
A case-control study of Oral contraceptive to myocardial infarction OC MI control sum sum χ 2 =5.84,P<0.05 cOR=1.68 OR 95C.I.(1.10,2.56)
Is age a potential confounding factor?
Age distribution in 2 group age ( year ) MI proportion ( % ) case proportion ( % ) OR 25~ ~ ~ ~ ~ 合计
OC exposure proportion in different age groups ( % ) OC exposure in MI Age ( year ) + - sum exposure Proportion(%) OC exposure in control + - sum exposure Proportion(%) 25~ ~ ~ ~ ~ sum χ 2 =38.99 P<0.01 χ 2 = P<0.01
Stratified analysis age ( year ) OCMIControlOR 25~ OR95%C.I. 7.2 (1.64,31.65) 30~ (3.96,19.98) 35~ (0.53,4.24) 40~ (1.36,10.04) 45~ (1.26,12.10)
Woolf’s Chi-square test
χ 2 =6.212 P<0.05, ν=5-1=4 Incorporate OR
OR MH =3.97
Analytic epidemiology : Case-control study; HIV “carried” by mosquitoes ? HIV + Controls Mosquito exposure No exposure O.R. = 5.38
Analytic epidemiology : stratification for confounding ; Case-control study. HIV “carried” by mosquitoes ? No exposure Mosquito Exposure Females HIV Males HIV controls Mosquito Exposure O.R. = 1.21 O.R. = 1.27