Bias Thanks to T. Grein
The truth is out there ….. Agent Mulder
Errors in epidemiological studies Two broad types of error Random error Systematic error (or: Bias) Definition of Bias: Any systematic error in an epidemiological study that results in an incorrect estimate of the association between exposure and risk of disease
Errors in epidemiological studies Random error (chance) Systematic error (bias) Study size Source: Rothman, 2002
Should I believe my measurement? Grocery store A Legionella OR = 11,6 True association causal non-causal Chance? Confounding? Bias?
Categories of bias Selection bias Information bias Confounding
Selection bias Errors in the process of identifying the study population Non-random selection of subjects related to their case/control status exposure status
Selection bias in case-control studies
Selection bias OR=6 How representative are hospitalised trauma patients of the population which gave rise to the cases?
Selection bias OR=6 OR=36 Hospital admissions with severe injuries are more likely to be heavy drinkers than the source population
Selection bias Case detection influenced by exposure status Overestimation of “a” overestimation of OR OC use breakthrough bleeding increased screening for uterine cancer
Selection bias Exposed cases have a different chance of admission than controls Overestimation of “a” overestimation of OR Professor “Pulmo”, head of respiratory department, world authority on asbestos exposure
Selection bias Only survivors of a highly fatal disease enter study Underestimation of “a” underestimation of OR Age is risk factor for death
Selection bias (non- response) Cases of myocardial infarction who are smokers are less likely to take part in study Underestimation of “d” underestimation of OR NB no bias if % non-response same in smokers and non smokers
Selection bias in cohort studies
Healthy worker effect Source: Rothman, 2002
Healthy worker effect Source: Rothman, 2002
Non-response bias lung cancer yes no Smoker 90 910 1000 Non-smoker 10 990 1000
10% of smokers dare to respond Non-response bias lung cancer yes no 10% of smokers dare to respond Smoker 9 91 100 Non-smoker 10 990 1000 No bias !
50% of cases that smoked lost to follow up Non-response bias lung cancer yes no Smoker 45 910 955 Non-smoker 10 990 1000 50% of cases that smoked lost to follow up
Loss to follow-up Bias due to differences in completeness of follow-up between comparison groups Example Study of disease risk in migrants Migrants more likely to return to place of origin when having disease lost to follow-up lower disease rate among exposed (=migrant)
Categories of bias Selection bias Information bias
Information bias Systematic error in the measurement of information on exposure or outcome Differences in accuracy of exposure data between cases and controls of outcome data between different exposure groups Study subjects are classified in the wrong category
Misclassification Measurement error leads to assigning wrong exposure or outcome category Non-differential Unrelated to exposure and outcome status Weakens measure of association Differential Related to exposure and outcome status Measure of association distorted in any direction
Differential bias Reporting bias Observer bias Recall bias Interviewer bias
Recall bias (differential) Cases remember exposure differently than controls Overestimation of “a” overestimation of OR Mothers of children with malformations will remember past exposures better than mothers with healthy children
Interviewer bias (differential) Investigator asks cases and controls differently about exposure Overestimation of “a” overestimation of OR Investigator may probe listeriosis cases about consumption of soft cheese
Non-differential misclassification Misclassification does not depend on values of other variables Exposure classification unrelated to disease status, or Disease classification unrelated to exposure status Consequence Weakening of measure of association (“bias towards the null”)
Non-differential misclassification Cohort study: Alcohol laryngeal cancer
Non-differential misclassification Cohort study: Alcohol laryngeal cancer
Bias in randomised controlled trials Gold-standard: randomised, placebo-controlled, double-blinded study Least biased Exposure randomly allocated to subjects - minimises selection bias Masking of exposure status in subjects and study staff - minimises information bias
Bias in prospective cohort studies Loss to follow up The major source of bias in cohort studies Assume that all do / do not develop outcome? Ascertainment and interviewer bias Some concern: Knowing exposure may influence how outcome determined Non-response, refusals Little concern: Bias arises only if related to both exposure and outcome Recall bias No problem: Exposure determined at time of enrolment
Bias in retrospective cohort & case-control studies Ascertainment bias, participation bias, interviewer bias Exposure and disease have already occurred differential selection / interviewing of compared groups possible Recall bias Cases (or ill) may remember exposures differently than controls (or healthy)
Minimising selection bias Precise case and exposure definitions Clear definition of study population Controls representative of source population Classification of exposed and non-exposed without knowing disease status (retrospective cohort) Aim for high response and follow up (check on non-responders, loss to follow up)
Minimising information bias Standardise measurement instruments (closed, precise, clear questions, field tested) Standardise interviews (training) Administer instruments equally to cases and controls (exposed/unexposed) Use multiple controls
Question to you: Suppose a computer error in data entry: Exposed group classified as unexposed Unexposed group classified as exposed What effect has this error on the result? Is it bias? If so: what type If not, what type of error?