Download presentation
Presentation is loading. Please wait.
1
Biases in clinical research
Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University
2
Learning objectives Describe the threats to causal inferences in clinical studies Understand the role of random variability in clinical studies Describe, understand, and learn how to control the 3 main types of bias: Confounding Information bias / measurement error Selection bias Discuss the concept of generalizability of study results DCR Chapters 4 and 9
3
Threats to causal inference
Truth in the Universe Research Question Truth in the Study Plan Findings in the study Actual Study infer infer Random and systematic error Random and systematic error Intended Sample Actual subjects Target Population Design Implementation Intended variables Actual measurements Phenomena of interest EXTERNAL VALIDITY INTERNAL VALIDITY
4
Bias Systematic difference between the true value and the measured value How close is the measured value to the true value? Synonym for bias: lack of validity Validity on average, the measurement estimates the true measurement
5
Real example of random error!
Body weight, bathroom scale True body weight 180 lbs Inconsistent scale, but set correctly at 0 lbs Moments apart: 1st measurement: lbs 2nd measurement: lbs 3rd measurement: lbs 4th measurement: lbs
6
Body weight, bathroom scale
Real example of bias! Body weight, bathroom scale True body weight 180 lbs Consistent scale, but fail to set it at 0 lbs: reads -5 lbs Moments apart: 1st measurement: 175 lbs 2nd measurement: 175 lbs 3rd measurement: 175 lbs 4th measurement: 175 lbs
7
Threats to causal inference
Lack of precision Random variability - by chance We may observe an association that does not exist or may fail to observe an existing association Lack of internal validity Bias - Systematic errors Confounding Information bias / measurement error Selection bias
8
Threats to causal inference
(continued) Incorrect assessment of the direction of causality: We believe that A B But, in reality A B Lack of external validity (generalizability): True effect in the study population But, does not apply to other populations
9
Smoking and CVD mortality – NHANES II Mortality Study
Sample size: 9,205 Length of follow-up: 16 years Prevalence at baseline Current smokers: 32.2% Former smokers: 26.8% Never smokers: 41.0% Hazard ratio for all-cause mortality: Current vs. never smokers: 2.08 (95% CI 1.75 – 2.48) Former vs. never smokers: 1.32 (95% CI 1.11 – 1.56)
10
Smoking and CVD mortality – NHANES II Mortality Study
Hazard ratios for mortality in random samples of N = 500 Ex-smokers Curr. smk [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
11
Smoking and CVD mortality – NHANES II Mortality Study
Hazard ratios for mortality comparing current to never smokers N = 500 N = 1,000 N = 5,000 N = 9,205
12
Smoking and CVD mortality – NHANES II Mortality Study
Hazard ratios for mortality comparing former to never smokers N = 500 N = 1,000 N = 5,000 N = 9,205
13
Streptokinase in AMI – Meta-analysis
Lau J, et al. N Engl J Med 1992;327:248-54
14
Bias – definition Deviation of results or inferences from the truth
Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth Last JM, ed. A dictionary of epidemiology, 4th ed. Oxford, Oxford University Press, 2001
15
Bias – classification Many different biases have been described
Sackett DL. Bias in analytic research. J Chron Dis 1979;32:51-63 Delgado-Rodriguez M, Llorca J. J Epidemiol Community Health 2004;58:635-41 3 general types of biases: Confounding Misclassification / Information bias Selection bias
16
From causal effect to data
17
What is the counterfactual?
Example: The risk experience an exposed individual would have had had he/she not been exposed, with all else being equal. The risk experience an unexposed individual would have had had he/she been exposed, with all else being equal. Not possible to follow the same person at the same time with and without exposure! The counterfactual is like a parallel hypothetical universe We use the counterfactual to describe our ideal reference group
18
Extension of the counterfactual model to a group of individuals
Is risk of heart attack higher in obese individuals than it would have been if the individuals had not been obese? Obese individuals 1 to i Heart attack Robese Actual state (at a given time and place) Same people Not obese individuals 1 to i Heart attack Rnot obese Counterfactual state (at the same time and place)
19
Counterfactual model: all else being equal?
How to address the real world question: Is risk of heart attack higher in obese individuals than in non-obese individuals? Obese individuals 1 to i Heart attack Robese Exposed group (at a given time and place) NOT the same people Not obese individuals i+1 to n Heart attack Rnot obese Comparison group (at a given time and place) These individuals should differ from the exposed individuals only on the exposure.
20
Advantages of randomization
Intermediate Epidemiology nd Term 2003 Advantages of randomization Produces comparable groups with respect to observed and to unobserved factors Control of confounding Control of selection bias of study participants at baseline Helps define exposure to intervention Randomization is the time of onset of follow-up (time 0) Clearly defined groups in terms of intervention Provides a firm basis for statistical inferences Adds credibility to the findings Causal inference
21
Observational “effect” of antihypertensive medication – ARIC
Intermediate Epidemiology nd Term 2003 Observational “effect” of antihypertensive medication – ARIC ARIC included 5,504 hypertensives Systolic blood pressure ≥ 140 mmHg Diastolic blood pressure ≥ 90 mmHg Use of antihypertensive medication Yes: 4,003 subjects (73%) No: 1,484 subjects (27%) Missing: 17 Causal inference
22
Observational “effect” of antihypertensive medication – ARIC
Intermediate Epidemiology nd Term 2003 Observational “effect” of antihypertensive medication – ARIC HR treated vs. untreated = 1.49 (1.22 – 1.82) P < 0.001 Untreated Proportion free of CHD Treated Survival time (y) Causal inference
23
Comparison of treated and untreated hypertensives in ARIC
Intermediate Epidemiology nd Term 2003 Comparison of treated and untreated hypertensives in ARIC Untreated Treated p (n = 1,484) (n = 4,003) Age (y) 55.3 (5.7) 55.6 (5.6) 0.09 Gender (% female) < 0.001 Race (% white) Center (% “W”) < 0.001 BMI (kg/m2) 28.6 (5.8) 29.9 (5.9) < 0.001 Waist-hip ratio 0.94 (0.07) 0.95 (0.07) 0.001 SBP (mmHg) (15.2) (20.0) < 0.001 DBP (mmHg) 87.1 (11.9) 78.4 (11.6) < 0.001 Serum chol. (mmol/l) 5.64 (1.12) 5.68 (1.16) 0.17 HDL-chol. (mmol/l) 1.36 (0.46) 1.27 (0.42) < 0.001 Current smokers (%) Current drinkers (%) < 0.001 Diabetics (%) < 0.001 Prevalent CHD (%) < 0.001 Prevalent angina (%) <0.001 Prevalent stroke/TIA (%) < 0.001 Values are means (SD) or percentages Causal inference
24
Confounders have to … Cause the disease (or be a surrogate measure of a cause) AND Be associated with exposure (i.e., be distributed differently between exposed and unexposed), AND Not affected by exposure (i.e., not be an intermediate variable in the causal pathway) Note: the 3 conditions are necessary for a variable to be a confounder
25
Concepts of confounding
Response (R) Exposure (E)
26
Concepts of confounding
C is associated with E C causes R CONFOUNDING Response (R) C = 1 C = 0 Exposure (E)
27
Concepts of confounding
C is associated with E C causes R CONFOUNDING Response (R) C = 1 C = 0 Exposure (E)
28
Concepts of confounding
C does not cause R C is not associated with E Response (R) C = 1 C = 0 Exposure (E)
29
Concepts of confounding
C is associated with E C does not cause R Response (R) C = 1 C = 0 Exposure (E)
31
Confounding Smith GD, et al. BMJ 1997;315:1641-5
32
Asking about sex … Smith GD, et al. BMJ 1997;315:1641-5
33
Comparability of exposure groups
Smith GD, et al. BMJ 1997;315:1641-5
34
Sex and mortality – Results
Smith GD, et al. BMJ 1997;315:1641-5
35
Sex and mortality – Recommendations!
Smith GD, et al. BMJ 1997;315:1641-5
36
Causal associations Death from myocardial infarction Low sex frequency
Poor health
37
Marmor M, et al. Lancet 1982;1:1083-7
38
Hypotheses Marmor M, et al. Lancet 1982;1:1083-7
39
Methods Marmor M, et al. Lancet 1982;1:1083-7
40
Results and interpretation
Marmor M, et al. Lancet 1982;1:1083-7
41
Amyl nitrite, HIV infection, and AIDS
Sexual behavior HIV infection AIDS Use of amyl nitrite HIV infection causes AIDS HIV infection and use of amyl nitrite were associated in homosexual men
42
Confounders are factors that …
Cause the disease (or are surrogates for causal factors) AND Have a different distribution in exposed and unexposed populations (i.e., are associated with the exposure in the study sample) Both conditions need to be present to have confounding We will also need the additional condition that the confounder is not affected by the exposure
43
Mediator Mediators are: Affected by exposure
On causal pathway between exposure and disease Cause of disease Mediators are intermediate variables, translating at least part of the effect of exposure on disease
44
Causal diagram – Physical activity, HDL cholesterol, and MI
Low physical activity Low HDL cholesterol Myocardial infarction Low physical activity is a cause low HDL cholesterol Low HDL cholesterol is a cause of myocardial infarction HOWEVER, low HDL cholesterol is an intermediate variable in the causal pathway between physical activity and myocardial infarction
47
Uncontrolled confounding
Unmeasured confounders Unknown confounders Known confounders that are too expensive or difficult to measure Residual confounding Confounder is measured imperfectly, and cannot be controlled completely
48
Results and interpretation
Marmor M, et al. Lancet 1982;1:1083-7
49
Methods to control for confounding
In the design of the study Randomization Restriction Matching primarily in case-control studies In the analysis Standardization Stratification Multivariate models Propensity scores Inverse probability weighting Sensitivity analysis
50
Enrollment and follow-up in HERS
Grady D, et al. JAMA 2002;288:49-57
51
Grady D, et al. JAMA 2002;288:49-57
53
van Vollenhoven, et al. Lupus 1999;8:181-7
54
Restriction to lifetime non-smokers to avoid confounding by smoking
55
Kabat GC, et al. Cancer 1986;57:362-7
56
In practice (I) … Prior knowledge on the biological and other causal relationships is needed to properly identify which variables to adjust for Do NOT apply statistical criteria to decide if the conditions for confounding are present Testing for the association of confounder with exposure and of confounder with disease Stepwise selection procedures Consider if exposed and unexposed subjects are comparable with respect to their risk of disease (except for exposure)
57
In practice (II) … Consider which determinants of disease may be responsible for the lack of comparability Elaborate causal diagram Identify causal factors that may be different between exposed and unexposed Obtain information on potential confounders Measuring confounders with error will result in residual confounding after adjustment Use statistical techniques to adjust for potential confounders
58
From causal effect to data
Phillips CV. Epidemiology 2003;14:459-66
59
A B C D Selection bias a b c d
The measure of association observed in the study sample is different to the measure of association in the source population Selection into the study is affected both by the exposure (or by a cause of the exposure) AND by a cause of the outcome (in cohort studies) or by the outcome (in case-control studies) Source population Study population Disease No disease Disease No disease A B a b Exposed Non-exposed Exposed Non-exposed C D c d
60
폭로군 (1000명) 비 폭로군 (2000명) 10 5 Risk = = 0.01 Risk = = 1000 2000 0.01 폭로군에서 risk Relative risk = = = 4 비 폭로군에서 risk 0.0025
61
폭로군 (1000명) 비 폭로군 (2000명) 환자군: 15명 (폭로 Hx에서 10명, 비 폭로 Hx에서 5명) 대조군: 15명 (폭로 Hx에서 10명, 비 폭로 Hx에서 5명) 10 환자군 에서 폭로 odd = = 2 5 2 Odds ratio = = 1 2 10 대조군 에서 폭로 odd = = 2 5
62
예1) Fat intake은 대장암의 위험요인인가?
환자군: multi-center 대장암 신환 대조군: 대장암이 없는 종검 수진자
63
폭로군 (1000명) 비 폭로군 (2000명) 환자군: 15명 (폭로 Hx에서 10명, 비 폭로 Hx에서 5명) 대조군: 15명 (폭로 Hx에서 1명, 비 폭로 Hx에서 14명) 10 환자군 에서 폭로 odd = = 2 5 2 Odds ratio = = 28 0.0714 1 대조군 에서 폭로 odd = = 14
64
예2) 흡연은 방광암의 위험요인인가? 환자군: multi-center 방광암 신환 대조군: 방광암이 없는 종검 수진자
65
폭로군 (1000명) 비 폭로군 (2000명) 환자군: 15명 (폭로군에서 10명, 비 폭로군에서 5명) 대조군: 15명 (폭로군에서 5명, 비 폭로군에서 10명) 10 환자군 에서 폭로 odd = = 2 5 2 Odds ratio = = 4 0.5 5 대조군 에서 폭로 odd = = 0.5 10
67
MacMachon B, et al. N Engl J Med 1981;304:630-3
68
MacMachon B, et al. N Engl J Med 1981;304:630-3
70
Selection bias in Case-Control Study
71
커피 음용 비율(%)
74
Selection bias in Case-Control Study
75
Selection bias in cohort studies
Immigrative selection bias Selection into the cohort is affected both by exposure (or by a cause of exposure) and by risk of disease Emigrative selection bias Selection out of the cohort (losses to follow-up) are affected both by exposure (or by a cause of exposure) and by risk of disease
76
Samaha FF, et al. N Engl J Med 2003;348:2074-81
77
Samaha FF, et al. N Engl J Med 2003;348:2074-81
79
Selection bias due to losses of follow-up in RCT of Atkins diet
Assigned diet Outcome (weight loss) Selection Age, other factors
80
Minimizing selection bias
Random sampling from source population Limit losses to follow up Sensitivity analysis
81
Healthy Worker Effect
82
Dibbs E, et al. Circulation 1982;65:943-6
84
From causal effect to data
Phillips CV. Epidemiology 2003;14:459-66
85
Measurement error can affect …
Exposure Outcome Confounders Mediators Modifying factors
86
Measured value = True value + Error Error = Bias + Random Error
Error components Measured value = True value + Error Error = Bias + Random Error Systematic component of the error Random component
87
Quantification of measurement error
Dichotomous variables Sensitivity, specificity Kappa statistic Categorical variables Spearman correlation coefficient Continuous variables Coefficient of variation Intraclass correlation coefficient (reliability coefficient)
88
Differential vs. non-differential errors
Non-differential measurement error Measurement error in the variable in question (e.g., the exposure) does not depend on the levels of other variables (e.g., the outcome, confounders, etc) Differential measurement error Measurement error depends on the levels of other variables (for instance, when sensitivity and specificity for measuring disease are different in exposed and unexposed participants)
89
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 TABLE WITH MISCLASSIFIED EXPOSURE Sensitivity = 80% Specificity = 100% Observed RR = 1.71 Diseased Yes No Exposed 160 640 800 100+40 300 1700 2000
90
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 TABLE WITH MISCLASSIFIED EXPOSURE Sensitivity = 100% Specificity = 90% Observed RR = 1.91 Diseased Yes No Exposed 200+10 800+90 90 810 900 300 1700 2000
91
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 TABLE WITH MISCLASSIFIED EXPOSURE Sensitivity = 80% Specificity = 90% Observed RR = 1.60 Diseased Yes No Exposed 160+10 640+90 90+40 300 1700 2000
92
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
In this case, measurement error will induce a bias will be towards the null, unless … The test is uninformative or misleading The true effect is null The magnitude of the bias depends on: Sensitivity and specificity The prevalence of the exposure The risk of the disease The magnitude of the true effect The measure of association used
93
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
94
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
95
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring exposure
96
Non-differential measurement error – Dichotomous exposure & outcome – Errors in measuring disease
Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 TABLE WITH MISCLASSIFIED DISEASE Sensitivity = 80% Specificity = 90% Observed RR = 1.41 Diseased Yes No Exposed 160+80 720+40 1000 80+90 810+20 2000
104
Regression towards the mean
When a variable is measured with random error and we select participants with observed extreme values, their true underlying values are on average closer to the population mean Measure BP and select participants with SBP > 140 mmHg Consequences Inconsistencies in diagnosis and classification Biases in evaluation of interventions Inefficiency in planning studies
105
Differential measurement error – Dichotomous exposure & outcome – Errors in measuring disease
TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TABLE WITH MISCLASSIFIED DISEASE Sens in exposed = 90% Sens in unexposed = 80% Spec in exposed = 100% Spec in unexposed = 100% Observed RR = 2.25 Diseased Yes No Exposed 180 800+20 1000 80 900+20 260 2000
106
Differential measurement error – Dichotomous exposure & outcome – Errors in measuring disease
Diseased Yes No Exposed 200 800 1000 100 900 300 1700 2000 TRUE TABLE N = 2000 P(E) = 50% P(D| ) = 10% RR = 2.0 TABLE WITH MISCLASSIFIED DISEASE Sens in exposed = 100% Sens in unexposed = 100% Spec in exposed = 90% Spec in unexposed = 80% Observed RR = 1.00 Diseased Yes No Exposed 200+80 720 1000 1440 2000
107
Differential measurement error
Can bias measures of association in any direction The magnitude can be substantial, even with small differences in sensitivity or specificity In cohort studies, an important concern is differential classification of disease as a function of exposure Diagnostic bias Surveillance bias Mask follow-up procedures and outcome assessment
108
Main points on measurement error
Measurement errors are pervasive in epidemiological studies Non-differential, independent errors in exposure or outcome tend to bias associations towards the null, but there are exceptions Differential or dependent errors can bias the association in either direction If sensitivity / specificity or ICC are known from validation studies, we can correct the measures of association
109
Strategies for increasing accuracy of measurements
Standardize measurement methods in an operations manual Train and certify observers Refine the instruments Automate instruments and procedures Calibrate equipment Make unobtrusive measurements Blind measurements Take repeated measurements
110
Generalizability (external validity)
111
PPV Information Sheet
112
South African miners
113
Generalizability is a judgment
Consider if the same biological / social mechanisms apply in the target population as in the source population Consider if the prevalence of factors that may modify the effect of the exposure are different in the target and in the source population E.g., genetic determinants Be careful
114
Thank you for your attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.