Download presentation
1
Bias, Confounding and effect modification
Jamlick Karumbi, SIRCLE
2
Outline What? Types How does it happen Examples Control/minimizing
3
In any study, association between an outcome and an exposure can be due to;
True association or Bias Confounding Random error (chance) The three affect the internal validity, but may not improve the external validity
4
Bias A systematic deviation of results or influences from truth Any trend in the collection, analysis, interpretation, publication or review of data that can lead to conclusions that are systematically different from the truth (Last, 2001) A process at any state of inference tending to produce results that depart systematically from the true values (Fletcher et al, 1988) Systematic error in design or conduct of a study (Szklo et al, 2000) It’s a form of differential/selective miss-classification.
5
Identifying a bias In a case control study to examine risk factors for lung cancer, cases are people admitted with lung cancer, and controls are people admitted to the same hospital with emphysema. The study finds no association between smoking and lung cancer.
6
Identifying a random error
If four people wanted to stop smoking, two of them take some nicotine pills to help them stop, and two don’t. the two taking the pills succeed in stopping the habit. A significance test shows a P = 0.03. We can then conclude that nicotine helps people stop smoking.
7
Bias Random error vs bias (non random error)
8
Bias vs random error(chance)
Systematic error Random error Error don’t cancel out even with larger sample size Errors cancel out with larger samples Leads to inaccurate results Leads to imprecision (wide confidence intervals)
9
Bias Bias description differs depending on the study design;
In descriptive studies/observational studies, bias is due to the study population being unrepresentative of general population being described In analytical studies, bias is due to comparisons groups being not comparable
10
Types of bias Mainly divided into two; Selection bias Information bias
11
Selection bias Systematic differences between those included in the study and those excluded Mailing a questionnaire to the community so as to estimate the prevalence of disability? Any problem with that?? In descriptive studies it occurs when your study population is not representative of the reference population
12
Selection bias In analytical studies, the comparison groups are not similar Case controls; Cases not representative of all cases in population Controls not representative of the population from which the cases were selected Cohort studies Choice of the unexposed group (health worker effect) Different follow-up times between the comparison groups
13
Examples Self-selection bias Also called volunteer bias
Health worker effect is a good example Workers are generally healthy than general population People in employment may be less or more likely to offer themselves for a study Self referral..etc HIV test, who will volunteer??
14
Examples Diagnostic bias Case control study
Also called hospital admission bias or work up bias Prior knowledge of exposure by investigator Case control study Outcome is pulmonary disease, exposure is smoking Radiologist aware of patient’s smoking status when reading x-ray – may look more carefully for abnormalities on x-ray and differentially select cases
15
Examples Loss to follow-up bias Non-response bias
Can also be referred to as withdrawal bias If its differential then the association will be distorted Non-response bias A postal survey on alcohol intake, heavy alcoholics might be less likely to respond, thereby giving an underestimation of the effect.
16
Minimizing selection bias
Ensure that study participants are a representative of target population, a control can be a case if he had the outcome Ensure high rates of response as possible and if not examine differences between responders and non-responders
17
Information bias An error in exposure or outcome measurement which results in systematic differences between information collected Sources include; Subject variation Observer variation Tools deficiencies (ruler vs calipers) Technical errors during measurement Divided into; Interviewer/observer bias Reporting bias
18
Information bias Interviewer bias may be in terms of; Probing/asking
Recording Interpreting
19
Reporting bias It results from subject variation; whereby a subjects have different levels of accuracy depending on exposure, its more common in case-control studies The exposed group have a greater sensitivity for recalling an exposure. E.g. those who develop a flu may be more likely to remember the exposure than those who don't,
20
Also individuals who are aware that they are being followed will behave differently, the so called Hawthorne effect Reporting and observer bias can result in misclassification – assigning to the wrong exposure or outcome category Can be differential or non differential
21
Minimizing information bias
Blinding; if possible double, but at least the investigator and interviewers Counter checking with medical records for data, e.g. from several sources In questionnaires multiple questions enquiring about the same information may act as a double check How you conduct a survey, face to face or mail
22
Caution! Its very important that all potential sources of bias are identified at the time of study design because, unlike random error or confounding you can’t adjust or make an allowance for bias at the analysis stage.
24
In any study, association between an outcome and an exposure can be due to;
True association or Bias Confounding Random error (chance) The three affect the internal validity, but may not improve the external validity
25
Confounding If a study found an association between coffee drinking and lung cancer. Do we ban coffee or are there alternative explanations?
26
Confounding It is a situation where the association between an exposure and an outcome is entirely or partially due to another exposure. Some may call it ‘confusion of effects’ It leads to distortion of the effect which can be an under or over estimation, it can even change the direction of an effect.
27
Characteristics of a confounder
Must be associated with exposure among the source population (controls), Should be an independent risk factor among non exposed Not on the causal pathway of the exposure of interest and the outcome.
28
Confounder not a result of the exposure
e.g., association between grey hair(exposure) and heart attack (outcome); Is age a confounder? e.g., association between age(exposure) and heart attack (outcome); Is grey hair a confounder?
29
Confounding To be a confounding factor, three conditions must be met:
Exposure Outcome Third variable (Confounder) Be associated with exposure - without being the consequence of exposure Be associated with outcome - independently of exposure Not an intermediary/not on the pathway
30
Confounding Grey hair Heart attack Age
Age is correlated with heart attack and a risk factor even if there is no grey hair
31
Confounding ? Age Heart attack Grey hair
Grey hair is correlated with age but not a risk factor in young. Is it a confounder?
32
Confounding Coffee Lung cancer Smoking
Smoking is correlated with coffee drinking and a risk factor even for those who do not drink coffee
33
Confounding ? Smoking Lung cancer Yellow fingers Not a confounder
Not related to the outcome Not an independent risk factor Not a confounder
34
Confounding ? Water source Cholera Pathogens On the causal pathway
35
Other examples Exposure Outcome Risk factor Confounder??
Area of residence Melanoma Exposure to sunlight Atmospheric pollution Bronchitis Smoking Use of Antacids Gastric cancer H. Pylori Infection
36
How does confounding occur?
Let’s take our example of coffee drinking and lung cancer, and see how it works numerically. Suppose that a cohort study finds that coffee consumption and is associated with an increased risk of lung cancer; the basic data were as shown below. What is the risk ratio Cancer No cancer Coffee 450 200 No Coffee 300 250
37
Risk in exposed Risk in unexposed So the risk ratio is given by: 450 / 650 = 1.3 300 / 550
38
Is smoking a potential confounder?
If we now re-analyse the data, grouping the subjects according to smoking habit, we have the table below. Is smoking a potential confounder? Cancer No cancer Smokers Coffee 400 100 No Coffee 200 50 Non smokers
39
What are the risk ratios for drinking coffee among people with cancer of the pancreas compared with controls?
40
Calculate the risk ratio among smokers
Risk in exposed Risk in unexposed So here the calculation is: OR = 400/500 = 1 200/250
41
Now calculate the risk ratio among non-smokers
Risk in exposed Risk in unexposed So here the calculation is: OR = 50/150 = 1 100/300
42
Positive and Negative Confounding
In the example we used, the effect of the confounding variable (cigarette smoking) was to cause an apparent association to be observed between an exposure (coffee) and an outcome (lung cancer), when in fact no association existed. Where the effect of a confounder is to make the observed association between exposure and outcome appear stronger (i.e. to increase the odds ratio) may be called positive confounding
43
Positive and Negative Confounding
This effect can also work in the opposite direction: confounding can also result in the association between an exposure and an outcome appearing to be weaker than it really is. This is called negative confounding.
44
Identification of potential confounders
No magic or formulae but; Think about exposures that are biologically plausible as risk factors for the outcome in question. Comprehensive literature review to find what exposures have been found to be risk factors in previous studies.
45
Controlling for confounding
Can be done at study design stage or at the analysis stage; At the design stage it can be controlled through; Restriction Matching Randomization At analysis stage Stratification Regression modeling
46
Restriction Limiting the study population to those with the same level of exposure to the potential confounder. Advantages; Simple No need to measure confounder Disadvantages Generalizability lost Few subjects hence may reduce statistical power
47
Matching Mainly done in case controls, cases and controls are selected in such a way that they are similar to each other in terms of a potential confounder A matched design must be accompanied by a matched analysis Advantages; Control of hard to measure confounders Disadvantage Recruitment of suitable matches Have to perform a matched analysis Cant examine the effect of a variable(s) that have been matched.
48
Randomization Usually applicable in interventional studies Advantage
Controls for known and unknown confounders Disadvantages Not doable in observational studies Ethical limitations Large number of subjects may be required
49
Analysis stage Two main methods Stratification Regression modelling
50
Stratification An extension of restriction where data is stratified in terms of confounders and analysed separately We assume that effect of confounder has been removed in each stratum. Too broad stratum- there may be significant differences in the stratum., too narrow may yield too few individuals Inaccurate measure of confounder may lead to subjects being assigned to the wrong stratum.
51
Stratification In appropriate stratification either too broad stratum or inaccurate measurement means confounding will not be completely adjusted for. This is referred to as residual confounding After adjusting for confounding in the stratum, then a summary or pooled estimate is reported. (Mantel-Haenszel estimate)
52
Regression A statistical modeling technique which adjusts for several variables at the same time.
53
Testing for confounding
There is NO statistical test for confounding! However if the estimate of effect changes after adjusting for a potential confounder i.e. its different from the crude estimate, then we can say there is a confounding E.g. a crude odds ratio of 2.4 which comes to 1.5 after stratifying by age
55
Effect modification If we conducted a study on effect of protein supplements on muscle gain, among people who are emaciated (cachexic), its likely that it there will be a strong association. However among normal people, the supplements may have little or no effect. This is referred to as effect modification
56
Cont’ It’s a situation whereby the association between an exposure and an outcome varies with the level of a third factor. Its also referred to as interaction or sometimes effect measure modification Effect modification is NOT the same as confounding.!! It may help us understand an association and when detected should always be reported.
57
Effect modification The way to distinguish confounding and interaction is to stratify the data according to the factor under investigation: If the stratum-specific rate ratios (or odds ratios or risk ratios) differ from the unstratified rate ratio, and if there is little variation between the stratum-specific rate ratios, this is evidence for confounding. If there is variation in the stratum-specific rate ratios (more than is due to chance), this is evidence of interaction.
58
Worked example A study to examinee association between alcohol intake and liver cancer. The data collected is as below for deaths due to liver ca(per 100,000 person years) amongst alcohol takers and non drinkers Non drinkers Alcohol takers Rate ratio 165 396 2.40 Meaning of the rate ratio?
59
If we separate the data by age group, the rate ratio varies with age
If we separate the data by age group, the rate ratio varies with age. Effect is highest in the eldest. Age group Non drinkers Alcohol takers Rate ratio <45 133 267 2.0 45-54 30 102 3.4 55-64 2 27 13.5 Rate ratio varies with age group, showing an interaction between age and alcohol intake
60
Confounding vs effect modification
A Nuisance effect which should be controlled A real effect which should be reported and not controlled. Effect of measure same in all categories but different from the crude estimate Effect measure differs in the categories and is also different from the crude. Adjust and give a pooled effect Report stratum specific estimates Can a factor be both a confounder and an effect modifier? If yes how do you report that?
61
Confounding or Effect Modification
Birth Weight Leukaemia Sex Can sex be responsible for the birth weight association in leukaemia? - Is it correlated with birth weight? - Is it correlated with leukaemia independently of birth weight? - Is it on the causal pathway? - Can it be associated with leukaemia even if birth weight is low? - Is sex distribution uneven in comparison groups?
62
Confounding or Effect Modification
Birth Weight Leukaemia Sex OR = 1.5 Does birth weight association differ in strength according to sex? Birth Weight Leukaemia BOYS OR = 1.8 GIRLS Birth Weight / / Leukaemia OR = 0.9
63
Example a study on asbestos exposure and lung cancer Crude data;
yes no Lung cancer 54 20 Person years 80,000 160,000 What is the crude Rate ratio? Potential confounders?
64
Stratification by confounder. (smoking)
smokers Non smokers asbestos No asbestos Lung cancer 50 10 4 Person years 60,000 40,000 20,000 120,000 Smoking rate ratio? Non smoking rate ratio? Do we have confounding, effect modification or both??
65
Effect modification There is no controlling or adjusting for effect modification!
67
Measurement error: validity and reliability
There are many potential sources of measurement error. The magnitude of errors in both exposure and outcome measures need to be known Validity; How close to the truth is our measure Reliability; How consistent is our measure when used by different observers or different times
68
Validity Hard to measure for complex outcomes or exposures e.g. quality of life Normally estimated by comparing results to true values (gold standard) A validity coefficient is used. Continuous data- Pearson correlation co-efficent Categorical data- sensitivity, specificity Sensitivity true positives identified as such Specificity true negatives identified as such
69
Precision/Reliability
Can also loosely be referred to as the repeatability of a study. The degree to which the value of a measurement has nearly the same value when measured several times A precise measurement has less random error than an imprecise measurement. Greater precision enhances statistical power It reflects the consistency and dependability of a set of measures
71
How? There are generally three main sources of imprecision or random error Observer variability Intra-observer Inter-observer Biological (subject) variability Instrument variability
72
Assessing precision Depends on the type of data you are dealing with
Continuous Categorical
73
Assessing precision Continuous data
Use intraclass correlation coefficient also called reliability coefficient which is an estimate of the ratio of measurement errors to the random error. Unpaired data Standard Deviation; Variance (ANOVA) Confidence interval Paired data (inter- and intra-rater reliability) Correlation Coefficient (Pearson, Spearman, ICC) Coefficient of variation (within subject SD/mean) Bland-Altman plots (plot of mean vs. difference within subjects)
74
Categorical data Tested using mean pair agreement index between two observers E.g. Mean index = number of agreements/total number of pairs between observers i.e. (a+d)/(a+b+c+d) Observer 2 Smokers Non smokers Observer 1 a b Non-smokers c d
75
Categorical data Another way to test is the use of a kappa statistic.- incorporates the effect of chance. Can take a value between 0 and 1 no rule of the thumb but, <0.5 = poor agreement >0.75 = excellent agreement
76
Maximizing precision Use of standardized operations manual
Provide training, support and supervision in equal measure to all data collectors. Minimize number of data collectors but ensure efficiency is still maintained Pretesting the measuring implements to be used in the fieldwork. Automate data collection
77
Perform repeated measures, (feasibility)
Quality control during data entry
78
Impact of measurement errors
Non-differential misclassification (random error) usually reduces the observed strength of association. Differential (bias) may overestimate, underestimate or have no effect on the observed strength of association.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.