Matching in case control studies Lecture notes Matching in case control studies Yvan Hutin
Cases of acute hepatitis (E) by residence, Girdharnagar, Gujarat, India, 2008 Attack rate per 1,000 > 40 30-39 20-29 >0-10 Water pumping station Leak Drain overflow
RR = 2.3, Chi Square= 41.1 df= 1. P < 0.001 Risk of hepatitis by place of residence, Girdharnagar, Gujarat, India, 2008 Source of water Hepatitis No hepatitis Total Leaking pipes /overflowing drain 144 8,694 8,838 No leakages / overflowing drain 89 12,436 12,525 233 21,130 21,363 RR = 2.3, Chi Square= 41.1 df= 1. P < 0.001
Attack rate of acute hepatitis (E) by zone of residence, Baripada, Orissa, India, 2004 Underground water supply Pump from river bed 0 - 0.9 / 1000 1 - 9.9 / 1000 10 -19.9 / 1000 20+ / 1000 Chipat river
Case-control study methods, acute hepatitis outbreak, Baripada, Orissa, India, 2004 Cases All cases identified through active case search Control Equal number of controls selected from affected wards but in households without cases Data collection Reported source of drinking water Comment events Restaurants
Adjusted odds ratio = 33, 95 % confidence interval: 23- 47 Consumption of pipeline water among acute hepatitis cases and controls, Baripada, Orissa, India, 2004 Acute hepatitis Control Total Drunk pipeline water 493 134 627 Did not drink pipeline water 45 404 449 538 1076 Adjusted odds ratio = 33, 95 % confidence interval: 23- 47
Key elements The concept of matching The matched analysis Pro and cons of matching
Controlling a confounding factor Stratification Restriction Matching Randomization Multivariate analysis
The concept of matching Confounding is anticipated Adjustment will be necessary Preparation of the strata a priori Recruitment of cases and controls By strata To insure sufficient strata size If cases are made identical to controls for the matching variable, the difference must be explained by the exposure investigated
Consequence.... The problem: Is solved with another problem: Confounding Is solved with another problem: Introduction of more confounding, so that stratified analysis can eliminate it.
Definition of matching Creation of a link between cases and controls This link is: Based upon common characteristics Created when the study is designed Kept through the analysis
Types of matching strategies Frequency matching Large strata Set matching Small strata Sometimes very small (1/1: pairs)
Unmatched control group Cases Controls Bag of cases Bag of controls
Sets of cases and controls that cannot be dissociated Matched control group Cases Controls Sets of cases and controls that cannot be dissociated
Matching: False pre-conceived ideas Matching is necessary for all case-control studies Matching needs to be done on age and sex Matching is a way to adjust the number of controls on the number of cases
Matching: True statements Matching can put you in trouble Matching can be useful to quickly recruit controls
Matching criteria Potential confounding factors Criteria Associated with exposure Associated with the outcome Criteria Unique Multiple Always justified
Risk factors for microsporidiosis among HIV infected patients Case control study Exposure Food preferences Potential confounder CD4 / mm3 Matching by CD4 category Analysis by CD4 categories
Mantel-Haenszel adjusted odds ratio ai.di) / Ti] bi.ci) / Ti] OR M-H=
Matched analysis by set (Pairs of 1 case / 1 control) Concordant pairs Cases and controls have the same exposure No ad and bc: no input to the calculation Cases Controls Total Exposed 1 1 2 Non exposed 0 0 0 Total 1 1 2 Cases Controls Total Exposed 0 0 0 Non exposed 1 1 2 Total 1 1 2 No effect No effect
Matched analysis by set (Pairs of 1 case / 1 control) Discordant pairs Cases and controls have different exposures ad’s and bc’s: input to the calculation Cases Controls Total Exposed 1 0 1 Non exposed 0 1 1 Total 1 1 2 Cases Controls Total Exposed 0 1 1 Non exposed 1 0 1 Total 1 1 2 Positive association Negative association
The Mantel-Haenszel odds ratio... S [(ai.di) / Ti] S [(bi.ci) / Ti] OR M-H=
…becomes the matched odds ratio S Discordant sets case exposed S Discordant sets control exposed OR M-H=
…and the analysis can be done with paper clips! Concordant questionnaire : Trash Discordant questionnaires : On the scale The "exposed case" pairs weigh for a positive association The "exposed control" pairs weigh for a negative association
Analysis of matched case control studies with more than one control per case Sort out the sets according to the exposure status of the cases and controls Count reconstituted case-control pairs for each type of set Multiply the number of discordant pairs in each type of set by the number of sets Calculate odds ratio using the f/g formula Example for 1 case / 2 controls Sets with case exposed: +/++, +/+-, +/-- Sets with case unexposed: -/++, -/+-, -/--
The old 2 x 2 table... Cases Controls Total Exposed a b L1 Unexposed c d L0 Total C1 C0 T Odds ratio: ad/bc
... is difficult to recognize! Controls Exposed Unexposed Total Exposed e f a Unexposed g h c Total b d P (T/2) Odds ratio: f/g Cases
The Mac Nemar chi-square (f - g) 2 (f+g) Chi2 McN=
Matching: Advantages Easy to communicate Useful for strong confounding factors May increase power of small studies May ease control recruitment Suits studies where only one factor is studied Allows looking for interaction with matching criteria
Matching: Disadvantages Must be understood by the author Is deleterious in the absence of confounding Can decrease power Can complicate control recruitment Is limiting if more than one factor Does not allow examining the matching criteria
Matching with a variable associated with exposure, but not with illness (Overmatching) Reduces variability Increases the number of concordant pairs Has deleterious consequences: If matched analysis: reduction of power If match broken: Odds ratio biased towards one
Hidden matching (“Crypto-matching”) Some control recruitment strategies consist de facto in matching Neighbourhood controls Friends controls Matching must be identified and taken into account in the analysis
Matching for operational reasons Outbreak investigation setting Friends or neighbours controls are a common choice Advantages: Allows identifying controls fast Will take care of gross confounding factors May results in some overmatching, which places the investigator on “the safe side”
Breaking the match Rationale Procedure Matching may limit the analysis Matching may have been decided for operational purposes Procedure Conduct matched analysis Conduct unmatched analysis Break the match if the results are unchanged
Take home messages Matching is a difficult technique Matching design means matched analysis Matching can always be avoided