Matched designs Need Matched analysis
Incorrect unmatched analysis. cc cc exp,exact Proportion | Exposed Unexposed | Total Exposed Cases | | Controls | | Total | | | | | Point estimate | [95% Conf. Interval] | Odds ratio | | (exact) Attr. frac. ex. | | (exact) Attr. frac. pop | | sided Fisher's exact P = sided Fisher's exact P = This analysis ignores that a matching control was found for each case. Notice that the ‘sample size’ looks to be 1242 and yet nevertheless there is no evidence of a disease- exposure relationship.
Correct classical analysis. reshape wide cc exp,i(pair) j(ct) (note: j = 1 2) Data long -> wide Number of obs > 621 Number of variables 4 -> 5 j variable (2 values) ct -> (dropped) xij variables: cc -> cc1 cc2 exp -> exp1 exp mcc exp2 exp1 | Controls | Cases | Exposed Unexposed | Total Exposed | | 106 Unexposed | | Total | | 621 McNemar's chi2(1) = 5.76 Prob > chi2 = Exact McNemar significance probability = odds ratio (exact). The ‘sample size’ is 21! But the p-value is less than 5% and the estimated odds ratio is very different from the incorrect analysis
Exact p-value is just the binomial. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 16) = (one-sided test) Pr(k <= 16) = (one-sided test) Pr(k = 16) = (two-sided test)
Conditional logistic regression version of the correct classical analysis. clogit exp cc,group(pair) note: multiple positive outcomes within groups encountered. note: 600 groups (1200 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 42 LR chi2(1) = exp | Coef. Std. Err. z P>|z| [95% Conf. Interval] cc | clogit exp cc,group(pair) or note: multiple positive outcomes within groups encountered. note: 600 groups (1200 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 42 LR chi2(1) = exp | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] cc | P-values / CIs are based on the normal approximation to the binomial. 600 concordant pairs are correctly ‘dropped’
4 matching controls per case LA Study of endometrial cancer use "C:\Mdsc643.02\la.dta", clear (LA Study of Endometrial Cancer). desc Contains data from C:\Mdsc643.02\la.dta obs: 315 LA Study of Endometrial Cancer vars: Nov :43 size: 15,120 (98.6% of memory free) (_dta has notes) storage display value variable name type format label variable label row float %9.0g age float %9.0g Age (yr) gbd float %9.0g yn Gall Bladder Disease hyp float %9.0g yn Hyertension obe float %9.0g yn Obesity est float %9.0g yn Estrogen (Any) Use conj float %9.0g cl Conjugated Dose dur float %9.0g Estrogen Duration (mo) ned float %9.0g yn Non Estrogen Drug cc float %9.0g ccl Case/Control quint float %9.0g 4 Controls: 1 Case Sorted by: quint
Incorrect analysis. cc cc est,exact Proportion | Exposed Unexposed | Total Exposed Cases | 56 7 | Controls | | Total | | | | | Point estimate | [95% Conf. Interval] | Odds ratio | | (exact) Attr. frac. ex. |.873 | (exact) Attr. frac. pop |.776 | sided Fisher's exact P = sided Fisher's exact P =
Classical analysis. drop row. sort quint cc. by quint: gen otf=_n. reshape wide cc age gbd hyp obe est conj dur ned, i(quint) j(otf) (note: j = ) Data long -> wide Number of obs > 63 Number of variables 11 -> 46 j variable (5 values) otf -> (dropped) xij variables: cc -> cc1 cc2... cc5 age -> age1 age2... age5 gbd -> gbd1 gbd2... gbd5 hyp -> hyp1 hyp2... hyp5 obe -> obe1 obe2... obe5 est -> est1 est2... est5 conj -> conj1 conj2... conj5 dur -> dur1 dur2... dur5 ned -> ned1 ned2... ned
A new table. gen sumcon=est1+est2+est3+est4. gen sumcas=est5. table sumcas sumcon | sumcon sumcas | | | There are 5 concordant pairs. Exact p-values based on Binomial p= 1/5, 2/5, 3/5 and 4/5
Components to the p-value. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 3) = (one-sided test) Pr(k <= 3) = (one-sided test) Pr(k >= 3) = (two-sided test) note: lower tail of two-sided p-value is empty. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 17) = (one-sided test) Pr(k <= 17) = (one-sided test) Pr(k >= 17) = (two-sided test) note: lower tail of two-sided p-value is empty return list scalars: r(p) = e-06
Next 2 p-values. bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 16) = (one-sided test) Pr(k <= 16) = (one-sided test) Pr(k = 16) = (two-sided test). bitesti N Observed k Expected k Assumed p Observed p Pr(k >= 15) = (one-sided test) Pr(k <= 15) = (one-sided test) Pr(k = 15) = (two-sided test)
Correct p-value TITLE STB-49 sbe28. Meta-analysis of p values. DESCRIPTION/AUTHOR(S) STB insert by Aurelio Tobias, Statistical Consultant, Madrid, Spain. Support: After installation, see help metap. INSTALLATION FILES (click here to install) sbe28/metap.ado sbe28/metap.hlp ANCILLARY FILES (click here to get) sbe28/fleiss.dta
Using the STB ado file. input pvar pvar e end. metap pvar Meta-analysis of p_values Method | chi2 p_value studies Fisher | e
Conditional logistic version. clogit est cc,group(quint) note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(1) = est | Coef. Std. Err. z P>|z| [95% Conf. Interval] cc | clogit est cc,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] cc |
Reversing Case/Control and Exposure. clogit cc est,group(quint) Conditional (fixed-effects) logistic regression Number of obs = 315 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = b cc | Coef. Std. Err. z P>|z| [95% Conf. Interval] est | clogit cc est,group(quint) or Conditional (fixed-effects) logistic regression Number of obs = 315 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = cc | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] est |
Assessment of potential confounder. clogit est hyp cc,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(2) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] hyp | cc |
Assessment of age as a potential modifier (even though age was a part of the matching criteria) gen ac=age*cc. clogit est hyp cc ac,group(quint) or note: multiple positive outcomes within groups encountered. note: 5 groups (25 obs) dropped due to all positive or all negative outcomes. Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Conditional (fixed-effects) logistic regression Number of obs = 290 LR chi2(3) = Prob > chi2 = Log likelihood = Pseudo R2 = est | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] hyp | cc | ac |
Notice… …that age*cc is included in the model even though age is not included. This is a special case where we CAN interpret a model with an interaction term even though one of the constituents of this interaction is not included in the model