Download presentation
Presentation is loading. Please wait.
1
Chapter 18 Cross-Tabulated Counts
11/15/2018 Chapter 18 Cross-Tabulated Counts November 18 Basic Biostat
2
In Chapter 18: 18.1 Types of Samples
18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control 18.6 Matched Pairs November 18
3
Types of Samples I. Naturalistic Samples ≡ simple random sample or complete enumeration of the population II. Purposive Cohorts ≡ select fixed number of individuals in each exposure group III. Case-Control ≡ select fixed number of diseased and non-diseased individuals November 18
4
Naturalistic (Type I) Sample
Random sample of study base November 18
5
Naturalistic (Type I) Sample
Random sample of study base How did we study CMV (the exposure) and restenosis (the disease) with a naturalistic sample? A population was identified and sampled The sample was classified as CMV+ and CMV− The outcome (restenosis) was studied and compared in the groups. November 18
6
Purposive Cohorts (Type II sample)
Fixed numbers in exposure groups How would I do study CMV and restenosis with a purposive cohort design? A population of CMV+ individuals would be identified. From this population, select, say 38, individuals. A population of CMV− individuals would be identified. From this population, select, say, 38 individuals. The outcome (restenosis) would be studied and compared among the groups. November 18
7
Case-control (Type III sample)
Set number of cases and non-cases How would I do study CMV and restenosis with a case-control design? A population of patents who experienced restenosis (cases) would be identified. From this population, select, say 38, individuals. A population of patients who did not restenose (controls) would be identified. From this population, select, say, 38 individuals. The exposure (CMV) would be studied and compared among the groups. November 18
8
Case-Control (Type III sample)
Set number of cases and non-cases November 18
9
Naturalistic Sample Illustrative Example
Edu. Smoke? + − Tot HS 12 38 50 JC 18 67 85 JC+ 27 95 122 UG 32 239 271 Grad 5 52 57 Total 94 491 585 SRS of 585 Cross-classify education level (categorical exposure) and smoking status (categorical disease) Talley R rows by C columns “cross-tab” November 18
10
Table Margins Row margins Total Educ. Smoke? + − Tot HS 12 38 50 JC 18
67 85 Some 27 95 122 UG 32 239 271 Grad 5 52 57 Total 94 491 585 Row margins Total Column margins November 18
11
Naturalistic & Cohort Samples
R-by-2 Table + − Total Grp 1 a1 b1 n1 Grp 2 a2 b2 n2 ↓ Grp R aR bR nR m1 m2 N November 18
12
Example Prevalence of smoking by education:
Example, prevalence group 1: November 18
13
Let group 1 represent the least exposed group
Relative Risks Let group 1 represent the least exposed group November 18
14
Illustration: RRs Note trend November 18
15
k Levels of Response Efficacy of Echinacea. Randomized controlled clinical trial: echinacea vs. placebo in treatment of URI in children. Response variable ≡ severity of illness Source: JAMA 2003, 290(21), November 18
16
Echinacea Example Purposive cohorts row percents
% severe, echinacea = 48 / 329 = .146 = 14.6% % severe, placebo = 40 / 367 = .109 = 10.9% Echinacea group fared worse than placebo November 18
17
§18.3 Chi-Square Test of Association
A. Hypotheses. H0: no association in population Ha: association in population B. Test statistic – by hand or computer C. P-value. Via Table E or software November 18
18
Chi-Square Example H0: no association in the population
Ha: association in the population Data Degree Smoke + Smoke − Tot HighS 12 38 50 JC 18 67 85 Some 27 95 122 UG 32 239 271 Grad 5 52 57 Total 94 491 585 November 18
19
Expected Frequencies (under H0)
Smoke + Smoke − Total HighS (50 × 94) ÷ 585 = 8.034 (50 × 491) ÷ 585 = 50 JC 13.658 71.342 85 Some 19.603 122 UG 43.545 271 Grad 9.159 47.841 57 94 491 585 November 18
20
Chi-Square Hand Calc. November 18
21
Chi-Square P-value X2stat= 13.20 with 4 df
Table E 4 df row bracket chi-square statistic look up tail regions (approx P-value) Example (below) shows bracketing values for example are (P = .025) and (P = .01) thus .01 < P < .025 Right tail 0.975 0.25 0.20 0.15 0.10 0.05 0.025 0.01 df =4 0.48 5.39 5.99 6.74 7.78 9.49 11.14 13.28 14.86 November 18
22
Illustration: X2stat= 13.20 with 4 df
The P-value = AUC in the tail beyond X2stat November 18
23
WinPEPI > Compare2 > F1
Input screen row 5 not visible Output November 18
24
Continuity Corrected Chi-Square
Two different chi-square statistics Both used in practice Pearson’s (“uncorrected”) chi-square Yates’ continuity-corrected chi-square: November 18
25
Chi-Square, cont. How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large evidence against H0 mounts Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5. November 18
26
Chi-Square, cont. 3. Supplement chi-squares with measures of association. Chi-square statistics do not quantify effects (need RR, RD, or OR) 4. Chi-square and z tests (Ch 17) produce identical P-values. The relationship between the statistics is: November 18
27
18.4 Test for Trend See pp. 431 – 436 November 18
28
§18.5 Case-Control Sampling
Identify all cases in source population Randomly select non-cases (controls) from source population Ascertain exposure status of subjects Cross-tabulate Efficient way to study rare outcomes November 18
29
Case-Control Sampling
Select non-case at random when case occurs Miettinen. Am J Epidemiol 1976; 103, November 18
30
Odds Ratio OR stochastically = RR
Cross-tabulate exposure (E) & disease (D) D+ D− E+ a1 b1 E− a2 b2 Calculate cross-product ratio OR stochastically = RR November 18
31
Relative risk associated with exposure
BD1 Data Cases: esophageal cancer Controls: noncases selected at random from electoral lists Exposure: alcohol consumption dichotomized at 80 gms/day Relative risk associated with exposure November 18
32
(1– α)100% CI for the OR November 18
33
90% CI for OR – Example D+ D− E+ 96 109 E− 104 666 November 18
34
WinPEPI > Compare2 > A.
Data entry Output WinPEPI’s Mid-P interval similar to ours November 18
35
Ordinal Exposure Break data up into multiple tables, using the least exposed level as baseline each time November 18
36
Ordinal Exposure Dose-response November 18
37
18.6 Matched Pairs Cohort matched pairs: each exposed individual uniquely matched to non-exposed individual Case-control matched pairs: each case uniquely matched to a control Controls for matching (confounding) factor Requires special matched-pair analysis November 18
38
Matched-Pairs, Cohort Exposed D+ Exposed D− Non-exp D+ a b Non-exp D−
November 18
39
Matched-Pairs, Case-Control
Case E+ Case E− Control E+ a b Control E− c d November 18
40
Matched-Pairs Case-Cntl Example
Cases = colon polyps; Controls = no polyps Exposure = low fruit & veg consumption Case E+ Case E− Cntl E+ ? 24 Cntl E− 45 88% higher risk w/ low fruit/veg consumption November 18
41
Matched-Pairs - Example
November 18
42
WinPEPI > PairEtc > A.
Input Output November 18
43
Hypothesis Test Matched Pairs
A. H0: OR = 1 B. McNemar’s test (z or chi-square) C. P-value from z stat Avoid if fewer than 5 discordancies expected November 18
44
Twins Mortality Example
Smoker D+ Smoker D− Non-smoker D+ 5 Non-smoker D− 17 November 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.