Presentation is loading. Please wait.

Presentation is loading. Please wait.

September 15. In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control.

Similar presentations


Presentation on theme: "September 15. In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control."— Presentation transcript:

1 September 15

2 In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control Samples 18.6 Matched Pairs

3 §18.1 Types of Samples The prior chapter considered categorical response variables with two possible outcomes This chapter considers categorical variables with any number of possible outcomes

4 Types of Samples, cont. Data may be generated by: I. Naturalistic Samples. An SRS with data then cross-classified according to the explanatory variable and response variable. II. Purposive Cohort Samples. Fixed numbers of individuals selected according to the explanatory factor. III. Case-Control Samples. Fixed numbers of individuals selected according to the outcome variable.

5 Naturalistic Samples Take an SRS from the population; then cross-classify individuals with respect to explanatory and response variables.

6 Purposive Cohort Samples Select predetermined numbers of exposed and nonexposed individuals; then ascertain outcomes in individuals.

7 Case-Control Samples Identify individuals who are positive for the outcome (cases); then sample the population for negative (controls).

8 §18.2 Naturalistic and Cohort Samples Data from a naturalistic sample are shown in this 5-by-2 table Let us always put the explanatory variable in row of such table (for uniformity) Totals are tallied in table margins Smoke + Smoke − Total High school 123850 Assoc. degree 186785 Some college 2795122 UG degree 32239271 Grad degree 55257 Total 94491585

9 Marginal Distributions For naturalistic samples (only) describe marginal distributions These may be reported graphically or in terms of percentages Top figure: column marginal distribution Bottom figure: row marginal distribution

10 Conditional Percents The relationship between the row variable and column variable is explored with conditional percents. Two types of conditional percents : Row percents  use in cohort and naturalistic samples (describe prevalence and incidence) Column percents  use in case-control samples

11 Incidence and Prevalence (Naturalistic and Cohort Samples only) The top table demonstrates R-by- C table notation (R rows and C columns) For naturalistic and cohort samples, row percents in column 1 represent group incidence or prevalences Smoke+Smoke-Total Group 1 a1a1 b1b1 n1n1 Group 2 a2a2 b2b2 n2n2 ↓ ↓↓n3n3 Group R aRaR bRbR nRnR Total m1m1 m2m2 N

12 Prevalences - Example This table shows prevalence by education level Example of calculation, prevalence group 1:

13 Relative Risks, R-by-2 Tables Let group 1 represent the least exposed group Relative risks are calculated as follows:

14 RRs, R-by-2 Tables, Example This table lists RR for the illustrative data Example of calculation Notice the downward dose-response in RRs

15 Odds Ratios, R-by-2 Tables (optional) The odds of an event is the ratio of successes to failures: The odds ratios associated with exposure level i in a R-by-2 table is Interpretation. ORs similar to RRs, e.g., OR≈1 implies no association (see chapter for details)

16

17 Responses with More than Two Levels of Outcome Efficacy of Echinacea. A randomized controlled clinical trial pitted echinacea vs. placebo in the treatment of upper respiratory symptoms in children. The response variable was severity of illness classified as: mild, moderate or severe. Source: JAMA 2003, 290(21), 2824-30JAMA 2003, 290(21), 2824-30

18 Echinacea, Conditional Distributions Row percents are calculated to determine the incidence of each outcome. Example of calculation, top right table cell (data prior slide) % severe w/echinacea = 48 / 329 × 100% = 14.6% Conclusion: the treatment group fared slightly worse than the control group: 14.6% of treatment group experienced severe symptoms compared to 10.9% of the control group.

19 §18.3 Chi-Square Test of Association A. Hypotheses. H 0 : no association in population versus H a : association in population B. Test statistic. C. P-value. Convert the X 2 stat to a P-value with a a Table E or software program.

20 Chi-Square Test - Example Data below reveal a negative association between smoking and education level. Let us test H 0 : no association in the population vs. H a : association in the population.

21 χ 2, Expected Frequencies

22 Chi-Square Statistic - Example

23 Chi-Square Test, P-value X 2 stat = 13.20 with 4 df Using Table E, find the row for 4 df Find the chi-square values in this row that bracket 13.20 Bracketing values are 11.14 (P =.025) and 13.28 (P =.01). Thus,.025 < P <.01 (closer to.01) Probability in right tail df0.980.250.200.150.100.050.0250.01 40.485.395.996.747.789.4911.1413.2814.86

24 Illustrative example X 2 stat = 13.20 with 4 df The P-value = AUC in the tail beyond X 2 stat

25 Chi-Square By Computer Here are results for the illustrative data from WinPepi > Compare2.exe > Program F Categorical Data

26 Yates’ Continuity Corrected Chi- Square Statistic Two different chi-square statistics are used in practice Pearson’s chi-square statistic (covered) is Yates’ continuity-corrected chi-square statistic is: The continuity-corrected method produces smaller chi- square statistics and larger P-values. Both chi-square are used in practice.

27 Chi-Square, cont. 1.How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large and evidence against H 0 mounts 2.Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5.

28 Chi-Square, cont. 3. Supplement chi-squares with measures of association. Chi-square statistics do not measure the strength of association. Use descriptive statistics or RRs to quantify “strength”. 4. Chi-square and z tests (Ch 17) produce identical P-values. The relationship between the statistics is:

29 18.4 Test for Trend See pp. 431 – 436

30 §18.5 Case-Control Samples Case-control sampling method Identify all cases in the population From the same source population, randomly select a series of non-cases (controls) Ascertain the exposure status of cases and controls Cross-tabulate the exposure status of cases and controls This provides an efficient way to study rare outcomes

31 Incidence Density Sampling This advanced concepts allows students to see that case- control studies are a type of longitudinal “time-failure” design. As cases are identified in the population; select at random one or more noncases (controls) for each case at time of occurrence.

32 Odds Ratio CasesControlsTotal Exposeda1a1 b1b1 n1n1 Nonexposeda2a2 b2b2 n2n2 Totalm1m1 m2m2 N With incidence density sampling, the OR is a direct estimate of the rate ratio in the population! Cross-tabulate the count of cases and controls according to their exposure status: cross-product ratio

33 Case-Control Illustrative Example Cases: men diagnosed with esophageal cancer Controls: noncases selected at random from electoral lists in same region Exposure = alcohol consumption dichotomized at 80 gms/day Interpretation: The rate ratio associated with high-alcohol consumption is about 5.6

34 (1– α)100% CI for the OR Note use of the natural logarithmic scale

35 90% CI for the OR – Example CasesCntls E+96109 E−104666

36 Case-Control - Example Results from WinPepi > Compare2.exe > A. WinPepi uses a slightly different formula than ours; the Mid-P results are similar to ours.

37 Case-Control Studies with Multiple Levels of Exposure With an ordinal exposure, compare each exposure level to the non-exposed group (next slide):

38 Case-Control, Ordinal Levels of Exposure Note dose-response relationship

39 18.6 Matched Pairs With matched-pair samples, each participant is carefully matched to a unique individual as part of the selection process This technique is used to mitigate confounding by the matching factor Both cohort and case-control samples may avail themselves of matching

40 Here’s the notation for matched-pair case-control data: The odds ratio associate with exposure is: The confidence interval is: Case E+Case E− Control E+ab Control E−cd

41 Matched Pairs - Example A matched case-control study found 45 pairs in which the case but not the control had a low fruit/veg diet; it found 24 pairs in which the control but not the case had a low fruit/veg diet Case E+Case E− Cntl E+ unknown 24 Cntl E−45 unknown The odds ratio suggests 88% higher risk in low fruit/veg consumers.

42 Matched Pair Example, cont. Data are compatible with ORs between 1.14 and 3.07 WinPepi’s PairEtc.exe program A calculates exact confidence intervals for ORs from matched-pair data. Hand calculated limits will be similar except in small samples.

43 Hypothesis Test, Matched Pairs A. H 0 : OR = 1 B. McNemar’s test statistic. C. P-values. Convert z stat to P-value with Table B or Table F If fewer than 5 discordancies are expected, use an exact binomial procedure (see text).

44 Hypothesis Test, Example Case E+Case E− Control E+unknown24 Control E−45unknown


Download ppt "September 15. In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control."

Similar presentations


Ads by Google