September 15. In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control.

Slides:



Advertisements
Similar presentations
Contingency Tables Prepared by Yu-Fen Li.
Advertisements

January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
1 2 Two-samples tests, X 2 Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Case-Control Studies (Retrospective Studies). What is a cohort?
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Chapter 13: The Chi-Square Test
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #17.
Chapter 17 Comparing Two Proportions
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chapter 17 Comparing Two Proportions
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
16: Odds Ratios [from case- control studies] Case-control studies get around several limitations of cohort studies.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
AS 737 Categorical Data Analysis For Multivariate
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Analysis of Categorical Data
Gerstman Case-Control Studies 1 Epidemiology Kept Simple Section 11.5 Case-Control Studies.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Amsterdam Rehabilitation Research Center | Reade Testing significance - categorical data Martin van der Esch, PhD.
1 Chapter 8 Case-Control Studies. 2 Chapter Outline 8.1 Introduction 8.2 Identification of cases and controls 8.3 Obtaining information on exposure 8.4.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means.
September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
1 Desipramine is an antidepressant affecting the brain chemicals that may become unbalanced and cause depression. It was tested for recovery from cocaine.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing two population proportions using independent samples.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
1October In Chapter 17: 17.1 Data 17.2 Risk Difference 17.3 Hypothesis Test 17.4 Risk Ratio 17.5 Systematic Sources of Error 17.6 Power and Sample.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
CHAPTER 11 SECTION 2 Inference for Relationships.
November 15. In Chapter 12: 12.1 Paired and Independent Samples 12.2 Exploratory and Descriptive Statistics 12.3 Inference About the Mean Difference 12.4.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Copyright © 2010 Pearson Education, Inc. Slide
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
STATS 10x Revision CONTENT COVERED: CHAPTERS
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
Chi-Square X2.
Chapter 18 Part B: Case-Control Studies and Test for Trend
Chapter 18 Cross-Tabulated Counts
Elementary Statistics
Chapter 18 Cross-Tabulated Counts Part A
Chapter 11: Inference About a Mean
Risk ratios 12/6/ : Risk Ratios 12/6/2018 Risk ratios StatPrimer.
Interpreting Epidemiologic Results.
Chapter 18 Part C: Matched Pairs
Chapter 13: Chi-Square Procedures
Presentation transcript:

September 15

In Chapter 18: 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control Samples 18.6 Matched Pairs

§18.1 Types of Samples The prior chapter considered categorical response variables with two possible outcomes This chapter considers categorical variables with any number of possible outcomes

Types of Samples, cont. Data may be generated by: I. Naturalistic Samples. An SRS with data then cross-classified according to the explanatory variable and response variable. II. Purposive Cohort Samples. Fixed numbers of individuals selected according to the explanatory factor. III. Case-Control Samples. Fixed numbers of individuals selected according to the outcome variable.

Naturalistic Samples Take an SRS from the population; then cross-classify individuals with respect to explanatory and response variables.

Purposive Cohort Samples Select predetermined numbers of exposed and nonexposed individuals; then ascertain outcomes in individuals.

Case-Control Samples Identify individuals who are positive for the outcome (cases); then sample the population for negative (controls).

§18.2 Naturalistic and Cohort Samples Data from a naturalistic sample are shown in this 5-by-2 table Let us always put the explanatory variable in row of such table (for uniformity) Totals are tallied in table margins Smoke + Smoke − Total High school Assoc. degree Some college UG degree Grad degree Total

Marginal Distributions For naturalistic samples (only) describe marginal distributions These may be reported graphically or in terms of percentages Top figure: column marginal distribution Bottom figure: row marginal distribution

Conditional Percents The relationship between the row variable and column variable is explored with conditional percents. Two types of conditional percents : Row percents  use in cohort and naturalistic samples (describe prevalence and incidence) Column percents  use in case-control samples

Incidence and Prevalence (Naturalistic and Cohort Samples only) The top table demonstrates R-by- C table notation (R rows and C columns) For naturalistic and cohort samples, row percents in column 1 represent group incidence or prevalences Smoke+Smoke-Total Group 1 a1a1 b1b1 n1n1 Group 2 a2a2 b2b2 n2n2 ↓ ↓↓n3n3 Group R aRaR bRbR nRnR Total m1m1 m2m2 N

Prevalences - Example This table shows prevalence by education level Example of calculation, prevalence group 1:

Relative Risks, R-by-2 Tables Let group 1 represent the least exposed group Relative risks are calculated as follows:

RRs, R-by-2 Tables, Example This table lists RR for the illustrative data Example of calculation Notice the downward dose-response in RRs

Odds Ratios, R-by-2 Tables (optional) The odds of an event is the ratio of successes to failures: The odds ratios associated with exposure level i in a R-by-2 table is Interpretation. ORs similar to RRs, e.g., OR≈1 implies no association (see chapter for details)

Responses with More than Two Levels of Outcome Efficacy of Echinacea. A randomized controlled clinical trial pitted echinacea vs. placebo in the treatment of upper respiratory symptoms in children. The response variable was severity of illness classified as: mild, moderate or severe. Source: JAMA 2003, 290(21), JAMA 2003, 290(21),

Echinacea, Conditional Distributions Row percents are calculated to determine the incidence of each outcome. Example of calculation, top right table cell (data prior slide) % severe w/echinacea = 48 / 329 × 100% = 14.6% Conclusion: the treatment group fared slightly worse than the control group: 14.6% of treatment group experienced severe symptoms compared to 10.9% of the control group.

§18.3 Chi-Square Test of Association A. Hypotheses. H 0 : no association in population versus H a : association in population B. Test statistic. C. P-value. Convert the X 2 stat to a P-value with a a Table E or software program.

Chi-Square Test - Example Data below reveal a negative association between smoking and education level. Let us test H 0 : no association in the population vs. H a : association in the population.

χ 2, Expected Frequencies

Chi-Square Statistic - Example

Chi-Square Test, P-value X 2 stat = with 4 df Using Table E, find the row for 4 df Find the chi-square values in this row that bracket Bracketing values are (P =.025) and (P =.01). Thus,.025 < P <.01 (closer to.01) Probability in right tail df

Illustrative example X 2 stat = with 4 df The P-value = AUC in the tail beyond X 2 stat

Chi-Square By Computer Here are results for the illustrative data from WinPepi > Compare2.exe > Program F Categorical Data

Yates’ Continuity Corrected Chi- Square Statistic Two different chi-square statistics are used in practice Pearson’s chi-square statistic (covered) is Yates’ continuity-corrected chi-square statistic is: The continuity-corrected method produces smaller chi- square statistics and larger P-values. Both chi-square are used in practice.

Chi-Square, cont. 1.How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large and evidence against H 0 mounts 2.Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5.

Chi-Square, cont. 3. Supplement chi-squares with measures of association. Chi-square statistics do not measure the strength of association. Use descriptive statistics or RRs to quantify “strength”. 4. Chi-square and z tests (Ch 17) produce identical P-values. The relationship between the statistics is:

18.4 Test for Trend See pp. 431 – 436

§18.5 Case-Control Samples Case-control sampling method Identify all cases in the population From the same source population, randomly select a series of non-cases (controls) Ascertain the exposure status of cases and controls Cross-tabulate the exposure status of cases and controls This provides an efficient way to study rare outcomes

Incidence Density Sampling This advanced concepts allows students to see that case- control studies are a type of longitudinal “time-failure” design. As cases are identified in the population; select at random one or more noncases (controls) for each case at time of occurrence.

Odds Ratio CasesControlsTotal Exposeda1a1 b1b1 n1n1 Nonexposeda2a2 b2b2 n2n2 Totalm1m1 m2m2 N With incidence density sampling, the OR is a direct estimate of the rate ratio in the population! Cross-tabulate the count of cases and controls according to their exposure status: cross-product ratio

Case-Control Illustrative Example Cases: men diagnosed with esophageal cancer Controls: noncases selected at random from electoral lists in same region Exposure = alcohol consumption dichotomized at 80 gms/day Interpretation: The rate ratio associated with high-alcohol consumption is about 5.6

(1– α)100% CI for the OR Note use of the natural logarithmic scale

90% CI for the OR – Example CasesCntls E E−104666

Case-Control - Example Results from WinPepi > Compare2.exe > A. WinPepi uses a slightly different formula than ours; the Mid-P results are similar to ours.

Case-Control Studies with Multiple Levels of Exposure With an ordinal exposure, compare each exposure level to the non-exposed group (next slide):

Case-Control, Ordinal Levels of Exposure Note dose-response relationship

18.6 Matched Pairs With matched-pair samples, each participant is carefully matched to a unique individual as part of the selection process This technique is used to mitigate confounding by the matching factor Both cohort and case-control samples may avail themselves of matching

Here’s the notation for matched-pair case-control data: The odds ratio associate with exposure is: The confidence interval is: Case E+Case E− Control E+ab Control E−cd

Matched Pairs - Example A matched case-control study found 45 pairs in which the case but not the control had a low fruit/veg diet; it found 24 pairs in which the control but not the case had a low fruit/veg diet Case E+Case E− Cntl E+ unknown 24 Cntl E−45 unknown The odds ratio suggests 88% higher risk in low fruit/veg consumers.

Matched Pair Example, cont. Data are compatible with ORs between 1.14 and 3.07 WinPepi’s PairEtc.exe program A calculates exact confidence intervals for ORs from matched-pair data. Hand calculated limits will be similar except in small samples.

Hypothesis Test, Matched Pairs A. H 0 : OR = 1 B. McNemar’s test statistic. C. P-values. Convert z stat to P-value with Table B or Table F If fewer than 5 discordancies are expected, use an exact binomial procedure (see text).

Hypothesis Test, Example Case E+Case E− Control E+unknown24 Control E−45unknown