Examples and SAS introduction: -Violations of the rare disease assumption -Use of Fisher’s exact test January 14, 2004.

Slides:



Advertisements
Similar presentations
Three or more categorical variables
Advertisements

Hypothesis Testing and Comparing Two Proportions Hypothesis Testing: Deciding whether your data shows a “real” effect, or could have happened by chance.
Tutorial: Chi-Square Distribution Presented by: Nikki Natividad Course: BIOL Biostatistics.
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
Simple Logistic Regression
1 Contingency Tables: Tests for independence and homogeneity (§10.5) How to test hypotheses of independence (association) and homogeneity (similarity)
MEASURES OF DISEASE ASSOCIATION Nigel Paneth. MEASURES OF DISEASE ASSOCIATION The chances of something happening can be expressed as a risk or as an odds:
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.5 Small Sample.
Chapter 10 Analyzing the Association Between Categorical Variables
How Can We Test whether Categorical Variables are Independent?
Xuhua Xia Smoking and Lung Cancer This chest radiograph demonstrates a large squamous cell carcinoma of the right upper lobe. This is a larger squamous.
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Analysis of Categorical Data
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Introduction to observational medical studies and measures of association HRP 261 January 5, 2005 Read Chapter 1, Agresti.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
Introduction to Biostatistics (ZJU 2008) Wenjiang Fu, Ph.D Associate Professor Division of Biostatistics, Department of Epidemiology Michigan State University.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
March 30 More examples of case-control studies General I x J table Chi-square tests.
Introduction to Categorical Data Analysis July 22, 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.1 Independence.
Statistical analyses for two- way contingency tables HRP 261 January 10, 2005 Read Chapter 2 Agresti.
Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.
I. ANOVA revisited & reviewed
The Chi-square Statistic
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
March 28 Analyses of binary outcomes 2 x 2 tables
Chi-Square (Association between categorical variables)
Statistics 200 Lecture #9 Tuesday, September 20, 2016
CHAPTER 13 Data Processing, Basic Data Analysis, and the Statistical Testing Of Differences Copyright © 2000 by John Wiley & Sons, Inc.
Learning Objectives: 1. Understand the use of significance levels. 2
Lecture #18 Thursday, October 20, 2016 Textbook: Sections 12.1 to 12.4
Lecture #19 Tuesday, October 25, 2016 Textbook: Sections 12.3 to 12.4
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Confidence Intervals and p-values
Chi-Square X2.
The binomial applied: absolute and relative risks, chi-square
Hypothesis Testing Review
Chapter 8: Inference for Proportions
Chapter 8 Inference for Proportions
Multiple logistic regression
Inference on Categorical Data
We’ll now consider 2x2 contingency tables, a table which has only 2 rows and 2 columns along with a special way to analyze it called Fisher’s Exact Test.
Saturday, August 06, 2016 Farrokh Alemi, PhD.
Hypothesis Testing and Comparing Two Proportions
Reasoning in Psychology Using Statistics
Risk ratios 12/6/ : Risk Ratios 12/6/2018 Risk ratios StatPrimer.
If we can reduce our desire,
Categorical Data Analysis Review for Final
Contingency Tables: Independence and Homogeneity
Inference on Categorical Data
Contingency tables and goodness of fit
The 2 (chi-squared) test for independence
Data Processing, Basic Data Analysis, and the
Reasoning in Psychology Using Statistics
Exact Test Fisher’s Statistics
Categorical Data Analysis
Chapter 8 Inference for Proportions
Inferential statistics Study a sample Conclude about the population Two processes: Estimation (Point or Interval) Hypothesis testing.
Applied Statistics Using SPSS
Applied Statistics Using SPSS
Testing Hypotheses I Lesson 9.
Karl L. Wuensch Department of Psychology East Carolina University
Introduction To Hypothesis Testing
Presentation transcript:

Examples and SAS introduction: -Violations of the rare disease assumption -Use of Fisher’s exact test January 14, 2004

1. When can the OR mislead?

When is the OR is a good approximation of the RR? General Rule of Thumb: “OR is a good approximation as long as the probability of the outcome in the unexposed is less than 10%”

February 25, 1999 Volume 340:618-626 From: “The Effect of Race and Sex on Physicians' Recommendations for Cardiac Catheterization” Study overview: Researchers developed a computerized survey instrument to assess physicians' recommendations for managing chest pain. Actors portrayed patients with particular characteristics (race and sex) in scripted interviews about their symptoms. 720 Physicians at two national meetings viewed a recorded interview and was given other data about a hypothetical patient. He or she then made recommendations about that patient's care.

February 25, 1999 Volume 340:618-626 From: “The Effect of Race and Sex on Physicians' Recommendations for Cardiac Catheterization”

Their results… The Media Reports: “Doctors were only 60 percent as likely to order cardiac catheterization for women and blacks as for men and whites. For black women, the doctors were only 40 percent as likely to order catheterization.”

Media headlines on Feb 25th, 1999… Wall Street Journal: “Study suggests race, sex influence physicians' care.” New York Times: Doctor bias may affect heart care, study finds.” Los Angeles Times: “Heart study points to race, sex bias.” Washington Post: “Georgetown University study finds disparity in heart care; doctors less likely to refer blacks, women for cardiac test.” USA Today: “Heart care reflects race and sex, not symptoms.” ABC News: “Health care and race”

A closer look at the data… The authors failed to report the risk ratios: RR for women: .847/.906=.93 RR for black race: .847/.906=.93 Correct conclusion: Only a 7% decrease in chance of being offered correct treatment.

Lessons learned: 90% outcome is not rare! OR is a poor approximation of the RR here, magnifying the observed effect almost 6-fold. Beware! Even the New England Journal doesn’t always get it right! SAS automatically calculates both, so check how different the two values are even if the RR is not appropriate. If they are very different, you have to be very cautious in how you interpret the OR.

SAS code and output for generating OR/RR from 2x2 table   Cath No Cath Female 305 55 Male 326 34 360

data cath_data; input IsFemale GotCath Freq; datalines; 1 1 305 1 0 55 0 1 326 0 0 34 run; data cath_data; *Fix quirky reversal of SAS 2x2 tables; set cath_data; IsFemale=1-IsFemale; GotCath=1-GotCath; proc freq data=cath_data; tables IsFemale*GotCath /measures; weight freq; run;

SAS output Statistics for Table of IsFemale by GotCath Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 0.5784 0.3669 0.9118 Cohort (Col1 Risk) 0.9356 0.8854 0.9886 Cohort (Col2 Risk) 1.6176 1.0823 2.4177 Sample Size = 720

Furthermore…stratification shows…

2. Example of Fisher’s Exact Test

Fisher’s “Tea-tasting experiment” (p. 40 Agresti) Claim: Fisher’s colleague (call her “Cathy”) claimed that, when drinking tea, she could distinguish whether milk or tea was added to the cup first. To test her claim, Fisher designed an experiment in which she tasted 8 cups of tea (4 cups had milk poured first, 4 had tea poured first). Null hypothesis: Cathy’s guessing abilities are no better than chance. Alternatives hypotheses: Right-tail: She guesses right more than expected by chance. Left-tail: She guesses wrong more than expected by chance

Fisher’s “Tea-tasting experiment” (p. 40 Agresti) Experimental Results:   Milk Tea 3 1 Guess poured first Poured First 4

Fisher’s Exact Test Step 1: Identify tables that are as extreme or more extreme than what actually happened: Here she identified 3 out of 4 of the milk-poured-first teas correctly. Is that good luck or real talent? The only way she could have done better is if she identified 4 of 4 correct.   Milk Tea 3 1 Guess poured first Poured First 4   Milk Tea 4 Guess poured first Poured First

Fisher’s Exact Test Step 2: Calculate the probability of the tables (assuming fixed marginals)   Milk Tea 3 1 Guess poured first Poured First 4   Milk Tea 4 Guess poured first Poured First

“right-hand tail probability”: p=.243 Step 3: to get the left tail and right-tail p-values, consider the probability mass function: Probability mass function of X, where X= the number of correct identifications of the cups with milk-poured-first: “right-hand tail probability”: p=.243 “left-hand tail probability” (testing the null hypothesis that she’s systematically wrong): p=.986

SAS code and output for generating Fisher’s Exact statistics for 2x2 table   Milk Tea 3 1 4

data tea; input MilkFirst GuessedMilk Freq; datalines; 1 1 3 1 0 1 0 1 1 0 0 3 run; data tea; *Fix quirky reversal of SAS 2x2 tables; set tea; MilkFirst=1-MilkFirst; GuessedMilk=1-GuessedMilk;run; proc freq data=tea; tables MilkFirst*GuessedMilk /exact; weight freq;run;

SAS output Statistics for Table of MilkFirst by GuessedMilk Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 2.0000 0.1573 Likelihood Ratio Chi-Square 1 2.0930 0.1480 Continuity Adj. Chi-Square 1 0.5000 0.4795 Mantel-Haenszel Chi-Square 1 1.7500 0.1859 Phi Coefficient 0.5000 Contingency Coefficient 0.4472 Cramer's V 0.5000 WARNING: 100% of the cells have expected counts less than 5. Chi-Square may not be a valid test. Fisher's Exact Test ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Cell (1,1) Frequency (F) 3 Left-sided Pr <= F 0.9857 Right-sided Pr >= F 0.2429 Table Probability (P) 0.2286 Two-sided Pr <= P 0.4857 Sample Size = 8