More Contingency Tables & Paired Categorical Data Lecture 8.

Slides:



Advertisements
Similar presentations
Contingency Tables Prepared by Yu-Fen Li.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
Business Statistics - QBM117
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
BCOR 1020 Business Statistics
Lecture Inference for a population mean when the stdev is unknown; one more example 12.3 Testing a population variance 12.4 Testing a population.
T-Tests Lecture: Nov. 6, 2002.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics Lecture 20 – April 3, 2008.
BCOR 1020 Business Statistics
Business Statistics - QBM117 Testing hypotheses about a population mean.
Comparing Population Parameters (Z-test, t-tests and Chi-Square test) Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director,
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Presentation 12 Chi-Square test.
Chapter 12: Analysis of Variance
Statistical Inference for Two Samples
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Chapter 13: Inference in Regression
Analysis of Categorical Data
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
More About Significance Tests
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
Large sample CI for μ Small sample CI for μ Large sample CI for p
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
How to Read Scientific Journal Articles
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
to accompany Introduction to Business Statistics
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
MTH3003 PJJ SEM II 2014/2015 F2F II 12/4/2015.  ASSIGNMENT :25% Assignment 1 (10%) Assignment 2 (15%)  Mid exam :30% Part A (Objective) Part B (Subjective)
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
P-values and statistical inference Dr. Omar Aljadaan.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Chapter 10 Comparing Two Treatments Statistics, 5/E by Johnson and Bhattacharyya Copyright © 2006 by John Wiley & Sons, Inc. All rights reserved.
Section 6.4 Inferences for Variances. Chi-square probability densities.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Statistics 300: Elementary Statistics Section 11-3.
Hypothesis Tests for 1-Proportion Presentation 9.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
Lecture #23 Tuesday, November 8, 2016 Textbook: 13.1 through 13.4
Lecture #24 Thursday, November 10, 2016 Textbook: 13.1 through 13.6
Paired Samples and Blocks
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
What are their purposes? What kinds?
Presentation transcript:

More Contingency Tables & Paired Categorical Data Lecture 8

A Larger Contingency Table A 4-by-2 contingency table. (Made-up data filled into empty cells from last class.) Exercise LevelCold/FluNo Cold/FluTotal No Exercise* Light Exercise Mod. Exercise Heavy Exercise* Totals (Marginal)

Estimated Distributions The Conditional Distributions are the distributions of the response within each level of the predictor. For example, No Exercise: 79/217=.364 experienced cold/flu 138/217=.636 didn’t Light Exercise: 96/222=.432, 126/222=.568 Etc. The Marginal Distribution is the distribution of the responses if we ignore information about the predictor. Colds/flu: 320/750 =.427 No cold/flu: 430/750 =.573

To Summarize Distributions in a Table Exercise LevelCold/FluNo Cold/FluTotal No Exercise*79/217 = /217 =.636 Light Exercise96/222 = /222 =.568 Mod. Exercise30/73 =.41143/73 =.589 Heavy Exercise*115/238 = /238 =.517 Totals (Marginal)320/750 = /750 =.573

Expected Values Under the Null Exercise LevelCold/FluNo Cold/FluTotal No Exercise*217*.427 ≈ *.573 ≈ Light Exercise222*.427 ≈ *.573 ≈ Mod. Exercise73*.427 ≈ *.573 ≈ Heavy Exercise*238*.427 ≈ *.573 ≈ Totals (Marginal) The approximate values are due to round-off error in the estimated probabilities. Note that we avoided some round-off error by calculating directly from the totals as 217*320/750.

Test Statistic and Sampling Distribution A test of independence of the two variables (Exercise Level and Cold/Flu) will be carried out using a chi-square test statistic with (r-1)(c-1)=(4-1)(2-1)=3 degrees of freedom. The test statistic is calculated as

Hypothesis Test Assumptions Random Independent Sample Groups collected independently “Large Sample” Hypotheses H 0 : conditional distributions equal H A : conditional distributions not all equal Test Statistic Chi-square = 6.69 compared to chi-square dist’n with 3 d.f.

Hypothesis Test, cont. P-value/Rejection Region Critical Values are for.05 significance, for.06, for.07 and for.10. Since 6.69 < 7.815, we fail to reject at the 0.05 level. The p-value is between.07 and.10. Conclusion At the type 1 error rate of.05, we fail to reject the null hypothesis. There is not enough evidence to say that the probability of whether or not someone gets a cold depends on the exercise level.

Matched Categorical Data Data may be matched/paired with respect to the risk factor or the response Matching on risk factor (not directly discussed in text) Differences of proportions, relative risks, and odds ratios are all appropriate. The formulas and the set-up of the contingency table will be different. We will focus on odds ratios, which will be calculated in the same way as for the matched case-control study. Matching on response (Matched Case-Control Study) Only the odds ratio is an appropriate measure of the association between the risk factor and the response. In both cases, inference focuses on the pair.

A Matched Case-Control Study on CAD Each of 59 adults with Coronary Artery Disease (CAD) were matched with an adult who did not develop CAD but was of the same gender, age, ethnicity, and socio- economic status. Of interest was whether drinking 2 or more glasses of red wine (on average) per week was associated with development of CAD.

Table for Matched Case-Control Data Do NOT use the standard contingency table that summarizes information about the individual subjects. Instead, use the following table to summarize information about the pairs. Cases (CAD) >= 2< 2 Controls>= < 21020

Physician Adherence Study - Matching on Predictor Suppose that investigators were interested in whether a particular educational intervention had an effect on whether physicians prescribe a particular treatment plan for their asthma patients. 75 physicians are rated on whether they prescribe the treatment plan both before and after the educational intervention. Before YesNo AfterYes2225 No1216

Estimation and Inference for Matched Categorical Data CANNOT use formulas for CI of odds ratio given before because the two groups of subjects (whether “exposure” groups or case/control groups) are not chosen independently. Inferences will be based on the discordant pairs, that is, the pairs in which the members “disagree” on the predictor variable for case-control studies on the response variable when subjects are matched with respect to the predictor

Labeling Cell (Pair) Counts & Estimation of Odds Ratio Odds ratio is estimated as R/S Interpretation: The odds that a person in group 2 is “exposed” is R/S times the odds that a group 1 member is “exposed.” Or: The odds that an “exposed” person is in group 2 is R/S times the odds that an “unexposed” person is in group Group 1 YesNo Group 2YesR NoS

CI for Odds Ratio The 95% confidence interval for the (natural) log of the odds ratio is

CAD Example – Odds Ratio There are more pairs in which a case drinks less than 2 and a control drinks more than 2 than pairs in which a case drinks more than 2 and a control drinks less than 2. Thus, >=2 has a “protective effect. ” The odds ratio is 14/10=1.4 The odds of someone who has at least two drinks per week not developing CAD is 1.4 times the odds of someone The odds of developing CAD for those who drink less than two drinks per week are 1.4 times the odds for someone who drinks more than 2 drinks per week. Cases (CAD) >= 2< 2 Controls>= < 21020

CAD Eg. – CI for Odds Ratio The 95% CI for the log of the OR is log(1.4) +/- 1.96*sqrt(1/14 + 1/10) = (-.475, 1.148) 95% CI for OR is (.622, 3.152) With 95% confidence, the odds of developing CAD for those who drink less than two drinks per week are between.622 and times the odds for someone who drinks more than 2 drinks per week. This interval includes 1, therefore, the effect of drinking at least two drinks per week is not a significant effect! However, the interval is very wide, so…

Physician Intervention: Odds Ratio Note that there are more pairs in which the physician prescribes the treatment plan after the intervention but not before than in which the physician prescribes the treatment plan before but not after. The odds ratio is calculated as 25/12=2.083 The odds that a physician will prescribe the treatment plan after the intervention are times the odds that a physician will prescribe it before the intervention. Before YesNo AfterYes2225 No1216

Physicians – CI for Odds Ratio The 95% CI for log of the odds ratio is ln(2.083) +/- 1.96*sqrt(1/25 + 1/12) = (.046, 1.422) The 95% CI for the odds ratio is (1.047, 4.145) There is a significant effect of the intervention since 1 is not included in the interval.

Hypothesis Testing in Matched Designs Again, the test involves comparing the discordant pairs. In particular, if the predictor and response are independent, one would expect the population proportion of each type of discordant pairs to be equal. If there is inequality in the sample, is it possible that the inequality is just due to chance?

Hypothesis Test – The Steps Assumptions Random, independent selection of pairs Large Sample (R+S > 10) Hypotheses H 0 : Predictor and Response are independent variables H A : Predictor and Response are associated Test Statistic With Yates’ continuity correction, P-value: Compare to chi-square dist’n with 1 d.f. Conclusion: per usual

CAD – Hypothesis Test Assumptions Random, independent selection of pairs Large Sample (R+S=24 > 10) Hypotheses H 0 : Drinking and CAD are independent variables H A : Drinking and CAD are associated Test Statistic (14-10) 2 /(14+10) = 16/24 =.667 P-value: Table A5.7: p-value is between.4386 and Conclusion: Insufficient evidence to reject the null that says that Drinking is not associated with CAD.

Physician – Hypothesis Test Assumptions Random, independent selection of pairs Large Sample (R+S = 37 > 10) Hypotheses H 0 : Participation in intervention and prescription of treatment plan are independent variables H A : Participation in intervention and prescription of treatment plan are associated Test Statistic (25-12) 2 /(25+12) = P-value: between.0339 and Conclusion: At the 0.05 significance level, reject the null in favor of the alternative that the intervention does have an effect on whether physicians prescribe the treatment plan.

Homework Textbook Reading Chapter 29, first two sections Repeat Chapter 9 (has info about OR for paired case-control studies) (Last time: Chapter 8, Chapter 26) When doing calculations for this class, you may ignore the Yates’ continuity correction. Homework Problems