Please turn off cell phones, pagers, etc. The lecture will begin shortly.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Comparing Two Proportions (p1 vs. p2)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
Section 1.3 Experimental Design © 2012 Pearson Education, Inc. All rights reserved. 1 of 61.
Section 1.3 Experimental Design.
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
Analysis of frequency counts with Chi square
Extension Article by Dr Tim Kenny
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Risk and Relative Risk. Suppose a news article claimed that drinking coffee doubled your risk of developing a certain disease. Assume the statistic was.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
BS704 Class 7 Hypothesis Testing Procedures
PSY 307 – Statistics for the Behavioral Sciences
Inferential Statistics
Are exposures associated with disease?
How Can We Test whether Categorical Variables are Independent?
Copyright © 2005 by Evan Schofer
Categorical Variables, Relative Risk, Odds Ratios STA 220 – Lecture #8 1.
INTRODUCTION TO EPIDEMIOLO FOR POME 105. Lesson 3: R H THEKISO:SENIOR PAT TIME LECTURER INE OF PRESENTATION 1.Epidemiologic measures of association 2.Study.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
8.1 Inference for a Single Proportion
Please turn off cell phones, pagers, etc. The lecture will begin shortly. There will be a quiz at the end of today’s lecture.
Experimental Design 1 Section 1.3. Section 1.3 Objectives 2 Discuss how to design a statistical study Discuss data collection techniques Discuss how to.
Measures of Association
Analyses of Covariance Comparing k means adjusting for 1 or more other variables (covariates) Ho: u 1 = u 2 = u 3 (Adjusting for X) Combines ANOVA and.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Please turn off cell phones, pagers, etc. The lecture will begin shortly. There will be a quiz at the end of today’s lecture. Friday’s lecture has been.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Introduction to Statistics 1.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
O RGANIZING & DISPLAYING DATA Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
R Programming Risk & Relative Risk 1. Session 2 Overview 1.Risk 2.Relative Risk 3.Percent Increase/Decrease Risk 2.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Apr. 22 Stat 100. Final Wednesday April 24 About 40 or so multiple choice questions Comprehensive Study the midterms Copies and answers are at the course.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
March 30 More examples of case-control studies General I x J table Chi-square tests.
Please turn off cell phones, pagers, etc. The lecture will begin shortly. There will be a very easy quiz at the end of today’s lecture.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
* Chapter 8 – we were estimating with confidence about a population * Chapter 9 – we were testing a claim about a population * Chapter 10 – we are comparing.
Section 1.3 Experimental Design.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
Logistic Regression An Introduction. Uses Designed for survival analysis- binary response For predicting a chance, probability, proportion or percentage.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
March 28 Analyses of binary outcomes 2 x 2 tables
Effect Sizes.
Statistics 200 Lecture #7 Tuesday, September 13, 2016
The binomial applied: absolute and relative risks, chi-square
Chapter 8: Inference for Proportions
Lecture 7 The Odds/ Log Odds Ratios
If we can reduce our desire,
Chapter 10 Analyzing the Association Between Categorical Variables
Analyzing the Association Between Categorical Variables
Data Collection and Experimental Design
Presentation transcript:

Please turn off cell phones, pagers, etc. The lecture will begin shortly.

Exam 2 summary Exam 2 had 40 items. The test results indicated that there were three bad items. The grades were adjusted: The test was also longer and more difficult than anticipated. adjusted score = (# items correct × 2.5 ) + 12

Distribution of adjusted scores n = 310 mean = 82.7median = 85 min = 35max = 107SD = 14.7

Lecture 21 This lecture will finish topics from Chapter Review probability and odds (Section 12.2) 2. Measures of association in 2×2 tables (Section 12.2) 3. Choosing the baseline (Section 12.3) Due to time constraints, there will not be a quiz today. There will be a quiz on Friday with odds ∞.

1. Review of probability and odds Probability and odds are numerical measures of how likely an event is to occur. Probability is a number between 0 and 1 Odds is a number between 0 and ∞ oddsprob oddsprob

Conversion formulas Given one, you can find the other by these formulas: odds = probability 1 - probability probability = odds 1 + odds Examples prob =.2corresponds to odds =.2/.8 =.25 prob =.75corresponds to odds =.75/.25 = 3 odds =.5corresponds to prob =.5/1.5 =.333 odds = 9corresponds to prob = 9/10 =.9

Estimating probabilities and odds Measure a binary variable for a sample of n subjects and summarize the results as a frequency table Yes No a b Freq Totaln Estimated probability of yes = a n Yes Yes + No Estimated odds of yes = a b Yes No

Example Twenty likely voters were asked, “Do you approve of the President’s job performance?” YYNYNNYYNNNNYNYNNYNYYYNYNNYYNNNNYNYNNYNY Approve Disapprove 9 11 Freq Estimate the probability and odds of approval. Estimated probability = 9/20 =.45 Estimated odds = 9/11 =.82

2. Measures of association in 2×2 tables Recall that with two continuous variables, a useful measure of association is the correlation coefficient. For two binary variables, the most common measures of association are Relative risk Odds ratio The relative risk is a ratio of probabilities. The odds ratio is a ratio of odds.

Estimating the relative risk Compute the proportion for each row Divide one proportion by the other 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 11,037 =.0094 Proportion with heart attack 189 / 11,034 =.0171 Example The estimated relative risk is.0094 /.0171 = 0.55

Estimating the odds ratio Compute the odds for each row Divide one odds by the other 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 10,933 =.0095 Estimated odds of heart attack 189 / 10,845 =.0174 Example The estimated odds ratio is.0095 /.0174 = 0.55

Easier way to estimate the odds ratio If the frequencies in the 2×2 table are ab cd then the estimated odds ratio is (a×d) / (b×c). 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Example 104 × 10,845 10,933 × 189 = 0.55 The estimated odds ratio is

Interpreting the relative risk A relative risk of 1.0 means that the proportion of “yes” in row 1 is the same as the proportion of “yes” in row Total Vitamin C Placebo YesNo Cold this year? Total This means that there is no evidence of a relationship between the explanatory variable and the response variable. Example Vitamin C Placebo 45 / 100 =.45 Proportion with cold 54 / 120 =.45 RR =.45 /.45 = 1.0 Vitamin C appears to be no more effective than a placebo for preventing the common cold.

A relative risk greater than 1.0 means that the proportion of “yes” in row 1 is greater than the proportion of “yes” in row Total Smoker Nonsmoker YesNo Throat cancer? Total RR = 1.2 means that the proportion is 20% greater Example Smoker Nonsmoker 15 / 2000 =.0075 Rate of throat cancer 13 / 2500 =.0052 RR =.0075 /.0052 = 1.44 The rate of throat cancer among smokers is 44% higher than among nonsmokers. RR = 1.5 means that the proportion is 50% greater RR = 2.0 means that the rate is doubled RR = 3.0 means that the rate is tripled

A relative risk less than 1.0 means that the proportion of “yes” in row 1 is lower than the proportion of “yes” in row 2. Example The rate of heart attack for those who took aspirin was 45% lower than for those who took a placebo. RR = 0.90 means that the rate is 10% lower RR = 0.75 means that the rate is 25% lower RR = 0.50 means that the rate is 50% lower RR = 0.20 means that the rate is 80% lower 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 11,037 =.0094 Rate of heart attack 189 / 11,034 =.0171 RR =.0094 /.0171 = 0.55

Interpreting the odds ratio The interpretation of an odds ratio is very similar to the interpretation of a relative risk. The only difference is that an odds ratio expresses the increase or decrease in terms of odds rather than rates. OR = 1.0 means that there is no evidence of a relationship OR = 1.2 means that the odds of “yes” in row 1 are 20% higher than the odds of “yes” in row 2 OR = 0.60 means that the odds of “yes” in row 1 are 40% lower than the odds of “yes” in row 2

Example Total Men Women YesNo Legal abortion for any reason Total From the 2002 General Social Survey “Should it be possible for a pregnant woman to obtain a legal abortion if the woman wants it for any reason?” OR = 215 × × 269 = 1.13 Based on this sample, men appear to be slightly more likely than women to support legalized abortion for any reason. The estimated odds of support are 13% higher among men than among women.

Based on the data from the last example, can we really conclude that the level of support of legalized abortion “for any reason” is greater among men than among women? Is it real? Perhaps not. The odds ratio of 1.13 is only an estimate, and it is not far from 1.0. How far away from 1.0 does an estimate need to be for us to conclude that the effect is real, and not just due to random chance? That depends on the margins of error, which in turn depend on the sample sizes in the two groups (men and women). Techniques for judging whether the effect is real or not will be discussed next week.

3. Choosing the baseline The relative risk is a ratio of proportions: RR = proportion of “yes” in one row proportion of “yes” in the other row In our examples thus far, we have used Row 1 of the 2×2 table for the numerator Row 2 of the 2×2 table for the denominator But we are free to use either row as the numerator or denominator, as long as we interpret the result correctly.

Total Smoker Nonsmoker YesNo Throat cancer? Total Example Smoker Nonsmoker 15 / 2000 =.0075 Rate of throat cancer 13 / 2500 =.0052 If we use “nonsmoker” as the numerator and “smoker” as the denominator, we get RR =.0052 /.0075 = 0.72 Interpretation: The estimated rate of throat cancer is 28% lower among non-smokers than among smokers. If we use “smoker” as the numerator and “nonsmoker” as the denominator, we get RR =.0075 /.0052 = 1.44 Interpretation: The estimated rate of throat cancer is 44% higher among smokers than among non-smokers.

Which way is better? Both ways of presenting the relative risk are correct. But the second way (RR=1.44) is a little easier to understand. In this example, “non-smoking” is the normative condition, and “smoking” is the condition that is potentially hazardous. If one of the two groups (either row 1 or row 2) can be regarded as normative behavior a control group (e.g., “placebo” or “nothing”) treatment as usual the majority then it makes sense to use that group as the denominator. The group in the denominator becomes the baseline for assessing the risk level of the group in the numerator

Another example In this example, the placebo group is the control group, and those who are taking aspirin are receiving the “new” or “novel” treatment. 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total So it makes sense to use the placebo group as the baseline. Aspirin Placebo 104 / 11,037 =.0094 Rate of heart attack 189 / 11,034 =.0171 RR =.0094 /.0171 = 0.55