Please turn off cell phones, pagers, etc. The lecture will begin shortly.

Slides:



Advertisements
Similar presentations
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
Advertisements

Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Organizing Data Proportions, Percentages, Rates, and rates of change.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
CATEGORICAL AND QUANTITATIVE VARIABLES TWO-WAY TABLES AP Statistics Chapter 1.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Analysis of frequency counts with Chi square
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Risk and Relative Risk. Suppose a news article claimed that drinking coffee doubled your risk of developing a certain disease. Assume the statistic was.
Lesson #29 2  2 Contingency Tables. In general, contingency tables are used to present data that has been “cross-classified” by two categorical variables.
Chi-square Test of Independence
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Crosstabs. When to Use Crosstabs as a Bivariate Data Analysis Technique For examining the relationship of two CATEGORIC variables  For example, do men.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Chapter 10 Analyzing the Association Between Categorical Variables
Ana Jerončić, PhD Department for Research in Biomedicine and Health.
Analysis of Categorical Data
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Multivariate Descriptive Research In the previous lecture, we discussed ways to quantify the relationship between two variables when those variables are.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Statistical test for Non continuous variables. Dr L.M.M. Nunn.
+ Chi Square Test Homogeneity or Independence( Association)
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Introduction to Statistics Chapter 6 Feb 11-16, 2010 Classes #8-9
 Some variables are inherently categorical, for example:  Sex  Race  Occupation  Other categorical variables are created by grouping values of a.
Aim: How do we analyze data with a two-way table?
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
Inference about a population proportion. 1. Paper due March 29 Last day for consultation with me March 22 2.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Section 4.4 Contingency Tables and Association. Definitions Contingency Table (Two-Way Table): Relates two categories of data Row Variable: Each row in.
Cross Tabs and Chi-Squared Testing for a Relationship Between Nominal/Ordinal Variables.
1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution.
3.3 More about Contingency Tables Does the explanatory variable really seem to impact the response variable? Is it a strong or weak impact?
Table #1 : Elizabeth surveys 9th graders, 10th graders, and 11th graders in her school. She asks each student how many hours they spend doing homework.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.1 Independence.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
CHI-SQUARE(X2) DISTRIBUTION
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Statistics 200 Lecture #7 Tuesday, September 13, 2016
Chi-Square X2.
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Module 15 Math
Looking at Data - Relationships Data analysis for two-way tables
Categorical Variables
If we can reduce our desire,
Chapter 10 Analyzing the Association Between Categorical Variables
Analysis of Frequencies
Wednesday, September 21, 2016 Farrokh Alemi, PhD.
Analyzing the Association Between Categorical Variables
Section 11-1 Review and Preview
Presentation transcript:

Please turn off cell phones, pagers, etc. The lecture will begin shortly.

Lecture 20 This lecture will introduce topics from Chapter × 2 frequency tables (Section 12.1) 2. Probability and odds (Section 12.2) 3. Measures of association in 2×2 tables (Section 12.2)

1. 2 × 2 frequency tables In Chapters 10-11, we learned how to describe relationships among continuous variables: scatterplots correlation coefficients regression analysis Now we begin to examine relationships between categorical variables. More specifically, we’ll consider relationships between variables that are binary.

What is a binary variable? A binary variable is a measurement that has only two possible outcomes. Examples: sex (male or female) These are also known as dichotomous variables. treatment in a two-armed experiment (e.g. aspirin or placebo) whether a subject has a trait or condition (e.g. cancer or no cancer) survival after a specified period of time (alive or dead)

Frequency table for a binary variable Suppose we take a sample of n subjects and record a binary variable for each subject. For example: Ask 16 students, “Are you registered to vote?” YYNYYNYYNNYYYNNYYYNYYNYYNNYYYNNY A frequency table (or contingency table) records the number of subjects in each category: Yes No 10 6 Freq Total16

Proportions and percentages Once you have the frequency table, you can compute the proportions and percentages in each category by dividing the frequencies by the sample size. Yes No 10 6 Freq Total16 Registered: Proportion = 10/16 = Percentage = × 100 = 62.5% Unregistered: Proportion = 6/16 = Percentage = × 100 = 37.5%

Suppose that you now have two binary variables for each subject. Subject 1 SexRegistered? M M F M F F M F F F M F M M F M N N Y Y Y N N N Y Y Y Y Y N Y Y The 2×2 frequency table (also known as 2×2 contingency table) records the number of subjects in each of the four possible categories. Male Female YesNo Registered? Two binary variables

Rows and columns When creating a 2×2 table, it’s customary to make the rows correspond to the explanatory variable columns correspond to the response variable Aspirin Placebo ,933 10,845 YesNo Heart attack? Heart attack No attack , ,845 AspirinPlacebo Wrong:Right:

Margins We often add an extra row and column to hold the row and column totals. These are called the margins. 11,037 11, ,933 Total ,845 Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total , ,845 11, ,034 or ,778 (Grand total or sample size n)

Row proportions 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total To uncover the relationship between the explanatory (row) variable and response (column) variable, compute the row proportions and percentages: choose one of the first two columns divide by the third column Aspirin Placebo 104 / 11,037 =.0094 Proportion with heart attack 189 / 11,034 =.0171

Percentages and rates When the row proportions are small, it is customary to express them as percentages, rates per 1,000, per 10,000, per 100,000, etc. proportion × 100 = percent proportion × 1,000 = rate per 1,000 proportion × 10,000 = rate per 10, Aspirin Placebo.0094 Rate of heart attack.0171 proportion %per 1,000

2. Probability and odds Probability is a number between 0 and 1 that indicates how likely it is that an event will occur probability = 0 means that the event will never occur probability = 1 means that the event will always occur probability = 0.5 means that the event is just as likely to occur as not 01 unlikelylikely Values close to zero indicate that the event is unlikely; values close to one indicate that it is likely.

Odds Another measure of how likely an event is to occur is odds. Odds ranges from 0 to ∞. 0 to ∞ unlikelylikely odds = 0 means that the event will never occur odds = ∞ means that the event will always occur odds = 1 (often written as 1:1, which is the same as 1/1) means that the event is just as likely to occur as not odds = 2 (often written as 2:1, which is the same as 2/1) means that the event is twice as likely to occur as not 1

Odds as ratios Gamblers sometimes express odds as a ratio a:b where b is something other than 1. For example, they may say “the odds are 3:2.” Note that odds of 3:2 are the same as 3/2 = 1.5 So if you ever see odds expressed as a:b, you should divide a by b to re-express the odds as a number between 0 and ∞.

Odds and probability Odds and probability are not the same! odds Prob = 0corresponds to odds = 0 Prob =.25corresponds to odds = probability Prob =.50corresponds to odds = 1 Prob =.667corresponds to odds = 2 Prob =.75corresponds to odds = 3 Prob =.8corresponds to odds = 4

Converting probability to odds Given a probability, you can find the odds by the formula odds = probability 1 - probability Examples Prob =.5corresponds to odds =.5 /.5 = 1 Prob =.7corresponds to odds =.7 /.3 = 2.33 Prob =.9corresponds to odds =.9 /.1 = 9 Prob =.99corresponds to odds =.99 /.01 = 99

Converting odds to probability Given a probability, you can find the odds by the formula probability = odds 1 + odds Examples odds =.5corresponds to prob =.5/1.5 = odds = 3corresponds to prob = 3/4 = 0.75 odds = 10corresponds to prob = 10/11 = odds = 25corresponds to prob = 25/26 = 0.962

Rare events For rare events (probabilities close to zero), odds and probabilities are nearly the same. Examples Prob =.001corresponds to odds = Prob =.01corresponds to odds =.0101 Prob =.02corresponds to odds =.0204 Prob =.03corresponds to odds =.0309 When discussing rare events, the distinction between odds and probability is often unimportant.

Estimating probabilities from frequency tables The sample proportion Sample proportion = # of subjects having the trait # of subjects in the sample is an estimate of the probability that a subject chosen at random from the population has the trait. Yes No 10 6 Freq Total16 Example “Are you registered to vote?” The proportion registered is 10/16 =.625 The proportion not registered is 6/16 =.375

Estimating odds from frequency tables The sample odds Sample odds = # of subjects having the trait # of subjects not having the trait is an estimate of the odds that a subject chosen at random from the population has the trait. Yes No 10 6 Freq Total16 Example The estimated odds of being registered is 10/6 = 1.67 The estimated odds of not being registered is 6/10 = 0.6

3. Measures of association in 2×2 tables Recall that with two continuous variables, a useful measure of association is the correlation coefficient. For two binary variables, the most common measures of association are Relative risk Odds ratio The relative risk is a ratio of proportions. The odds ratio is a ratio of odds.

Estimating the relative risk Compute the proportions for each row Divide one proportion by the other 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 11,037 =.0094 Proportion with heart attack 189 / 11,034 =.0171 Example The estimated relative risk is.0094 /.0171 = 0.55

Computing the odds ratio Compute the odds for each row Divide one odds by the other 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 10,933 =.0095 Estimated odds of heart attack 189 / 10,845 =.0174 Example The estimated odds ratio is.0095 /.0174 = 0.55

Easier way to estimate the odds ratio If the frequencies in the 2×2 table are ab cd then the estimated odds ratio is (a×d) / (b×d). 11,037 11,034 Total Aspirin Placebo ,933 10,845 YesNo Heart attack? 29321,77822,071Total Example 104 × 10,845 10,933 × 189 = 0.55 The estimated odds ratio is