Download presentation
Presentation is loading. Please wait.
Published byProsper Reeves Modified over 9 years ago
1
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
2
Lecture 20 This lecture will introduce topics from Chapter 12. 1.2 × 2 frequency tables (Section 12.1) 2. Probability and odds (Section 12.2) 3. Measures of association in 2×2 tables (Section 12.2)
3
1. 2 × 2 frequency tables In Chapters 10-11, we learned how to describe relationships among continuous variables: scatterplots correlation coefficients regression analysis Now we begin to examine relationships between categorical variables. More specifically, we’ll consider relationships between variables that are binary.
4
What is a binary variable? A binary variable is a measurement that has only two possible outcomes. Examples: sex (male or female) These are also known as dichotomous variables. treatment in a two-armed experiment (e.g. aspirin or placebo) whether a subject has a trait or condition (e.g. cancer or no cancer) survival after a specified period of time (alive or dead)
5
Frequency table for a binary variable Suppose we take a sample of n subjects and record a binary variable for each subject. For example: Ask 16 students, “Are you registered to vote?” YYNYYNYYNNYYYNNYYYNYYNYYNNYYYNNY A frequency table (or contingency table) records the number of subjects in each category: Yes No 10 6 Freq Total16
6
Proportions and percentages Once you have the frequency table, you can compute the proportions and percentages in each category by dividing the frequencies by the sample size. Yes No 10 6 Freq Total16 Registered: Proportion = 10/16 = 0.625 Percentage = 0.625 × 100 = 62.5% Unregistered: Proportion = 6/16 = 0.375 Percentage = 0.375 × 100 = 37.5%
7
Suppose that you now have two binary variables for each subject. Subject 1 SexRegistered? 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M M F M F F M F F F M F M M F M N N Y Y Y N N N Y Y Y Y Y N Y Y The 2×2 frequency table (also known as 2×2 contingency table) records the number of subjects in each of the four possible categories. Male Female 4 6 4 2 YesNo Registered? Two binary variables
8
Rows and columns When creating a 2×2 table, it’s customary to make the rows correspond to the explanatory variable columns correspond to the response variable Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? Heart attack No attack 104 10,933 189 10,845 AspirinPlacebo Wrong:Right:
9
Margins We often add an extra row and column to hold the row and column totals. These are called the margins. 11,037 11,034 104 + 10,933 Total 189 + 10,845 Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? 29321,77822,071Total 104 + 189 10,933 + 10,845 11,037 + 11,034 or 293 + 21,778 (Grand total or sample size n)
10
Row proportions 11,037 11,034 Total Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? 29321,77822,071Total To uncover the relationship between the explanatory (row) variable and response (column) variable, compute the row proportions and percentages: choose one of the first two columns divide by the third column Aspirin Placebo 104 / 11,037 =.0094 Proportion with heart attack 189 / 11,034 =.0171
11
Percentages and rates When the row proportions are small, it is customary to express them as percentages, rates per 1,000, per 10,000, per 100,000, etc. proportion × 100 = percent proportion × 1,000 = rate per 1,000 proportion × 10,000 = rate per 10,000 0.94 1.71 Aspirin Placebo.0094 Rate of heart attack.0171 proportion 9.4 17.1 %per 1,000
12
2. Probability and odds Probability is a number between 0 and 1 that indicates how likely it is that an event will occur probability = 0 means that the event will never occur probability = 1 means that the event will always occur probability = 0.5 means that the event is just as likely to occur as not 01 unlikelylikely Values close to zero indicate that the event is unlikely; values close to one indicate that it is likely.
13
Odds Another measure of how likely an event is to occur is odds. Odds ranges from 0 to ∞. 0 to ∞ unlikelylikely odds = 0 means that the event will never occur odds = ∞ means that the event will always occur odds = 1 (often written as 1:1, which is the same as 1/1) means that the event is just as likely to occur as not odds = 2 (often written as 2:1, which is the same as 2/1) means that the event is twice as likely to occur as not 1
14
Odds as ratios Gamblers sometimes express odds as a ratio a:b where b is something other than 1. For example, they may say “the odds are 3:2.” Note that odds of 3:2 are the same as 3/2 = 1.5 So if you ever see odds expressed as a:b, you should divide a by b to re-express the odds as a number between 0 and ∞.
15
Odds and probability Odds and probability are not the same! odds 04123 Prob = 0corresponds to odds = 0 Prob =.25corresponds to odds = 0.333 01 probability.25.5.75.667.8 Prob =.50corresponds to odds = 1 Prob =.667corresponds to odds = 2 Prob =.75corresponds to odds = 3 Prob =.8corresponds to odds = 4
16
Converting probability to odds Given a probability, you can find the odds by the formula odds = probability 1 - probability Examples Prob =.5corresponds to odds =.5 /.5 = 1 Prob =.7corresponds to odds =.7 /.3 = 2.33 Prob =.9corresponds to odds =.9 /.1 = 9 Prob =.99corresponds to odds =.99 /.01 = 99
17
Converting odds to probability Given a probability, you can find the odds by the formula probability = odds 1 + odds Examples odds =.5corresponds to prob =.5/1.5 = 0.333 odds = 3corresponds to prob = 3/4 = 0.75 odds = 10corresponds to prob = 10/11 = 0.909 odds = 25corresponds to prob = 25/26 = 0.962
18
Rare events For rare events (probabilities close to zero), odds and probabilities are nearly the same. Examples Prob =.001corresponds to odds =.001001 Prob =.01corresponds to odds =.0101 Prob =.02corresponds to odds =.0204 Prob =.03corresponds to odds =.0309 When discussing rare events, the distinction between odds and probability is often unimportant.
19
Estimating probabilities from frequency tables The sample proportion Sample proportion = # of subjects having the trait # of subjects in the sample is an estimate of the probability that a subject chosen at random from the population has the trait. Yes No 10 6 Freq Total16 Example “Are you registered to vote?” The proportion registered is 10/16 =.625 The proportion not registered is 6/16 =.375
20
Estimating odds from frequency tables The sample odds Sample odds = # of subjects having the trait # of subjects not having the trait is an estimate of the odds that a subject chosen at random from the population has the trait. Yes No 10 6 Freq Total16 Example The estimated odds of being registered is 10/6 = 1.67 The estimated odds of not being registered is 6/10 = 0.6
21
3. Measures of association in 2×2 tables Recall that with two continuous variables, a useful measure of association is the correlation coefficient. For two binary variables, the most common measures of association are Relative risk Odds ratio The relative risk is a ratio of proportions. The odds ratio is a ratio of odds.
22
Estimating the relative risk Compute the proportions for each row Divide one proportion by the other 11,037 11,034 Total Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 11,037 =.0094 Proportion with heart attack 189 / 11,034 =.0171 Example The estimated relative risk is.0094 /.0171 = 0.55
23
Computing the odds ratio Compute the odds for each row Divide one odds by the other 11,037 11,034 Total Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? 29321,77822,071Total Aspirin Placebo 104 / 10,933 =.0095 Estimated odds of heart attack 189 / 10,845 =.0174 Example The estimated odds ratio is.0095 /.0174 = 0.55
24
Easier way to estimate the odds ratio If the frequencies in the 2×2 table are ab cd then the estimated odds ratio is (a×d) / (b×d). 11,037 11,034 Total Aspirin Placebo 104 189 10,933 10,845 YesNo Heart attack? 29321,77822,071Total Example 104 × 10,845 10,933 × 189 = 0.55 The estimated odds ratio is
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.