Statistics 200 Lecture #7 Tuesday, September 13, 2016

Slides:



Advertisements
Similar presentations
Risk and Relative Risk. Suppose a news article claimed that drinking coffee doubled your risk of developing a certain disease. Assume the statistic was.
Advertisements

Categorical Variables, Relative Risk, Odds Ratios STA 220 – Lecture #8 1.
Relations and Categorical Data Target Goal: I can describe relationships among categorical data using two way tables. 1.1 cont. Hw: pg 24: 20, 21, 23,
R Programming Risk & Relative Risk 1. Session 2 Overview 1.Risk 2.Relative Risk 3.Percent Increase/Decrease Risk 2.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Aim: How do we analyze data with a two-way table?
Intermediate Applied Statistics STAT 460 Lecture 20, 11/19/2004 Instructor: Aleksandra (Seša) Slavković TA: Wang Yu
STATISTICS 200 Lecture #2Thursday, August 25, 2016 Distinguish between: - A statistic and a parameter - A categorical and a quantitative variable - A response.
Topic 5 Exploring Categorical Data: Frequency Tables.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Chapter 1: Exploring Data
Statistics 200 Lecture #9 Tuesday, September 20, 2016
CHAPTER 1 Exploring Data
Inference about a population proportion.
The Practice of Statistics in the Life Sciences Third Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
AP Statistics Chapter 3 Part 3
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Looking at Data - Relationships Data analysis for two-way tables
The Practice of Statistics in the Life Sciences Fourth Edition
Chapter 1 Data Analysis Section 1.1 Analyzing Categorical Data.
CHAPTER 6: Two-Way Tables
Contingency Tables and Association
AP Statistics Chapter 3 Part 2
Analyzing Categorical Data
Chapter 1: Exploring Data
Warmup Which part- time jobs employed 10 or more of the students?
Lesson Two-Way Tables Obj: I can 1) Construct two-way tables 2) Use two-way tables to summarize data HWK: Worksheet.
Chapter 1: Exploring Data
The Table Categorization
Chapter 1: Exploring Data
Introduction & 1.1: Analyzing categorical data
CHAPTER 1 Exploring Data
1.1 Analyzing Categorical Data.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
1.1: Analyzing Categorical Data
3.2 Pie Charts and Two-Way Tables
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Section 1.1 Analyzing Categorical Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Lesson Two-Way Tables Obj: I can 1) Construct two-way tables 2) Use two-way tables to summarize data HWK: Worksheet.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Warmup A teacher is compiling information about his students. He asks for name, age, student ID, GPA and whether they ride the bus to school. For.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

Statistics 200 Lecture #7 Tuesday, September 13, 2016 Textbook: Sections 4.1 through 4.2 Objectives (for two categorical variables and their relationship): • Understand two-way tables of counts (a.k.a. contingency tables) • Describe and calculate two types of conditional percentages: Row percentages and column percentages • Calculate and interpret risk, relative risk, increased risk • Calculate and interpret odds, odds ratio • Also: Discuss Exam #1

Two Categorical Variables Summarized in Contingency Tables Categorical variables are non-numeric variables that have a finite number of possible values. Just like we can examine relationships between quantitative variables, We can also examine relationships between categorical variables. Use 2-way tables, aka contingency tables

Example: Two Survey Questions 300 students were asked… 1. Do you like to take “selfies”? (yes) (no) 2. What is your sex? (female) (male) Question: Is there an association (relationship) between the answers to the two questions?

Example 1: 2 × 2 Contingency Table Yes No Row Total Female 90 110 200 Male 30 70 100 Column Total 120 180 300 Notice that the raw data (counts) are in the blue area of the two-way table. Along the margins are totals. These totals are sometimes called marginals.

Displaying data from two-way table Use a bar graph! Does it look like there is a relationship between sex and whether or not someone takes selfies?

Many quantities to calculate Conditional percentages: row percents and column percents Risk, Relative risk, percent increase or decrease in risk Odds, and the odds ratio

Conditional percentages: Conditional percentages are percentages calculated within either rows or columns of the two-way table. We use them to see distributions of frequencies within rows or columns. Column percentages: percentage of observations in a particular column category that are in a specified category of the row. Row percentages: percentage of observations in a particular row category that are in a specified category of the column.

Easier to see in an example! Calculating column percentages Yes column % No Row Total Female 90 110 200 Male 30 70 100 Column Total 120 180 300 a. 75% b. 25% a. Column % = (# observations in cell) = = b. Column % = (# observations in cell) = = 90 0.75, or 75% Column total 120 30 0.25, or 25% Column total 120

Easier to see in an example! Calculating row percentages Yes No Row Total Female 90 110 200 Male 30 70 100 Column Total 120 180 300 a. 45% b. 55% a. row % = (# observations in cell) = = b. row % = (# observations in cell) = = 90 0.45, or 45% row total 200 110 0.55, or 55% row total 200

More on row percents Sample interpretation: 45% of women surveyed said they take selfies. We use row percents more than column percents We say that there is a relationship between the variables if the rows in a two-way table have different distributions of row percentages.

Risk When an outcome is undesirable, we can describe the risk for that outcome: the proportion of individuals within a group that fall in that category, often expressed as a percent. Closely related to row percents. You will have to know this very simple formula:

Example: risk We look at summarized data from women who have been screened for the breast cancer gene BRCA1 and also who have been followed to see whether or not they develop breast cancer by the age of 70. Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200

Example: risk Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200 For women with BRCA1, the risk of developing breast cancer by age 70 is 60/100=.6, or 60% For women without BRCA1, the risk of developing breast cancer by age 70 is 12/100=.12, or 12%

Relative Risk Baseline risk We sometimes call the denominator the To see how risk relates to the explanatory variable Ratio of risks in two different categories of explanatory variable. We sometimes call the denominator the Baseline risk

Example: relative risk Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200 For women with BRCA1, the relative risk of developing breast cancer by age 70 is Interpretation: Women with BRCA1 have 5 times the risk of developing breast cancer than women without BRCA1. 60 /100 = 5 12 /100

Features of Relative Risk 1 If two groups have the same risk, their relative risk is ____ If the numerator has a bigger risk, the relative risk will be _____________ If the denominator has a bigger risk, the relative risk will be _____than 1 greater than 1 less

Percent increase or decrease in risk We can also present an increase or decrease in risk as a percent change. You can find the percent increase in risk with two equivalent formulas: Difference in risks Percent increase in risk = x 100% Baseline risk Percent increase in risk = (relative risk – 1) x 100%

Example: percent increase in risk Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200 Percent increase in risk = (relative risk – 1) x 100% (5 – 1) x 100% = 400% Difference in risks Percent increase in risk = x 100% Baseline risk (.6 - .12) / .12 = .48/.12 = 4 4 x 100% = 400%

Interpretation: percent increase in risk Women with BRCA1 have a risk of cancer that is 400% higher than the risk of cancer for women without BRCA1. When the risk is smaller than the baseline risk, we call the percent change in the risks a percent decrease.

Odds Compares the chance an even happens to the chance that it does not. Phrases as “3 to 1” or “1 to 2” for example, although a ratio is implied. Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200 Odds of cancer for women with BRCA1 are ________. We can simplify to say _________. 60 to 40 3 to 2

Odds ratio Odds in category 1 Odds ratio = Odds in category 2 60/40 Sometimes, it’s useful to compare the odds of two different groups of individuals. Odds in category 1 Odds ratio = Odds in category 2 Cancer No cancer Total BRCA1 60 40 100 NO BRCA1 12 88 72 128 200 60/40 12/88 1.5 .136 = = 11.03

Odds ratio interpretation We found that the odds ratio that compares the odds of cancer for women with BRCA1 to women without BRCA1 is 11.03 Interpretation: The odds of cancer for women with BRCA1 are 11.03 times the odds for women without BRCA1.

Numbers we use to summarize 2×2 Tables yes no Total Group1 Group2 2×2 Table One Group: from table Individual Risk: risk for one group Odds: compares two possible outcomes within one group Comparing Two Groups: (single number) Relative Risk: (ratio) Increased Risk: (percent) yes no Total Group1 Group2

Match the quantity with its statistical name Important skill! Women athletes who play certain sports are five times as likely to tear their anterior cruciate ligament (ACL) when compared against men who play the same sports. Americans, by 2-1, predict that the agreement to raise the debt ceiling will make the nation worse rather than better. 49.8 percent of Americans who are 16 years or older, are married according to the Bureau of Labor Statistics Five times as likely… relative risk of 5 Two to one … odds 49.8% of a single group… risk

Things to watch out for For instance: When reading reports that cite risk-type statistics, you should ask yourself some questions: What are the actual risks? What is the baseline risk? What is the population for which the reported risk or relative risk applies? What is the time period for the risk? For instance: In 2000, it was reported that there were 54 unprovoked shark attacks. But wait! When you take the amount of people at the beach, that’s only 1 attack for every 11.5 million beach visits.

Review: If you understood today’s lecture, you should be able to solve 4.3, 4.5, 4.7, 4.9, 4.15, 4.17, 4.19, 4.21, 4.25, 4.29 Recall Objectives (for two categorical variables): • Understand two-way tables of counts (a.k.a. contingency tables) • Describe and calculate two types of conditional percentages: Row percentages and column percentages • Calculate and interpret risk, relative risk, increased risk • Calculate and interpret odds, odds ratio