AP Statistics Causation & Relations in Categorical Data.

Slides:



Advertisements
Similar presentations
Chapter 4 Review: More About Relationship Between Two Variables
Advertisements

Chapter 4: More on Two- Variable Data.  Correlation and Regression Describe only linear relationships Are not resistant  One influential observation.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
Comparitive Graphs.
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Chapter 4: Designing Studies
Aim: How do we establish causation?
Review for the chapter 6 test 6. 1 Scatter plots & Correlation 6
AP Statistics Section 4.3 Establishing Causation
Correlation AND EXPERIMENTAL DESIGN
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
2.6 The Question of Causation. The goal in many studies is to establish a causal link between a change in the explanatory variable and a change in the.
BPSChapter 61 Two-Way Tables. BPSChapter 62 To study associations between quantitative variables  correlation & regression (Ch 4 & Ch 5) To study associations.
Ch 2 and 9.1 Relationships Between 2 Variables
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
The Question of Causation
HW#9: read Chapter 2.6 pages On page 159 #2.122, page 160#2.124,
1 10. Causality and Correlation ECON 251 Research Methods.
The Practice of Statistics
1 Chapter 5 Two-Way Tables Associations Between Categorical Variables.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
AP STATISTICS Section 4.2 Relationships between Categorical Variables.
The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Warm-Up List all of the different types of graphs you can remember from previous years:
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Chapter 4 More on Two-Variable Data “Each of us is a statistical impossibility around which hover a million other lives that were never destined to be.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
CHAPTER 9: Producing Data: Experiments. Chapter 9 Concepts 2  Observation vs. Experiment  Subjects, Factors, Treatments  How to Experiment Badly 
Unit 3 Relations in Categorical Data. Looking at Categorical Data Grouping values of quantitative data into specific classes We use counts or percents.
CHAPTER 6: Two-Way Tables. Chapter 6 Concepts 2  Two-Way Tables  Row and Column Variables  Marginal Distributions  Conditional Distributions  Simpson’s.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Warm Up The number of motor vehicles registered (in millions) in the U.S. has grown as charted in the table. 1)Plot the number of vehicles against time.
Two-way tables BPS chapter 6 © 2006 W. H. Freeman and Company.
Analysis of two-way tables - Data analysis for two-way tables IPS chapter 2.6 © 2006 W.H. Freeman and Company.
Causal inferences Most of the analyses we have been performing involve studying the association between two or more variables. We often conduct these kinds.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In this chapter we will study the relationship between two categorical variables (variables.
Stat1510: Statistical Thinking and Concepts Two Way Tables.
Two-Way Tables Categorical Data. Chapter 4 1.  In this chapter we will study the relationship between two categorical variables (variables whose values.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
Warm-up An investigator wants to study the effectiveness of two surgical procedures to correct near-sightedness: Procedure A uses cuts from a scalpel and.
Chapter 6 Two-Way Tables BPS - 5th Ed.Chapter 61.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In prior chapters we studied the relationship between two quantitative variables with.
AP Statistics Section 4.2 Relationships Between Categorical Variables
4.3 Relations in Categorical Data.  Use categorical data to calculate marginal and conditional proportions  Understand Simpson’s Paradox in context.
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
CHAPTER 6: Two-Way Tables*
4.3 Reading Quiz (second half) 1. In a two way table when looking at education given a person is 55+ we refer to it as ____________ distribution. 2. True.
CHAPTER 9: Producing Data Experiments ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
The Question of Causation 4.2:Establishing Causation AP Statistics.
AP Statistics. Issues Interpreting Correlation and Regression  Limitations for r, r 2, and LSRL :  Can only be used to describe linear relationships.
2.7 The Question of Causation
Two-Way Tables and The Chi-Square Test*
AP Statistics Chapter 3 Part 3
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
The Question of Causation
Chapter 4: Designing Studies
Statistical Reasoning December 8, 2015 Chapter 6.2
Section 4-3 Relations in Categorical Data
Chapter 4: Designing Studies
Section Way Tables and Marginal Distributions
Section 6.2 Establishing Causation
Correlation Vs. Causation
Relations in Categorical Data
Chapter 4: More on Two-Variable Data
Presentation transcript:

AP Statistics Causation & Relations in Categorical Data

HW Questions???

Causation  The old adage, “Correlation does not necessarily imply causation.”  Many times in statistics we can find a “connection” or a correlation or an association between two variables, it is more difficult to prove that the explanatory actually causes the response variable to respond.

Examples  X = number of complete passes a quarterback throws  Y= passing yardage for the quarterback  It’s reasonable to assume that the more complete passes he throws, the more yards he’ll rack up.

Example 2  X=the number of ounces of alcohol consumed  Y=Fine motor skills abilities  Again, it’s reasonable to assume the x is actually causing y to respond.

Common Response  Two variables show a strong association because a third variable is causing both of them to respond.  X=A student’s ACT score  Y = A student’s SAT score It’s fair to assume that ones intelligence, or lack of it, will cause high, or low, scores on both tests. Having a high ACT score doesn’t cause the SAT score to be high.

Confounding  Two variables are confounded when their effect on a response variable cannot be distinguished from each other.  It looks like x is causing y, but another variable z is also acting on y, and it’s hard to sort out who is doing what.

Example 1  X = # of mentos dropped into the soda bottle  Y = height of soda spray One might think that x causes y to respond, more mentos = higher spray, but Z the temperature of the soda is also changing, and now you can’t tell what did what.

Example 2  X = latitude at which a person lives  Y = lifespan  The variables are associated, but it’s hard to know if living at a higher latitude causes you to live longer, or if, it just happens that poorer countries tend to be in the tropics and it’s the poverty that is reducing the life span.

How to establish causation???  You need a controlled experiment, where the effects of lurking variables are controlled and minimized. See chapter 5!!

Relations in Categorical Data  What if we want to see if there is an association among categorical data? Obviously we can’t make a scatterplot and compute the correlation, do a regression, etc.  We make a two way table.

Example 1: College Students I Gender I Age GroupFemaleMaleTotal 15 – ,6684,69710, – 341,9041,5893, or older1, ,630 Total9,3217,31716,639

Conditional and Marginal Distribution  The Marginal Distribution is the distribution of one variable alone, that is a column total out of the total total. Ex. % of males in college.  The Conditional Distribution is the distribution of one variable across another variable. Example, % of Women among 15 – 17 year olds.

Looking for association  If age does not have any effect on gender in college, then we’d expect the conditional percentages to be roughly equal. If there is a big disparity, then we might conclude that age and gender in college are connected.  Compute the conditional distributions for gender on age.

Do Medical Helicopters Save Lives?  A businessman is trying to cut costs to a hospital and knows that the helicopter program is quite expensive. He gets some data on whether or not the program is effective.  ____________Helicopter Road Victim died Victim survived Total

Doesn’t look good  The conditional distributions for death on vehicle type is 64/200 = 32% on the helicopter, and 260/1100 = 24% on the road.  As the hospital statistician, what might you do to try and save the helicopter program?? i.e. what lurking variables are out there?

Statistics to the Rescue…  Since helicopters are more likely to respond to serious accidents… Serious Accidents Less Serious Helicopter Road Helicopter Road Died Survived Total

Simpson’s Paradox  The reversal of the direction of a comparison or association when data from several groups are combined to form a single group.  I.E. when you have data that isn’t parsed out for various lurking variables, it might not be the true reprsentation.

Homework  4.33, 4.36, 4.37; 4.52 – 54, 4.60