Chapter 4 More on Two-Variable Data “Each of us is a statistical impossibility around which hover a million other lives that were never destined to be.

Slides:



Advertisements
Similar presentations
Section 4.2. Correlation and Regression Describe only linear relationship. Strongly influenced by extremes in data. Always plot data first. Extrapolation.
Advertisements

Chapter 4 Review: More About Relationship Between Two Variables
Chapter 4 More About Relationships Between Two Variables 4.1 Transforming to Achieve Linearity 4.2 Relationship Between Categorical Variables 4.3 Establishing.
Chapter 4: More on Two- Variable Data.  Correlation and Regression Describe only linear relationships Are not resistant  One influential observation.
Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
AP Statistics Section 4.2 Relationships Between Categorical Variables.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
AP Statistics Causation & Relations in Categorical Data.
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
BPSChapter 61 Two-Way Tables. BPSChapter 62 To study associations between quantitative variables  correlation & regression (Ch 4 & Ch 5) To study associations.
Ch 2 and 9.1 Relationships Between 2 Variables
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
The Practice of Statistics
1 Chapter 5 Two-Way Tables Associations Between Categorical Variables.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
AP STATISTICS Section 4.2 Relationships between Categorical Variables.
The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
CHAPTER 7: Exploring Data: Part I Review
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 2 Looking at Data: Relationships.
CHAPTER 6: Two-Way Tables. Chapter 6 Concepts 2  Two-Way Tables  Row and Column Variables  Marginal Distributions  Conditional Distributions  Simpson’s.
Chapter 4 More on Two-Variable Data YMS 4.1 Transforming Relationships.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Two-way tables BPS chapter 6 © 2006 W. H. Freeman and Company.
Analysis of two-way tables - Data analysis for two-way tables IPS chapter 2.6 © 2006 W.H. Freeman and Company.
 Some variables are inherently categorical, for example:  Sex  Race  Occupation  Other categorical variables are created by grouping values of a.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Stat1510: Statistical Thinking and Concepts Two Way Tables.
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Business Statistics for Managerial Decision Making
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In prior chapters we studied the relationship between two quantitative variables with.
AP Statistics Section 4.2 Relationships Between Categorical Variables
UNIT 4 Bivariate Data Scatter Plots and Regression.
10. Introduction to Multivariate Relationships Bivariate analyses are informative, but we usually need to take into account many variables. Many explanatory.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CHAPTER 6: Two-Way Tables*
1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
AP Statistics. Issues Interpreting Correlation and Regression  Limitations for r, r 2, and LSRL :  Can only be used to describe linear relationships.
CHAPTER 1 Exploring Data
Cautions About Correlation and Regression
Two-Way Tables and The Chi-Square Test*
AP Statistics Chapter 3 Part 3
Chapter 2 Looking at Data— Relationships
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
Data Analysis for Two-Way Tables
Cautions about Correlation and Regression
Chapter 2 Looking at Data— Relationships
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Section 4-3 Relations in Categorical Data
4.2 Cautions about Correlation and Regression
3.3 Cautions Correlation and Regression Wisdom Correlation and regression describe ONLY LINEAR relationships Extrapolations (using data to.
Chapters Important Concepts and Terms
Relations in Categorical Data
Honors Statistics Review Chapters 7 & 8
Chapter 4: More on Two-Variable Data
Presentation transcript:

Chapter 4 More on Two-Variable Data “Each of us is a statistical impossibility around which hover a million other lives that were never destined to be born” Loren Eiseley

4.1 Some models for scatterplots with non-linear data (pp ) Exponential growth Growth or decay function Form: Power function Form:

Logarithms Rules for logarithms

In other words… The log of a product is the sum of the logs. The log of a quotient is the difference of the logs. The log of a power is the power times the log.

4.2 Interpreting Correlation and Regression (pp ) Overview: Correlation and regression need to be interpreted with CAUTION. Two variables may be strongly associated, but this DOES NOT MEAN that one causes the other. High Correlation does not imply causation! We need to consider lurking variables and common response.

Extrapolation The use of a regression line or curve to make a prediction outside of the domain of the values of your explanatory variable x that you used to obtain your line or curve. These predictions cannot be trusted.

Lurking Variable A variable that affects the relationship of the variables in the study. NOT INCLUDED among the variables studied. Example: strong positive association might exist between shirt size and intelligence for teenage boys. A lurking variable is AGE. Shirt size and intelligence among teenage boys generally increases with age.

If there is a strong association between two variables x and y, any one of the following statements could be true: x causes y: Association DOES NOT imply causation, but causation could exist. Both x and y are responding to changes in some unobserved variable or variables. This is called common response. The effect of x on y is hopelessly mixed up with the effects of other variables on y. This is called confounding. Always a potential problem in observational studies. Can be somewhat controlled in experiments with a control group and a treatment group.

4.3 Relations in Categorical Data (pp ) Overview: We can see relations between two or more categorical variables by setting up tables. So far, we have studied relationships with a quantitative response variable.

Notation Prob(X) is the probability that X is true. Prob(X/Y) is the probability that X is true, given that Y is true

Two-way Table Describes the relationship between two categorical variables: Row variable Column variable Row totals and column totals give MARGINAL DISTRIBUTIONS of the two variables separately. DO NOT give any information about the relationships between the variables. Can be used in the calculation of probabilities.

Example: 200 employees of a company are classified according to the Table below, where A, B, and C are mutually exclusive. Have AHave BHave C Totals Female Male Totals

Example: (con’t) What is the probability that a randomly chosen person is female? Prob(F) = 120/200 = 60% What is the probability that a randomly chosen person has property A? Prob(A) = 50/200 = 25% If a randomly chosen person is female, what is the probability that she has property B? Prob(B/F) = 40/50 = 80% Note: equals Prob(B and F)/Prob(B)

Example: (con’t) If a randomly chosen person has property C, what is the probability that the individual is male? Prob(M/C) = 40/100 = 40% Note: equals Prob(C and M)/Prob(M) If a randomly chosen person has B or C, what is the probability that the person is male? Prob(M/B or C) = 50/150 = 33.3%

Simpson’s Paradox The reversal of the direction of a comparison or an association when data from several groups are combined to form a single group. Lurking variables are categorical. An extreme form of the fact that observed associations can be misleading when there are lurking variables.

Example of Simpson’s Paradox First Half of BB Season HitsTimesBat at batavg. Caldwell Wilson Second Half of BB Season HitsTimes Bat at bat avg Batting avgs. For entire season:Caldwell: 110/400 =.275 Wilson: 30/105 =.286 Calwell had a better avg. than Wilson in each half; however, Caldwell ends up with a LOWER OVERALL avg. than Wilson.