Chapter 3 Unusual points and cautions in regression.

Slides:



Advertisements
Similar presentations
2nd Day: Bear Example Length (in) Weight (lb)
Advertisements

Scatter Diagrams and Linear Correlation
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
 Pg : 3b, 6b (form and strength)  Page : 10b, 12a, 16c, 16e.
 Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does. 
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
MAT 1000 Mathematics in Today's World. Last Time.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Examining Relationships YMS3e Chapter 3 3.3: Correlation and Regression Extras Mr. Molesky Regression Facts.
AP Statistics Chapter 8 & 9 Day 3
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapter 5 Regression BPS - 5th Ed. Chapter 51. Linear Regression  Objective: To quantify the linear relationship between an explanatory variable (x)
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
3.3 Correlation: The Strength of a Linear Trend Estimating the Correlation Measure strength of a linear trend using: r (between -1 to 1) Positive, Negative.
Lesson Correlation and Regression Wisdom. Knowledge Objectives Recall the three limitations on the use of correlation and regression. Explain what.
Warm Up Feel free to share data points for your activity. Determine if the direction and strength of the correlation is as agreed for this class, for the.
CHAPTER 3 Describing Relationships
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Influential Points By Noelle Hodge. Does the age at which a child begins to talk predict later score on a test of mental ability? A study of the development.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Least Squares Regression Textbook section 3.2. Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable.
AP Statistics. Issues Interpreting Correlation and Regression  Limitations for r, r 2, and LSRL :  Can only be used to describe linear relationships.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
AP Review Exploring Data. Describing a Distribution Discuss center, shape, and spread in context. Center: Mean or Median Shape: Roughly Symmetrical, Right.
Warm-up Get a sheet of computer paper/construction paper from the front of the room, and create your very own paper airplane. Try to create planes with.
Sit in your permanent seat
CHAPTER 3 Describing Relationships
Describing Relationships
Essential Statistics Regression
Lesson 13: Things To Watch out for
Chapter 2 Looking at Data— Relationships
Examining Relationships
Chapter 8 Part 2 Linear Regression
EQ: How well does the line fit the data?
residual = observed y – predicted y residual = y - ŷ
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Cautions about Correlation and Regression
Chapter 2 Looking at Data— Relationships
Examining Relationships
Basic Practice of Statistics - 5th Edition Regression
HS 67 (Intro Health Stat) Regression
Least-Squares Regression
Basic Practice of Statistics - 3rd Edition Regression
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Correlation/regression using averages
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
Honors Statistics Review Chapters 7 & 8
Correlation/regression using averages
Presentation transcript:

Chapter 3 Unusual points and cautions in regression

Look for Outliers & Influential Observations Does the age at which a child begins to talk predict later scores on a test of mental ability? Analyze the following data (using the technology toolbox!)‏

Remember the toolbox... Answer the key questions. Graph the data Calculate Numerical Summaries When possible, use a mathematical model to represent the data Interpretation

1.Key Questions Who? What? When? Where? Why? How? By Whom?

2.Graph

3.Numerical Summaries s x =7.947, s y = Which numerical summaries should I report? r=-0.640, r 2 =.41

4.Model How do we express the data with a model? The equation of the LSRL is

Residual Plot

Interpretation? What do the graphs, numerical summaries, and model tell you?

Outliers Child 19 is an outlier in the y-direction, with a score so high that we should check for a mistake in recording it. (In fact, it is correct). Child 18 is an outlier in the x-direction.

Influential Points This picture adds a second regression line (blue), calculated after leaving out child 18. This one point moves the line quite a bit. In fact, the equation of the new least-squares line is with r= -0.33

Be aware... In the regression setting, not all outliers are influential. Influential points often have small residuals because they pull the regression line toward themselves. The surest way to verify that a point is influential is the find the regression line both with and without the suspect point.

Gesell scores continued The original data have r 2 =0.41. That is, the LSRL relating age at which a child begins to talk with Gesell score explains 41% of the variation on this later test of mental ability. This relationship is strong enough to be interesting to parents. If we leave out Child 18, r 2 drops to only 11%. The apparent strength of the association was largely due to a single influential observation Wow!

Gesell scores continued What should the researcher do? Without Child 18, the evidence for a connection between the variables vanishes. If she keeps Child 18, she needs data on other children who were also slow to begin talking so that the analysis no longer depends so heavily on just one child.

Beware the Lurking Variable

Ice cream causes drowning? The amount of ice cream consumed and the number of drowning deaths are positively associated. This might lead someone to wonder if eating ice cream causes drowning. What other variable might influence both ice cream consumption and drowning deaths?

Nonsense Correlation How close is the linear relationship between these two variables? Guess the correlation.

Nonsense Correlation Now look at the labels to the bottom and side... Is the amount of goods imported to the United States really related to private health spending? No. In fact, any two variables that both increase over time will show a strong association. This does not mean that one variable explains or influences the other.

Nonsense Correlation Nonsense correlations are real correlation. Just make sure you understand that association does not imply causation.

Lurking Variable Hides Relationship A housing study in Hull, England did a study comparing overcrowding with lack of toilets. They figured the two would be correlated...

Lurking Variable Hides Relationship but found the correlation to be only r= 0.08!

But when they looked again... A lurking variable, the amount of public housing, actually divided the data into two clusters, which when looked at as a whole made the variables look uncorrelated. These areas had lots of public housing (and more toilets)‏ overcrowding lack of public toilets These areas had less public housing (and less toilets)‏

Beware Correlations Based on Averaged Data Correlations based on averages are usually higher than correlations based on individual scores.