Lesson 13: Things To Watch out for

Slides:



Advertisements
Similar presentations
Section 4.2. Correlation and Regression Describe only linear relationship. Strongly influenced by extremes in data. Always plot data first. Extrapolation.
Advertisements

Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Chapter 3 Association: Contingency, Correlation, and Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
Describing Relationships: Scatterplots and Correlation
 Pg : 3b, 6b (form and strength)  Page : 10b, 12a, 16c, 16e.
1 10. Causality and Correlation ECON 251 Research Methods.
 Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does. 
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
MAT 1000 Mathematics in Today's World. Last Time.
1 Chapter 10, Part 2 Linear Regression. 2 Last Time: A scatterplot gives a picture of the relationship between two quantitative variables. One variable.
Chapter 3.3 Cautions about Correlations and Regression Wisdom.
WARM-UP Do the work on the slip of paper (handout)
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Lesson Correlation and Regression Wisdom. Knowledge Objectives Recall the three limitations on the use of correlation and regression. Explain what.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Business Statistics for Managerial Decision Making
1 Regression Line Part II Class Class Objective After this class, you will be able to -Evaluate Regression and Correlation Difficulties and Disasters.
Regression Wisdom. Getting the “Bends”  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t.
Chapter 9 Regression Wisdom. Getting the “Bends” Linear regression only works for data with a linear association. Curved relationships may not be evident.
Prediction and Causation How do we predict a response? Explanatory Variables can be used to predict a response: 1. Prediction is based on fitting a line.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.1 The Association.
Chapter 3 Unusual points and cautions in regression.
Chapter 3 Association: Contingency, Correlation, and Regression Section 3.1 How Can We Explore the Association between Two Categorical Variables?
Quantitative Data Essential Statistics.
Scatter Plots and Correlation Coefficients
Statistics 200 Lecture #6 Thursday, September 8, 2016
Describing Scatterplots
Chapter 3: Describing Relationships
Chapter 4.2 Notes LSRL.
Examining Relationships Least-Squares Regression & Cautions about Correlation and Regression PSBE Chapters 2.3 and 2.4 © 2011 W. H. Freeman and Company.
Georgetown Middle School Math
Chapter 9 Regression Wisdom Copyright © 2010 Pearson Education, Inc.
Active Learning Lecture Slides For use with Classroom Response Systems
Establishing Causation
Chapter 4 Correlation.
Cautions about Correlation and Regression
Regression Wisdom Chapter 9.
Regression BPS 7e Chapter 5 © 2015 W. H. Freeman and Company.
RELATIONSHIPS Vocabulary scatter plot correlation
Week 5 Lecture 2 Chapter 8. Regression Wisdom.
Register for AP Exams --- now there’s a $10 late fee per exam
Chapter 7 Part 1 Scatterplots, Association, and Correlation
AP STAT Section 3.3: Correlation and Regression Wisdom
Topic 4: Exploring Categorical Data
DRILL Put these correlations in order from strongest to weakest.
3 4 Chapter Describing the Relation between Two Variables
Review of Chapter 3 Examining Relationships
HS 67 (Intro Health Stat) Regression
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Chapter 3 Association: Contingency, Correlation, and Regression
Least-Squares Regression
An Introduction to Correlational Research
EQ: What gets in the way of a good model?
4.2 Cautions about Correlation and Regression
Correlation/regression using averages
3.3 Cautions Correlation and Regression Wisdom Correlation and regression describe ONLY LINEAR relationships Extrapolations (using data to.
Day 15 Agenda: DG minutes.
Chapter 3.2 Regression Wisdom.
Chapter 9 Regression Wisdom.
Honors Statistics Review Chapters 7 & 8
Chapter 4: More on Two-Variable Data
Review of Chapter 3 Examining Relationships
Correlation/regression using averages
Presentation transcript:

Lesson 13: Things To Watch out for

Extrapolating Murder The data and regression line for U.S. States relating x = percentage of single-parent families to y = annual murder rate (number of murders per 100,000 people in the population) is given below. Using the equation for the regression line stated in the figure, find the predicted murder rate at x = 0. Questions. What is the problem with this prediction? b) What is the general lesson to be learned from this example?

Extrapolating Murder: Conclusions Questions. a) What is the problem with this prediction? The prediction for the murder rate is -8.25 murders per 100,000 people, this is nonsense. b) What is the general lesson to be learned from this example? For the data, single-parent families were between 14% and 30%, following the trend down to 0% single-parent families resulted in making a prediction that is too far away from the data to be reliable.

Winning Marathon Times Winning times in the Boston marathon have followed a straight line decreasing trend from 160 minutes in 1927 to 130 minutes in 2004. After fitting a regression line to the winning times, it can be predicted that the winning time in the year 2300 will be about 13 minutes. Questions. a) What is the problem with this prediction? b) What is the general lesson to be learned from this example?

Winning Marathon Times: Conclusions Questions. a) What is the problem with this prediction? It is not feasible for a person to run a marathon in 13 minutes. Other factors, such as biological limitations, will become more dominant. b) What is the general lesson to be learned from this example? We cannot make reliable predictions too far from the range of the original data.

Summary: Extrapolation is Dangerous Extrapolation refers to using a regression line to predict y-values for x-values outside of the range of data. This is riskier as we move farther from that range

TV Watching and the Birth Rate The plot below shows recent data on x = the number of televisions per 100 people and y = the birth rate (number of births per 1000 people) for six African and Asian nations. The regression line y = 29.8 – 0.024x applies to the data for these six countries. If the data for the United States (81,15.2) is added, the regression line for all seven points is y = 31.2 – 0.195x. In addition, the correlation is r = -0.051without the United States and r = -0.935 with the United States. Questions. In what ways does the United States appear to be an outlier? b) What would you conclude about the strength of the linear association without the United States, and with the United States? c) What is the general lesson to be learned from this example?

TV Watching and the Birth Rate: Conclusions Questions. a) In what ways does the United States appear to be an outlier? With respect to both the x variable and the y variable, as well as the trend. b) What would you conclude about the strength of the linear association without the United States, and with the United States? It is not a strong association without the United States, but is quite strong with the United States. c) What is the general lesson to be learned from this example? Outliers can have quite an effect on the regression line and the correlation.

Education and Expensive Homes A regression line for the data x = number of years of education and y = annual income for 100 people shows a modest positive trend, except for one person who dropped out after the 10th grade but is now a multi-millionaire. The correlation for the data including all 100 people is r = -0.28. Questions. a) What can we say about the association if r = -0.28? b) Is your answer to part a) what we would want to report in our findings? c) What is the general lesson to be learned from this example?

Education and Expensive Homes: Conclusions Questions. a) What can we say about the association if r = -0.28? It has a negative direction. b) Is your answer to part a) what we would want to report in our findings? No, the trend for the data, with the exception of this one outlier, had a positive direction. This is a better reflection of the data. c) What is the general lesson to be learned from this example? Outliers can change the direction of the association.

Summary: Be Cautious of Influential Outliers An observation can be an outlier in its x-value, or in its y-value, or with respect to the trend of the other observations Regression outliers are outliers with respect to the trend of the other observations. An observation has a large effect on the regression line and/or correlation when x-value is relatively low or high compared to rest of data The observation is a regression outlier

Eating Ice Cream and Drowning The “Gold Coast” of Australia is famous for its magnificent beaches. Because of strong rip tides, however, each year many people drown. Data collected each month for x = number of gallons of ice cream sold in refreshment stands along the beach that month and y = number of people who drowned that month shows a positive correlation. Questions. a) Do you think the following is a good conclusion: Eating ice cream at the beach is a contributing factor in deaths from drowning? b) Can you identify another variable that could be responsible for this association? c) What is the general lesson to be learned from this example?

Eating Ice Cream and Drowning: Conclusions Questions. a) Do you think the following is a good conclusion: Eating ice cream at the beach is a contributing factor in deaths from drowning? No, there is an association between the variables, but eating ice does not necessarily cause someone to drown. b) Can you identify another variable that could be responsible for this association? Mean temperature for the month. c) What is the general lesson to be learned from this example? Association is not the same as causation, and other variables may be influencing the association.

Smoking May Be Beneficial To Your Health A survey of 1,314 women in the United Kingdom during 1972-1974 asked each woman whether she was a smoker. Twenty years later, a follow-up survey observed whether each woman was dead or still alive. The explanatory variable was smoker/non-smoker and the response variable was survival status after 20 years. We find that 24% of smokers died and 31% of non-smokers died. There was a greater survival rate for the smokers. Questions. a) Do you think the following is a good conclusion: People who smoke tend to have a better survival rate? b) Can you identify another variable that could be responsible for this association? c) What is the general lesson to be learned from this example?

Smoking May Be Beneficial To Your Health: Conclusions Questions. a) Do you think the following is a good conclusion: People who smoke tend to have a better survival rate? No, in this study there was a positive association between smoking and survival rate, but smoking was not necessarily responsible for the better survival rate for smokers. b) Can you identify another variable that could be responsible for this association? Age c) What is the general lesson to be learned from this example? Association is not the same as causation, and other variables may be influencing the association.

Summary: Association Does Not Imply Causation A lurking variable is a variable, usually unobserved, that influences the association between the variables of primary interest. A confounding variable is observed and possibly influences the variables of primary interest. Eating ice cream and drowning Lurking variable might be mean temperature for the month Smoking may be beneficial to your health Lurking variable was age

Smoking May Be Beneficial To Your Health The study found 24% of smokers died 31% of non-smokers died

Simpson’s Paradox The direction of an association between two variables can change after we include a third variable and analyze the data at separate levels of the third variable. This is called Simpson’s paradox.