Chapters 14 and 15 – Linear Regression and Correlation

Slides:



Advertisements
Similar presentations
Section 10-3 Regression.
Advertisements

Linear regression and correlation
7.1 Seeking Correlation LEARNING GOAL
Correlation & Regression Chapter 10. Outline Section 10-1Introduction Section 10-2Scatter Plots Section 10-3Correlation Section 10-4Regression Section.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Correlation and Regression
Chapter 4 The Relation between Two Variables
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
Regression and Correlation
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Regression multiple Dan Fisher Marriott School of Management Brigham Young University November 2005 linear.
Correlation & Regression Math 137 Fresno State Burger.
Correlation and Linear Regression Chapter 13 McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Lecture 3: Bivariate Data & Linear Regression 1.Introduction 2.Bivariate Data 3.Linear Analysis of Data a)Freehand Linear Fit b)Least Squares Fit c)Interpolation/Extrapolation.
Linear Regression.
Section 4.1 Exploring Associations between Two Quantitative Variables?
Linear Regression and Correlation
Correlation and Linear Regression
Correlation and regression 1: Correlation Coefficient
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
In the above correlation matrix, what is the strongest correlation? What does the sign of the correlation tell us? Does this correlation allow us to say.
Biostatistics Unit 9 – Regression and Correlation.
Researchers, such as anthropologists, are often interested in how two measurements are related. The statistical study of the relationship between variables.
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph.
Dr. Fowler AFM Unit 8-5 Linear Correlation Be able to construct a scatterplot to show the relationship between two variables. Understand the properties.
2.8Exploring Data: Quadratic Models Students will classify scatter plots. Students will use scatter plots and a graphing utility to find quadratic models.
Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
CHAPTER 38 Scatter Graphs. Correlation To see if there is a relationship between two sets of data we plot a SCATTER GRAPH. If there is some sort of relationship.
Summarizing Bivariate Data
Sec 1.5 Scatter Plots and Least Squares Lines Come in & plot your height (x-axis) and shoe size (y-axis) on the graph. Add your coordinate point to the.
Regression Regression relationship = trend + scatter
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 7 Scatterplots, Association, and Correlation.
2-7 Curve Fitting with Linear Models Warm Up Lesson Presentation
Section 2.6 – Draw Scatter Plots and Best Fitting Lines A scatterplot is a graph of a set of data pairs (x, y). If y tends to increase as x increases,
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Scatter Plots, Correlation and Linear Regression.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
Chapter 141 Describing Relationships: Scatterplots and Correlation.
SWBAT: Calculate and interpret the residual plot for a line of regression Do Now: Do heavier cars really use more gasoline? In the following data set,
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
.  Relationship between two sets of data  The word Correlation is made of Co- (meaning "together"), and Relation  Correlation is Positive when the.
Chapter 9 – Linear Regression and Correlation Part II.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Chapter 5: Introductory Linear Regression. INTRODUCTION TO LINEAR REGRESSION Regression – is a statistical procedure for establishing the relationship.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
UNIT 8 Regression and Correlation. Correlation Correlation describes the relationship between two variables. EX: How much you study verse how well you.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Correlation and Linear Regression
Regression and Correlation
Correlation & Regression
CHAPTER 10 & 13 Correlation and Regression
Objectives Fit scatter plot data using linear models.
Chapter 5 STATISTICS (PART 4).
CHAPTER 10 Correlation and Regression (Objectives)
Practice Mid-Term Exam
Multiple Regression BPS 7e Chapter 29 © 2015 W. H. Freeman and Company.
Lecture Notes The Relation between Two Variables Q Q
Day 49 Causation and Correlation
Presentation transcript:

Chapters 14 and 15 – Linear Regression and Correlation

Contingency tables are useful for displaying information on two qualitative variables

Scatter plots are useful for displaying information on two quantitative variables.

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

What type of relationship is present in the following scatter plot? No relationship Linear relationship Quadratic relationship Other type of relationship

We can quantify how strong the linear relationship is by calculating a correlation coefficient. The formula is: It is easier to let technology do the calculation!

We can quantify how strong the linear relationship is by calculating a correlation coefficient. It is easier to let technology do the calculation! You have multiple options: Calculator Minitab Excel Websites

Calculation example Correlation Coefficient = -0 Calculation example Correlation Coefficient = -0.492 Correlation Coefficient is abbreviated by r. r = -0.492 x y 5 7 2 6 3 8 6 4 5 5 4 6

TI Calculator: Type x data into L1 and y data into L2 then go to VARS -> Statistics -> EQ -> r r = -0.492 Note: it does not matter which is the x data and which is the y data for computing r. x y 5 7 2 6 3 8 6 4 5 5 4 6

Consider the following data: x y 2 14 14 10 57 28 14 16 23 16 56 18 8 1 r = - 0.734 r = 0.538 r = 0.734 r = 0.466 r = - 0.538

Consider the following data: x1 x2 0 -4 8 -6 7 -8 9 -9 5 -5 8 -8 8 -7 6 -9 r = - 0.034 r = - 0.724 r = - 0.545 r = - 0.983 r = - 0.241

Properties of the Correlation Coefficient −1≤𝑟≤1 If 𝑟<0 then there is a negative relationship between the two variables If 𝑟>0 then there is a positive relationship between the two variables r only measures a linear relationship The greater 𝑟 , the stronger the relationship

The correlation coefficient is 0.734 There is a positive relationship x y 2 14 14 10 57 28 14 16 23 16 56 18 8 1

The correlation coefficient is - 0 The correlation coefficient is - 0.724 There is a negative relationship x1 x2 0 -4 8 -6 7 -8 9 -9 5 -5 8 -8 8 -7 6 -9

Guess the correlation r = - 0.821 r = - 0.759 r = 0.388 r = 0.674

Guess the correlation r = 0.121 r = 0.372 r = 0.644 r = 0.865

Guess the correlation r = 0.372 r = 0.522 r = 0.644 r = 0.865

Guess the correlation r = - 0.034 r = - 0.299 r = - 0.438 r = - 0.601

Guess the correlation r = - 0.004 r = - 0.156 r = - 0.441 r = - 0.699

Guess the correlation r = 0.7484 r = 0.3156 r = 0.0116 r = - 0.2994

Guess the correlation r = 0.7484 r = 0.2676 r = 0.0018 r = - 0.1944

Fill in the blank: If one variable tends to increase linearly as the other variable increases, the variables are __________ correlated. Positively Negatively Not

Fill in the blank: If one variable tends to increase linearly as the other variable decreases, the variables are __________ correlated. Positively Negatively Not

If there is a correlation (relationship) between two variables, it does not necessarily mean there is a causal relationship between the two variables (one variable affects the other).

If there is a correlation (relationship) between two variables, it does not necessarily mean there is a causal relationship between the two variables (one variable affects the other)

Nobel Prize and McDonalds data set Country Nobel Prize Count McDonalds Count Austria 11 148 Czech Republic 2 60 Denmark 13 99 Finland 93 Greece 48 Hungary 3 76 Iceland 1 Ireland 5 62 Luxembourg 6 Norway 8 55 Portugal 91 Slovakia 10 Turkey 133 United States 270 12804 The correlation coefficient of this data set is closest to what value? -0.999 0.999 0.099 -0.099

The correlation between the number of Nobel Prizes awarded and number of McDonald’s Restaurants for select countries is strong. Therefore, we can correctly conclude that if a country were to build more McDonald’s Restaurants its inhabitants would be more likely to receive Nobel Prizes. Country Nobel Prize Count McDonalds Count Austria 11 148 Czech Republic 2 60 Denmark 13 99 Finland 93 Greece 48 Hungary 3 76 Iceland 1 Ireland 5 62 Luxembourg 6 Norway 8 55 Portugal 91 Slovakia 10 Turkey 133 United States 270 12804 True False

Nobel Prize and McDonalds data set Country Nobel Prize Count McDonalds Count Austria 11 148 Czech Republic 2 60 Denmark 13 99 Finland 93 Greece 48 Hungary 3 76 Iceland 1 Ireland 5 62 Luxembourg 6 Norway 8 55 Portugal 91 Slovakia 10 Turkey 133 United States 270 12804 A confounding variable is a variable that is not accounted for that can affect both variables being studied.

Recall the equation of a line is: 𝑦=𝑚𝑥+𝑏 where m is the slope of the line and b is the y intercept. In statistics we use this notation: 𝑦= 𝛽 𝑜 + 𝛽 1 𝑥 where 𝛽 1 is the slope and 𝛽 𝑜 is the y intercept. The values of 𝛽 1 and 𝛽 𝑜 are unknown and must be estimated from the data.

The values of 𝛽 1 and 𝛽 𝑜 are unknown and estimated using a method called “least squares.” This method picks the line that minimizes the sum of the squared errors of all the data points. What is an error?

An error is the vertical distance between a data point and the line and is abbreviated as ε

The method of least squares picks the line that results in this being the smallest: 𝜀 1 + 𝜀 2 + 𝜀 3 +⋯+ 𝜀 𝑛 .

We will let computes calculated the line of best fit or the least squares line because it requires multivariate calculus.

The regression line below is a poor fit of the data and results in high error.

The regression line below is a better fit of the data and results in lower error.

The regression line below is the line of best fit or the least squares line.

Review of properties of a line. Consider: 𝑦=12+2 Review of properties of a line! Consider: 𝑦=12+2.4𝑥 where x measures time in hours and y measures distance in miles. The interpretation if the slope is? An increase of 1 hour results in an increase of 2.4 miles. A decrease of 1 hour results in an increase of 2.4 miles. A decrease of 2.4 miles results in a decrease of 1 mile. An increase of 2.4 miles results in a decrease of 1 mile.

A study looked at the weight (in hundreds of pounds) and mpg of 82 vehicles. Following is the scatter plot:

The line of best fit is: MPG = 68.2 - 1.11 Weight

The line of best fit is: MPG = 68. 2 - 1. 11 Weight The line of best fit is: MPG = 68.2 - 1.11 Weight. What does the slope tell us? An increase in mpg of 1 results in an increase in weight of 111 pounds. A decrease in mpg of 1 results in an increase in weight of 111 pounds. An increase in weight of 100 pounds results in a decrease in gas mileage of 1.11 mpg. An increase in weight of 100 pounds results in an increase in gas mileage of 1.11 mpg.

The line of best fit is: MPG = 68. 2 - 1. 11 Weight The line of best fit is: MPG = 68.2 - 1.11 Weight. What does the y-intercept tell us? A car with a weight of 0 lbs gets 68.2 mpg A car with a weight of 100 lbs gets 68.2 mpg A car with a weight of 1000 lbs gets 68.2 mpg A car with a weight of 2000 lbs gets a 68.2 mpg

Consider the following data set and graph. The graph is of this data. X Y 6 5 2 7 3 8 4 True False

A direct relationship means an increase in one variable results in an increase in the other. This is also a positive correlation An inverse relationship means an increase in one variable results in a decrease in the other. This is also a negative correlation

There is a negative correlation between the two variables which indicates a direct relationship between femur length and horse height. There is a positive correlation between the two variables which indicates an inverse relationship between femur length and horse height. There is a negative relationship between the two variables which indicates an inverse relationship between femur length and horse height. There is a positive relationship between the two variables which indicates a direct relationship between femur length and horse height. None of the above