Linear Regression Least Squares Method: an introduction.

Slides:



Advertisements
Similar presentations
Things to do in Lecture 1 Outline basic concepts of causality
Advertisements

AP Statistics Section 3.2 C Coefficient of Determination
Lesson 10: Linear Regression and Correlation
R Squared. r = r = -.79 y = x y = x if x = 15, y = ? y = (15) y = if x = 6, y = ? y = (6)
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Linear regression and correlation
1 Functions and Applications
Regression Greg C Elvers.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter 6: Exploring Data: Relationships Lesson Plan
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points.
Chapter 2 – Simple Linear Regression - How. Here is a perfect scenario of what we want reality to look like for simple linear regression. Our two variables.
Regression and Correlation BUSA 2100, Sect , 3.5.
Relationships Among Variables
Correlation & Regression Math 137 Fresno State Burger.
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Linear Regression and Correlation
Linear Regression.
Relationship of two variables
Linear Regression and Correlation
Graphs. The results of an experiment are often used to plot a graph. A graph can be used to verify the relation between two variables and, at the same.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
Linear Regression When looking for a linear relationship between two sets of data we can plot what is known as a scatter diagram. x y Looking at the graph.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
1.6 Linear Regression & the Correlation Coefficient.
Linear Regression Least Squares Method: the Meaning of r 2.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Regression Lines. Today’s Aim: To learn the method for calculating the most accurate Line of Best Fit for a set of data.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
LECTURE 9 Tuesday, 24 FEBRUARY STA291 Fall Administrative 4.2 Measures of Variation (Empirical Rule) 4.4 Measures of Linear Relationship Suggested.
Section 2.6 – Draw Scatter Plots and Best Fitting Lines A scatterplot is a graph of a set of data pairs (x, y). If y tends to increase as x increases,
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
Curve Fitting Pertemuan 10 Matakuliah: S0262-Analisis Numerik Tahun: 2010.
Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Correlation & Regression
CHAPTER 3 Describing Relationships
The Lease Squares Line Finite 1.3.
Correlation and Simple Linear Regression
Correlation and Regression
Chapter 10 Correlation and Regression
Correlation and Simple Linear Regression
Least Squares Method: the Meaning of r2
Least-Squares Regression
Lesson 2.2 Linear Regression.
Simple Linear Regression and Correlation
Regression & Prediction
3.2 – Least Squares Regression
Presentation transcript:

Linear Regression Least Squares Method: an introduction

We are given the following ordered pairs: (1.2,1), (1.3,1.6), (1.7,2.7), (2,2), (3,1.8), (3,3), (3.8,3.3), (4,4.2). They are shown in the scatterplot below:

If we draw a line, not the best line, necessarily, but a line, as shown, we can begin to consider how well it fits the data. From each data point, we construct a vertical line segment to the line. This distance gives us an indication of the error, the difference between the predicted and actual y values. Squaring this error, which may be positive or negative, gives all positive values, an advantage in finding a total. The sum of the squares gives us a measure of the scatter of the data away from the line.

We try drawing another line, this time a horizontal line is shown. The squares are still fairly large.

This line seems like a better fit. It has a positive slope and an intercept that seems reasonable. The total sum of the squares is less than that for the two previous lines.

Wow, this line with a negative slope does not fit so well. The sum of the squares will be very large. This would make a very poor model.

Again, this line looks much better. This is clearly a better model than some of the earlier attempts.

This line does not look as good as the purple or pink ones.

This line has larger squares than some of the others. This is not the best model.

This is the line based on calculations. This is very similar to the purple one.The equation is. (The graphs are just approximations, and are not exact.)

This shows the approximate sums of the squares of the previous examples. The smaller this quantity, the better the model. Fortunately, we have a technique that allows us to go straight to the equation without all the guesswork. Each colored square represents the total area of the squares for an earlier example. The yellow was the worst of the proposals, and the dark green the best.

Below are the standardized coordinates. All ordered pairs (x,y) are now represented (Z x, Z y ). The best fit line will pass through the origin. Remember, to standardize is to calculate a z-score.

The line drawn is the best fit line. The difference between the data point and the line is shown.

In order to find the best fit line we want to minimize the quantity This is the standardized sum of the squares of the differences, divided by degrees of freedom to adjust for sample size.

The equation of the best fit line is This means that the equation can be found if you have the means and standard deviations for both x and y, even without knowing all of the data values. We usually make use of technology to carry out these calculations, and formulas are always provided, but do know how to use the formulas. where and