Simple Linear Regression

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Chapter 12 Simple Linear Regression
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Correlation and regression Dr. Ghada Abo-Zaid
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Correlation and Regression
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Simple Linear Regression
Chapter 12 Simple Linear Regression
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Regression and Correlation
Correlation and Regression Analysis
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
Correlation and Regression Analysis
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
Correlation and Linear Regression
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Linear Regression.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Relationship of two variables
Linear Regression and Correlation
Correlation and Linear Regression
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Slide Copyright © 2008 Pearson Education, Inc. Chapter 4 Descriptive Methods in Regression and Correlation.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
© The McGraw-Hill Companies, Inc., 2000 Business and Finance College Principles of Statistics Lecture 10 aaed EL Rabai week
Biostatistics Unit 9 – Regression and Correlation.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Chapter 6 & 7 Linear Regression & Correlation
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
Linear Regression Least Squares Method: the Meaning of r 2.
Chapter 10 Correlation and Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Regression Regression relationship = trend + scatter
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
7-3 Line of Best Fit Objectives
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Chapter 8: Simple Linear Regression Yang Zhenlin.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance.
The simple linear regression model and parameter estimation
Part 5 - Chapter
Least Squares Method: the Meaning of r2
SIMPLE LINEAR REGRESSION
Algebra Review The equation of a straight line y = mx + b
Presentation transcript:

Simple Linear Regression and Correlation

Correlation question: From 1983 to 2001 in the state of Tennessee, were motor gasoline consumption and ethanol consumption significantly related to each other? In a correlation problem, one is interested in measuring the strength of the relationship between variables.

Regression question: From 1983 to 2001 in the state of Tennessee, could the ethanol consumption in one year have been used to predict motor gasoline consumption in the following year? In a regression problem, one is interested in predicting one variable (called the dependent variable) based on another variable (called the independent variable).

Simple Linear Regression The Key Word Simple Linear Regression and Correlation

Simple Linear Regression A Straight Line Simple Linear Regression and Correlation

What is the equation for a straight line? Do you recall ? x is the independent variable, and y is the dependent variable. What is ? Answer: the slope What is ? Answer: the y-intercept In the text, the equation is given by:

The General Simple Linear Regression Problem Given a random sample of the related x and y values, find the value of the slope and the value of the y-intercept that yields the “best” fit to these points.

Visually Y Given a random sample of the related x and y values, find the value of the slope and the value of the y-intercept that yields the “best” fit to these points. What does “best” mean? By “best” we mean the smallest error in prediction. X

} Error Defined Y If one picks an arbitrary point in the random sample, (Xi, Yi), how “far” is the point from the line: ? Yi is the actual y value. Error = } is the predicted y-value. (value on the line) By “best” we mean the smallest error in prediction. The error is the difference between Yi and . X

} } { General Problem Restated Y Given a random sample of the related x and y values, find the value of the slope and the value of the y-intercept that yields the smallest error over all the sample. } { { Error = } { What would you want ? Unfortunately, there are an infinite number of lines possessing this property. Any line that passes through the point, , will have this property, because it is a property of the mean. } { { The errors for the points above the line should balance the errors for the points below the line, resulting in a sum of zero. } X

} } { General Problem Restated in terms of Least Squares Y Given a random sample of the related x and y values, find the value of the slope and the value of the y-intercept that yields the smallest sum of the squares of the errors (SSE) over all the sample. } { { Error = } { } { { } Find the value of b0 and the value of b1 that will minimize where X

} } { Solution of the Least Squares Problem Y Find the value of b0 and the value of b1 that will minimize where } { { Error = } Noting that SSE is a function of two variables, we can restate the problem once again. { } { { } X

} } { Solution of the Least Squares Problem Y Find the value of b0 and the value of b1 that will minimize f(b0, b1) = } { Finding the values of variables that will maximize/minimize a function is a calculus problem. Because calculus is not a prerequisite to this course, the details are omitted, but the process results in two equations and two unknowns. { Error = } { } { { } X

The Normal Equations matrix form algebraic form or There are many ways to solve a system of two equations and two unknowns. If you have a favorite, feel free to use it. Two relationships that I expect you to know are: and

The Normal Equations matrix form algebraic form or There are many ways to solve a system of two equations and two unknowns. If you have a favorite, feel free to use it. Two relationships that I expect you to know are: and Now the specifics are introduced with an example.

The Random Sample

Generate Graph First, graph the data. The scatter plot of the data may indicate that a linear model is totally inappropriate and a waste of time. The following three slides give some examples of nonlinear patterns. Following the nonlinear examples, the graph of the data in the random sample is constructed.

Example of a Nonlinear Pattern

Example of a Nonlinear Pattern

Example of a Nonlinear Pattern

Example of a Nonlinear Pattern Transformed to a linear pattern

The Scatter Graph H2O Consumption Number of Commercials (14, 10000) (13, 10000) H2O Consumption (10, 8000) (12, 9000) (11, 8000) (10, 7000) ( 8, 5000) ( 7, 5000) ( 7, 4000) ( 8, 4000) Number of Commercials

The Scatter Graph H2O Consumption Number of Commercials Find the slope and the y-intercept of the line that is the “best” fit to these points. H2O Consumption Number of Commercials

(with “guesstimated” line) The Scatter Graph (with “guesstimated” line) Find the slope and the y-intercept of the line that is the “best” fit to these points. H2O Consumption Number of Commercials

The Initial Calculations

Some Basic Formulas

X = Number of Commercials; Y = Water Consumption (gallons)

Interpretation of the Slope and the Y-intercept X = Number of Commercials; Y = Water Consumption (gallons) Interpret the slope. (What does the slope mean in terms of the problem?) For each additional commercial, we expect the water consumption to increase by 910.714 gallons. Interpret the y-intercept. (What does the y-intercept mean in terms of the problem?) If there are no commercials, we expect the water consumption to be a negative 2,107.14 gallons. ?????????? Think about it. ??????????

If the water consumption is a negative 2,107.14 gallons, which way is Reservoir If the water consumption is a negative 2,107.14 gallons, which way is the water flowing in the pipe from the reservoir to the city? We know that the water does not flow back into the reservoir. Welcome to Mulvany, Tennessee Does this result mean that the regression model is worthless? City Water Plant Sensor line River

Interpolation versus Extrapolation smallest X largest X smallest X

Interpolation Interpolation versus Extrapolation largest X 14 H2O Between the smallest (7) and the largest (14) values of X used to compute the sample regression model, we may interpolate with statistical significance. (14, 10000) largest X 14 H2O Consumption Interpolation To determine if the model has statistical significance, we still have to perform some more calculations. ( 7, 5000) ( 7, 4000) smallest X 7 Extrapolation Extrapolation Relevant Range Number of Commercials

Calculation of SSE by Definition

Calculation of SSE by Definition First, you insert the Xi values into the sample regression equation to calculate the predicted values.

Calculation of SSE by Definition First, you insert the Xi values into the sample regression equation to calculate the predicted values. Second, you calculate the deviations of the points from the line.

Calculation of SSE by Definition First, you insert the Xi values into the sample regression equation to calculate the predicted values. Second, you calculate the deviations of the points from the line. Finally, you calculate the squares of the deviations of the points from the line and sum them to obtain SSE.

Calculation of SSE by “Backing” into it = variation explained by regression variation not explained by regression +

Calculation of SSE by “Backing” into it = variation explained by regression variation not explained by regression + Therefore,

Calculation of SSE by “Backing” into it = variation explained by regression variation not explained by regression + Therefore, However,

Calculation of SSE by “Backing” into it = variation explained by regression variation not explained by regression + Therefore, However, and Hence,

Calculation of the Standard Error of the Estimate = error variance = = standard error of the estimate = = gallons Interpretation: The “typical” error made when predicting the number of gallons of water consumed based on the number of commercials is about 666.48 gallons.

The Question At the .05 level of significance, is there evidence that a linear relationship exists between the number of commercials and water consumption? We have almost enough calculated to be able to answer the question. just one more...........................................................

Calculation of the Standard Error of the Slope (also called the standard error of the regression coefficient, b1)

Test Statistics for Regression Now, what is ? Well, that’s another story. or

The End