Regression Analysis Chapter 10.

Slides:



Advertisements
Similar presentations
Chapter 4: Basic Estimation Techniques
Advertisements

2009 Foster School of Business Cost Accounting L.DuCharme 1 Determining How Costs Behave Chapter 10.
Lecture Unit Multiple Regression.
Correlation and Regression
Simple Linear Regression Analysis
Copyright © 2012 by Nelson Education Limited. Chapter 13 Association Between Variables Measured at the Interval-Ratio Level 13-1.
Correlation and Linear Regression
Multiple Regression and Model Building
Lesson 10: Linear Regression and Correlation
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Simple Linear Regression
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
The Simple Regression Model
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Linear Regression and Correlation
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Lecture 10: Correlation and Regression Model.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Chapter 13 Simple Linear Regression
Regression and Correlation
Simple Linear Regression
Chapter 11 Simple Regression
Chapter 13 Simple Linear Regression
Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Chapter 13 Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Regression Analysis Chapter 10

Regression and Correlation Techniques that are used to establish whether there is a mathematical relationship between two or more variables, so that the behavior of one variable can be used to predict the behavior of others. Applicable to “Variables” data only. “Regression” provides a functional relationship (Y=f(x)) between the variables; the function represents the “average” relationship. “Correlation” tells us the direction and the strength of the relationship. The analysis starts with a Scatter Plot of Y vs X. The analysis starts with a Scatter Plot of Y vs X

Simple Linear Regression What is it? Determines if Y depends on X and provides a math equation for the relationship (continuous data) y Examples: Process conditions and product properties Sales and advertising budget x Does Y depend on X? Which line is correct?

Simple Linear Regression rise run m = slope = Y b = Y intercept = the Y value at point that the line intersects Y axis. rise run Simple linear regression will help you to understand the relationship between the process output (Y) and any factor that may affect it (X). Understanding this relationship will allow you to predict the Y, given a value of X. This is especially useful when the Y variable of interest is difficult or expensive to measure. For example, measuring the strength of a hardened steel part would require a destructive test. A model of the relationship between annealing temperature and strength would be useful in predicting the material's strength. b X A simple linear relationship can be described mathematically by Y = mX + b

Simple Linear Regression slope = rise run = (6 - 3) (10 - 4) 1 2 Y rise run 5 intercept = 1 X 5 10 Y = 0.5X + 1

Simple regression example An agent for a residential real estate company in a large city would like to predict the monthly rental cost for apartments based on the size of the apartment as defined by square footage. A sample of 25 apartments in a particular residential neighborhood was selected to gather the information

Size Rent 850 950 1450 1600 1085 1200 1232 1500 718 1485 1700 1136 1650 726 935 700 875 956 1150 1100 1400 1285 1985 2300 1369 1800 1175 1225 1245 1259 896 1361 1040 755 1000 800 1750 The data on size and rent for the 25 apartments will be analyzed in EXCEL.

Scatter plot Scatter plot suggests that there is a ‘linear’ relationship between Rent and Size

Interpreting EXCEL output Regression Equation Rent = 177.121+1.065*Size

Interpretation of the regression coefficient What does the coefficient of Size mean? For every additional square feet, Rent goes up by $1.065

Using regression for prediction Predict monthly rent when apartment size is 1000 square feet: Regression Equation: Rent = 177.121+1.065*Size Thus, when Size=1000 Rent=177.121+1.065*1000=$1242 (rounded)

Using regression for prediction – Caution! Regression equation is valid only over the range over which it was estimated! We should interpolate Do not use the equation in predicting Y when X values are not within the range of data used to develop the equation. Extrapolation can be risky Thus, we should not use the equation to predict rent for an apartment whose size is 500 square feet, since this value is not in the range of size values used to create the regression equation.

Why extrapolation is risky Extrapolated relationship In this figure, we fit our regression model using sample data – but the linear relation implicit in our regression model does not hold outside our sample! By extrapolating, we are making erroneous estimates!

Correlation (r) “Correlation coefficient”, r, is a measure of the strength and the direction of the relationship between two variables. Values of r range from +1 (very strong direct relationship), through “0” (no relationship), to –1 (very strong inverse relationship). It measures the degree of scatter of the points around the “Least Squares” regression line

Coefficient of correlation from EXCEL The sign of r is the same as that of the coefficient of X (Size) in the regression equation (in our case the sign is positive). Also, if you look at the scatter plot, you will note that the sign should be positive. R=0.85 suggests a fairly ‘strong’ correlation between size and rent.

Coefficient of determination (r2) “Coefficient of Determination”, r-squared, (sometimes R- squared), defines the amount of the variation in Y that is attributable to variation in X

Getting r2 from EXCEL It is important to remember that r-squared is always positive. It is the square of the coefficient of correlation r. In our case, r2=0.72 suggests that 72% of variation in Rent is explained by the variation in Size. The higher the value of r2, the better is the simple regression model.

Standard error (SE) Standard error measures the variability or scatter of the observed values around the regression line.

Getting the standard error (SE) from EXCEL In our example, the standard error associated with estimating rent is $194.60.

Is the simple regression model statistically valid? It is important to test whether the regression model developed from sample data is statistically valid. For simple regression, we can use 2 approaches to test whether the coefficient of X is equal to zero using t-test using ANOVA

Is the coefficient of X equal to zero? In both cases, the hypothesis we test is: What could we say about the linear relationship between X and Y if the slope were zero?

Using coefficient information for testing if slope=0 P-value 7.52E-08 =7.52*10-8 =0.0000000752 t-stat=7.740 and P-value=7.52E-08. P-value is very small. If it is smaller than our a level, then, we reject null; not otherwise. If a=0.05, we would reject null and conclude that slope is not zero. Same result holds at a=0.01 because the P-value is smaller than 0.01. Thus, at 0.05 (or 0.01) level, we conclude that the slope is NOT zero implying that our model is statistically valid.

Using ANOVA for testing if slope=0 in EXCEL F=59.91376 and P-value=7.51833E-08. P-value is again very small. If it is smaller than our a level, then, we reject null; not otherwise. Thus, at 0.05 (or 0.01) level, slope is NOT zero implying that our model is statistically valid. This is the same conclusion we reached using the t-test.

Confidence interval for the slope of Size The 95% CI tells us that for every 1 square feet increase in apartment Size, Rent will increase by $0.78 to $1.35.

Summary Simple regression is a statistical tool that attempts to fit a straight line relationship between X (independent variable) and Y (dependent variable) The scatter plot gives us a visual clue about the nature of the relationship between X and Y EXCEL, or other statistical software is used to ‘fit’ the model; a good model will be statistically valid, and will have a reasonably high R-squared value A good model is then used to make predictions; when making predictions, be sure to confine them within the domain of X’s used to fit the model (i.e. interpolate); we should avoid extrapolation