CHAPTER 15 Simple Linear Regression and Correlation

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Simple Linear Regression
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
9. SIMPLE LINEAR REGESSION AND CORRELATION
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Simple Linear Regression Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Business Statistics - QBM117 Least squares regression.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Simple Linear Regression Analysis
Linear Regression/Correlation
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation and Linear Regression
Lecture 15 Basics of Regression Analysis
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression Models
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
CHAPTER 14: Nonparametric Methods to accompany Introduction to Business Statistics seventh edition, by Ronald M. Weiers Presentation by Priscilla Chaffe-Stengel.
Introduction to Probability and Statistics Chapter 12 Linear Regression and Correlation.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Class 4 Simple Linear Regression. Regression Analysis Reality is thought to behave in a manner which may be simulated (predicted) to an acceptable degree.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 16 Multiple Regression and Correlation
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Multiple Regression.
Chapter 13 Simple Linear Regression
Statistics for Managers using Microsoft Excel 3rd Edition
Relationship with one independent variable
Simple Linear Regression
Multiple Regression.
Relationship with one independent variable
SIMPLE LINEAR REGRESSION
Introduction to Regression
Presentation transcript:

CHAPTER 15 Simple Linear Regression and Correlation to accompany Introduction to Business Statistics seventh edition, by Ronald M. Weiers Presentation by Priscilla Chaffe-Stengel Donald N. Stengel © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Chapter 15 - Key Concept Regression analysis generates a “best-fit” mathematical equation that can be used in predicting the values of the dependent variable as a function of the independent variable. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Direct vs Inverse Relationships Direct relationship: As x increases, y increases. The graph of the model rises from left to right. The slope of the linear model is positive. Inverse relationship: As x increases, y decreases. The graph of the model falls from left to right. The slope of the linear model is negative. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Model Probabilistic Model: yi = b0 + b1xi + ei where yi = a value of the dependent variable, y xi = a value of the independent variable, x b0 = the y-intercept of the regression line b1 = the slope of the regression line ei = random error, the residual Deterministic Model: = b0 + b1xi where and is the predicted value of y in contrast to the actual value of y. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Determining the Least Squares Regression Line Slope y-intercept ˆ y = b + 1 x b 1 = ( x i y ) – n × å 2 We actually used a slightly different formula to calculate the slope, which we discovered by first computing the value of r Before we go to the next slide. Let’s use Excel to look at some correlations. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

NFL Defense stats scatter plots w/ regression lines.

Simple Linear Regression: An Example Problem 15.9: For a sample of 8 employees, a personnel director has collected the following data on ownership of company stock, y, versus years with the firm, x. x 6 12 14 6 9 13 15 9 y 300 408 560 252 288 650 630 522 (a) Determine the least squares regression line and interpret its slope. (b) For an employee who has been with the firm 10 years, what is the predicted number of shares of stock owned? Here’s an example right out of the book. Since we already used the formula to compute the least squares regression line once, let’s use Excel to figure this one out. When using Excel, select Line Fit Plots © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Excel Output, Problem 15.9, cont. The values I’m concerned with right now are the y-intercept and the slope. Can you use these to create our linear regression equation? Notice, also, the values of r, r square, and adjusted r square. What do they tell us? If an employee has been with the firm 10 years, how many shares of stock would we expect him to have? 10(38.75)+44.3=431.8 The y-intercept The slope © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Problem 15.9, cont. Interpretation of the slope: For every additional year an employee works for the firm, the employee acquires an estimated 38.8 shares of stock per year. If x1 = 10, the point estimate for the number of shares of stock that this employee owns is: There’s the equation and the answer to the 10 years question. ˆ y = 44 . 314 + 38 7558 × x ( 10 ) 431 872 » 432 shares © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Interval Estimates Using the Regression Model Confidence Interval for the Mean of y places an upper and lower bound around the point estimate for the average value of y given x. Prediction Interval for an Individual y places an upper and lower bound around the point estimate for an individual value of y given x. Confidence interval – for a score of x on the dexterity test, what is the CI for the mean productivity for everyone who got score x on the test Prediction interval – if an individual scores x on the test, what is the PI for that one individual’s productivity? Staying with the stock shares problem © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

To Form Interval Estimates The Standard Error of the Estimate, sy,x The standard deviation of the distribution of the data points above and below the regression line, distances between actual and predicted values of y, residuals, of e The square root of MSE given by ANOVA To develop the interval estimates, we need to know the standard error of the estimate. This is the standard deviation describing the dispersion of the data above and below the regression line. Here’s the formula, but if I go back one slide, I will discover that Excel already computed this for me. Standard error = 91.48 2 – ) ˆ ( , n y i x s å = © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Equations for the Interval Estimates Confidence Interval for the Mean of y Prediction Interval for the Individual y å + × ± n i x value y s t 2 ) ( – 1 , ˆ a ˆ y ± t a 2 × ( s , x ) 1 + n value – i å Our predicted value of y-hat given x=10 will be the mid- point of our estimate. In this case y-hat = 432 © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Using Intervals – Problem 15.9 For employees who worked 10 years for the firm, what is the 95% confidence interval for their mean share holdings? This calls for a confidence interval on the average number of shares owned by employees who worked for the firm 10 years. So we will use: å + × ± n x y s t 2 ) ( – value 1 , ˆ a © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Standard Error of the Estimate, Definitional Equation x y Predicted y Squared Residual 6 300 276.8488 535.9763 12 408 509.3837 10278.6589 14 560 586.8953 723.3598 6 252 276.8488 617.4647 9 288 393.1163 11049.4321 13 650 548.1395 10375.5544 15 630 625.6512 18.9124 9 522 393.1163 16611.0135 Sum = 50210.3721 © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Evaluating the Confidence Interval Since n = 8, df = 8 – 2 = 6 and ta/2 = 2.447. From our prior analyses, Sx = 84, Sx2 = 968, and the predicted y = 431.872. 4789 . 91 2 – 8 3721 210 , 50 ) ˆ ( = å n y i x s 057 . 80 872 431 ) 3576 ( 4789 91 447 2 8 84 – 968 5 10 1 value , ˆ ± = × + å n x y s t a This is as far as I want to go with this one. If we could use the Data Analysis Plus CD, Excel would give us this data. But we can evaluate the results. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Interpreting the Confidence Interval Based on our calculations, we would have 95% confidence that the mean number of shares for persons working for the firm 10 years will be between: 431.872 – 80.057 = 351.815 and 431.872 + 80.057 = 511.929 Written in interval notation: (351.815, 511.929) © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Using Intervals – Problem 15.9 An employee worked 10 years for the firm. What is the 95% prediction interval for her share holdings? This calls for a prediction interval on the number of shares owned by an individual employee who worked for the firm 10 years. So we will use: å + × ± n x y s t 2 ) ( – value 1 , ˆ a © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Evaluating the Prediction Interval - Problem 15.9 Since n = 8, df = 8 – 2 = 6 and ta/2 = 2.447. From our prior analyses, Sx = 84, Sx2 = 968, and the predicted y = 431.872. 734 . 237 872 431 ) 0620 1 ( 4789 91 447 2 8 84 – 968 5 10 value , ˆ ± = × + å n x y s t a © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Interpreting the Prediction Interval – Problem 15.9 Based on our calculations, we would have 95% confidence that the number of shares an employee working for the firm 10 years will hold will be between: 431.872 – 237.734 = 194.138 and 431.872 + 237.734 = 669.606 Written in interval notation, (194.138 , 669.606) © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Comparing the Two Intervals Notice that the confidence interval for the mean is much narrower than the prediction interval for the individual value. There is greater fluctuation among individual values than among group means. Both are centered at the point estimate. = 431.872 This is the same type of situation we got when we covered sampling distributions © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Correlation A measure of the Direction of the linear relationship between x and y. If x and y are directly related, r > 0. If x and y are inversely related, r < 0. Strength of the linear relationship between x and y. The larger the absolute value of r, the more the value of y depends in a linear way on the value of x. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Linearity Key Argument: If the value of y does not change linearly with the value of x, then using the mean value of y is the best predictor for the actual value of y. This implies is preferable. If the value of y does change linearly with the value of x, then using the regression model gives a better prediction for the value of y than using the mean of y. This implies is preferable. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Determination A measure of the Strength of the linear relationship between x and y. The larger the value of r2, the more the value of y depends in a linear way on the value of x. Amount of variation in y that is related to variation in x. Ratio of variation in y that is explained by the regression model divided by the total variation in y. © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Three Tests for Linearity 1. Testing the Coefficient of Correlation H0: r = 0 There is no linear relationship between x and y. H1: r ¹ 0 There is a linear relationship between x and y. Test Statistic: 2. Testing the Slope of the Regression Line H0: b1 = 0 There is no linear relationship between x and y. H1: b1 ¹ 0 There is a linear relationship between x and y. t = r 1 – 2 n © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Three Tests for Linearity 3. The Global F-test H0: There is no linear relationship between x and y. H1: There is a linear relationship between x and y. Test Statistic: Note: At the level of simple linear regression, the global F-test is equivalent to the t-test on b1. When we conduct regression analysis of multiple variables, the global F-test will take on a unique function. F = MSR MSE SSR 1 SSE ( n – 2 ) © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.

Excel Output, Problem 15.9 The global F test statistic for the test of H0: b1 = 0 Coefficient of correlation Coefficient of determination Note that: (1) both t and F have the same p-value, and (2) t2 = F. Finish with a discussion of the problems with linear regression (see earlier notes) Look at the baseball regression equation for 2010 Homework: 15.71, 15.79, 15.37, 15.1 In class: 15.43, XR15043 15.57, XR15057 The calculated t for the test of H0: b1 = 0 © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied, or duplicated, or posted to a publicly accessible website, in whole or in part.