Statistics for Business and Economics Chapter 10 Simple Linear Regression.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Simple Linear Regression
Chapter 12 Simple Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Simple Linear Regression
Chapter 12 Multiple Regression
Statistics for Business and Economics
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Simple Linear Regression
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Chapter 13 Simple Linear Regression
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
© 2011 Pearson Education, Inc. Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Chapter 13 Simple Linear Regression
Regression and Correlation Methods Judy Zhong Ph.D.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Chapter 14 Simple Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Chap 13-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 12.
Lecture 10: Correlation and Regression Model.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
1 Linear Regression Model. 2 Types of Regression Models.
© 2011 Pearson Education, Inc Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Chapter 13 Simple Linear Regression
Statistics for Business and Economics
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 13 Simple Linear Regression
PENGOLAHAN DAN PENYAJIAN
Statistics for Business and Economics
Introduction to Regression
Presentation transcript:

Statistics for Business and Economics Chapter 10 Simple Linear Regression

Learning Objectives 1.Describe the Linear Regression Model 2.State the Regression Modeling Steps 3.Explain Least Squares 4.Compute Regression Coefficients 5.Explain Correlation 6.Predict Response Variable

Models

Representation of some phenomenon Mathematical model is a mathematical expression of some phenomenon Often describe relationships between variables Types –Deterministic models –Probabilistic models

Deterministic Models Hypothesize exact relationships Suitable when prediction error is negligible Example: force is exactly mass times acceleration –F = m·a

Probabilistic Models Hypothesize two components –Deterministic –Random error Example: sales volume (y) is 10 times advertising spending (x) + random error –y = 10x +  –Random error may be due to factors other than advertising

Types of Probabilistic Models Probabilistic Models Regression Models Correlation Models

Regression Models

Types of Probabilistic Models Probabilistic Models Regression Models Correlation Models

Regression Models Answers ‘What is the relationship between the variables?’ Equation used –One numerical dependent (response) variable  What is to be predicted –One or more numerical or categorical independent (explanatory) variables Used mainly for prediction and estimation

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Model Specification

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Specifying the Model 1.Define variables Conceptual (e.g., Advertising, price) Empirical (e.g., List price, regular price) Measurement (e.g., $, Units) 2.Hypothesize nature of relationship Expected effects (i.e., Coefficients’ signs) Functional form (linear or non-linear) Interactions

Model Specification Is Based on Theory Theory of field (e.g., Sociology) Mathematical theory Previous research ‘Common sense’

Advertising Sales Advertising Sales Advertising Sales Advertising Sales Thinking Challenge: Which Is More Logical?

Types of Relationships Y X Y X Y Y X X Strong relationshipsWeak relationships (continued)

Types of Relationships Y X Y X No relationship (continued)

Simple 1 Explanatory Variable Types of Regression Models Regression Models 2+ Explanatory Variables Multiple Linear Non- Linear Non- Linear

Linear Regression Model

Types of Regression Models Simple 1 Explanatory Variable Regression Models 2+ Explanatory Variables Multiple Linear Non- Linear Non- Linear

yx  01 Linear Regression Model Relationship between variables is a linear function Dependent (Response) Variable Independent (Explanatory) Variable Population Slope Population y-intercept Random Error

y E(y) = β 0 + β 1 x (line of means) β 0 = y-intercept x Change in y Change in x β 1 = Slope Line of Means

Population & Sample Regression Models Population $ $ $ $ $ Unknown Relationship Random Sample $ $

y x Population Linear Regression Model Observed value  i = Random error

y x Sample Linear Regression Model Unsampled observation  i = Random error Observed value ^

Estimating Parameters: Least Squares Method

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Scattergram 1.Plot of all (x i, y i ) pairs 2.Suggests how well model will fit x y

x y Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

Least Squares ‘Best fit’ means difference between actual y values and predicted y values are a minimum – But positive differences off-set negative Least Squares minimizes the Sum of the Squared Differences (SSE)

Least Squares Graphically  2 y x  1  3  4 ^ ^ ^ ^

Coefficient Equations Slope y-intercept Prediction Equation

Computation Table xixi 2 x1x1 2 x2x2 2 :: xnxn 2 ΣxiΣxi 2 yiyi y1y1 y2y2 : ynyn ΣyiΣyi xixi x1x1 x2x2 : xnxn ΣxiΣxi yiyi y1y1 y2y2 : ynyn ΣyiΣyi xiyixiyi ΣxiyiΣxiyi x1y1x1y1 x2y2x2y2 xnynxnyn

Interpretation of Coefficients ^ ^^ 1.Slope (  1 ) Estimated y changes by  1 for each 1unit increase in x —If  1 = 2, then Sales (y) is expected to increase by 2 for each 1 unit increase in Advertising (x) 2.Y-Intercept (  0 ) Average value of y when x = 0 —If  0 = 4, then Average Sales (y) is expected to be 4 when Advertising (x) is 0^^

Least Squares Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) Find the least squares line relating sales and advertising.

Scattergram Sales vs. Advertising Sales Advertising

Parameter Estimation Solution Table xixi yiyi xixi yiyi xiyixiyi

Parameter Estimation Solution

Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT Parameter Estimation Computer Output 00 ^ 11 ^

Coefficient Interpretation Solution 1.Slope (  1 ) Sales Volume (y) is expected to increase by.7 units for each $1 increase in Advertising (x) ^ 2.Y-Intercept (  0 ) Average value of Sales Volume (y) is -.10 units when Advertising (x) is 0 — Difficult to explain to marketing manager — Expect some sales without advertising^

Regression Line Fitted to the Data Sales Advertising

Least Squares Thinking Challenge You’re an economist for the county cooperative. You gather the following data: Fertilizer (lb.)Yield (lb.) Find the least squares line relating crop yield and fertilizer. © T/Maker Co.

Scattergram Crop Yield vs. Fertilizer* Yield (lb.) Fertilizer (lb.)

Parameter Estimation Solution Table* xixi 2 yiyi xixi yiyi xiyixiyi 2

Parameter Estimation Solution*

Coefficient Interpretation Solution* 2.Y-Intercept (  0 ) Average Crop Yield (y) is expected to be 0.8 lb. when no Fertilizer (x) is used ^ ^ 1.Slope (  1 ) Crop Yield (y) is expected to increase by.65 lb. for each 1 lb. increase in Fertilizer (x)

Regression Line Fitted to the Data* Yield (lb.) Fertilizer (lb.)

Probability Distribution of Random Error

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Linear Regression Assumptions 1.Mean of probability distribution of error, ε, is 0 2.Probability distribution of error has constant variance 3.Probability distribution of error, ε, is normal 4.Errors are independent

Error Probability Distribution x1x1 x2x2 x3x3 y E(y) = β 0 + β 1 x x

Random Error Variation Measured by standard error of regression model – Sample standard deviation of  : s ^ Affects several factors –Parameter significance –Prediction accuracy ^ Variation of actual y from predicted y, y

y x xixi Variation Measures yiyi Unexplained sum of squares Total sum of squares Explained sum of squares y

Estimation of σ 2

Calculating SSE, s 2, s Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) Find SSE, s 2, and s.

Calculating SSE Solution xixi yiyi SSE=1.1

Calculating s 2 and s Solution

Residual Analysis The residual for observation i, e i, is the difference between its observed and predicted value Check the assumptions of regression by examining the residuals –Examine for linearity assumption –Evaluate independence assumption –Evaluate normal distribution assumption –Examine for constant variance for all levels of X (homoscedasticity)

Residual Analysis for Linearity Not Linear Linear x residuals x Y x Y x

Residual Analysis for Independence Not Independent Independent X X residuals X

Checking for Normality Examine the Stem-and-Leaf Display of the Residuals Examine the Boxplot of the Residuals Examine the Histogram of the Residuals Construct a Normal Probability Plot of the Residuals

Residual Analysis for Normality Percent Residual When using a normal probability plot, normal errors will approximately display in a straight line

Residual Analysis for Equal Variance Non-constant variance Constant variance xx Y x x Y residuals

Simple Linear Regression Example: Excel Residual Output RESIDUAL OUTPUT Predicted House PriceResiduals Does not appear to violate any regression assumptions

Evaluating the Model Testing for Significance

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Test of Slope Coefficient Shows if there is a linear relationship between x and y Involves population slope  1 Hypotheses –H 0 :  1 = 0 (No Linear Relationship) –H a :  1  0 (Linear Relationship) Theoretical basis is sampling distribution of slope

y Population Line x Sample 1 Line Sample 2 Line Sampling Distribution of Sample Slopes  1 Sampling Distribution 1111 1111 S ^ ^ All Possible Sample Slopes Sample 1:2.5 Sample 2:1.6 Sample 3:1.8 Sample 4:2.1 : : Very large number of sample slopes

Slope Coefficient Test Statistic

Test of Slope Coefficient Example You’re a marketing analyst for Hasbro Toys. You find β 0 = –.1, β 1 =.7 and s = Ad $Sales (Units) Is the relationship significant at the.05 level of significance? ^ ^

Solution Table xixi yiyi xixi 2 yiyi 2 xiyixiyi

Test Statistic Solution

Test of Slope Coefficient Solution H 0 :  1 = 0 H a :  1  0  .05 df  = 3 Critical Value(s): t Reject H 0.025

Test of Slope Coefficient Solution Test Statistic: Decision: Reject at  =.05 Conclusion: There is evidence of a relationship

Test of Slope Coefficient Computer Output Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT t =  1 / S  P-Value SS 11 1 1 ^ ^ ^ ^

Correlation Models

Types of Probabilistic Models Probabilistic Models Regression Models Correlation Models

Answers ‘How strong is the linear relationship between two variables?’ Coefficient of correlation –Sample correlation coefficient denoted r –Values range from –1 to +1 –Measures degree of association –Does not indicate cause–effect relationship

Coefficient of Correlation where

Coefficient of Correlation Values – –.5+.5 No Linear Correlation Perfect Negative Correlation Perfect Positive Correlation Increasing degree of positive correlation Increasing degree of negative correlation

Coefficient of Correlation Example You’re a marketing analyst for Hasbro Toys. Ad $Sales (Units) Calculate the coefficient of correlation.

Solution Table xixi yiyi xixi 2 yiyi 2 xiyixiyi

Coefficient of Correlation Solution

Coefficient of Correlation Thinking Challenge You’re an economist for the county cooperative. You gather the following data: Fertilizer (lb.)Yield (lb.) Find the coefficient of correlation. © T/Maker Co.

Solution Table* xixi 2 yiyi xixi yiyi xiyixiyi 2

Coefficient of Correlation Solution*

Proportion of variation ‘explained’ by relationship between x and y Coefficient of Determination 0  r 2  1 r 2 = (coefficient of correlation) 2

r 2 = 1 Examples of Approximate r 2 Values Y X Y X r 2 = 1 Perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X

Examples of Approximate r 2 Values Y X Y X 0 < r 2 < 1 Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X

Examples of Approximate r 2 Values r 2 = 0 No linear relationship between X and Y: The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) Y X r 2 = 0

Coefficient of Determination Example You’re a marketing analyst for Hasbro Toys. You know r =.904. Ad $Sales (Units) Calculate and interpret the coefficient of determination.

Coefficient of Determination Solution r 2 = (coefficient of correlation) 2 r 2 = (.904) 2 r 2 =.817 Interpretation: About 81.7% of the sample variation in Sales (y) can be explained by using Ad $ (x) to predict Sales (y) in the linear model.

r 2 Computer Output Root MSE R-square Dep Mean Adj R-sq C.V r 2 adjusted for number of explanatory variables & sample size r2r2

Using the Model for Prediction & Estimation

Regression Modeling Steps 1.Hypothesize deterministic component 2.Estimate unknown model parameters 3.Specify probability distribution of random error term Estimate standard deviation of error 4.Evaluate model 5.Use model for prediction and estimation

Prediction With Regression Models Types of predictions –Point estimates –Interval estimates What is predicted –Population mean response E(y) for given x  Point on population regression line –Individual response (y i ) for given x

What Is Predicted Mean y, E(y) y ^ y Individual Prediction, y E(y) =      x ^ x xPxP ^ ^ y i =      x

Confidence Interval Estimate for Mean Value of y at x = x p df = n – 2

Factors Affecting Interval Width 1.Level of confidence (1 –  ) Width increases as confidence increases 2.Data dispersion (s) Width increases as variation increases 3.Sample size Width decreases as sample size increases 4.Distance of x p from mean  x Width increases as distance increases

Why Distance from Mean? Sample 2 Line y x x1x1 x2x2 y Sample 1 Line Greater dispersion than x 1 x

Confidence Interval Estimate Example You’re a marketing analyst for Hasbro Toys. You find β 0 = -.1, β 1 =.7 and s = Ad $Sales (Units) Find a 95% confidence interval for the mean sales when advertising is $4. ^ ^

Solution Table x i y i x i 2 y i 2 x i y i

Confidence Interval Estimate Solution x to be predicted

Prediction Interval of Individual Value of y at x = x p Note! df = n – 2

Why the Extra ‘S’? Expected (Mean) y y y we're trying to predict Prediction, y ^ x xpxp  E(y) =      x ^ ^ ^ y i =      x i

Prediction Interval Example You’re a marketing analyst for Hasbro Toys. You find β 0 = -.1, β 1 =.7 and s = Ad $Sales (Units) Predict the sales when advertising is $4. Use a 95% prediction interval. ^ ^

Solution Table x i y i x i 2 y i 2 x i y i

Prediction Interval Solution x to be predicted

Dep Var Pred Std Err Low95% Upp95% Low95% Upp95% Obs SALES Value Predict Mean Mean Predict Predict Interval Estimate Computer Output Predicted y when x = 4 Confidence Interval SYSY ^ Prediction Interval

Confidence Intervals v. Prediction Intervals x y x ^ ^ ^ y i =      x i

Conclusion 1.Described the Linear Regression Model 2.Stated the Regression Modeling Steps 3.Explained Least Squares 4.Computed Regression Coefficients 5.Explained Correlation 6.Predicted Response Variable