© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10
© 2001 Prentice-Hall, Inc. Learning Objectives 1.Describe the Linear Regression Model 2.State the Regression Modeling Steps 3.Explain Ordinary Least Squares 4.Compute Regression Coefficients 5.Predict Response Variable 6.Interpret Computer Output
© 2001 Prentice-Hall, Inc. Models
© 2001 Prentice-Hall, Inc. Models 1.Representation of Some Phenomenon 2.Mathematical Model Is a Mathematical Expression of Some Phenomenon 3.Often Describe Relationships between Variables 4.Types Deterministic Models Deterministic Models Probabilistic Models Probabilistic Models
© 2001 Prentice-Hall, Inc. Deterministic Models 1.Hypothesize Exact Relationships 2.Suitable When Prediction Error is Negligible 3.Example: Force Is Exactly Mass Times Acceleration F = m·a F = m·a © T/Maker Co.
© 2001 Prentice-Hall, Inc. Probabilistic Models 1.Hypothesize 2 Components Deterministic Deterministic Random Error Random Error 2.Example: Sales Volume Is 10 Times Advertising Spending + Random Error Y = 10X + Y = 10X + Random Error May Be Due to Factors Other Than Advertising Random Error May Be Due to Factors Other Than Advertising
© 2001 Prentice-Hall, Inc. Types of Probabilistic Models
© 2001 Prentice-Hall, Inc. Regression Models
© 2001 Prentice-Hall, Inc. Types of Probabilistic Models
© 2001 Prentice-Hall, Inc. Regression Models 1.Answer ‘What Is the Relationship Between the Variables?’ 2.Equation Used 1 Numerical Dependent (Response) Variable 1 Numerical Dependent (Response) Variable What Is to Be Predicted What Is to Be Predicted 1 or More Numerical or Categorical Independent (Explanatory) Variables 1 or More Numerical or Categorical Independent (Explanatory) Variables 3.Used Mainly for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Model Specification
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Specifying the Model 1.Define Variables Conceptual (e.g., Advertising, Price) Conceptual (e.g., Advertising, Price) Empirical (e.g., List Price, Regular Price) Empirical (e.g., List Price, Regular Price) Measurement (e.g., $, Units) Measurement (e.g., $, Units) 2.Hypothesize Nature of Relationship Expected Effects (i.e., Coefficients’ Signs) Expected Effects (i.e., Coefficients’ Signs) Functional Form (Linear or Non-Linear) Functional Form (Linear or Non-Linear) Interactions Interactions
© 2001 Prentice-Hall, Inc. Model Specification Is Based on Theory 1.Theory of Field (e.g., Sociology) 2.Mathematical Theory 3.Previous Research 4.‘Common Sense’
© 2001 Prentice-Hall, Inc. Thinking Challenge: Which Is More Logical?
© 2001 Prentice-Hall, Inc. Types of Regression Models
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models Simple 1 Explanatory Variable
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models 2+ Explanatory Variables Simple Multiple 1 Explanatory Variable
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models Linear 2+ Explanatory Variables Simple Multiple 1 Explanatory Variable
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models Linear Non- Linear 2+ Explanatory Variables Simple Multiple 1 Explanatory Variable
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models Linear Non- Linear 2+ Explanatory Variables Simple Multiple Linear 1 Explanatory Variable
© 2001 Prentice-Hall, Inc. Types of Regression Models Regression Models Linear Non- Linear 2+ Explanatory Variables Simple Multiple Linear 1 Explanatory Variable Non- Linear
© 2001 Prentice-Hall, Inc. Linear Regression Model
© 2001 Prentice-Hall, Inc. Types of Regression Models
© 2001 Prentice-Hall, Inc. Linear Equations High School Teacher © T/Maker Co.
© 2001 Prentice-Hall, Inc. YX iii 01 Linear Regression Model 1.Relationship Between Variables Is a Linear Function Dependent (Response) Variable Independent (Explanatory) Variable Population Slope Population Y-Intercept Random Error
© 2001 Prentice-Hall, Inc. Population & Sample Regression Models
© 2001 Prentice-Hall, Inc. Population & Sample Regression Models Population $ $ $ $ $
© 2001 Prentice-Hall, Inc. Population & Sample Regression Models Unknown Relationship Population $ $ $ $ $
© 2001 Prentice-Hall, Inc. Population & Sample Regression Models Unknown Relationship Population Random Sample $ $ $ $ $
© 2001 Prentice-Hall, Inc. Population & Sample Regression Models Unknown Relationship Population Random Sample $ $ $ $ $
© 2001 Prentice-Hall, Inc. Population Linear Regression Model Observed value i = Random error
© 2001 Prentice-Hall, Inc. Sample Linear Regression Model Unsampled observation i = Random error Observed value ^
© 2001 Prentice-Hall, Inc. Estimating Parameters: Least Squares Method
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc X Y Scattergram 1.Plot of All (X i, Y i ) Pairs 2.Suggests How Well Model Will Fit
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?
© 2001 Prentice-Hall, Inc. Least Squares Least Squares 1.‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative But Positive Differences Off-Set Negative
© 2001 Prentice-Hall, Inc. Least Squares Least Squares 1.‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative But Positive Differences Off-Set Negative
© 2001 Prentice-Hall, Inc. Least Squares Least Squares 1.‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative But Positive Differences Off-Set Negative 2.LS Minimizes the Sum of the Squared Differences (SSE)
© 2001 Prentice-Hall, Inc. Least Squares Graphically
© 2001 Prentice-Hall, Inc. Coefficient Equations Sample Slope Sample Y-intercept Prediction Equation
© 2001 Prentice-Hall, Inc. Computation Table
© 2001 Prentice-Hall, Inc. Interpretation of Coefficients
© 2001 Prentice-Hall, Inc. Interpretation of Coefficients 1.Slope ( 1 ) Estimated Y Changes by 1 for Each 1 Unit Increase in X Estimated Y Changes by 1 for Each 1 Unit Increase in X If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) ^ ^ ^
© 2001 Prentice-Hall, Inc. Interpretation of Coefficients 1.Slope ( 1 ) Estimated Y Changes by 1 for Each 1 Unit Increase in X Estimated Y Changes by 1 for Each 1 Unit Increase in X If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) 2.Y-Intercept ( 0 ) Average Value of Y When X = 0 Average Value of Y When X = 0 If 0 = 4, then Average Sales (Y) Is Expected to Be 4 When Advertising (X) Is 0 If 0 = 4, then Average Sales (Y) Is Expected to Be 4 When Advertising (X) Is 0 ^ ^ ^ ^ ^
© 2001 Prentice-Hall, Inc. Parameter Estimation Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) What is the relationship between sales & advertising?
© 2001 Prentice-Hall, Inc. Scattergram Sales vs. Advertising Sales Advertising
© 2001 Prentice-Hall, Inc. Parameter Estimation Solution Table
© 2001 Prentice-Hall, Inc. Parameter Estimation Solution
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution 1.Slope ( 1 ) Sales Volume (Y) Is Expected to Increase by.7 Units for Each $1 Increase in Advertising (X) Sales Volume (Y) Is Expected to Increase by.7 Units for Each $1 Increase in Advertising (X) ^
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution 1.Slope ( 1 ) Sales Volume (Y) Is Expected to Increase by.7 Units for Each $1 Increase in Advertising (X) Sales Volume (Y) Is Expected to Increase by.7 Units for Each $1 Increase in Advertising (X) 2.Y-Intercept ( 0 ) Average Value of Sales Volume (Y) Is -.10 Units When Advertising (X) Is 0 Average Value of Sales Volume (Y) Is -.10 Units When Advertising (X) Is 0 Difficult to Explain to Marketing Manager Difficult to Explain to Marketing Manager Expect Some Sales Without Advertising Expect Some Sales Without Advertising ^ ^
© 2001 Prentice-Hall, Inc. Parameter Estimates Parameter Estimates Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT Parameter Estimation Computer Output 00 ^ 11 ^ kk ^
© 2001 Prentice-Hall, Inc. Parameter Estimation Thinking Challenge You’re an economist for the county cooperative. You gather the following data: Fertilizer (lb.)Yield (lb.) What is the relationship between fertilizer & crop yield? © T/Maker Co.
© 2001 Prentice-Hall, Inc. Scattergram Crop Yield vs. Fertilizer* Yield (lb.) Fertilizer (lb.)
© 2001 Prentice-Hall, Inc. Parameter Estimation Solution Table*
© 2001 Prentice-Hall, Inc. Parameter Estimation Solution*
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution*
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution* 1.Slope ( 1 ) Crop Yield (Y) Is Expected to Increase by.65 lb. for Each 1 lb. Increase in Fertilizer (X) Crop Yield (Y) Is Expected to Increase by.65 lb. for Each 1 lb. Increase in Fertilizer (X) ^
© 2001 Prentice-Hall, Inc. Coefficient Interpretation Solution* 1.Slope ( 1 ) Crop Yield (Y) Is Expected to Increase by.65 lb. for Each 1 lb. Increase in Fertilizer (X) Crop Yield (Y) Is Expected to Increase by.65 lb. for Each 1 lb. Increase in Fertilizer (X) 2.Y-Intercept ( 0 ) Average Crop Yield (Y) Is Expected to Be 0.8 lb. When No Fertilizer (X) Is Used Average Crop Yield (Y) Is Expected to Be 0.8 lb. When No Fertilizer (X) Is Used ^ ^
© 2001 Prentice-Hall, Inc. Probability Distribution of Random Error
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Linear Regression Assumptions 1.Mean of Probability Distribution of Error Is 0 2.Probability Distribution of Error Has Constant Variance 3.Probability Distribution of Error is Normal 4. Errors Are Independent
© 2001 Prentice-Hall, Inc. Error Probability Distribution ^
© 2001 Prentice-Hall, Inc. Random Error Variation
© 2001 Prentice-Hall, Inc. Random Error Variation 1.Variation of Actual Y from Predicted Y
© 2001 Prentice-Hall, Inc. Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s ^
© 2001 Prentice-Hall, Inc. Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s 3. Affects Several Factors Parameter Significance Parameter Significance Prediction Accuracy Prediction Accuracy ^
© 2001 Prentice-Hall, Inc. Measures of Variation in Regression 1.Total Sum of Squares (SS yy ) Measures Variation of Observed Y i Around the Mean Y Measures Variation of Observed Y i Around the Mean Y 2.Explained Variation (SSR) Variation Due to Relationship Between X & Y Variation Due to Relationship Between X & Y 3.Unexplained Variation (SSE) Variation Due to Other Factors Variation Due to Other Factors
© 2001 Prentice-Hall, Inc. Variation Measures Total sum of squares (Y i - Y) 2 Unexplained sum of squares (Y i - Y i ) 2 ^ Explained sum of squares (Y i - Y) 2 ^ YiYiYiYi
© 2001 Prentice-Hall, Inc. 1.Proportion of Variation ‘Explained’ by Relationship Between X & Y Coefficient of Determination 0 r 2 1
© 2001 Prentice-Hall, Inc. Coefficient of Determination Examples r 2 = 1 r 2 =.8r 2 = 0
© 2001 Prentice-Hall, Inc. Coefficient of Determination Example You’re a marketing analyst for Hasbro Toys. You find 0 = -0.1 & 1 = 0.7. Ad $Sales (Units) Interpret a coefficient of determination of ^ ^
© 2001 Prentice-Hall, Inc. r 2 Computer Output Root MSE R-square Root MSE R-square Dep Mean Adj R-sq Dep Mean Adj R-sq C.V C.V r 2 adjusted for number of explanatory variables & sample size S r2r2
© 2001 Prentice-Hall, Inc. Evaluating the Model Testing for Significance
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Test of Slope Coefficient 1.Shows If There Is a Linear Relationship Between X & Y 2.Involves Population Slope 1 3.Hypotheses H 0 : 1 = 0 (No Linear Relationship) H 0 : 1 = 0 (No Linear Relationship) H a : 1 0 (Linear Relationship) H a : 1 0 (Linear Relationship) 4.Theoretical Basis Is Sampling Distribution of Slope
© 2001 Prentice-Hall, Inc. Sampling Distribution of Sample Slopes
© 2001 Prentice-Hall, Inc. Sampling Distribution of Sample Slopes
© 2001 Prentice-Hall, Inc. Sampling Distribution of Sample Slopes All Possible Sample Slopes Sample 1:2.5 Sample 2:1.6 Sample 3:1.8 Sample 4:2.1 : : Very large number of sample slopes
© 2001 Prentice-Hall, Inc. Sampling Distribution of Sample Slopes All Possible Sample Slopes Sample 1:2.5 Sample 2:1.6 Sample 3:1.8 Sample 4:2.1 : : Very large number of sample slopes Sampling Distribution 1111 1111 S ^ ^
© 2001 Prentice-Hall, Inc. Slope Coefficient Test Statistic
© 2001 Prentice-Hall, Inc. Test of Slope Coefficient Example You’re a marketing analyst for Hasbro Toys. You find b 0 = -.1, b 1 =.7 & s = Ad $Sales (Units) Is the relationship significant at the.05 level?
© 2001 Prentice-Hall, Inc. Solution Table
© 2001 Prentice-Hall, Inc. Test of Slope Parameter Solution H 0 : 1 = 0 H a : 1 0 .05 df = 3 Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 There is evidence of a relationship
© 2001 Prentice-Hall, Inc. Test Statistic Solution
© 2001 Prentice-Hall, Inc. Test of Slope Parameter Computer Output Parameter Estimates Parameter Estimates Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT t = k / S P-Value SS kk k k ^ ^ ^ ^
© 2001 Prentice-Hall, Inc. Using the Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
© 2001 Prentice-Hall, Inc. Prediction With Regression Models 1.Types of Predictions Point Estimates Point Estimates Interval Estimates Interval Estimates 2.What Is Predicted Population Mean Response E(Y) for Given X Population Mean Response E(Y) for Given X Point on Population Regression Line Point on Population Regression Line Individual Response (Y i ) for Given X Individual Response (Y i ) for Given X
© 2001 Prentice-Hall, Inc. What Is Predicted
© 2001 Prentice-Hall, Inc. Confidence Interval Estimate of Mean Y
© 2001 Prentice-Hall, Inc. Factors Affecting Interval Width 1.Level of Confidence (1 - ) Width Increases as Confidence Increases Width Increases as Confidence Increases 2.Data Dispersion (s) Width Increases as Variation Increases Width Increases as Variation Increases 3.Sample Size Width Decreases as Sample Size Increases Width Decreases as Sample Size Increases 4.Distance of X p from Mean X Width Increases as Distance Increases Width Increases as Distance Increases
© 2001 Prentice-Hall, Inc. Why Distance from Mean? Greater dispersion than X 1 XXXX
© 2001 Prentice-Hall, Inc. Confidence Interval Estimate Example You’re a marketing analyst for Hasbro Toys. You find b 0 = -.1, b 1 =.7 & s = Ad $Sales (Units) Estimate the mean sales when advertising is $4 at the.05 level.
© 2001 Prentice-Hall, Inc. Solution Table
© 2001 Prentice-Hall, Inc. Confidence Interval Estimate Solution X to be predicted
© 2001 Prentice-Hall, Inc. Prediction Interval of Individual Response Note!
© 2001 Prentice-Hall, Inc. Why the Extra ‘S ’ ?
© 2001 Prentice-Hall, Inc. Interval Estimate Computer Output Dep Var Pred Std Err Low95% Upp95% Low95% Upp95% Dep Var Pred Std Err Low95% Upp95% Low95% Upp95% Obs SALES Value Predict Mean Mean Predict Predict Predicted Y when X = 4 Confidence Interval SYSYSYSY^ Prediction Interval
© 2001 Prentice-Hall, Inc. Hyperbolic Interval Bands
© 2001 Prentice-Hall, Inc. Correlation Models
© 2001 Prentice-Hall, Inc. Types of Probabilistic Models
© 2001 Prentice-Hall, Inc. Correlation Models 1.Answer ‘How Strong Is the Linear Relationship Between 2 Variables?’ 2.Coefficient of Correlation Used Population Correlation Coefficient Denoted (Rho) Population Correlation Coefficient Denoted (Rho) Values Range from -1 to +1 Values Range from -1 to +1 Measures Degree of Association Measures Degree of Association 3.Used Mainly for Understanding
© 2001 Prentice-Hall, Inc. 1.Pearson Product Moment Coefficient of Correlation, r: Sample Coefficient of Correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values No Correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values Increasing degree of negative correlation No Correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values Perfect Negative Correlation No Correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values Perfect Negative Correlation No Correlation Increasing degree of positive correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Values Perfect Positive Correlation Perfect Negative Correlation No Correlation
© 2001 Prentice-Hall, Inc. Coefficient of Correlation Examples r = 1r = -1 r =.89r = 0
© 2001 Prentice-Hall, Inc. Test of Coefficient of Correlation 1.Shows If There Is a Linear Relationship Between 2 Numerical Variables 2.Same Conclusion as Testing Population Slope 1 3.Hypotheses H 0 : = 0 (No Correlation) H 0 : = 0 (No Correlation) H a : 0 (Correlation) H a : 0 (Correlation)
© 2001 Prentice-Hall, Inc. Conclusion 1.Described the Linear Regression Model 2.Stated the Regression Modeling Steps 3.Explained Ordinary Least Squares 4.Computed Regression Coefficients 5.Predicted Response Variable 6.Interpreted Computer Output
End of Chapter Any blank slides that follow are blank intentionally.