Regression The basic problem Regression and Correlation Accuracy of prediction in regression Hypothesis testing Regression with multiple predictors
The Basic Problem How do we predict one variable from another? How does one variable change as the other changes? Cause and effect (can only be inferred if it makes theoretical sense)
An Example Effect of Dow Jones Performance on Darts performance (to what degree can Dow Jones predict Dart performance) Landwehr, J.M. & Watkins, A.E. (1987) Exploring Data: Teacher’s Edition. Palo Alto, CA: Dale Seymour Publications.
The Data
Relationship can be represented by line of best fit
Why use regression? We may want to make a prediction. More likely, we want to understand the relationship. How fast does Darts rise with one unit rise in Dow Jones?
Regression Line Formula = the predicted value of Y (Darts) X = Dow value
Regression Coefficients “Coefficients” are a and b b = slope (also called rate of change) Change in predicted Y for one unit change in X a = intercept value of when X = 0
Calculation Slope Intercept
For Our Data b = 11.13/5.43 = 2.04 a = 14.52 - 2.04*5.95 = 2.37 See SPSS printout on next slide
SPSS Printout for one Predictor R2, Percentage of Variance
SPSS printout cont. Is regression Significant? Intercept Slope Error of prediction Is regression Significant? Intercept Slope
Note: The values we obtained are shown on printout. The intercept is labeled “constant.” Slope is labeled by name of predictor variable.
Making a Prediction Suppose that we want to predict Darts score for a new Dow Score of 200 We predict that Darts will be at 23.65 when Dow is at 25 Check with data: what is real value of Darts when Dow is 25
Prediction Residual
Errors of Prediction Residual variance Standard error of estimate The variability of predicted values Standard error of estimate The standard deviation of predicted values
Standard Error of Estimate A common measure of the accuracy of our predictions We want it to be as small as possible.
r 2 as % Predictable Variability Define Sum of Squares
Major Points Predicting one dependent variable from multiple predictor variables Example with Product Advisor Data Multiple correlation Regression equation Predictions
The Problem In the product advisor study, we asked participants to rate the system on a number of aspects: e.g, usefulness, ease of use, trust, kind of product information, number of ratings etc. Lets think of overall usefulness as our dependent variable. Which of the above factors can predict overall usefulness? What percentage variance do they explain in the usefulness overall? What factors play the more important role?
Product Advisor Data Kliewer, W., Lepore, S.J., Oskin, D., & Johnson, P.D. (1998) The role of social and cognitive processes in children’s adjustment to community violence. Journal of Consulting and Clinical Psychology, 66, 199-209.
Correlational Matrix
Regression Results (using simple linear regression using method “enter” R2, Percentage of Variance
Regression is significant Importance of each variable Is contribution significant?
Regression Coefficients Slopes and an intercept. Each variable adjusted for all others in the model. Just an extension of slope and intercept in simple regression SPSS output on next slide
Regression Equation A separate coefficient for each variable These are slopes An intercept (here called b0 instead of a)