Stat 1301 More on Regression
Outline of Lecture 1. Regression Effect and Regression Fallacy 2. Regression Line as Least Squares Line 3. Extrapolation 4. Multiple Regression
1. Regression Effect and Regression Fallacy
Hypothetical Grades for the First 2 Tests in a Class of STAT 1301 Hypothetical Grades for the First 2 Tests in a Class of STAT 1301 AVG x = 75SD x = 10 (Test 1) AVG y = 75SD y = 10 (Test 2) r = 0.7 Test - Retest Situation
Predict the score on Test 2 for a student whose Test 1 score was... Predict the score on Test 2 for a student whose Test 1 score was... (a) 95 (a) 95 (b) 60 Regression Line: Y =.7X ^
l Test-retest situation: - Bottom group on Test 1 does better on Test 2 - Top group on Test 1 falls back on Test 2 The Regression Fallacy l attributing the regression effect to something besides natural spread around the line. The Regression Effect
Regression Effect - Explanation Students scoring 95 on Test 1 3 categories (a) Students who will average 95 for the course (a) Students who will average 95 for the course (b) Great students having a bad day (c) “Pretty good” students having a good day - There are more students in category (c) than in (b) - Thus, we expect the “average” performance for those who scored 95 on Test 1 to drop
Regression Effect - Examples 4-yr-olds with IQ’s of 120 typically have adult IQ’s around yr-olds with IQ’s of 70 typically have adult IQ’s around 85. Of major league baseball teams with winning records, typically 2/3 win fewer games the next year.
Note: n The regression effect does not explain a change in averages n If r > 0: if X is above AVGx, then the predicted Y must be above AVGy if X is below AVGx, then the predicted Y must be below AVGy if X is below AVGx, then the predicted Y must be below AVGy
2. Regression Line as Least Squares Line
What line is “closest” to the points ?
The regression line has smallest RMS size of deviations from points to the line.
The regression line is also called the least squares line.
3. Extrapolation l Predicting beyond the range of predictor variables
3. Extrapolation l Predicting beyond the range of predictor variables 6 NOT a good idea
4. Multiple Regression Using more than one independent variable to predict dependent variable. Using more than one independent variable to predict dependent variable.Example: PredictY = son’s height UsingX 1 = father’s height X 2 = mother’s height
4. Multiple Regression Using more than one independent variable to predict dependent variable. Using more than one independent variable to predict dependent variable.Example: PredictY = son’s height UsingX 1 = father’s height X 2 = mother’s height Equation:Y = m 1 X 1 + m 2 X 2 + b