Download presentation
Presentation is loading. Please wait.
1
Regression Wisdom Chapter 9
2
Underlying Assumption
When fitting models to data: all data is from the same group. Watch out for clusters or multi-moded distributions This could imply data subgroups.
3
Making Predictions Using the Linear Model
Interpolation: Predicting a y value by putting an x value into the model. The x value is within range of the x data list. The prediction is as good as the model. The better the r value the more comfortable we are with the prediction.
4
Making a Prediction (pg 2)
Extrapolation: Predicting a y value by putting an x value that is outside of the range of the x data list. The prediction is on shaky ground. There is no evidence of what the relation is like outside of the data. If you must extrapolate into the future at least don’t believe the prediction will come true.
5
Outliers and Influential Points
Any point that stands away from the others can be called an outlier and deserves our attention. Extreme x values have high leverage and can greatly affect a regression line. Influential points greatly affect the model. They can make weak correlations look strong and strong correlations look weak.
6
IMPORTANT !4 No matter how strong the relationship (r value) we can NOT state that the explanatory variable causes the response variable. Correlation does not imply causation. There are possible lurking variable that can not be accounted for that may relate the two variables. We do not have enough tools yet to talk about cause and effect.
7
The Must List The relationship must be linear.
Check residuals. Examine the largest residual Don’t extrapolate unless required Watch out for outliers. Examine them and their effects on the model Watch out for high leverage and influential points Consider running two different models, one with the high leverage points included and one with the high leverage points excluded
8
The Must List pg 2 Don’t automatically discount outliers. If you remove enough points you can get the data to fit anything. Some data just does not model well. When that happens describe it and stop analysis. Beware of lurking variables. We can not talk about cause and effect. Watch out for summary data. Means and medians are not as variable as raw data. The relationship might be artificially high.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.