Download presentation
Presentation is loading. Please wait.
Published byDonna Park Modified over 8 years ago
1
TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate for predictions Understand how the Standard error (Se) is used in regression Analysis
2
Review How to describe a scatterplot Correlation Coefficient ( r ) Math Vs. Stats Equation of Line vs. LSRL Interpret Slope and y-intercept What is a residual (or error)?
3
Review How to describe a scatterplot Trend ~ Positive or Negative Form ~ Linear or non Linear Strength ~ moderate, weak or strong Correlation Coefficient ( r ) -1< r < 1 Strength R Close to 1 or -1 ~ Strong association R Close to 0 ~ Weak or no linear association Trend Positive association (as x variable increase, y variable also increase) Negative Association (as x variable increase, Y variable decrease)
4
Review Math vs. Stats Equation of Line vs. LSRL Line Math y = mx + b Line Stats
5
Review Interpret Slope and y-intercept Slope: For every one unit of x, y increases (decreases) on average by the slope. Y-intercept When the value of the variable x=0 then the value of the variable y = “a”
6
Review What is a residual (or Error) Observed y Predicted y } residual Error = Residuals OBSERVED Y VALUE – Predicted Y value
7
Use Residual Analysis to assess if the model (LSRL) is appropriate for making predictions
8
Correlation and Linearity and Outliers Only use linear correlation to interpret the data when there is a linear relationship An outlier can strongly influence the correlation.
9
Fitting a Model for Prediction or Fitting the LRSL for Prediction Stochastic MESSAGES All models are wrong but some are useful Text Deterministic Residual Analysis Address directly the problem of Signal and Noise Allow Random Variation A model is not the reality Signal Noise
10
Signal and Noise
13
Types of Residual plots Different plots can highlight different departures or problems in the prediction model. 1)Residual vs. Fitted 2)Histogram 3)PP~PLOT 4)Order vs. Fitted Note: these plots are from software output (Minitab)
15
Residual vs. Fitted value plot Three common defects may be revealed by plotting residuals vs. fitted value 1) Outliers 2) Progressive change in the variance: Band of uniform width Funnel shape = not equal variance : transform 3) inadequacy of the model : Curvature ~ wrong model Linear trend going up ~ wrong calculation
16
Residual vs. Fitted
17
Let's look at an example to see what a "well-behaved" residual plot looks like.
18
Scatterplot Some researchers (Urbano- Marquez, et al., 1989) were interested in determining whether or not alcohol consumption was linearly related to muscle strength. The researchers measured the total lifetime consumption of alcohol (x) on a random sample of n = 50 alcoholic men. They also measured the strength (y) of the deltoid muscle in each person's nondominant arm. A fitted line plot of the resulting data, (alcoholarm.txt), looks like:
19
Scatterplot. Residual Plot Residual vs. Fitted
20
Let's look at an example to see what a ”not so well-behaved" residual plot looks like.
21
What do you notice in this scatterplot? 0 OUTLIER Scatterplot Residual plot Predicted or Fitted Foot length
22
0 Predicted or Fitted
23
Outlier Removed Predicted or Fitted 0
24
Let's look at an example to see what a ”not well-behaved" residual plot looks like.
25 0
26
Heteroscedasticity When the requirement of a constant variance is violated we have a condition of heteroscedasticity. Diagnose heteroscedasticity by plotting the residual against the predicted y. + + + + + + + + + + + + + + + + + + + + + + + + The spread increases with y ^ y ^ Residual ^ y + + + + + + + + + + + + + + + + + + + + + + +
29
Signal and Noise
30
Residuals plots fitted vs. residuals Homoscedasticity vs. Heteroscedasticity Homoscedasticity A residual plot is a scatterplot of the standardized residuals against the fitted values
31
Let's look at an example to see what a ”not well-behaved" residual plot looks like.
32
How does a non-linear regression function show up on a residual vs. fits plot? The answer: The residuals depart from 0 in some systematic manner, such as being positive for small x values, negative for medium x values, and positive again for large x values. Any systematic (non-random) pattern is sufficient to suggest that the regression function is not linear.
37
2) The random errors are normally distributed and centered at zero Histograms + PP PLOTS -- Normality assumption Histogram show why center at zero and why bell shape QQ plots better to discover the normal shape because the histogram bins can be manipulated and therefore the normal shape maybe difficult in some cases.
38
Histograms of residuals Centered at zero Bell shaped No outliers What to look for? Centered at zero Bell shaped No outliers How strict? Centered at zero Bell shaped No outliers What does it mean when Histogram is skewed
40
R, R-squared,SE 4 in one residual plots
42
Look at this graph normal residuals???
45
Here's the corresponding normal probability plot of the residuals:
48
residuals vs. order plot residuals vs. order plot" as a way of detecting a particular form of non- independence of the error terms, namely serial correlation. If the data are obtained in a time (or space) sequence, a residuals vs. order plot helps to see if there is any correlation between the error terms that are near each other in the sequence. The plot is only appropriate if you know the order in which the data were collected! Highlight this, underline this, circle this,..., er, on second thought, don't do that if you are reading it on a computer screen. Do whatever it takes to remember it though — it is a very common mistake made by people new to regression analysis. So, what is this residuals vs. order plot all about? As its name suggests, it is a scatter plot with residuals on the y axis and the order in which the data were collected on the x axis. Here's an example of a well-behaved residuals vs. order plot:
49
Residual Vs. Order The residuals bounce randomly around the residual = 0 line as we would hope so. In general, residuals exhibiting normal random noise around the residual = 0 line suggest that there is no serial correlation.
50
A residuals vs. order plot that exhibits (positive) trend as the following plot does: Residual Vs. Order
51
R-SquaredResidual Standard Error R2R2 ResidualsSe Residuals Analysis is more important than High R 2
52
Residual Activity https://www.causeweb.org/repository/StarLibrary/activities/miller2001/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.