Bellwork (Why do we want scattered residual plots?): 10/2/15 I feel like I didn’t explain this well, so this is residual plots redux! Copy down these two scatterplots
3.2B&C LSRL—stdev of the Residuals & r 2
If she loves you more each and every day, by linear regression she hated you before you met.
Objectives: CALCULATE residuals CONSTRUCT and INTERPRET residual plots DETERMINE how well a line fits observed data
A scatterplot of home price in thousands of dollars vs. home size in thousands of square feet shows a relatively linear, positive association with r = There are no unusual points and the residual plot shows random scatter.
1.If a home size is one stdev above the mean home size, how many stdev above the mean would you expect the sale price to be?
So, I would expect it’s sale price to be 0.85 stdev above the mean.
2. What would you predict about the sale price of a home 2 SD below the average size?
So, I would expect its sale price to be 1.70 stdev below the mean.
Consider the linear regression: 3.What are the units of the slope?
Consider the linear regression: 4.Interpret the slope.
Consider the linear regression: 5.By how much would the value of my home increase after the addition of 500 sq ft?
Consider the linear regression: 6.How much would you expect to pay for a 3000 sq ft home?
Homework Comments If form isn’t linear, what is it? Correlation vs Association Form, strength (r), direction, outliers Only use correlation/scatter plots for quantitative data r has no units
Let’s go back to the Handspan vs. Height activity
r2r2
Computer Output
BAC
Computer Output
Scatterplot and Residual Plot
Questions
Definition: If we use a least-squares regression line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (AKA s) is given by: (Whaaat?!)
The standard deviation of the residuals gives us a numerical estimate of the typical size of our prediction errors (AKA residuals.) There is another numerical quantity that tells us how well the least-squares regression line predicts values of the response y.
Definition: The coefficient of determination r 2 is the fraction of the variation in the values of y that is accounted for by the least- squares regression line of y on x. We can calculate r 2 using the following formula: Where and (Whaaat?!) 2
TeamGames WonRuns Scored Atlanta Chic Cubs84738 Cincinnati73722 Colorado67758 Florida64581 Houston85716 LA81675 Montreal94732 NY Mets59672 Philly97877 Pittsburg75707 San Diego61679 San Fran St. Louis NL Statistics for MLB
Miles Driven Price Miles driven and the price of a used Honda CR-V
= data value y Error w.r.t. mean model Proportion of error eliminated by new model for this data point = x 10 8 mean model = 0.8 Error w.r.t. mean model10 8 ? ? ? ? ? ? ? ? ? ? Conceptually, if we computed a proportion in the same way for each data point and combined them sensibly, we would end up with r 2. Est. This Call it 10 units! r 2 is proportion of error (variability) in the response variable (y) accounted for by the given model (w.r.t the mean model).
Bellwork: 10/5/15 Describe what each of these residual plots is telling us about their linear regression
Homework Comments Define your variables. #37: “The slope is which means that the typical highway gas mileage increases on average by mpg for each 1 mpg increase in city mileage.” “An increase in city mileage of 1 mpg is associated with a predicted increase of mpg on average in highway mileage” Reference residual plot CONTEXT!! Meaning!
FRQ Practice Sewing machines Study time