Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8 Linear Regression.

Similar presentations


Presentation on theme: "Chapter 8 Linear Regression."— Presentation transcript:

1 Chapter 8 Linear Regression

2 Topics Linear Regression Residuals Coefficient of Determination

3 Regression Regression: Fitting a model to multiple sets of data.
Goal: To relate pairs (or more) of quantitative data to find the association between them.

4 Least-Squares Criterion
How should we choose a line of best fit for the data points? Least Squares Regression line: The straight line has the smallest possible sum of squared residuals. Residual: the difference between the actual y value observed in the data and the predicted y-value determined by the regression equation.

5 Example Go to class measurements from the course webpage.

6 Regression Equation The regression equation for a set of n data points is where

7 Residual Plot The residual plot is the plot consisting of the x-values on the x-axis and the corresponding residuals on the y-axis.

8 Example In MLB in 2005, there was a correlation of r = .46 between the number of runs a team scored and the number of wins they earned. The average number of runs scored by a team was with a standard deviation of The average number of wins was 81 with a standard deviation of The scatterplot shows the data to be fairly linear.

9 Example ctd. Write the equation of the regression line.
Explain what the y-intercept indicates. Interpret the slope of the regression line. Predict the number of wins for a team that scored 696 runs. How effective does it appear this line is for predicting the number of wins based on runs scored? As a team, would you rather have a positive or negative residual?

10 Outlier Example Select States Data from the course webpage.

11 Criterion for Finding a Regression Line
1) Straight Line Criterion: Before finding a regression line for a set of data points, draw a scatter diagram. If the data points do not appear to be scattered about a straight line, do not determine a regression line.

12 Criterion ctd. 2) Quantitative Variable Condition: We only use regression on sets of quantitative data. Ex) We will not look for associations between favorite lunch and favorite TV show using regression.

13 Criterion ctd. 3) Outlier Condition: One outlier, high leverage point or influential observation can make it appear as if there is more or less correlation than there actually is.

14 Example Open the Fathom Data File Baby Weight from the course webpage.
Construct the scatterplot. Do the age and weight appear to be correlated? Find the regression line. Create the residual plot. Estimate the baby’s weight at 5 weeks. Estimate the baby’s weight at 1 year. Estimate the age when the baby will weigh 20 pounds.

15 Residuals ctd. One way to think about the fit of a model is to ask “what does the model miss?” The residual plot helps to show this. Select the Fathom Data File “Cricket Chirps” from the Course Webpage.

16 Coefficient of Determination
r2 is known as the coefficient of determination. What value(s) does r2 range between? r2 is a way of explaining the percentage of the variation in the response variable that is due to the variation in the explanatory variable.

17 Used Toyota Corollas The Fathom Data File Corolla Prices lists the sale value for 17 different Toyota Corollas. A) Find a regression line for the price as a function of age. B) Interpret the slope and y-intercept. C) What is the coefficient of determination? Interpret this value? D) What is the correlation coefficient? E) Predict how old a Corolla will be to have a value of $2000.

18 Final Note on Regression
Linear Regression is not the only type of regression. Exponential Quadratic Polynomial Logarithmic Sometimes, the data might appear to be linear based on the coefficient of determination, but you can get an even better fit with exponential, polynomial, etc. For this reason, be sure to analyze what you know to be true about the behavior of the data if you can. For example, population growth is frequently exponential or logistic in nature.


Download ppt "Chapter 8 Linear Regression."

Similar presentations


Ads by Google