2nd Day: Bear Example Length (in) Weight (lb) 53 80 67.5 344 72 416

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

Chapter 6: Exploring Data: Relationships Lesson Plan
CHAPTER 3 Describing Relationships
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
Regression, Residuals, and Coefficient of Determination Section 3.2.
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Chapter 5 Residuals, Residual Plots, & Influential points.
Chapter 5 Residuals, Residual Plots, Coefficient of determination, & Influential points.
Verbal SAT vs Math SAT V: mean=596.3 st.dev=99.5 M: mean=612.2 st.dev=96.1 r = Write the equation of the LSRL Interpret the slope of this line Interpret.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
AP Statistics HW: p. 165 #42, 44, 45 Obj: to understand the meaning of r 2 and to use residual plots Do Now: On your calculator select: 2 ND ; 0; DIAGNOSTIC.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
AP STATISTICS LESSON 3 – 3 (DAY 2) The role of r 2 in regression.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Residuals, Residual Plots, and Influential points
AP Stats: 3.3 Least-Squares Regression Line
residual = observed y – predicted y residual = y - ŷ
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Residuals, Residual Plots, & Influential points
Least Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

1) Use the data in the table to find the value of the linear correlation coefficient r. 2nd Day: Bear Example Length (in) Weight (lb) 53 80 67.5 344 72 416 348 73.5 262 68.5 360 73 332 37 34 2) Based on this data, does there appear to be a relationship between the length of a bear and its weight? If so, what is the relationship? Comment on the form, direction, and strength. 3) Find the residuals and make a residual plot. Is the LSRL a good model for the data? Why? 4) If a researcher anesthetizes a bear and uses a tape measure to find that it is 71 inches long, how do we use that length to predict the bear's weight?

Residual = observed y – predicted y A residual plot plots the residuals on the vertical axis against the explanatory variable on the horizontal axis. The plot magnifies residuals and makes patterns easier to see. The mean of the residuals is always zero

Residual Plot Y = 0 helps orient us TI83  Enter data from table, p. 234 Find vital stats Find residuals for data

Coefficient of Determination Numerical quantity that tells us how well the LSRL predicts values of y. R-sq: 2 components (SSM and SSE) Shows us how much better the LSRL is at predicting y than if we just used y-bar as our prediction for every point. If we have little info on predicting y (or if r is weak), we use as a predictor of y instead of y-hat.

Example Data set: x 0 3 6 y 0 10 2 Association between x and y: positive, but weak = 3, = 4 Some use as a predictor of y, since r = .1890 (weak!) and we have little info on predicting y.

SSM Measures the total variation of the y-values if we use y-bar to make predictions Sum of Squares about the Mean = 4. The total area of these 3 squares is a measure of the total sample variability. SSM = X Y 0 0 16 3 10 36 6 2 4 = 56

Sum of Squares for Error (SSE) (Sum of the squares of the deviations of the points about LSRL) If x is a good predictor of y, then the deviations and SSE will be small. If all the points fall exactly on a regression line, SSE = 0. LSRL: Y-intercept = 3, passing through = (3, 4) (always the case) SSE = X Y 0 0 9 3 10 36 6 2 9 = 54

Coefficient of Determination The difference SSM-SSE measures the amount of variation of y that can be explained by the regression line of y on x. The ratio of these two quantities is the proportion of the total sample variability that is explained by the least-squares regression of y on x. For data set A, = (56-54)/56 = .0357 That is, 3.57% of the variation in y is explained by least-squares regression of y on x Check with Calculator

Points …. If x is a poor predictor of y, then SSM and SSE are about the same In our ex: if SSM = 56 and SSE = 54  Poor prediction line.

Understanding Regression When you report a regression, is a measure of how successful the regression was in explaining the response (y). When you see a correlation, square it to get a better feel for the strength of the association. Perfect correlation means = 1, so 100% of the variation in one variable is accounted for by the linear relationship with the other variable. If r = -.7 or +.7, = .49 and about half the variation is accounted for by the linear relationship.

Another Ex.

3.3: Correlation and Regression Wisdom (Residual Plot: Helps identify outliers)

Outlier vs. Influential point Child 19 = outlier (but doesn’t affect regression line much due to other points with similar x-values nearby) Child 18 = influential point, small residual (close to line but far out in x-direction, strong influence on regression line)

Misc. Not all outliers are influential The LSRL is most likely to be heavily influenced by observations that are outliers in the x direction. Influential points often have small residuals since they pull the LSRL towards themselves. Find the LSRL with and without the suspect point. If the line moves more than a small amount, the point is influential.

Strong positive linear association. The correlation is r =. 9749 Strong positive linear association. The correlation is r = .9749. Since r-sq = .9504, the regression of of y on x will explain 95% of the variation in the values of y.

The AP Statistics exam was first administered in May 1997 to the largest first-year group in any discipline in the AP program. Since that time, the number of students taking the exam has grown at an impressive rate. Here are the actual data. Begin by entering them into your calculator lists. Year # students 1997 7,667 1998 15,486 1999 25,240 2000 34,118 2001 40,259 2002 49,824 2003 58,230 2004 65,878 2005 76,786 1. Use your calculator to construct a scatterplot of these data using 1997 as Year 1, 1998 as Year 2, etc. Describe what you see. 2. Find the equation of the least-squares line on your calculator. Record the equation below. Be sure to define any variables used. 3. Interpret the slope of the least-squares line in context. 4. How many students would you predict took the AP Statistics exam in 2006? Show your method. 5. Construct a residual plot. Sketch it in the space below. Comment on what the residual plot tells you about the quality of your linear model. 6. Interpret the value of from your calculator in the context of this problem.