Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-3 Linear Regression, Residuals, and Variation.

Slides:



Advertisements
Similar presentations
Linear Regression (C7-9 BVD). * Explanatory variable goes on x-axis * Response variable goes on y-axis * Don’t forget labels and scale * Statplot 1 st.
Advertisements

Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Chapter 6: Exploring Data: Relationships Lesson Plan
2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Quantitative Variables Chapter 5.
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Chapter 3 Association: Contingency, Correlation, and Regression
Relationships Between Quantitative Variables
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Least Squares Regression Line (LSRL)
Regression, Residuals, and Coefficient of Determination Section 3.2.
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Linear Regression and Correlation
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Bivariate Data Pick up a formula sheet, Notes for Bivariate Data – Day 1, and a calculator.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Chapter 6 & 7 Linear Regression & Correlation
Lesson Least-Squares Regression. Knowledge Objectives Explain what is meant by a regression line. Explain what is meant by extrapolation. Explain.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
3.2 Least Squares Regression Line. Regression Line Describes how a response variable changes as an explanatory variable changes Formula sheet: Calculator.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
3.2 - Least- Squares Regression. Where else have we seen “residuals?” Sx = data point - mean (observed - predicted) z-scores = observed - expected * note.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Statistics Bivariate Analysis By: Student 1, 2, 3 Minutes Exercised Per Day vs. Weighted GPA.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation The apparent relation between two variables.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
SWBAT: Calculate and interpret the residual plot for a line of regression Do Now: Do heavier cars really use more gasoline? In the following data set,
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
POD 09/19/ B #5P a)Describe the relationship between speed and pulse as shown in the scatterplot to the right. b)The correlation coefficient, r,
AP Statistics HW: p. 165 #42, 44, 45 Obj: to understand the meaning of r 2 and to use residual plots Do Now: On your calculator select: 2 ND ; 0; DIAGNOSTIC.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
3.2 - Residuals and Least Squares Regression Line.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
CHAPTER 3 Describing Relationships
Unit 4 LSRL.
LEAST – SQUARES REGRESSION
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
9/27/ A Least-Squares Regression.
Presentation transcript:

Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-3 Linear Regression, Residuals, and Variation

3-3 Learning Objectives 1) Regression Line 2) Regression Equation 3) Residuals 4) Calculating the Regression Line/Equation 5) Slope vs. Correlation 6) Squared Correlation (Variation)

Objective 1: REGRESSION LINE A regression line the value of the response variable (y) for value of the explanatory variable (x). In previous math classes we called it… …because it is the best possible line to fit the given data. FYI - The line itself is calculated using the method of ‘least squares’, which is a calculation that minimizes the sum of the squared residuals or differences of every value.predicts any given line of best fit

Objective 1: REGRESSION LINE EXAMPLE A: What is the predicted value of y when x = 45? y = 200 (approx.) NOT ON GUIDED NOTES!! The regression line runs through the calculated point calculated point

Objective 2: REGRESSION EQUATION The regression equation is the exact equation of the regression line. It tells us the predicted value of y for any value of x. Predicted value of y ‘y hat’ y-intercept slope Think about it: How does this differ from a similar equation you have learned in previous math courses?

Objective 2: REGRESSION EQUATION The y-intercept (a): Tells us the predicted value for y when x = 0 Helps in plotting the line (to see where it starts) May not have any interpretative value if no observations had x values near 0 The slope (b): (gets multiplied by x value) Measures the change in the predicted variable (y) for a 1 unit increase in the explanatory variable in (x)

Objective 2: REGRESSION EQUATION Example B: Anthropologists predict the height of a human using their remains with the following regression equation: Understand the variables: ____ is the predicted height and ____ is the length of a femur (thighbone), measured in centimeters Interpret the slope in the context: A ___ cm increase in femur length results in a ____ cm increase in predicted height. Interpret the y-intercept in the context: A person with a femur length of ____ is ____ cm tall See, sometimes y-intercept has no contextual value!

Objective 2: REGRESSION EQUATION Example B: Anthropologists predict the height of a human using their remains with the following regression equation: (50) Use the regression equation: to predict the height of a person whose recovered femur length was 50 centimeters cm =.033 feet 181.4(.033) = 5 feet 11 inches

Objective 2: REGRESSION EQUATION So regression lines and equations allow us to: Predict a single value of the response variable But… we should not expect all subjects at that value of x to have the same value of y… Variability occurs in the y values! It’s only a prediction! (levels of change)

Objective 3: RESIDUALS A single regression line/equation is used to represent correlation between many data points, thus some of the actual data will fall above the predicted regression or below. We call the difference between where a value is and where the regression equation says it should be, its… residual (A residual is found by taking the real y value and subtracting the predicted y value)

Objective 3: RESIDUALS A few things to note about residuals: Measures the size of the prediction ___________ error (copy this plot onto your notes) Is there error between the points and line?

A few things to note about residuals: It is the ____________ distance between the point and the regression line Objective 3: RESIDUALS Some above, some below. Various distance. vertical

A few things to note about residuals: Each and every observation has a residual. Objective 3: RESIDUALS All 5 points have one here. Is it possible for a point to have a residual of 0?

A few things to note about residuals: A large residual indicates an… Objective 3: RESIDUALS Which point has the largest residual? What would we call it? unusual observation

A few things to note about residuals: All residuals when summed will equal… Objective 3: RESIDUALS = = =

Objective 3: RESIDUALS Example C: From earlier we saw the predicted value for a person with femur bone length = 50 cm was: _______ If an actual person who had a 50 cm femur bone had a height of 192cm, what is their residual? – = 10.6 cm They were 10.6 cm taller than was predicted.

Objective 4: CALCULATING REGRESSION Your objective is to take all the given information and convert it into a prediction/regression equation Standard deviation of y, divided by standard deviation of x. Multiplied by r, which is the correlation of the data Take b (from above formula) and multiply by the mean of the x values. Subtract from the mean of the y values.

Objective 4: CALCULATING REGRESSION Example D: Generate the regression equation with the given information (pure math example) = =

Objective 4: CALCULATING REGRESSION Using a calculator to find regression equations.

Objective 5: SLOPE VS CORRELATION Describes the strength of the linear association Does not change when the units change (stuck between -1 and 1) Does not depend upon which variable is the response and which is the explanatory (flipping them won’t change the strength) Does not have specific predictive power Does not tell us strength of the association (tells us rate of change) Changes based on the units The two variables must be identified as response and explanatory Used to predict values of the response variable for given values of the explanatory variable SLOPE CORRELATION BOTH Give us direction of the association (positive or negative)

Objective 6: SQUARED CORRELATION Looking at the last calculator screen above (in your notes), which of those items have we not discussed yet? Squaring the correlation is a way to measure the proportion of the variation in the response variable (y-values) that is accounted for by the linear relationship with the explanatory variable (x-values). r2r2r2r2 A correlation of r =.90 for example…. So ____% of the variation in y-values can be explained by the x-values (strong) = (squaring also removes negative values)

Objective 6: SQUARED CORRELATION Example E: Current GPA (response) was measured for 50 students as well as their self-reported study time which had a correlation of r =.68 and their number of absences which had a correlation of r = Interpret the direction and strength of the correlations (r): Interpret the variation in response due to each explanatory (r 2 ): Study time: Absences:.68 is a strong, positive correlation -.48 is a weak, negative correlation.68 2 = = % of the variation in GPA can be accounted for by study time. 23% of the variation in GPA can be accounted for by attendance.

‘Running’ Out of Interest: REVIEW AND APPLY The scatterplot (page 4 in your notes) represents the average number of hours a newly purchased treadmill is used by a single person per week for the first year of ownership. (i.e. in the first month, they used it 10 hours the first week, 9 the second week, 10 the third week, and 11 the fourth week for an average of 10 hours in the first month.) Work through each part referring to the entire lesson. Check your work when finished. SOLUTIONS: A) B) C) D) E) Slope: Every 1 month of owning treadmill, hours of use decrease by.66 (about 40 minutes) Y-Intercept: At x = 0 (brand new) used 9.86 hours. (3, 7.86) (13, 1.22) 9.86/.665 = 14.8 (14.8, 0) 10 was really 2, prediction says 10 is – 3.21 = (uses 1.21 hours less than predicted).808 or 81% of the variation in hours using treadmill is accounted for by the time they have owned it.