Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Residuals.
2nd Day: Bear Example Length (in) Weight (lb)
Scatter Diagrams and Linear Correlation
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition
Regression, Residuals, and Coefficient of Determination Section 3.2.
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Descriptive Methods in Regression and Correlation
Residuals Target Goal: I can construct and interpret residual plots to assess if a linear model is appropriate. 3.2c Hw: pg 192: 48, 50, 54, 56, 58 -
Ch 3 – Examining Relationships YMS – 3.1
Examining Relationships YMS3e Chapter 3 3.3: Correlation and Regression Extras Mr. Molesky Regression Facts.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
WARM-UP Do the work on the slip of paper (handout)
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Least Squares Regression Textbook section 3.2. Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Warm-up Get a sheet of computer paper/construction paper from the front of the room, and create your very own paper airplane. Try to create planes with.
CHAPTER 3 Describing Relationships
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Examining Relationships
Least-Squares Regression
Chapter 8 Linear Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Examining Relationships
Least Squares Regression
Chapter 3: Describing Relationships
Objectives (IPS Chapter 2.3)
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Describing Relationships

Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific setting  Different people may draw different lines by eye on a scatterplot. This gives us an exact procedure/formula

Regression Line  A straight line that describes how a response variable y changes as an explanatory variable x changes Unlike correlation, we must have explanatory and response variables Often use regression line to make a prediction  No line will pass through all the points  We use the line to predict y from x, so we want a line that is as close as possible to the points in the vertical direction Regression line makes the vertical distance of the point in the scatterplot as small as possible

Example 3.8 Page 150

Least-squares regression line (LSRL)  A model used when the data shows a linear trend  The line that makes the sum of the squares of the vertical distances of the data point from the line as small as possible Can’t just go with what visually looks like the best fit Picture on page 151 (shown on next 2 slides)  Error = observed – predicted

LSRL  Equation: Slope: Intercept:  Notice these use and s

Why do we use ?  y denotes observed value  (read “y hat”) denotes predicted value  When you are solving regression problems, make sure you are careful to distinguish between y and

Two necessary conditions  Every least-squares regression line passes through the point  the slope of the least-squares line is equal to the product of the correlation and the quotient of the standard deviations:  Commit these two facts to memory and you will be able to find equations of least-squares lines

Constructing Least-Squares Regression without Data  Use the following data to construct the equation for the least-squares line:

Constructing LSRL

The magic of the calculator  The calculator will find the information for you  8:LinReg(a+bx)  Make sure that your r and values appear  When you write your equation, do not forget to use,this means predicted value  To plot the line on the scatterplot by hand, use the equation to find for two values of x, one near each end of the range of x in the data

Warm up Find the equation of the Least Squares Regression Line (LSRL) that models the relationship between corrosion and strength and the correlation coefficient. Use the prediction model (LSRL) to determine the following: What is the predicted strength of concrete with a corrosion depth of 25mm? What is the predicted strength of concrete with a corrosion depth of 40mm?

r tells us there is a strong, negative, LINEAR relationship between depth of corrosion and strength of concrete. b (the slope) tells us that for every increase of 1 mm in depth of corrosion, we PREDICT a 0.28 decrease in strength of the concrete. a (the y-intercept or constant) tells us the initial PREDICTED strength

How to add a regression line to your scatterplot on TI-84

Explanatory vs. Response The Distinction Between Explanatory and Response variables is essential in regression. Switching the distinction results in a different least-squares regression line. Note: The correlation value, r, does NOT depend on the distinction between Explanatory and Response.

Computer output for LSRL

Coefficient of Determination The coefficient of determination, r 2, describes the percent of variability in y that is explained by the linear regression on x. 71% of the variability in death rates due to heart disease can be explained by the LSRL on alcohol consumption. That is, alcohol consumption provides us with a fairly good prediction of death rate due to heart disease, but other factors contribute to this rate, so our prediction will be off somewhat.

Coefficient of Determination (r 2 )  r vs. r 2 Correlation: strength of association Coefficient of determination: how successful you are  Some computer output calls it R-sq

Calculating r 2  r 2 =  SST: sum of the squares about the mean  SSE: sum of the squares for error  Interpretation: the proportion of the total sample variability that is explained by the least-square regression of y on x

Example 3.13 page 164 r = r 2 = This means over 99% of the variation in gas consumption is accounted for by the linear relationship with degree-days.

r = 7842 r 2 = This means that the linear relationship between distance and velocity explains about 61.5% of the variation in either variable.

Warmup Sarah’s parents are concerned that she seems short for her age. Their doctor has the following record of Sarah’s height: Age (months): Height (cm): a) Make a scatterplot of these data. b) Using your calculator, find the equation of the least- squares regression line of height on age. c) Use your regression line to predict Sarah’s height at age 40 years (480 months). Convert your prediction to inches (2.54 cm = 1 inch). Does this make sense? Explain.

Residuals  The difference between the observed value of the response variable and the value predicted by the regression line  Residual = observed y – predicted y =  The mean of the residuals is always 0 Sometimes not exact on calculator due to roundoff error

Example 3.14 page 167

 Predict the Gesell Score for a child who first spoke at 15 months.  What is the residual? The residual is positive because it lies above the line

Residual Plot This line corresponds to the regression line from before

Residual plots  A scatterplot of the regression residuals against the explanatory variables  Help us to assess the fit of a regression line

Calculator can produce graph of residuals (page 174)

Some facts:  A good residual plot has a uniform scatter of points above and below the line y = 0  There should be no systematic pattern or unusual individual observations  A curved pattern shows that the relationship is not linear  Increasing or decreasing the spread about the line as x increases indicates that prediction of y will be less accurate for larger x

More facts:  Individual points with large residuals are outliers in the vertical (y) direction because they lie far from the line that describes the overall pattern  Individual points that are extreme in the x direction may not have large residuals, but they can be very important

Influential Observations  Outliers: an observation that lies outside the overall pattern of the other observations  Influential point: one that, if removed, would markedly change the result of the calculation outliers in the x direction are often influential for the least-squares regression line Often have small residuals because they pull the regression line towards themselves can see what removal of such a point does to the line on page 172 (next slide)  Child 19 has less influence than child 18

Outliers/Influential Points  Does the age of a child’s first word predict his/her mental ability? Consider the following data on (age of first word, Gesell Adaptive Score) for 21 children. Does the highlighted point markedly affect the equation of the LSRL? If so, it is “influential”. Test by removing the point and finding the new LSRL. Influential?

 Interpolation: predicting a value within the parameter of the data given  Extrapolation: predicting a value outside the parameter of the data given  Beware of extrapolation!

Applets  Adding a point to a scatterplot Adding a point to a scatterplot  Regression by eye Regression by eye  Regression line and residuals Regression line and residuals