Algebra Review The equation of a straight line y = mx + b

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Chapter 8 Linear regression
Chapter 8 Linear regression
Chapter 8 Linear Regression.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
CHAPTER 8: LINEAR REGRESSION
AP Statistics Chapter 8: Linear Regression
Correlation & Regression
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Linear Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Relationship of two variables
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Chapter 6 & 7 Linear Regression & Correlation
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Regression Regression relationship = trend + scatter
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Chapter 8 Linear Regression. Slide 8- 2 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Lecture Slides Elementary Statistics Twelfth Edition
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
Chapter 8 Linear Regression.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 5 LSRL.
Chapter 8 Linear Regression.
Correlation and Simple Linear Regression
Scatter plots & Association
Lecture Slides Elementary Statistics Thirteenth Edition
Chapter 8 Part 2 Linear Regression
CHAPTER 26: Inference for Regression
AP Stats: 3.3 Least-Squares Regression Line
Simple Linear Regression
Simple Linear Regression
Least Squares Regression Line LSRL Chapter 7-continued
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Unit 4 Vocabulary.
Correlation and Simple Linear Regression
M248: Analyzing data Block D UNIT D2 Regression.
Simple Linear Regression and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
9/27/ A Least-Squares Regression.
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Algebra Review The equation of a straight line y = mx + b m is the slope – the change in y over the change in x – or rise over run. b is the y-intercept – the value where the line cuts the y axis. Before we look at the application of straight lines to statistics, let’s briefly review a little algebra.

Review y = 3x + 2 x = 0 y = 2 (y-intercept) x = 3 y = 11 Change in y (+9) divided by the change in x (+3) gives the slope, 3.

Linear Regression Example: Tar (mg) and CO (mg) in cigarettes. y, Response: CO (mg). x, Explanatory: Tar (mg). Cases: 25 brands of cigarettes. With linear regression, we will use the data on the relationship between two variables, the response variable, y and the explanatory variable, x to come up with values of the slope and intercept for a line that comes close to the data points on the scatter plot.

Correlation Coefficient Tar and nicotine r = 0.9575 The calculation of the regression line depends on the value of the correlation coefficient. Because the correlation coefficient measures the strength of the linear association between two variables, it is not surprising that this coefficient plays an important role in determining the line that best fits the data.

Linear Regression There is a strong positive linear association between tar and nicotine. What is the equation of the line that models the relationship between tar and nicotine?

Linear Model The linear model is the equation of a straight line through the data. A point on the straight line through the data gives a predicted value of y, denoted . In Chapter 6 we talked about a Normal model to describe the distribution of a population. We are now going to consider a model for the relationship between two variables. The relationship will be modeled with a straight line.

Residual The difference between the observed value of y and the predicted value of y, , is called the residual. Residual = The residual, difference between an observed and a predicted value is the basis for coming up with an estimate of this linear model.

Residual A residual indicates how far away a point is from a model line.

Line of “Best Fit” There are lots of straight lines that go through the data. The line of “best fit” is the line for which the sum of squared residuals is the smallest – the least squares line. But how do we draw the line? There are any number of straight lines that would “fit” the data. Which one do we choose?

Line of “Best Fit” Least squares slope: intercept: Where do these equations come from. Finding the smallest sum of squared residuals is a minimization problem. You can use calculus to solve that minimization problem. When you use calculus, you end up with two equations and two unknowns (the slope and the intercept). Solving for the slope and intercept gives you the equations above.

Summary of the Data Tar, x CO, y

Least Squares Estimates

Interpretations Slope – for every 1 mg increase in tar, the CO content increases, on average, 0.801 mg. Intercept – there is not a reasonable interpretation of the intercept in this context because one wouldn’t see a cigarette with 0 mg of tar.

Predicted CO = 2.743 + 0.801*Tar A residual indicates how far away a point is from a model line.

Prediction Least squares line

Residual Tar, x = 16.0 mg CO, y = 16.6 mg Predicted, = 15.56 mg Calculation of a residual for a particular x, y pair.

Residuals Residuals help us see if the linear model makes sense. Plot residuals versus the explanatory variable. If the plot is a random scatter of points, then the linear model is the best we can do. Plotting residuals versus the explanatory variable help us see if a linear model is appropriate. We will see more of this in Chapter 10.

You will always have about the same number above zero as below zero You will always have about the same number above zero as below zero. What is important is the lack of any trend or curvature in the residuals.

Interpretation of the Plot The residuals appear to have a pattern. For values of Tar between 0 and 20 the residuals tend to increase. The brand with Tar = 30, appears to have a large residual.

(r)2 or R2 The square of the correlation coefficient gives the amount of variation in y, that is accounted for or explained by the linear relationship with x. The square of the correlation coefficient is on a scale from 0 to 1 where 0 is no linear association and 1 is a perfect linear association.

Tar and Nicotine r = 0.9575 (r)2 = (0.9575)2 = 0.917 or 91.7% 91.7% of the variation in CO content can be explained by the linear relationship with Tar content.

Regression Conditions Quantitative variables – both variables should be quantitative. Linear model – does the scatter plot show a reasonably straight line? Outliers – watch out for outliers as they can be very influential.

Regression Cautions Beware of extraordinary points. Don’t extrapolate beyond the data. Don’t infer x causes y just because there is a good linear model relating the two variables. Don’t choose a model based on R2 alone.