Simple Linear Regression Chapter 6 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.

Slides:



Advertisements
Similar presentations
Multivariate Data/Statistical Analysis SC504/HS927 Spring Term 2008 Week 18: Relationships between variables: simple ordinary least squares (OLS) regression.
Advertisements

Correlation and Regression
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Correlation and Linear Regression.
Chapter 8 Linear Regression.
Chapter 7 Linear Regression.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Project #3 by Daiva Kuncaite Problem 31 (p. 190)
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Business Statistics - QBM117 Least squares regression.
CHAPTER 3 Describing Relationships
Bivariate Relationships Chapter 5 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Bivariate Relationships Chapter 5 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
Relationships Among Variables
Least Squares Regression Line (LSRL)
Simple Linear Regression
Linear Regression.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Copyright © Cengage Learning. All rights reserved.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 10 Correlation and Regression
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chapter 8 Linear Regression *The Linear Model *Residuals *Best Fit Line *Correlation and the Line *Predicated Values *Regression.
Political Science 30: Political Inquiry. Linear Regression II: Making Sense of Regression Results Interpreting SPSS regression output Coefficients for.
Chapter 8 Linear Regression. Slide 8- 2 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Section 2.6 – Draw Scatter Plots and Best Fitting Lines A scatterplot is a graph of a set of data pairs (x, y). If y tends to increase as x increases,
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
9.2 Linear Regression Key Concepts: –Residuals –Least Squares Criterion –Regression Line –Using a Regression Equation to Make Predictions.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 7, Slide 1 Chapter 7 Linear Regression.
Chapter 8 Linear Regression.
Inference for Regression
Bivariate Relationships
Correlation and Simple Linear Regression
Multiple Regression.
Linear Regression Prof. Andy Field.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Chapter 8 Linear Regression.
Regression and Residual Plots
Correlation and Simple Linear Regression
Lecture Slides Elementary Statistics Thirteenth Edition
Correlation and Simple Linear Regression
Section 3.2: Least Squares Regressions
Chapter 14 Multiple Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Simple Linear Regression Chapter 6 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using

Simple Linear Regression When two variables are related, you may use one to predict the other. The variable being predicted is often called the dependent or criterion or outcome variable. The variable predicting the dependent variable is called the independent variable, predictor or regressor. In predicting one variable from another, we are not suggesting that one variable is a cause of the other.

Overview of Topics Covered Simple linear regression when both independent and dependent variables are scale The scatterplot: Graphing the “best-fitting” linear equation The simple linear regression equation: Yhat = bX+a The standardized regression equation R as a measure of the goodness of fit of the model Why the scatterplot is important Simple linear regression when the independent variable is dichotomous

An Example: Graphing the Simple Linear Regression Equation on the Scatterplot for Predicting Calories from Fat Go to Graphs on the Main Menu bar, Scatter, and Define. Put CALORIES in the box for the Y-Axis and FAT in the box for the X-Axis. Click OK. Once the graph appears in the Output Navigator, click it twice to go into Edit Mode. Go to Elements on the menu bar, Fit Line at Total. Click on Elements, Show Data Labels. Note: By convention, the dependent variable is on the y-axis and the independent variable is on the x-axis.

The Best-Fitting Line for Predicting Calories from Fat

Using the Scatterplot to Predict the Number of Calories of a Burger with 28 Grams of Fat Answer: Approximately 520 calories. Note: A Big Mac has 28 grams of fat, but actually has 530 calories. The predicted value of 520 calories departs from the actual value by 10 calories. This amount, the difference, d, between the actual and predicted values, is called the residual or error. In equation form, we may say that in this case, d = Y – = = 10

Creating the Best-Fitting Line: Averaging (a function of) the d’s While d = 10 in this case, d may equal a different value for each of the other four cases in this data set. The line that best fits our data, in the sense that it provides the most accurate predictions of the DV, is the one that gives rise to the overall smallest possible set of d values. The overall set of d values is summarized as an average, and, in particular, as an average of the squared d’s rather than as an average of the d’s themselves so as to avoid the cancelling out of negative and positive d’s when forming the average. By squaring, we only are adding positive terms to get the average.

Creating the Best-Fitting Line: The Least Squares Criterion Because we take an average of the squared d’s to find the best-fitting line, the criterion for creating that line is called the Least Squares Criterion. The best-fitting line is called the (least squares) regression line or (least squares) prediction line. The equation of the regression line is called the linear regression equation.

The Regression Equation Although the derivation of the regression equation is beyond the scope of this course, the equation of the regression line itself is really quite simple. The regression line is given by where

Interpreting Regression Coefficients: Slope, b The value of the slope, b, gives the change in the predicted value of Y, on average, for each unit increase in X. If b is positive, then Y increases with an increase in X. If b is negative, then Y decreases with an increase in X.

Interpreting Regression Coefficients: Intercept, a The value of the intercept, a, is the predicted value of Y when X = 0. It is only meaningful when 0 is a meaningful value for the variable X and data close to X = 0 have been collected. Based on the equation for a, we may note that a is defined so that when X =, the predicted value of Y is i.e. (, ) lies on the regression line.

An Example: Using SPSS to Obtain the Linear Regression Equation for Predicting Calories from Fat using the Hamburg Data Set Go to Analyze on the Main Menu bar, Regression, Linear. Put CALORIES in the box for the Dependent variable and FAT in the box for the Independent variable. To obtain the set of predicted Y values and their residuals, click Save. In the boxes labeled Predicted Values and Residuals, click the boxes next to Unstandardized. Click Continue, and OK. In the data window, you will see that two new variables, PRE_1 and RES_1, have been created which give predicted and residual calories for each hamburger.

An Example Continued: Writing the Regression Equation The regression equation may be obtained from the output in the Coefficients table.

An Example: Interpreting the Regression Coefficients, b and a. b = The value of the slope, b = 13.71, tells us that a one gram increase in the fat content of a burger is associated with an increase of calories, on average. a = , The value of the intercept, a = , is not meaningful in this example because burgers with 0 grams of fat would be quite different from the burgers in this data set. where Y = Calories and X = Fat

An Example Continued: Using the Regression Equation to Predict Calories from a Burger with 28 Grams of Fat Where Y = Calories and X = Fat

An Example Continued: Reviewing the Data with PRE_1 and RES_1 BurgerFatCaloriesCheesePRE_1RES_1 Hamburger Cheeseburger Quarter Pounder Quarter P. w/ c Big Mac

The Standardized Regression Equation The regression equation for predicting the standardized (z-score) values of Y from the standardized (z-score) values of X is

Measuring the Goodness of Fit of the Model: R R is defined as the correlation between the actual and predicted values of Y. In the case of simple linear regression As such, R may be used to measure how well the regression model fits the data.

Drawing the Correct Conclusions: Illustrating with Anscombe’s Data Consider four data sets of X,Y pairs, each with the following identical set of summary statistics: = 9.0, = 7.5, S X = 3.17, S Y = 1.94, r XY =.82 Regression line: = 0.5X + 3 Question: Can we draw the same conclusions about each set of data? That is, are the four scatterplots corresponding to these data sets the same? Let’s see.

Anscombe’s Data: Panels I & II

Anscombe’s Data: Panel III & IV

The Moral of the Story: As Illustrated by Anscombe’s Data Summary statistics, including the linear regression model, may be misleading or in some way fail to capture the salient features of data. The use of graphical displays, such as the scatterplot, is critical in the process of assessing how appropriate a given model is for describing a set of data.

Simple Linear Regression: When the Independent Variable is Dichotomous An Example: Predict the number of calories in a burger by whether or not the burger has cheese. Find the regression equation using SPSS Predict the calories for a burger with cheese. How else might we interpret this value? Predict the calories for a burger without cheese. How else might we interpret this value? Find and interpret the intercept, a. Find and interpret the slope, b.

Simple Linear Regression: When the Independent Variable is Dichotomous The regression equation is Where X = CHEESE and Y = CALORIES

Simple Linear Regression: When the Independent Variable is Dichotomous The predicted calories for a burger with cheese is (110)(1) +350 = 460. Intuitively, this is the mean number of calories for burgers with cheese. The predicted calories for a burger without cheese is (110)(0) +350 = 350. Intuitively, this is the mean number of calories for burgers without cheese.

Simple Linear Regression: When the Independent Variable is Dichotomous The slope of the regression equation is 110, indicating that burgers with cheese have 110 more calories, on average, than burgers without cheese. The intercept of the regression equation is 350, indicating that burgers without cheese are predicted to have 350 calories. Note that in this example, the value X=0 is meaningful and therefore, so is the intercept.