1 The Basics of Regression. 2 Remember back in your prior school daze some algebra? You might recall the equation for a line as being y = mx + b. Or maybe.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Here we add more independent variables to the regression.
Kin 304 Regression Linear Regression Least Sum of Squares
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Chapter 14 The Simple Linear Regression Model. I. Introduction We want to develop a model that hopes to successfully explain the relationship between.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
1 Multiple Regression Here we add more independent variables to the regression. In this section I focus on sections 13.1, 13.2 and 13.4.
The Basics of Regression continued
Business Statistics - QBM117 Least squares regression.
Least Squares Regression Line (LSRL)
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Chapter 8: Bivariate Regression and Correlation
Linear Regression.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Relationship of two variables
CORRELATION & REGRESSION
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
M22- Regression & Correlation 1  Department of ISM, University of Alabama, Lesson Objectives  Know what the equation of a straight line is,
Ch4 Describing Relationships Between Variables. Pressure.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Statistical Methods Statistical Methods Descriptive Inferential
Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition, we would like to.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Simple & Multiple Regression 1: Simple Regression - Prediction models 1.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Chapter 8 Linear Regression *The Linear Model *Residuals *Best Fit Line *Correlation and the Line *Predicated Values *Regression.
Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition, we would like to.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
ANOVA, Regression and Multiple Regression March
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Lecture 8: Ordinary Least Squares Estimation BUEC 333 Summer 2009 Simon Woodcock.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Least Square Regression Line. Line of Best Fit Our objective is to fit a line in the scatterplot that fits the data the best As just seen, the best fit.
Lecture Slides Elementary Statistics Twelfth Edition
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Lecture 9 Sections 3.3 Objectives:
CHAPTER 3 Describing Relationships
Practice. Practice Practice Practice Practice r = X = 20 X2 = 120 Y = 19 Y2 = 123 XY = 72 N = 4 (4) 72.
Multiple Regression.
Note: In this chapter, we only cover sections 10-1 through 10-3
Regression and Residual Plots
Correlation and Simple Linear Regression
LESSON 21: REGRESSION ANALYSIS
CHAPTER 3 Describing Relationships
Correlation and Simple Linear Regression
M248: Analyzing data Block D UNIT D2 Regression.
Simple Linear Regression and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

1 The Basics of Regression

2 Remember back in your prior school daze some algebra? You might recall the equation for a line as being y = mx + b. Or maybe you had the form y = a + bx. Maybe you even had another form. Did you? Notice how the y term is on the left of the equal sign. It looks like y is all by itself, but actually it is called the dependent variable. The value of y depends on the value of x. x is the independent variable. On the right side the variable x has a coefficient with it called the slope. The slope can be negative or positive, or even zero. The term that is on the right with no x hooked to it is called the y-intercept, or intercept for short. The intercept can be positive, negative or zero.

3 x y This height is called the intercept. Here I show three different lines with the same intercept. But, different lines could have different intercepts. Intercepts can even be negative.

4 x y The dot on the line is represented by an x value and a y value. 1 ? Say we move from a dot one unit away in the x direction. The slope then tells us how far we have to go in the y direction to get back to the line. Note on the upward sloping (to the right) curve when we went over to the right on x we have to go up on the y variable. On the flat line we wouldn’t move in the y direction at all, and on the downward sloping line we would move down to the line

5 Now, in algebra, we might have a specific line with the form y = x. Then we can say, when x=y= and so on. In algebra every point fits exactly on the line.

6 Now, let’s use an example to see how what we have just been thinking about is related to statistics. Say a chain of pizza joints has stores in many college towns. And say it is wondering if the sales in these towns are related to the size of the college in terms of student population. Sales would be the y variable because sales are thought to depend on the population. The student population would be the x variable. On the next screen I have data from 10 of the stores. Note each row is a store and we have on each line the population and the sales. Then we put each store as a dot in the scatter diagram.

7 Do the dots fit exactly on a line like in algebra? No, but maybe a line can be put into the data so that the line can be used to represent the data.

8 Math form It is thought that in the population the variable x and y are related in the following general form: y = B 0 + B 1 x + e, where B 0 is the y intercept of the line, B 1 is the slope of the line, and e is an error term that captures all those influences on y not picked up by x. The error term reflects the fact that all the points are not directly on the line. So, we think there is a regression line out there that expresses the relationship between x and y. We have to go find it. In fact we take a sample and get an estimate of the regression line.

9 Later we will see a method to get an estimate, but for now say we have the method. When we have a sample of data from a population we will say in general the regression line is estimated to be ^ y = b 0 + b 1 x, where the ‘hat’ refers to the estimated value of y. Once we have this estimated line we are right back to algebra. y hat values are exactly on the line. Now, for an each value of x we have data values, called y’s, and we have the one value of the line, called y hat.

10 At each x a deviation, or residual is the data value minus the y hat value. The method we use to find the line is called the (ordinary) least squares method. From the data of our example I tell you the least squares method gives the equation y hat = x (look like the algebra you saw before?) Now, go back to the slide with the data. Create a y hat, or values of y on the line, column (you don’t have too, but think about it). You get this column by taking the population values for x in each row and plug into the line to get the y hat. The difference between the sales values and the y hat values are the deviations to which I refer.

11 ordinary least squares The typical method used to pick the line through the data is called the ordinary least squares line. This method is the one that minimizes the sum of squared deviations of the data points to the line. The line has desirable properties(not proven here): 1) It is unbiased - if many samples were taken, the average of the intercepts and slopes from the samples would be the population intercept and slope. 2) It is consistent - ‘large’ samples would give the population intercept and slope as well.

12 One last point in this section. When you see the scatterplot like the one I had before, you should look at the pattern in the dots. Look at the dots from left to right. 1) if the dots go up hill, suggesting a positive slope, you should get the feel that the sample suggests the relationship between the variables is then beginning to look like a positive relationship – this means the two variables tend to move in the same direction. The means higher values for x go with higher values for y. 2) If the dots go down hill the sample is suggesting there is a negative relationship between the variables. 3) If the dots are flat the sample is suggesting there is no relationship between the variables.