Simple Linear Regression

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.2 C Coefficient of Determination
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Techniques I EXST7005 Simple Linear Regression.
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Business Statistics - QBM117 Least squares regression.
Correlation Coefficients Pearson’s Product Moment Correlation Coefficient  interval or ratio data only What about ordinal data?
Least Squares Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Simple Linear Regression
Ch4 Describing Relationships Between Variables. Pressure.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Linear Regression Least Squares Method: the Meaning of r 2.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Correlation and Regression. Section 9.1  Correlation is a relationship between 2 variables.  Data is often represented by ordered pairs (x, y) and.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Correlation and Linear Regression
Chapter 2 Linear regression.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
Chapter 11: Linear Regression and Correlation
Regression Analysis AGEC 784.
1 Functions and Applications
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Lecture #26 Thursday, November 17, 2016 Textbook: 14.1 and 14.3
Statistical Data Analysis - Lecture /04/03
Regression and Correlation
3.1 Examples of Demand Functions
Chapter 11: Simple Linear Regression
LSRL Least Squares Regression Line
Ordinary Least Squares (OLS) Regression
SIMPLE LINEAR REGRESSION MODEL
Simple Linear Regression
Section 13.7 Linear Correlation and Regression
The Least-Squares Regression Line
CHAPTER 10 Correlation and Regression (Objectives)
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
I271B Quantitative Methods
Chapter 14 – Correlation and Simple Regression
Lecture Notes The Relation between Two Variables Q Q
AP STATISTICS LESSON 3 – 3 (DAY 2)
Equations of Lines and Modeling
Least-Squares Regression
CHAPTER 3 Describing Relationships
Descriptive Analysis and Presentation of Bivariate Data
Least Squares Method: the Meaning of r2
The Simple Linear Regression Model: Specification and Estimation
Functions and Their Graphs
Simple Linear Regression
Chapter 5 LSRL.
Least-Squares Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Introduction to Regression
Introduction to Regression
Presentation transcript:

Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance traveled and the time spent driving; one’s age and height. Generally, there are two types of relationships between a pair of variable: deterministic relationship and probabilistic relationship. Deterministic relationship distance S: distance travel S0: initial distance v: speed t: traveled v slope S0 intercept time

Probabilistic Relationship In many occasions we are facing a different situation. One variable is related to another variable as in the following. age height Here we can not definitely predict one’s height from his age as we did in

Here, x is called independent variable y is called dependent variable Linear Regression Statistically, the way to characterize the relationship between two variables as we shown before is to use a linear model as in the following: Here, x is called independent variable y is called dependent variable  is the error term a is intercept b is slope Error:  y b a x

Least Square Lines Given some pairs of data for independent and dependent variables, we may draw many lines through the scattered points x y The least square line is a line passing through the points that minimize the vertical distance between the points and the line. In other words, the least square line minimizes the error term .

Least Square Method For notational convenience, the line that fits through the points is often written as The linear model we wrote before is If we use the value on the line, ŷ , to estimate y, the difference is (y- ŷ) For points above the line, the difference is positive, while the difference is negative for points below the line. ŷ y (y- ŷ)

Error Sum of Squares For some points, the values of (y- ŷ) are positive (points above the line) and for some other points, the values of (y- ŷ) are negative (points below the line). If we add all these up, the positive and negative values can get cancelled. Therefore, we take a square for all these difference and sum them up. Such a sum is called the Error Sum of Squares (SSE) The constant a and b is estimated so that the error sum of squares is minimized, therefore the name least square.

Estimating Regression Coefficients If we solve the regression coefficients a and b from by minimizing SSE, the following are the solutions. Where xi is the ith independent variable value yi is dependdent variable value corresponding to xi x_bar and y_bar are the mean value of x and y.

Interpretation of a and b The constant b is the slope, which gives the change in y (dependent variable) due to a change of one unit in x (independent variable). If b> 0, x and y are positively correlated, meaning y increases as x increases, vice versus. If b<0, x and y are negatively correlated. b>0 x y a b<0 x y a

Correlation Coefficient Although now we have a regression line to describe the relationship between the dependent variable and the independent variable, it is not enough to characterize the relationship between x and y. We may see the situation in the following graphs. (2) x y (1) x y Obviously the relationship between x and y in (1) is stronger than that in (2) even though the line in (2) is the best fit line. The statistic that characterizes the strength of the relationship is correlation coefficient or R2

How R2 is Calculated? y If we use y_bar to represent y, the error is (y-y_bar). If we use ŷ to represents y, the error is (y- ŷ ). Therefore the error is reduced to (y- ŷ ). Thus (ŷ- y_bar ) is the improvement over using y_bar. This is true for all points in the graph. To account how much total improvement we get, we take a sum of all improvements, (ŷ -y_bar). Again we face the same situation as we did while calculating variance. We take the square of the difference and sum the squared difference for all points

R Square SST=SSR+SSE Regression Sum of Squares y Total Sum of Squares R square indicates the percent variance in y explained by the regression. We already calculated SSE (Error Sum of Squares) while estimating a and b. In fact, the following relationship holds true: SST=SSR+SSE

An Simple Linear Regression Example The followings are some survey data showing how much a family spend on food in relation to household income (x=income in thousand $, y=is percent of income left after spending on food)