Regression Analysis Intro to OLS Linear Regression.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Multiple Regression Analysis
Lesson 10: Linear Regression and Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
CHAPTER 3: TWO VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD.
Chapter 10 Curve Fitting and Regression Analysis
Describing Relationships Using Correlation and Regression
Part 1 Cross Sectional Data
9. SIMPLE LINEAR REGESSION AND CORRELATION
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
The Simple Regression Model
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Relationships Among Variables
Ordinary Least Squares
Correlation & Regression
Correlation and Regression
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Lecture 3-2 Summarizing Relationships among variables ©
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Simple Linear Regression Models
CORRELATION & REGRESSION
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Chapter 8 Curve Fitting.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
REGRESSION G&W p
Part 5 - Chapter
Part 5 - Chapter 17.
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 5 STATISTICS (PART 4).
Multiple Regression.
The Simple Regression Model
BPK 304W Regression Linear Regression Least Sum of Squares
Quantitative Methods Simple Regression.
BPK 304W Correlation.
Part 5 - Chapter 17.
CHAPTER- 17 CORRELATION AND REGRESSION
Simple Linear Regression
Correlation and Regression
Product moment correlation
3 basic analytical tasks in bivariate (or multivariate) analyses:
Presentation transcript:

Regression Analysis Intro to OLS Linear Regression

Regression Analysis Defined as the analysis of the statistical relationship among variables In it’s simplest form there are only two variables: Dependent or response variable (labeled as Y) Independent or predictor variable (labeled as X)

Statistical Relationships- A warning Be aware that as with correlation and other measures of statistical association, a relationship does not guarantee or even imply a causality between the variables Also be aware of the difference between a mathematical or functional relationship based upon theory and a statistical relationship based upon data and its imperfect fit to a mathematical model

Simple Linear Regression The basic function for linear regression is Y=f(X) but the equation typically takes the following form: Y= α+βX+ε α - Alpha – an intercept component to the model that represents the models value for Y when X=0 β - Beta – a coefficient that loosely denotes the nature of the relationship between Y and X and more specifically denotes the slope of the linear equation that specifies the model ε - Epsilon – a term that represents the errors associated with the model

Example i in this case is a “counter” representing the i th observation in the data set

Accompanying Scatterplot

Accompanying Scatterplot with Regression Equation

What does the additional info mean? α - Alpha – 138 cones β - Beta – -16 cones/$1 increase in cost ε - Epsilon – still present and evidenced by the fact that the model does not fit the data perfectly R 2 - a new term, the Coefficient of Determination - a value of 0.71 is pretty good considering that the value is scaled between 0 and 1 with 1 being a model with a perfect agreement with the data

Coefficient of Determination In this simple example R 2 is indeed the square of R Recall that R is often the symbol for the Pearson Product Moment Correlation (PPMC) which is a parametric measure of association between two variables R (X,Y) = in this case –0.84^2=0.71

A Digression into History Adrien Legendre- the original author of “the method of least squares”, published in 1805

The guy that got the credit- Carl-Fredrick- the “giant” of early statistics AKA – Gauss – published the theory of least squares in 1821

Back on Topic – a recap of PPMC or r From last semester: The PPMC coefficient is essentially the sum of the products of the z-scores for each variable divided by the degrees of freedom Its computation can take on a number of forms depending on your resources

What it looks like in equation form: The sample covariance is the upper center equation without the sample standard deviations in the denominator Covariance measures how two variables covary and it is this measure that serves as the numerator in Pearson’s r Computationally EasierMathematically Simplified

Take home message Correlation is a measure of association between two variables Covariance is a measure of how the two variables vary with respect to one another Both of these are parametrically based statistical measures – note that PPMC is based upon z- scores Z-scores are based upon the normal or Gaussian distribution - thus these measures as well as linear regression based upon the method of least squares is predicated upon the assumption of normality and other parametric assumptions

OLS defined OLS stands for Ordinary Least Squares This is a method of estimation that is used in linear regression Its defining and nominal criteria is that it minimizes the errors associated with predicting values for Y It uses a least squares criterion because a simple “least” criterion would allow positive and negative deviations from the model to cancel each other out (using the same logic that is used for computations of variance and a host of other statistical measures)

The math behind OLS Recall that the linear regression equation for a single independent variable takes this form: Y= α+βX+ε Since Y and X are known for all I and the error term is immutable, minimizing the model errors is really based upon our choice of alpha and beta

This is this under the condition that S is the total sum of squared deviations from i =1 to n for all Y and X for an alpha and beta The correct alpha and beta to minimize S can be found by taking the partial derivative for alpha and beta by setting each of them equal to zero for the other, yielding for alpha, andfor beta which can be further simplified tofor alpha and for beta

Refer to page 436 for the the text’s more detailed description of the computations for solving for alpha and beta Given these, we can easily solve for the more simple alpha via algebra is and since X(bar) is the sum of all X(I) from 1 to n diveded by n and the same can be said for Y(bar) we are left with Since the mean of both X and Y can be obtained from the data, we can calculate the intercept or alpha very simply if we know the slope or beta

Once we have a simple equation for alpha, we can plug it into the equation for beta and then solve for the slope of the regression equation Multiply by n and you get isolate beta and we have

Alpha or the regression intercept Beta or the regression slope

Given this info, let’s Head over to the lab and get some hand’s on practice using the small and relatively simple ice cream sale’s data set We will cover the math behind the coefficient of determination on Thursday and introduce regression with multiple independent variables