Class 4 Ordinary Least Squares CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Managerial Economics in a Global Economy
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Linear regression models
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
Quantitative Data Analysis: Hypothesis Testing
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Simple Linear Regression Model: Specification and Estimation
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
Chapter 12 Simple Regression
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Introduction to Regression Analysis, Chapter 13,
Correlation & Regression
Chapter 8: Bivariate Regression and Correlation
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Class 4 Ordinary Least Squares SKEMA Ph.D programme Lionel Nesta Observatoire Français des Conjonctures Economiques
Hypothesis Testing in Linear Regression Analysis
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
11 Chapter 12 Quantitative Data Analysis: Hypothesis Testing © 2009 John Wiley & Sons Ltd.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Class 3 Relationship Between Variables CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Simple linear regression and correlation Regression analysis is the process of constructing a mathematical model or function that can be used to predict.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
Inference for Least Squares Lines
10.2 Regression If the value of the correlation coefficient is significant, the next step is to determine the equation of the regression line which is.
Basic Estimation Techniques
The Simple Linear Regression Model: Specification and Estimation
Chapter 11 Simple Regression
Basic Estimation Techniques
CHAPTER 29: Multiple Regression*
Multiple Regression Models
Simple Linear Regression
Simple Linear Regression
Seminar in Economics Econ. 470
Simple Linear Regression
BEC 30325: MANAGERIAL ECONOMICS
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

Class 4 Ordinary Least Squares CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques

Introduction to Regression  Ideally, the social scientist is interested not only in knowing the intensity of a relationship, but also in quantifying the magnitude of a variation of one variable associated with the variation of one unit of another variable.  Regression analysis is a technique that examines the relation of a dependent variable to independent or explanatory variables.  Simple regression y = f(X)  Multiple regression y = f(X,Z)  Let us start with simple regressions

Scatter Plot of Fertilizer and Production

Objective of Regression  It is time to ask: “What is a good fit?”  “A good fit is what makes the error small”  “The best fit is what makes the error smallest”  Three candidates 1.To minimize the sum of all errors 2.To minimize the sum of absolute values of errors 3.To minimize the sum of squared errors

To minimize the sum of all errors X Y – – + X Y – + + Problem of sign

X Y +3 To minimize the sum of absolute values of errors X Y –1 +2 Problem of middle point

To minimize the sum of squared errors X Y – – + Solve both problems

ε ε²ε²  Overcomes the sign problem  Goes through the middle point  Squaring emphasizes large errors  Easily Manageable  Has a unique minimum  Has a unique – and best - solution To minimize the sum of squared errors

Scatter Plot of Fertilizer and Production

Scatter Plot of R&D and Patents (log)

The Simple Regression Model y i Dependent variable (to be explained) x i Independent variable (explanatory) α First parameter of interest  Second parameter of interest ε i Error term

The Simple Regression Model

ε ε²ε² To minimize the sum of squared errors

ε ε²ε²

Application to CERAM_BIO Data using Excel

Interpretation  When the log of R&D (per asset) increases by one unit, the log of patent per asset increases by  Remember! A change in log of x is a relative change of x itself  A 1% increase in R&D (per asset) entails a 1.748% increase in the number of patent (per asset).

Application to Data using SPSS Analyse  Régression  Linéaire

Assessing the Goodness of Fit  It is important to ask whether a specification provides a good prediction on the dependent variable, given values of the independent variable.  Ideally, we want an indicator of the proportion of variance of the dependent variable that is accounted for – or explained – by the statistical model.  This is the variance of predictions ( ŷ ) and the variance of residuals ( ε ), since by construction, both sum to overall variance of the dependent variable ( y ).

Overall Variance

Decomposing the overall variance (1)

Decomposing the overall variance (2)

Coefficient of determination R²  R 2 is a statistic which provides information on the goodness of fit of the model.

Fisher’s F Statistics  Fisher’s statistics is relevant as a form of ANOVA on SS fit which tells us whether the regression model brings significant (in a statistical sense, information. ModelSSdfMSSF (1)(2)(3)(2)/(3) Fittedp ResidualN–p–1N–p–1 TotalN–1N–1 p: number of parameters N: number of observations

Application to Data using SPSS Analyse  Régression  Linéaire

What the R² is not  Independent variables are a true cause of the changes in the dependent variable  The correct regression was used  The most appropriate set of independent variables has been chosen  There is co-linearity present in the data  The model could be improved by using transformed versions of the existing set of independent variables

Inference on β  We have estimated   Therefore we must test whether the estimated parameter is significantly different than 0, and, by way of consequence, we must say something on the distribution – the mean and variance – of the true but unobserved β*

The mean and variance of β  It is possible to show that is a good approximation, i.e. an unbiased estimator, of the true parameter β*.  The variance of β is defined as the ratio of the mean square of errors over the sum of squares of the explanatory variable

The confidence interval of β  We must now define de confidence interval of β, at 95%. To do so, we use the mean and variance of β and define the t value as follows:  Therefore, the 95% confidence interval of β is: If the 95% CI does not include 0, then β is significantly different than 0.

Student t Test for β  We are also in the position to infer on β  H 0 : β* = 0  H 1 : β* ≠ 0 Rule of decision Accept H 0 is | t | < t α/2 Reject H 0 is | t | ≥ t α/2

Application to Data using SPPS Analyse  Régression  Linéaire

Assignments on CERAM_BIO  Regress the number of patent on R&D expenses and consider: 1.The quality of the fit 2.The significance and direction of R&D expenses 3.The interpretation of the result in an economic sense  Repeat steps 1 to 3 using:  R&D expenses divided by one million (you need to generate a new variable for that)  The log of R&D expenses  What do you observe? Why?