1 Lecture Ten. 2 Lecture Part I: Regression Part II: Experimental Method.

Slides:



Advertisements
Similar presentations
Simple Linear Regression 1. review of least squares procedure 2
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
Chapter 13 Additional Topics in Regression Analysis
1 Lecture Ten. 2 Lecture Part I: Regression –properties of OLS estimators –assumptions of OLS –pathologies of OLS –diagnostics for OLS Part II: Experimental.
1 Lecture Ten. 2 Where Do We Go From Here? Regression Properties Assumptions Violations Diagnostics Modeling ProbabilityProbability Probability Count.
BA 555 Practical Business Analysis
Lecture 25 Multiple Regression Diagnostics (Sections )
1 BA 275 Quantitative Business Methods Residual Analysis Multiple Linear Regression Adjusted R-squared Prediction Dummy Variables Agenda.
Llad Phillips1 Part I Strategies to Estimate Deterrence Part II Optimization of the Criminal Justice System.
Lecture 24 Multiple Regression (Sections )
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Llad Phillips1 Part I Strategies to Estimate Deterrence Part II Optimization of the Criminal Justice System.
Chapter Topics Types of Regression Models
Lecture 20 Simple linear regression (18.6, 18.9)
Regression Diagnostics - I
1 Simple Linear Regression and Correlation Chapter 17.
1 Lecture Ten. 2 Where Do We Go From Here? Regression Properties Assumptions Violations Diagnostics Modeling ProbabilityProbability Probability Count.
1 Lecture Ten. 2 Lecture Part I: Regression –properties of OLS estimators –assumptions of OLS –pathologies of OLS –diagnostics for OLS Part II: Experimental.
Regression Diagnostics Checking Assumptions and Data.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Business Statistics - QBM117 Statistical inference for regression.
Chapter 7 Forecasting with Simple Regression
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Inference for regression - Simple linear regression
Regression Method.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Economics 173 Business Statistics Lectures Summer, 2001 Professor J. Petry.
Llad Phillips1 Part I Strategies to Estimate Deterrence Part II Optimization of the Criminal Justice System.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Residual Analysis Purposes –Examine Functional Form (Linear vs. Non- Linear Model) –Evaluate Violations of Assumptions Graphical Analysis of Residuals.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
1 Simple Linear Regression Review 1. review of scatterplots and correlation 2. review of least squares procedure 3. inference for least squares lines.
Economics 173 Business Statistics Lecture 18 Fall, 2001 Professor J. Petry
BPS - 5th Ed. Chapter 231 Inference for Regression.
Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Assumptions & Requirements.  Three Important Assumptions 1.The errors are normally distributed. 2.The errors have constant variance (i.e., they are homoscedastic)
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Warm-Up The least squares slope b1 is an estimate of the true slope of the line that relates global average temperature to CO2. Since b1 = is very.
Inference for Least Squares Lines
Linear Regression.
Simple Linear Regression Review 1
Keller: Stats for Mgmt & Econ, 7th Ed
Fundamentals of regression analysis
BA 275 Quantitative Business Methods
CHAPTER 29: Multiple Regression*
CHAPTER 12 More About Regression
BEC 30325: MANAGERIAL ECONOMICS
Chapter 13 Additional Topics in Regression Analysis
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

1 Lecture Ten

2 Lecture Part I: Regression Part II: Experimental Method

3 Outline: Regression The Assumptions of Least Squares The Pathologies of Least Squares Diagnostics for Least Squares

Assumptions Expected value of the error is zero, E[e)t)]= 0 The error is independent of the explanatory variable, E{e(t) [x(t)-Ex(t)]}=0 The errors are independent of one another, E[e(i)e(j)] = 0, i not equal to j. The variance is homoskedatic, E[e(i)] 2 =E[e(j)] 2 The error is normal with mean zero and variance

Error Variable: Required Conditions The error  is a critical part of the regression model. Four requirements involving the distribution of  must be satisfied. –The probability distribution of  is normal. –The mean of  is zero: E(  ) = 0. –The standard deviation of  is   for all values of x. –The set of errors associated with different values of y are all independent.

The Normality of  From the first three assumptions we have: y is normally distributed with mean E(y) =  0 +  1 x, and a constant standard deviation   From the first three assumptions we have: y is normally distributed with mean E(y) =  0 +  1 x, and a constant standard deviation     0 +  1 x 1  0 +  1 x 2  0 +  1 x 3 E(y|x 2 ) E(y|x 3 ) x1x1 x2x2 x3x3  E(y|x 1 )  The standard deviation remains constant, but the mean value changes with x

7 Pathologies Cross section data: error variance is heteroskedatic. Example, could vary with firm size. Consequence, all the information available is not used efficiently, and better estimates of the standard error of regression parameters is possible. Time series data: errors are serially correlated, i.e auto-correlated. Consequence, inefficiency.

8 Pathologies ( Cont. ) Explanatory variable is not independent of the error. Consequence, inconsistency, i.e. larger sample sizes do not lead to lower standard errors for the parameters, and the parameter estimates (slope etc.) are biased. The error is not distributed normally. Example, there may be fat tails. Consequence, use of the normal may underestimate true 95 % confidence intervals.

9 Pathologies (Cont.) Multicollinearity: The independent variables may be highly correlated. As a consequence, they do not truly represent separate causal factors, but instead a common causal factor.

Regression Diagnostics - I The three conditions required for the validity of the regression analysis are: –the error variable is normally distributed. –the error variance is constant for all values of x. –The errors are independent of each other. How can we diagnose violations of these conditions?

11 Residual Analysis Examining the residuals (or standardized residuals), help detect violations of the required conditions. Example 18.2 – continued: –Nonnormality. Use Excel to obtain the standardized residual histogram. Examine the histogram and look for a bell shaped. diagram with a mean close to zero.

12 Diagnostics ( Cont. ) Multicollinearity may be suspected if the t- statistics for the coefficients of the explanatory variables are not significant but the coefficient of determination is high. The correlation between the explanatory variable can then be calculated. To see if it is high.

13 Diagnostics Is the error normal? Using EViews, with the view menu in the regression window, a histogram of the distribution of the estimated error is available, along with the coefficients of skewness and kurtosis, and the Jarque-Bera statistic testing for normality.

14 Diagnostics (Cont.) To detect heteroskedasticity: if there are sufficient observations, plot the estimated errors against the fitted dependent variable

Heteroscedasticity When the requirement of a constant variance is violated we have a condition of heteroscedasticity. Diagnose heteroscedasticity by plotting the residual against the predicted y The spread increases with y ^ y ^ Residual ^ y

16 Homoscedasticity When the requirement of a constant variance is not violated we have a condition of homoscedasticity. Example continued

17 Diagnostics ( Cont.) Autocorrelation: The Durbin-Watson statistic is a scalar index of autocorrelation, with values near 2 indicating no autocorrelation and values near zero indicating autocorrelation. Examine the plot of the residuals in the view menu of the regression window in EViews.

18 Non Independence of Error Variables –A time series is constituted if data were collected over time. –Examining the residuals over time, no pattern should be observed if the errors are independent. –When a pattern is detected, the errors are said to be autocorrelated. –Autocorrelation can be detected by graphing the residuals against time.

19 Patterns in the appearance of the residuals over time indicates that autocorrelation exists Time Residual Time Note the runs of positive residuals, replaced by runs of negative residuals Note the oscillating behavior of the residuals around zero. 00 Non Independence of Error Variables

20 Fix-Ups Error is not distributed normally. For example, regression of personal income on explanatory variables. Sometimes a transformation, such as regressing the natural logarithm of income on the explanatory variables may make the error closer to normal.

21 Fix-ups (Cont.) If the explanatory variable is not independent of the error, look for a substitute that is highly correlated with the dependent variable but is independent of the error. Such a variable is called an instrument.

22 Data Errors: May lead to outliers Typos may lead to outliers and looking for ouliers is a good way to check for serious typos

23 Outliers An outlier is an observation that is unusually small or large. Several possibilities need to be investigated when an outlier is observed: –There was an error in recording the value. –The point does not belong in the sample. –The observation is valid. Identify outliers from the scatter diagram. It is customary to suspect an observation is an outlier if its |standard residual| > 2

The outlier causes a shift in the regression line … but, some outliers may be very influential An outlier An influential observation

25 Procedure for Regression Diagnostics Develop a model that has a theoretical basis. Gather data for the two variables in the model. Draw the scatter diagram to determine whether a linear model appears to be appropriate. Determine the regression equation. Check the required conditions for the errors. Check the existence of outliers and influential observations Assess the model fit. If the model fits the data, use the regression equation.

26 Part II: Experimental Method

27 Outline Critique of Regression

28 Critique of Regression Samples of opportunity rather than random sample Uncontrolled Causal Variables –omitted variables –unmeasured variables Insufficient theory to properly specify regression equation

29 Experimental Method: # Examples Deterrence Aspirin Miles per Gallon

30 Deterrence and the Death Penalty

31 Isaac Ehrlich Study of the Death Penalty: _Homicide Rate Per Capita _Control Variables _probability of arrest _probability of conviction given charged _Probability of execution given conviction _Causal Variables _labor force participation rate _unemployment rate _percent population aged years _permanent income _trend

Long Swings in the Homicide Rate in the US: Source: Report to the Nation on Crime and Justice

Ehrlich Results: Elasticities of Homicide with respect to Controls Source: Isaac Ehrlich, “The Deterrent Effect of Capital Punishment

34 Critique of Ehrlich by Death Penalty Opponents _Time period used: _period of declining probability of execution _Ehrlich did not include probability of imprisonment given conviction as a control variable _Causal variables included are unconvincing as causes of homicide

35 United States Bureau of Justice Statistics

36 Experimental Method Police intervention in family violence

37 United States Bureau of Justice Statistics

38 United States Bureau of Justice Statistics

39 Police Intervention with Experimental Controls _A 911 call from a family member _the case is randomly assigned for “treatment” _A police patrol responds and visits the household _police calm down the family members _based on the treatment randomly assigned, the police carry out the sanctions

40 Why is Treatment Assigned Randomly? _To control for unknown causal factors _assign known numbers of cases, for example equal numbers, to each treatment _with this procedure, there should be an even distribution of difficult cases in each treatment group

call (characteristics of household Participants unknown) Random Assignment code blue code gold patrol responds settles the household verbally warn the husbandtake the husband to jail for the night

42 Experimental Method: Clinical Trials Doctors Volunteer Randomly assigned to two groups treatment group takes an aspirin a day the control group takes a placebo (sugar pill) per day After 5 years, 11,037 experimentals have 139 heart attacks (fatal and non fatal) p E = after 5 years, controls have 239 heart attacks, p c =

Conclusions from the Clinical Trials Hypotheses: H 0 : p C = p E, or p C - p E = 0.; H a : (p C - p E ) 0. Statistic:Z = [ C - E ) – (p C - p E )]/  ( p C - p E ) Var C - E ) = Var( C ) + Var ( E ) recall, from the variance for a proportionSE SE( C - E )={[ c (1- c )]/n c + [ E (1- E )]/n E } 1/2 { [0.={[0217 ( )/ 11,034] + [ ( 1 – )/ 11,039} 1/2 = , so z = ( )/ z= 5.2

44 Experimental Method Experimental Design: Paired Comparisons

Table 1: Miles Per Gallon for Brand A and Brand B