EC 30031 - Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x 2 +...  k x k + u 2. Inference.

Slides:



Advertisements
Similar presentations
Multiple Regression Analysis
Advertisements

The Simple Regression Model
The Multiple Regression Model.
1 Javier Aparicio División de Estudios Políticos, CIDE Otoño Regresión.
CHAPTER 25: One-Way Analysis of Variance Comparing Several Means
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.
Lecture 3 (Ch4) Inferences
Econ 140 Lecture 81 Classical Regression II Lecture 8.
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Confidence Interval and Hypothesis Testing for:
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Multiple regression analysis
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Hypothesis Testing.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Chapter 10 Simple Regression.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
4.1 All rights reserved by Dr.Bill Wan Sing Hung - HKBU Lecture #4 Studenmund (2006): Chapter 5 Review of hypothesis testing Confidence Interval and estimation.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Multiple Regression Analysis
Hypothesis : Statement about a parameter Hypothesis testing : decision making procedure about the hypothesis Null hypothesis : the main hypothesis H 0.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Multiple Regression Analysis
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 3. Asymptotic Properties.
The Simple Regression Model
Chapter 11: Inference for Distributions
Inferences About Process Quality
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 9 Hypothesis Testing.
Economics Prof. Buckles
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Simple Linear Regression Analysis
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Multiple Linear Regression Analysis
Chapter 13: Inference in Regression
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
4.2 One Sided Tests -Before we construct a rule for rejecting H 0, we need to pick an ALTERNATE HYPOTHESIS -an example of a ONE SIDED ALTERNATIVE would.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
More About Significance Tests
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. CH5 Multiple Regression Analysis: OLS Asymptotic 
1 Hypothesis Tests & Confidence Intervals (SW Ch. 7) 1.For a single coefficient 2.For multiple coefficients 3.Other types of hypotheses involving multiple.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
© Copyright McGraw-Hill 2000
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Chapter 13 Multiple Regression
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
A.P. STATISTICS EXAM REVIEW TOPIC #2 Tests of Significance and Confidence Intervals for Means and Proportions Chapters
Chapter 9 Introduction to the t Statistic
Multiple Regression Analysis: Inference
Statistical Inference about Regression
Chapter 7: The Normality Assumption and Inference with OLS
Multiple Regression Analysis
Presentation transcript:

EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference

EC Prof. Buckles2 Assumptions of the Classical Linear Model (CLM) So far, we know that given the Gauss- Markov assumptions, OLS is BLUE, In order to do classical hypothesis testing, we need to add another assumption (beyond the Gauss-Markov assumptions) Assume that u is independent of x 1, x 2,…, x k and u is normally distributed with zero mean and variance  2 : u ~ Normal(0,  2 )

EC Prof. Buckles3 CLM Assumptions (cont) We can summarize the population assumptions of CLM as follows y|x ~ Normal(  0 +  1 x 1 +…+  k x k,  2 ) While for now we just assume normality, clear that sometimes not the case (ex: dependent variable is a dummy) Large samples will let us drop normality

EC Prof. Buckles4.. x1x1 x2x2 The homoskedastic normal distribution with a single explanatory variable E(y|x) =  0 +  1 x y f(y|x) Normal distributions

EC Prof. Buckles5 Normal Sampling Distributions

EC Prof. Buckles6 The t Test

EC Prof. Buckles7 The t Test (cont) Knowing the sampling distribution for the standardized estimator allows us to carry out hypothesis tests Start with a null hypothesis For example, H 0 :  j =0 If fail to reject null, then fail to reject that x j has no effect on y, controlling for other x’s

EC Prof. Buckles8 The t Test (cont)

EC Prof. Buckles9 t Test: One-Sided Alternatives Besides our null, H 0, we need an alternative hypothesis, H 1, and a significance level H 1 may be one-sided, or two-sided H 1 :  j > 0 and H 1 :  j < 0 are one-sided H 1 :  j  0 is a two-sided alternative If we want to have only a 5% probability of rejecting H 0 if it is really true, then we say our significance level is 5%

EC Prof. Buckles10 One-Sided Alternatives (cont) Having picked a significance level, , we look up the (1 –  ) th percentile in a t distribution with n – k – 1 df and call this c, the critical value We can reject the null hypothesis if the t statistic is greater than the critical value If the t statistic is less than the critical value then we fail to reject the null

EC Prof. Buckles11 y i =  0 +  1 x i1 + … +  k x ik + u i H 0 :  j = 0 H 1 :  j > 0 c 0   One-Sided Alternatives (cont) Fail to reject reject

EC Prof. Buckles12 One-sided vs Two-sided Because the t distribution is symmetric, testing H 1 :  j < 0 is straightforward. The critical value is just the negative of before We can reject the null if the t statistic than –c then we fail to reject the null For a two-sided test, we set the critical value based on  /2 and reject H 1 :  j  0 if the absolute value of the t statistic > c

EC Prof. Buckles13 y i =  0 +  1 X i1 + … +  k X ik + u i H 0 :  j = 0 H 1 :  j  0 c 0   -c  Two-Sided Alternatives reject fail to reject

EC Prof. Buckles14 Summary for H 0 :  j = 0 Unless otherwise stated, the alternative is assumed to be two-sided If we reject the null, we typically say “x j is statistically significant at the  % level” If we fail to reject the null, we typically say “x j is statistically insignificant at the  % level”

EC Prof. Buckles15 Testing other hypotheses A more general form of the t statistic recognizes that we may want to test something like H 0 :  j = a j In this case, the appropriate t statistic is

EC Prof. Buckles16 Confidence Intervals Another way to use classical statistical testing is to construct a confidence interval using the same critical value as was used for a two-sided test A (1 -  ) % confidence interval is defined as

EC Prof. Buckles17 Computing p-values for t tests An alternative to the classical approach is to ask, “what is the smallest significance level at which the null would be rejected?” So, compute the t statistic, and then look up what percentile it is in the appropriate t distribution – this is the p-value p-value is the probability we would observe the t statistic we did, if the null were true

EC Prof. Buckles18 Stata and p-values, t tests, etc. Most computer packages will compute the p-value for you, assuming a two-sided test If you really want a one-sided alternative, just divide the two-sided p-value by 2 Stata provides the t statistic, p-value, and 95% confidence interval for H 0 :  j = 0 for you, in columns labeled “t”, “P > |t|” and “[95% Conf. Interval]”, respectively

EC Prof. Buckles19 Testing a Linear Combination Suppose instead of testing whether  1 is equal to a constant, you want to test if it is equal to another parameter, that is H 0 :  1 =  2 Use same basic procedure for forming a t statistic

EC Prof. Buckles20 Testing Linear Combo (cont)

EC Prof. Buckles21 Testing a Linear Combo (cont) So, to use formula, need s 12, which standard output does not have Many packages will have an option to get it, or will just perform the test for you In Stata, after reg y x1 x2 … xk you would type test x1 = x2 to get a p-value for the test More generally, you can always restate the problem to get the test you want

EC Prof. Buckles22 Example: Suppose you are interested in the effect of campaign expenditures on outcomes Model is voteA =  0 +  1 log(expendA) +  2 log(expendB) +  3 prtystrA + u H 0 :  1 = -  2, or H 0 :  1 =  1 +  2 = 0  1 =  1 –  2, so substitute in and rearrange  voteA =  0 +  1 log(expendA) +  2 log(expendB - expendA) +  3 prtystrA + u

EC Prof. Buckles23 Example (cont): This is the same model as originally, but now you get a standard error for  1 –  2 =  1 directly from the basic regression Any linear combination of parameters could be tested in a similar manner Other examples of hypotheses about a single linear combination of parameters:  1 = 1 +  2 ;  1 = 5  2 ;  1 = -1/2  2 ; etc

EC Prof. Buckles24 Multiple Linear Restrictions Everything we’ve done so far has involved testing a single linear restriction, (e.g.  1 =  or  1 =  2 ) However, we may want to jointly test multiple hypotheses about our parameters A typical example is testing “exclusion restrictions” – we want to know if a group of parameters are all equal to zero

EC Prof. Buckles25 Testing Exclusion Restrictions Now the null hypothesis might be something like H 0 :  k-q+1 = 0, ,  k = 0 The alternative is just H 1 : H 0 is not true Can’t just check each t statistic separately, because we want to know if the q parameters are jointly significant at a given level – it is possible for none to be individually significant at that level

EC Prof. Buckles26 Exclusion Restrictions (cont) To do the test we need to estimate the “restricted model” without x k-q+1,, …, x k included, as well as the “unrestricted model” with all x’s included Intuitively, we want to know if the change in SSR is big enough to warrant inclusion of x k-q+1,, …, x k

EC Prof. Buckles27 The F statistic The F statistic is always positive, since the SSR from the restricted model can’t be less than the SSR from the unrestricted Essentially the F statistic is measuring the relative increase in SSR when moving from the unrestricted to restricted model q = number of restrictions, or df r – df ur n – k – 1 = df ur

EC Prof. Buckles28 The F statistic (cont) To decide if the increase in SSR when we move to a restricted model is “big enough” to reject the exclusions, we need to know about the sampling distribution of our F stat Not surprisingly, F ~ F q,n-k-1, where q is referred to as the numerator degrees of freedom and n – k – 1 as the denominator degrees of freedom

EC Prof. Buckles29 0 c   f( F ) F The F statistic (cont) reject fail to reject Reject H 0 at  significance level if F > c

EC Prof. Buckles30 The R 2 form of the F statistic Because the SSR’s may be large and unwieldy, an alternative form of the formula is useful We use the fact that SSR = SST(1 – R 2 ) for any regression, so can substitute in for SSR u and SSR ur

EC Prof. Buckles31 Overall Significance A special case of exclusion restrictions is to test H 0 :  1 =  2 =…=  k = 0 Since the R 2 from a model with only an intercept will be zero, the F statistic is simply

EC Prof. Buckles32 General Linear Restrictions The basic form of the F statistic will work for any set of linear restrictions First estimate the unrestricted model and then estimate the restricted model In each case, make note of the SSR Imposing the restrictions can be tricky – will likely have to redefine variables again

EC Prof. Buckles33 Example: Use same voting model as before Model is voteA =  0 +  1 log(expendA) +  2 log(expendB) +  3 prtystrA + u now null is H 0 :  1 = 1,   = 0 Substituting in the restrictions: voteA =  0 + log(expendA) +  2 log(expendB) + u, so Use voteA - log(expendA) =  0 +  2 log(expendB) + u as restricted model

EC Prof. Buckles34 F Statistic Summary Just as with t statistics, p-values can be calculated by looking up the percentile in the appropriate F distribution Stata will do this by entering: display fprob(q, n – k – 1, F), where the appropriate values of F, q,and n – k – 1 are used If only one exclusion is being tested, then F = t 2, and the p-values will be the same