Multivariate Regression Model y =    x1 +  x2 +  x3 +… +  The OLS estimates b 0,b 1,b 2, b 3.. …. are sample statistics used to estimate 

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Inference for Regression
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Objectives (BPS chapter 24)
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Multiple regression analysis
The Simple Linear Regression Model: Specification and Estimation
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
4.1 All rights reserved by Dr.Bill Wan Sing Hung - HKBU Lecture #4 Studenmund (2006): Chapter 5 Review of hypothesis testing Confidence Interval and estimation.
The Simple Regression Model
T-test.
T-Tests Lecture: Nov. 6, 2002.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
Getting Started with Hypothesis Testing The Single Sample.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Lecture 5 Correlation and Regression
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Statistical inference: confidence intervals and hypothesis testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing in Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Method.
Regression Analysis (2)
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
More About Significance Tests
Interval Estimation and Hypothesis Testing
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Lesson Inference for Regression. Knowledge Objectives Identify the conditions necessary to do inference for regression. Explain what is meant by.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 Chapter 9 Inferences from Two Samples 9.2 Inferences About Two Proportions 9.3 Inferences About Two Means (Independent) 9.4 Inferences About Two Means.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
Correlation & Regression Analysis
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research The Classical Model and Hypothesis Testing.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Inferences Concerning Variances
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Using Microsoft Excel to Conduct Regression Analysis.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Lesson Testing the Significance of the Least Squares Regression Model.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
6. Simple Regression and OLS Estimation
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Chapter 11: Simple Linear Regression
Statistical Assumptions for SLR
Simple Linear Regression
Presentation transcript:

Multivariate Regression Model y =    x1 +  x2 +  x3 +… +  The OLS estimates b 0,b 1,b 2, b 3.. …. are sample statistics used to estimate          respectively y is the DEPENDENT variable Each of the x j is an INDEPENDENT variable

Conditions: Each explanatory variable Xj is assumed (1A) to be deterministic or non-random (1B) : to come from a ‘fixed’ population (1C) : to have a variance V(xj) which is not ‘too large’ The above assumptions are best suited to a situation of a controlled experiment

Assumptions concerning the random term    (IIA) E(  i ) = 0 for all i (IIB) Var(  i ) =    constant for all i (IIC) Covariance (    k ) =  for any i and k (IID) Each of the  i has a normal distribution

Properties of b 0, b 1, b 2, b 3 1. Each of these statistics is a linear functions of the Y values. 2. Therefore, they all have normal distributions 3. Each is an unbiased estimator. That is, E(b k ) =   

4. Each b k is the most efficient estimator of all unbiased estimators.

Best Linear Unbiased Estimator of the respective parameter Thus, each of b 0, b 1, b 2 ….is

Conclusion Each estimator b i has a normal distribution with mean =    and variance =  bi 2 where  bi 2 is unknown.

Income (£ per week) of an individual is regressed on a constant, education (in years), age (in years) and wealth inheritance (in £), using EViews. Number of observations is 20 and the regression output is given below:

Variable Coefficient Std.Error t-Stats Prob. C AGE EDUCATION WEALTH

Significance Level (  The Maximum Type 1 Error = Significance Level

The smaller the p-value the more significant is the test p-value

The proposed regression model is: Income = ß 0 + ß 1 (Age) +ß 2 (Education) + ß 3 (Wealth Inheritance) ….. (A) We are proposing that Income is the variable dependent on three independent variables: Age, Education and Wealth.

 0 is a constant. It measures the effect of other deterministic factors on Income not included in the model.  1,  2,  3 measure the effect of a marginal change in Age, Education and Wealth, respectively.

However, we recognise that there may be other random factors affecting the dependent variable Income. So we add a random variable  to the model which now becomes: Income = ß 0 + ß 1 (Age) +ß 2 (Education) + ß 3 (Wealth Inherited) +  ….. (B)

We use the least squares technique to estimate the model B. Therefore, our estimation of the proposed model B is Y e = *AGE *EDUCATION *WEALTH INHERITANCE Here Y e is the estimated value of income

is the estimate of ß 0, 8.85 is the estimate of ß 1,; is the estimate of ß 2 and 1.51 is the estimate of ß 3 The least-squares estimates of the ß- values are denoted by b-values. Thus, b 1 is the estimate of ß 1 and b 2 is the estimate of ß 2. In our case, b 1 = 8.85 and b 2 =

We next make the following assumptions on the specification of model B so that the least-squares method produces ‘good’ estimators.

i.  is normally distributed with mean 0 and an unknown variance  2 . In the context of the model B,  can be thought of as a luck factor which can be good (positive values) or bad (negative values), If the positive and negative values cancel out on average, we can say that mean value is 0.

The  values are uncorrelated across the population (Whether or not you are lucky does not influence my being lucky/unlucky) i. The  values have the same variance (  2  ) across it. (Every individual is exposed to the same extent/chance of good or bad luck)

The  values are uncorrelated with the independent variables Age, Education and Wealth Inheritance. (For example, an old person is as likely to be lucky as a young one; or a university graduate is as likely to be unlucky as someone with no A-levels).

We now test (at 10% significance) the following hypothesis: Education has a positive effect on income Step 1: Set up the hypotheses H 0 : ß 2 = 0 (Education has no effect) H 1 : ß 2 > 0(Education has a positive effect) one-tailed test

Step 2: Select statistic The estimator b 2 is the test-statistic Step3 : Identify the distribution of b 2

Best Linear in the dependent variable income Unbiased Estimator of  2 Assumptions i-iii above imply that b2 is

Since b 2 is unbiased, E(b 2 ) =  2 b 2 has a normal distribution because it is linear in Income Thus, b 2 ~ N(  2,  2 2 ) where    is unknown.

Therefore, the test statistic is t  (b 2 -  2 ) / (standard error of b 2 ) has a Student’s t-distribution with 20-4 = 16 d.o.f. Step 4: Construct test statistic We use the standard error of b 2 because we do not know what    is

EViews therefore gives us a t-statistic regarding education of As  2 = 0 under the null hypothesis (H0) t = b2 / (standard error of b 2 ) The corresponding probability value is

Select f x /TDIST. For X, enter , the t-Statistic value. The degree of freedom is 16. EViews calculates two-tail probability So number of tails is 2. You now get the 2-tail probability of from Excel. Since we are performing a one-tail test, take half the probability value, or

Step 5: Compare with critical value t C t C = for a one-tailed test with significance level (  ) = 0.1 and d.o.f. = 16 t C = <

Step 6 : Draw conclusion The test is significant. Reject H0 at 10% and at 5% ( < ) but not at 1% ( > ) Step 7: Interpret result The data supports (with at least 98% accuracy) the hypothesis that EDUCATION is an important explanatory variable affecting income.

The probability of a type 1 error is nothing but the area to the right of t- statistic, or In rejecting H 0, we are prone to make a Type 1 Error.

Example 2: Use output 2 to test the hypothesis (at 5% significance) that weightgain is proportional to foodvalue. H 0 : a = 0 (proportionality) H 1 : a  0 (non-proportionality) The estimator a is the test-statistic Step 1: Step 2: The Model :: y =  x +  and add the assumptions (Lec17)

Conditions: The explanatory variable X is assumed (1A) to be deterministic or non-random (1B) : to come from a ‘fixed’ population (1C) : to have a variance V(x) which is not ‘too large’ The above assumptions are best suited to a situation of a controlled experiment

Assumptions concerning the random term    (IIA) E(  i ) = 0 for all i (IIB) Var(  i ) =    constant for all i (IIC) Covariance (    j ) =  for any i and j (IID) Each of the  i has a normal distribution

Step 3: Thus, a~ N( ,   ) where   is unknown. Step 4: Therefore, the test statistic t  (a-  ) / (standard error of a) has a Student’s t-distribution with 10-2 = 8 d.o.f.

The p-value is < 0.05 Foodvalue is not the only variable that affects weightgain Step 6: Draw conclusion Step 5: Compare with critical value t C t C = > t C = for a two-tailed test with significance level (  ) = 0.05 and d.o.f.= 8 The test is significant. Reject H0 at 5% Step 7: Interpret

Use output 3 to test (at 5% significance) the following hypothesis: Exercise has a negative effect on weight gain The proposed regression model is: Weightgain = ß 0 + ß 1 (Foodvalue) +ß 2 (Exercise) +  Example 3:

Step 1: Set up the hypotheses H 0 : ß 2 = 0 (Exercise has no effect) H 1 : ß 2 < 0(Exercise has a negative effect)