Dummy Variables. Introduction Discuss the use of dummy variables in Financial Econometrics. Examine the issue of normality and the use of dummy variables.

Slides:



Advertisements
Similar presentations
Econometric Modelling
Advertisements

Functional Form and Dynamic Models
Multiple Regression.
Autocorrelation and Heteroskedasticity
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Ordinary least Squares
Regression Analysis.
Chapter 9: Simple Regression Continued
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Hypothesis Testing Steps in Hypothesis Testing:
Correlation and Regression
Hypothesis Testing IV Chi Square.
4.2.2 Inductive Statistics 1 UPA Package 4, Module 2 INDUCTIVE STATISTICS.
Chapter 10 Simple Regression.
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Useful Statistical Distributions for Econometrics Econometrics is usually concerned with the estimation of equations of the form: The normal distribution.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Chapter 11: Inference for Distributions
Ch. 14: The Multiple Regression Model building
CHAPTER 6 ECONOMETRICS x x x x x Dummy Variable Regression Models Dummy, or indicator, variables take on values of 0 or 1 to indicate the presence or absence.
Basic Analysis of Variance and the General Linear Model Psy 420 Andrew Ainsworth.
Lecture 5 Correlation and Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
Hypothesis Testing for Variance and Standard Deviation
ANOVA One Way Analysis of Variance. ANOVA Purpose: To assess whether there are differences between means of multiple groups. ANOVA provides evidence.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations10 ANOVA dfSSMSF Regression
Environmental Modeling Basic Testing Methods - Statistics III.
Testing Regression Coefficients Prepared by: Bhakti Joshi February 06, 2012.
STAT 497 LECTURE NOTE 9 DIAGNOSTIC CHECKS 1. After identifying and estimating a time series model, the goodness-of-fit of the model and validity of the.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Stochastic Error Functions I: Another Composed Error Lecture X.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Lecture 32 Summary of previous lecture PANEL DATA SIMULTANEOUS EQUATION MODELS.
Statistical Inferences for Population Variances
Inference concerning two population variances
Spearman’s Rho Correlation
Introduction to Hypothesis Test – Part 2
Inference for Two-Samples
Chapter 7 Hypothesis Testing with One Sample.
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
3.1 Examples of Demand Functions
Prediction, Goodness-of-Fit, and Modeling Issues
Part Three. Data Analysis
BIVARIATE REGRESSION AND CORRELATION
STAT 497 LECTURE NOTE 9 DIAGNOSTIC CHECKS.
Test for Normal Distribution
CHAPTER 29: Multiple Regression*
Alternative Investments and Risk Measurement
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Chapter 9 Hypothesis Testing
Chapter 7 Hypothesis Testing with One Sample.
Chapter 23 Comparing Means.
Logistic Regression --> used to describe the relationship between
Review of Statistical Inference
Statistical Process Control
Hypothesis Testing.
Shape of Distributions
One way ANALYSIS OF VARIANCE (ANOVA)
Hypothesis Tests for a Standard Deviation
Simple Linear Regression
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Financial Econometrics Fin. 505
CLASS 6 CLASS 7 Tutorial 2 (EXCEL version)
Adding variables. There is a difference between assessing the statistical significance of a variable acting alone and a variable being added to a model.
Testing a Claim About a Standard Deviation or Variance
DRQ #10 AGEC pts October 17, 2013 (1 pt) 1. Calculate the median of the following sample of observations for a variable labeled.
Confidence and Prediction Intervals
Presentation transcript:

Dummy Variables

Introduction Discuss the use of dummy variables in Financial Econometrics. Examine the issue of normality and the use of dummy variables to correct any problem Show how dummy variables affect the regression Assess the use of intercept and slope dummy variables

The Normality Assumption In general we assume the error term is normally distributed. Financial data often fails this assumption due to the volatile nature of the data and the numbers of outliers. The normality of the error term can be tested using the Bera-Jarque test, which tests for the presence of skewness (non- symmetry) and kurtosis (fat tails)

Bera-Jarque Test This test for normality in effect tests for the coefficients of skewness and excess kurtosis being jointly equal to 0

Bera-Jarque Test The statistic follows the chi-squared distribution with 2 degrees of freedom. The null hypothesis is that the distribution is normal. i.e. if we get a Bera-Jarque statistic of 4.78, the critical value is 5.99 (5%), then as 4.78<5.99 we would accept the null hypothesis that the error term is normally distributed. Most computer programmes report this statistic.

Remedies for non-normality The non-normality is often caused by a couple of observations in the tails of the distribution, these observations are often termed outliers. The simplest way to solve the problem is to use a dummy variable, often called an impulse dummy variable, which takes the value of 0, except the one outlier observation which takes the value of 1. This has the effect of forcing the residual for this observation to 0. To determine where the outlier is, we could simply plot the residuals against time.

Non-normality The use of this type of dummy variable is controversial, as some argue it is an artificial method of improving the regression, by in effect removing the influence of this particular observation. However an outlier can have an excessively strong effect on a model, giving an unrealistic result, so needs to be taken into account.

Dummy Variable for Single Outlier In a regression of stock prices against income for the UK, an outlier was noticed for 1992 month 9, when the UK left the ERM. A dummy variable was added to account for this. This produced the following result:

Dummy Variables The previous set of results can be interpreted in the usual way, in this case the dummy variable has a significant t-statistic (4), so the outlier has a significant effect on the regression, or put another way the UK leaving the ERM had a significant effect on UK stock prices. In many cases however the outlier will be more difficult to interpret and may not correspond to a particular event.

Dummy Variables Dummy variables are discrete variables taking a value of 0 or 1. They are often called on off variables, being on when they are 1. Dummy variables can be used either as explanatory variables or as the dependent variable. When they act as the dependent variable there are specific problems with how the regression is interpreted, however when they act as explanatory variables they can be interpreted in the same way as other variables.

Types of Explanatory Dummy Variable Qualitative dummy variables: i.e. age, sex, race, health. Seasonal dummy variables: depends on the nature of the data, so quarterly data requires three dummy variables etc. Dummy variables that represent a change in policy: –Intercept dummy variables, that pick up a change in the intercept of the regression –Slope dummy variables, that pick up a change in the slope of the regression

Dummy Variables If y is a teachers salary and Di = 1 if a non-smoker Di = 0 if a smoker We can model this in the following way:

Dummy Variables This produces an average salary for a smoker of E(y/Di =0) =. The average salary of a non-smoker will be E(y/Di = 1) = +. This suggests that non-smokers receive a higher salary than smokers.

Dummy Variables Equally we could have used the dummy variable in a model with other explanatory variables. In addition to the dummy variable we could also add years of experience (x), to give:

Dummy Variables α α+β Non-smoker Smoker y x

Seasonal Dummy Variables The use of seasonal dummy variables is widespread in finance due to the day of the week effect on asset prices. They take the same format as other dummy variables, i.e. a January dummy variable would consist of 0, except every observation in January which has the value of 1. For monthly data, we include 11 dummy variables, quarterly data 3 etc. i.e. we have as many dummies as months, quarters etc minus 1. The excluded month acts as the reference category, i.e. all the other dummies refer to differences between themselves and this reference month.

Seasonal Dummy variables If we have the following model of share prices for a gas and electricity firm, where the share price is regressed against 3 dummy variables. (Using quarterly data)

Seasonal Dummy variables The regression can not be carried out if all the seasonal dummies are added (i.e. 4 for quarterly data), as there is perfect multicollinearity Although we can use the t-test to determine if the seasonal dummy is significant, we usually use an F-test to determine if they are jointly significant.

Slope Dummy Variables The type of dummy variable considered so far is the intercept dummy variable, we could also use dummy variables to model changes in the slope of the regression line, these are known as slope or interaction dummy variables. We can include either types of dummy variable or more commonly both types in a regression, to account for changes in the intercept and slope of the regression line.

Slope Dummy Variables The slope dummy variable consists of a term which is the product of an explanatory variable and dummy variable (Dx):

Slope Dummy Variable Given the following results from a demand for bank loans (bl) model, with house prices (hp) as the explanatory variable. The dummy variable takes the value of 0 before 1979 and 1 afterwards. The slope dummy is going to determine the change in lending as a result of changes to the credit laws, i.e. it is easier to borrow based on the value of a persons house.

Slope Dummy variables We then get two separate regression lines, before and after 1979, with different intercepts and slope coefficients:

Test for Structural Stability Although the Chow test is usually used to test for a structural break, an alternative test involving the dummy variables can also be used. It involves running two regressions, one with the dummy variables (unrestricted model) and collecting the RSS. The other regression excludes the dummy variables (restricted model) and collect this RSS. Use the F-test formula to produce the F-statistic and compare with the critical values, the null hypothesis being that the regression is structurally stable.

The Dummy Variable Approach to Testing for a Structural Break Instead of two separate regressions on each sub-sample, as in the Chow test, we just need the single regression with the dummy variables (as well as without the dummy variables) The dummy variable approach allows us to test a variety of hypotheses about any structural break The dummy variable approach allows us to determine if it is the intercept or slope that is different Using the Chow test requires testing of sub- samples, which reduces the degrees of freedom

Conclusion When running a regression, we assume the error term is normally distributed The Bera-Jarque test is used to determine if the error term is normally distributed. To overcome non-normality, we can use an impulse dummy variable to account for any outliers. Dummy variables have a variety of uses, mostly being used to model qualitative effects Dummy variables can be in either intercept or slope form.