Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study
Last week we looked at What is so special about environmental goods? Theory of consumer demand for market goods Welfare effects of a price change: Equivalent variation versus compensating variation Consumer demand for environmental goods Welfare effects of a quantity change: Equivalent surplus versus compensating surplus Theory and practise
Why econometrics? Analysis –To test the validity of economic theories Policy making –To test the outcome of different government economic policy moves Forecasting or prediction –To predict the value of other variables
What are the tasks? Specification –From an economic model to an econometric model Estimation Testing hypotheses Predictions
Specification – the function Include all relevant exogenous variables Functional form: linear relationship? Estimates parameters for and are constant for all observations
Specification – disturbance (1) Expected value is zero
Specification – disturbance (2) Variance is constant –Homoscedasticity vs. heteroscedasticity
Specification – disturbance (3) disturbances are not autocorrelated disturbances are normally distributed
Specification – disturbance (4)
OLS - Point estimates disturbance vs. residual
OLS – R 2
OLS – hypotheses testing T-test F-Test P values
Data and variables Data –Cross-section –Time-series –Panel data Variables –Continuous –Discrete including dummy variables –Proxy variables
Functional forms FunctionImplicit Price –Linear –Quadratic –Semi-log –Logarithm –Inverse
Functional forms - Diagnostics RESET test R 2 is of limited use Box-Cox test
Example using the SOEP data The German Socio-Economic Panel Study (SOEP) offers micro data for research in the social and economic sciences The SOEP is a wide-ranging representative longitudinal study of Germany‘s private households in Germany and provides information on all household members Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators The Panel was started in 1984; in 2005, there were nearly 12,000 households, and more than 21,000 persons sampled We use data on the level of a household for the year 1997 and perform an OLS regression with one explanatory variable We try to explain differences in square meter by differences in household income
Example results. use "C:\data\kdd\data1.dta", clear (SOEP'97 (Kohler/Kreuter)). regress sqm hhinc Source | SS df MS Number of obs = F( 1, 3124) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sqm | Coef. Std. Err. t P>|t| [95% Conf. Interval] hhinc | _cons |
Results: The estimated coefficients How do square meters occupied change with higher income? What is the estimated size given a certain income? Are the results significant? What does the confidence interval tell us How does the estimated size for a household compare to the observed size? sqm | Coef. Std. Err. t P>|t| [95% Conf. Interval] hhinc | _cons |
Estimates and observed values
Results: Analysis of variance Sum of squares The model is able to explain only little of the TSS (MSS=TSS-RSS) The higher MSS and the smaller the RSS the „better“ is our model Degrees of freedom We have 3125 total degrees of freedom (n-1) of which 1 is consumed by the model, leaving 3124 for the residual Mean square error Defined as the residual sum of squares divided by the corresponding degrees of freedom Source | SS df MS Model | Residual | Total |
Results: Model fit The F-statistic Tests that all coefficients except the intercept are zero In our example it has 1 numerator and 3124 denominator degrees of freedom The R-squared MSS/TSS=1-RSS/TSS The adjusted R-squared Takes changes in k and n into account The root mean square error Root MSE= Number of obs = 3126 F( 1, 3124) = Prob > F = R-squared = Adj R-squared = Root MSE =
Diagnostics Homoskedasticity: Expected value:
Diagnostics - 2. hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of sqm chi2(1) = Prob > chi2 =
Multiple regression. regress sqm hhinc hhsize east owner Source | SS df MS Number of obs = F( 4, 3120) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sqm | Coef. Std. Err. t P>|t| [95% Conf. Interval] hhinc | hhsize | east | owner | _cons |