How Low Can House Prices Fall? (Quite a bit). Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis.

How Low Can House Prices Fall? (Quite a bit)

Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis test procedure using test statistics 3.Look at more general hypothesis tests a)Multiple coefficients b)Inequality hypotheses 4.Formalise a procedure for using regression for prediction

House Prices We all know the story of financial crisis – House price bubble – Banks borrow abroad to finance bubble – Bubble bursts and banks cant get money bank – Bankrupt banks bailed out by state – State bankrupted and bailed out by Troika See nama.biz if interested

Two Questions 1.Could we have used econometrics to test for the presence of a bubble? 2.Now that the bubble is burst, can we use econometrics to say how far have prices to fall? Yes to both

Immediate relevance Should you buy or rent now? How big is remaining hole in the banks? What about other bubbles? China?

Look at data Look at aggregate macro time series data of Irish house prices in housing.dta Stata: line psecd Certainly a rapid rise Was it justified? – Incomes and population were rising at the time – By enough? Econometrics can answer this question

House Prices

How to Answer the 2 Q Make use of conditional expectation interpretation of a regression – Recall that regression line gives E(Y|X) So we will use OLS to give us the expected price of a house conditional on income, population etc Answers 1.If the actual price is systematically above the expected price we have evidence of a bubble 2.After burst the price will fall to the conditional expectation

Extending OLS to Many Xs We need to understand how OLS works when there are many independent (RHS) variables Recall: E(Y|X)=  1 +  2 X Generalise to: E(Y|X)=  0 +  1 X 1i +  2 X 2i +  +  k X ki So the full model becomes: Y i =  0 +  1 X 1i +  2 X 2i +  +  k X ki + u i

Interpreting  k Each parameter  2,  3 …  k measures the isolated effect of x 2, x 3, x k on the dependant variable y Partial Regression coefficients. In terms of calculus  k is a partial derivative The effect of changing one variable while keep all others constant

Interpreting OLS OLS still gives the best line The only difference is that the “line” isn't a line any more, it is a multi-dimensional hyper plane The actual data still deviates from the “line” The “line” is still the conditional expectation So if confused use the intuition from the single RHS variable case

Intuition of single X case is still valid Show three data points for illustration

Maths of OLS The formulae for OLS are much more complicated Really need matrix algebra to write them down But idea is same – Choose estimates b 0 …b k to minimise the sum of squared deviations Computer does it for us with the regress command

A Preliminary Answer As with many projects we first need an economic model – Just like Keynes consumption function Our model will assert that real house prices are a function of per capita real income and the per capita housing stock Need to generate variables from the raw data Generate – real house prices:gen price=psecd/p*100 – Real per capita income: gen inc_pc=gni/pop*1000 – Real pc housing stock: gen hstock_pc=hstock/pop

OLS Estimates of Model regress price inc_pc hstock_pc Source | SS df MS Number of obs = 41 -------------+------------------------------ F( 2, 38) = 210.52 Model | 6.7142e+11 2 3.3571e+11 Prob > F = 0.0000 Residual | 6.0598e+10 38 1.5947e+09 R-squared = 0.9172 -------------+------------------------------ Adj R-squared = 0.9129 Total | 7.3202e+11 40 1.8301e+10 Root MSE = 39934 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- inc_pc | 16.15503 2.713043 5.95 0.000 10.66276 21.6473 hstock_pc | 124653.9 389266.5 0.32 0.751 -663374.9 912682.6 _cons | -190506.4 80474.78 -2.37 0.023 -353419.1 -27593.71 ------------------------------------------------------------------------------

Predicting Prices We can use this model to predict prices Recall that OLS gives us an estimate of the conditional expectation function Put in the numbers – E(Y|X)=  0 +  1 X 1i +  2 X 2i +  +  k X ki – E(Price|inc_pc, hstock_pc)= -190506.4 + 124653.9*hstock_pc+ 16.15503*inc_pc

Predicting Prices We can use generate command to create a variable with this conditional expectation – gen pred=-190506.4 + 124653.9*hstock_pc+ 16.15503*inc_pc So useful there is a special command – predict pred Compare the two on a graph – line price pred year

Predicted vs Actual Prices

Interpretation The predicted line is our estimation of what we would expect the price to be given the level of income and the housing stock The actual price was systematically above this expected price from 2003 This could be seen as evidence of a bubble Note how it fits in to the discussion of the time which said that prices went up but this was justified given the rise in incomes We show that prices went up more than we would “expect” for the observed increase in incomes So we show that prices too high in a precise sense

More can be done We can test hypotheses about the variables to see if they have effects consistent with theory If the effects were different from our preconceived notions we may be wary of trusting the estimates – i.e. we got a fluke sample

Housing Stock Has no Effect? 1.Test the null hypothesis that housing stock has no effect on prices H 0 :  H = 0 H 1 :  H ≠ 0 2.Calculate the distribution of b OLS assuming that H 0 is true. 3.Find our estimate on the distribution 4.What is the probability that our estimate would have come from this distribution? 5.Does this lead us to believe the null hypothesis?

Quite possible to get an estimate of 124653.9 if the true value is 0.0 Note that this has nothing to do with the scale of the estimate. The estimate is a big number but it is not statistically different from zero. Calculate the probability – P(b OLS ≥124653.9|  H =0.0)= – P(z ≥(124653.9-0)/(389266.5)) – P(z ≥0.32)= 0.37 Clearly this is much larger than usual threshold values of 5%,10% or 1% So we cannot reject the null hypothesis

Comments We could reject the null if our threshold was 40% – Seems very extreme – Think of criminal trial metaphor Cannot reject idea that effect of housing stock is zero even though the estimated effect is 120000! – Scale of coeff has NO impact on statistical significance Result does seem unlikely as contradicts theory – How to resolve this contradiction? – Look carefully at both theory and estimates – Sniff test! Can simplify test procedure

Test Statistics Clearly a large degree of commonality between our tests even though they were on different data So we can systematize things a little better The key part of each test was calculating Z using one of the key properties of normal distributions

So now we only ever have to deal with one distribution, the “standard” normal The two diagrams correspond but the z distribution will be the same every time Note also how the construction of Z explicitly removes the issue of scale – Stn err has same scale as coeff. Stream-lined Test procedure 1.State Hypothesis 2.Calculate Z assuming H 0 is true 3.Now we can compare the calculated values of Z with the standard normal distribution

The Housing Stock Example 1.State the Hypothesis we want to test H 0 :  H = 0 H 1 :  H ≠ 0 2.Calculate the test statistic assuming that H 0 is true. z =(124653.9-0)/(389266.5)=0.32 3.Find our estimate on the distribution – Either find the test statistic on the standard normal distribution – Or compare with one of the traditional threshold (“critical”) values: 2.58(1%), 1.96 (5%), 1.64(10%) 4.|Z|<all the critical values 5.So we cannot reject the null hypothesis

Comment We will reject the idea that  H = 0.0 if there is overwhelming evidence that  H is bigger or smaller The evidence is our estimate (120000) Is this big enough? It looks huge But the standard error is huge also – so a very wide distribution of estimates So probability of a large estimate arriving by fluke is high Remove the scale from the problem by calculating the test statistic: Z=0.32

Comment Is this big enough? Traditionally 1.96 would be the “critical value” because of 5% probability of |Z|>1.96 as fluke “beyond reasonable doubt” Free to decide for ourselves (p-value) – p-value=Pr(|Z|>0.32)=0.75

Issues in Hypothesis Testing Test of significance “t-test” Rule of thumb General procedure Significance level P-value

Test of Significance A test of H 0 :  = 0 is given the special name of “test of significance” Test statistic is simple Z=(b OLS –  se(b OLS )= b OLS /se(b OLS ) Which is calculated by most statistical software Simple eyeball test of significance Variable is or is not “statistically significant” Not the same as economically significant

t-test Strictly speaking the Z test is only valid when , the variance of u is known as it is used to calculate se(b)  will almost never be known and will have to be estimated When it is estimated the distribution of the estimator (and therefore the test statistic) is no longer normal Has a t-distribution Typically thicker tails than normal. Why?

t-test The precise shape of the t distribution depends on degrees of freedom: N-K – N is number of observations – K is the number of variables – So the critical values will vary with N-K Fortunately t≈Z when N-K is large Stata reports t-test for statistical significance automatically (see over)

OLS Estimates of Model regress price inc_pc hstock_pc Source | SS df MS Number of obs = 41 -------------+------------------------------ F( 2, 38) = 210.52 Model | 6.7142e+11 2 3.3571e+11 Prob > F = 0.0000 Residual | 6.0598e+10 38 1.5947e+09 R-squared = 0.9172 -------------+------------------------------ Adj R-squared = 0.9129 Total | 7.3202e+11 40 1.8301e+10 Root MSE = 39934 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- inc_pc | 16.15503 2.713043 5.95 0.000 10.66276 21.6473 hstock_pc | 124653.9 389266.5 0.32 0.751 -663374.9 912682.6 _cons | -190506.4 80474.78 -2.37 0.023 -353419.1 -27593.71 ------------------------------------------------------------------------------ T-test of statistical significance P-value of the T-test

Rule of Thumb Easy to “learn off” test procedure – calculate the test statistic – Reject hypothesis if test statistic>2 in absolute terms – Useful for “eyeball” tests of significance – Works because most critical values are below 2 Stata command “test” does the whole procedure automatically – But does “F-test” – This is the square of t-test

T-Test of Two sided Hypotheses 1.State the null and alternative hypothesis – H 0 :  = c H 1 :  = c – Note when the hypothesis is “two sided” the null be rejected if estimate is very big or very small 2.Choose a significance level i.e. the threshold for rejection (denoted by  ) – 1%, 5%, 10% or another – Free to choose but there are consequences (later) 3.Calculate the test statistic – T=(b-    /se(b)

4.Find the test statistic on the distribution Two methods Hint: always helps to draw the distribution A.The Critical Value method a)Find the critical values on the distribution. Look up tables and/ or stata b)You will need significance level and degrees of freedom c)The +/- critical values define rejection region d)Reject null if in the rejection region i.e. if the test statistic is greater than the critical value in absolute terms

B.The P-value method a)Find the probability that a draw from the distribution of the test static would be greater in absolute terms than the actual value of the test statistic observed b)The p-value is twice this calculated value c)Reject the null if p<  5.Clearly state the result noting the significance level – This is very important – Can reject at one  and fail to reject at another

The Housing Stock Example Again 1.H 0 :  H = 0 H 1 :  H ≠ 0 (clearly two sided) 2.Significance level: choose 1% 5% 10% 3.Calculate the test statistic assuming that H 0 is true. t =(124653.9-0)/(389266.5)=0.32 4.Find our estimate on the distribution A.Critical value method a)Df=41-3=38: SL 1,5,10 use stata command: di invttail(38,0.005) b)Or look up tables c)Critical values are : 2.71; 2.02; 1.68 (hint: check makes sense on diagram) d)T-stat is less than all the critical values for all significance level B.P-value method a)Pr(|t|>032)=pr(t>0.32)+pr(t 0.32) b)Stata command: di ttail(38,0.32) c)P value is 0.75 d)P>alpha 5.So we cannot reject the null hypothesis at the 1%, 5% or 10% significance levels

Comments and Hints Always draw the diagram and label it clearly Check on the diagram that higher critical values correspond to lower significance level Both imply smaller rejection region, so less likely to reject Remember these tests are two sided – Two regions each with half the significance level – Careful when looking up the critical values – Why? We can reject if extremely small or large t

Up to you whether use critical value method or p- value method – Critical value easier initially – P value more common now because of computers – Need to understand both Always indicate the significance level that you are working with – Crucial for exam You are free to choose  – Certain values are typically used but this is convention – One reason for popularity of p-values, can see instantly at what  you would reject

Choosing Significance Level Roughly: Probability of a fluke When we choose a critical value we choose a significance level also: – 2.58(1%), 1.96 (5%), 1.64(10%) for large df If we reject the null because |t|>1.96, we say we reject the null at the 5% significance level. We acknowledge that there is a 5% chance that t>1.96 even though the null is true This is type 1 error: Rejecting a true Null – Criminal trial: Convicting the innocent

The test is set up make this as low as possible – i.e. reject only if overwhelming evidence Why not make it zero? Cant because would never reject any null – Criminal: always acquit Type II error: fail to reject a false null All This matters because setting up a hypothesis is setting up a procedure that is deliberately biased against rejecting – Compare the size of rejection region Make sure that is what you want for your null

How Low Can House Prices Fall? (Quite a bit). Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis.

Similar presentations

Presentation on theme: "How Low Can House Prices Fall? (Quite a bit). Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

How Low Can House Prices Fall? (Quite a bit). Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis.

Similar presentations

Presentation on theme: "How Low Can House Prices Fall? (Quite a bit). Learning Outcomes 1.Expand the regression model to allow for multiple X variables 2.Formalise the hypothesis."— Presentation transcript:

Similar presentations

About project

Feedback