Dealing with location in the valuation of office rents in London Multilevel and semi-parametric modelling Aniel Anand, 1 st July 2015.

Slides:



Advertisements
Similar presentations
Autocorrelation Functions and ARIMA Modelling
Advertisements

Properties of Least Squares Regression Coefficients
Introduction: Correlation and Regression The General Linear Model is a phrase used to indicate a class of statistical models which include simple linear.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Forecasting Using the Simple Linear Regression Model and Correlation
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation and Regression
Correlation and regression
Describing Relationships Using Correlation and Regression
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Basic geostatistics Austin Troy.
Bivariate Regression Analysis
Correlation and Autocorrelation
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Regression and Correlation
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
The Simple Regression Model
Basic Statistical Concepts
Regression line – Fitting a line to data If the scatter plot shows a clear linear pattern: a straight line through the points can describe the overall.
Statistics Psych 231: Research Methods in Psychology.
Applied Geostatistics
Chapter Topics Types of Regression Models
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
SA basics Lack of independence for nearby obs
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Simple Linear Regression
Correlation & Regression
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Correlation and Regression
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Simple Linear Regression Models
Biostatistics Unit 9 – Regression and Correlation.
Prior Knowledge Linear and non linear relationships x and y coordinates Linear graphs are straight line graphs Non-linear graphs do not have a straight.
Ch4 Describing Relationships Between Variables. Pressure.
Regression. Population Covariance and Correlation.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Ordinary Least Squares Regression.
PS 225 Lecture 20 Linear Regression Equation and Prediction.
Yesterday Correlation Regression -Definition
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation The apparent relation between two variables.
Review #2.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Stats Methods at IC Lecture 3: Regression.
Regression Analysis AGEC 784.
Statistics 101 Chapter 3 Section 3.
AP Statistics Chapter 14 Section 1.
Regression and Correlation
A Spatial Analysis of the Central London Office Market
Virtual COMSATS Inferential Statistics Lecture-26
More on Specification and Data Issues
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Goodness of Fit The sum of squared deviations from the mean of a variable can be decomposed as follows: TSS = ESS + RSS This decomposition can be used.
1.7 Nonlinear Regression.
Regression & Prediction
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
BOX JENKINS (ARIMA) METHODOLOGY
Presentation transcript:

Dealing with location in the valuation of office rents in London Multilevel and semi-parametric modelling Aniel Anand, 1 st July 2015

Background Data collected by the Valuation Office Agency Developing analytical tools to assist in the valuation of property Valuation professionals indicate that location is a major driver of property valuation What analytical challenges does this present?

Why is location important? Properties within the same location are likely to have similar rents

Impact if location is not captured Excluding location from the regression can result in: -Omitted Variable Bias -Spatially correlated error terms -Under-estimated standard errors Y = β 0 + β 1 X 1 + β 2 X 2 + ε Y = β 0 + β 1 X 1 + ε C(ε i,ε j ) ≠ 0

OLS regression – England and Wales Rent premiums in major cities and coastal towns Clustered residuals show there is a location effect How can this location effect be measured?

Identifying Spatial Autocorrelation - Variograms Shows relationship between the average correlation between residuals and the geographic distance between residuals. Spatial autocorrelation occurs at the upward part of the graph, before the residuals tail off.

Identifying Spatial Autocorrelation - Moran’s I test -Values range from −1 (indicating perfect dispersion) to +1 (perfect correlation). - A zero value indicates a random spatial pattern. H 0 : There is no spatial autocorrelation H 1 : There is spatial autocorrelation Convert Moran’s I values to a Z score. Test at 5% level of significance and compare the p values.

OLS regression – including location Spatial autocorrelation still evident. More complex regression models are required. Including locational characteristics in regression

Multilevel modelling Controlling for different levels of spatial hierarchy Decomposes the variance such that the Level 1 residuals, ε, are uncorrelated

Building the multilevel model Model selection Choose model with the lowest AIC, using forward and backwards selection methods. 2 level random intercept / slope model (LA level) RI:Y ij = β 0 + β 1 x ij + β 2 x j + u 0j + ε ij RS:Y ij = β 0 + β 1 x ij + β 2 x j + u 0j + u 1j x j + ε ij Extending to 4 levels (down to LSOA level) Y ijkl = β 0 + β 1 x ijkl + β 2 x jkl + β 3 x kl + β 4 x l + u 0jkl + u 0kl + u 0l + u 1jkl x jkl + ε ijkl

Chosen multilevel model – Inner London Multilevel model has reduced spatial autocorrelation effect. Spatial autocorrelation occurs for properties up to about 130 metres apart from each other. Can we do any better?

Generalised Additive Model Relaxing the constraint of modelling location by a linear function. Use the x,y co-ordinates of each property to model the spatial component more flexibly. Y i = β 0 + β 1 x i + s(x 1i, x 2i ) + ε i where s(x 1i, x 2i ) is the smoothing effect for location. Reduces to the standard OLS model when there is no smoothing effect.

Non-linear effect of location The deviations of the spatial smoothing parameter away from zero indicate the extent of the non-linear effect that location has in the model. There are several distinct pockets of high rent areas in London, which is why a non- linear model is more appropriate.

Chosen GAM – Inner London Spatial autocorrelation reduced to properties up to about 20 metres apart from each other However, GAM shows evidence of spatial heterogeneity (negative Moran’s I). GAM possibly over-compensates for the spatial effect.

Conclusion The root mean square error (RMSE) is used to assess the prediction precision of the model. Multilevel model over- fits the spatial element of the model. Rents of neighbouring properties does little to add to the predictive power of the model. Which model should we choose?

Questions