Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)

Slides:



Advertisements
Similar presentations
Evaluating the Effects of Business Register Updates on Monthly Survey Estimates Daniel Lewis.
Advertisements

Annual growth rates derived from short term statistics and annual business statistics Dr. Pieter A. Vlag, Dr. K. van Bemmel Department of Business Statistics,
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Goods for Processing / Toll Processing … a pragmatic approach What is toll processing? Why is toll processing used? What is the problem? How has ONS dealt.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
The estimation strategy of the National Household Survey (NHS) François Verret, Mike Bankier, Wesley Benjamin & Lisa Hayden Statistics Canada Presentation.
Editing and Imputing VAT Data for the Purpose of Producing Mixed- Source Turnover Estimates Hannah Finselbach and Daniel Lewis Office for National Statistics,
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Examining the use of administrative data for annual business statistics Joanna Woods, Ria Sanderson, Tracy Jones, Daniel Lewis.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for Managers Using Microsoft® Excel 5th Edition
Regression and Correlation
Statistics for Managers Using Microsoft® Excel 5th Edition
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Curve-Fitting Regression
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Slide Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Increasing Survey Statistics Precision Using Split Questionnaire Design: An Application of Small Area Estimation 1.
Maintenance of Selective Editing in ONS Business Surveys Daniel Lewis.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Chapter 12 Spatial Sharpening of Spectral Image Data.
Correlation and Regression
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics Pete Brodie & Kevin Moore UK Office for National Statistics.
Regression and Correlation Methods Judy Zhong Ph.D.
Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
Chapter 11 Simple Regression
Measuring the quality of regional estimates from the ABS Jennie Davies and Daniel Ayoubkhani.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Improvements in stratification in the UK's Office for National Statistics Pete Brodie, Martina Portanti & Emily Carless UK Office for National Statistics.
Improving the Design of UK Business Surveys Gareth James Methodology Directorate UK Office for National Statistics.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical.
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
The application of selective editing to the ONS Monthly Business Survey Emma Hooper Office for National Statistics
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.
1 Enhancing Small Area Estimation Methods Applications to Istat’s Survey Data Ranalli M.G. ~ Università di Perugia D’Alo’ M., Di Consiglio L., Falorsi.
Analysis of Residuals Data = Fit + Residual. Residual means left over Vertical distance of Y i from the regression hyper-plane An error of “prediction”
Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
SWBAT: Calculate and interpret the residual plot for a line of regression Do Now: Do heavier cars really use more gasoline? In the following data set,
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation & Regression Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Evaluating the benefits of using VAT data to improve the efficiency of editing in a multivariate annual business survey Daniel Lewis.
The Unscented Kalman Filter for Nonlinear Estimation Young Ki Baik.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
Regression and Correlation
Ratio and regression estimation STAT262, Fall 2017
Stats Club Marnie Brennan
OVERVIEW OF LINEAR MODELS
Product moment correlation
Sampling and estimation
N. Ganesh, Adrijo Chakraborty, Vicki Pineau, and J. Michael Dennis
Small area estimation for the Dutch Investment Survey
Presentation transcript:

Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)

Overview Introduction Generalised calibration estimation Fay-Herriot model Conclusions and further work

Introduction Quarterly survey of capital expenditure (Capex) –Sample size – 28,000 –Stratified by industry and size –Main user is National Accounts –Many zeros and some very large values Aim to reduce costs and respondent burden –Reduce the sample size whilst maintaining quality Investigated two strategies –Calibration estimation in cut-off sampling –Fay-Herriot model

Current cut-off sampling Not sample businesses with < 20 employees G-weights adjusted to account for this Sampled (20-299) Fully enumerated (300+) Not sampled (<20)

Extension of cut-off sampling Extend to a cut-off of < 50 employees Sample size reduced by about 9,000 Reduce bias introduced through cut-off sampling Sampled (50-299) Fully enumerated (300+) Not sampled (<50)

Relationship between acquisitions and employment

Direct calibration Find set of weights w i such that: distance (d,w) is minimised while Solution

Generalised calibration (Deville 2002) The set of calibration equations Can be generalised to yield the set of equations

Generalised calibration In context of cut-off sampling, Haziza et al (2010) assumed a linear function F of the form And obtained weights

Applying generalised calibration Cut-off set deterministically based on employment Consider two auxiliary variables: –x well correlated with variable of interest employment from the business register –z well correlated with probability of being above the cut-off turnover from the business register

Generalised calibration estimation 2008 sample data – Bands 2 to 4 3 Estimates –Ratio estimate using full sample data –Ratio estimate with extended g-weight adjustment –Generalised calibration estimate Relative difference compared to ratio estimate using full sample data

Results Relative difference (in %) compared to ratio estimate using full sample data PeriodRatio estimate with g- weight adjustment for band 2 Generalised calibration estimate Q %32.1% Q %41.6% Q %32.8% Q %37.0%

Industries with largest contribution to total acquisitions in size-bands 2-4

Summary – Extension of cut-off sampling Adjusted g-weights method performs better overall Generalised calibration estimation does not consistently improve on simple method in any industry Residual relationship between x and p

Fay-Herriot model Combine direct estimate with synthetic estimate Fay-Herriot aggregate level model fitted to obtain synthetic estimator i=1, 2, …,m

Fay-Herriot model - BLUP

Fay-Herriot model 2008 sample data Two variables - total acquisitions and total disposals Auxiliary variables for Fay-Herriot model –VAT turnover and expenditure Scaled estimates and auxiliary variables using the total number of employees Fitted mixed model

Plot of Residuals against Predicted (mixed model with no transformation)

Transformation Transformation needed Implementation of BLUP becomes complicated –noted by Chandra and Chambers, 2006

Plot of Residuals against Predicted (mixed model)

Plot of Residuals against Predicted (linear model without random effects)

Back transformation Used back transformation to obtain synthetic estimate (Chambers and Dorfman, 2003) Calculation of gamma - variance of random effects required back transformation

Evaluating use of Fay-Herriot model Gamma very high –Gamma using back transformation may not be suitable Investigated combined estimate using a fixed value for gamma Evaluation is via re-sampling –Reduced the sample size by 25% (about 6,000 units) –Repeated sub-sampling –Set gamma to 0.7 –Calculated a combined estimate

Evaluating use of Fay-Herriot model Estimated Bias and MSE of combined estimate

Results Average of the direct estimates very similar to the direct estimate from the full sample Variance of the synthetic estimate is small Variance of combined estimate lower than variance of direct estimate from full sample Bias is high in most industries –Relative bias also large High bias ratio resulted in higher Mean Square Error in most divisions

Results – Acquisitions Q DivisionPercentage bias (bands 2 to 4) Percentage bias (bands 2 to 5) Bias ratio Percentage difference in MSE Overall

Conclusions and further work Cut-off sampling with g-weight adjustment performed best –Know this has bias More work to be done –Impact on growth –Modelling at unit level –Additional covariates –Alternative estimation methods Model-based direct approach (Chandra and Chambers, 2006)

Questions