Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Overview Introduction Generalised calibration estimation Fay-Herriot model Conclusions and further work
Introduction Quarterly survey of capital expenditure (Capex) –Sample size – 28,000 –Stratified by industry and size –Main user is National Accounts –Many zeros and some very large values Aim to reduce costs and respondent burden –Reduce the sample size whilst maintaining quality Investigated two strategies –Calibration estimation in cut-off sampling –Fay-Herriot model
Current cut-off sampling Not sample businesses with < 20 employees G-weights adjusted to account for this Sampled (20-299) Fully enumerated (300+) Not sampled (<20)
Extension of cut-off sampling Extend to a cut-off of < 50 employees Sample size reduced by about 9,000 Reduce bias introduced through cut-off sampling Sampled (50-299) Fully enumerated (300+) Not sampled (<50)
Relationship between acquisitions and employment
Direct calibration Find set of weights w i such that: distance (d,w) is minimised while Solution
Generalised calibration (Deville 2002) The set of calibration equations Can be generalised to yield the set of equations
Generalised calibration In context of cut-off sampling, Haziza et al (2010) assumed a linear function F of the form And obtained weights
Applying generalised calibration Cut-off set deterministically based on employment Consider two auxiliary variables: –x well correlated with variable of interest employment from the business register –z well correlated with probability of being above the cut-off turnover from the business register
Generalised calibration estimation 2008 sample data – Bands 2 to 4 3 Estimates –Ratio estimate using full sample data –Ratio estimate with extended g-weight adjustment –Generalised calibration estimate Relative difference compared to ratio estimate using full sample data
Results Relative difference (in %) compared to ratio estimate using full sample data PeriodRatio estimate with g- weight adjustment for band 2 Generalised calibration estimate Q %32.1% Q %41.6% Q %32.8% Q %37.0%
Industries with largest contribution to total acquisitions in size-bands 2-4
Summary – Extension of cut-off sampling Adjusted g-weights method performs better overall Generalised calibration estimation does not consistently improve on simple method in any industry Residual relationship between x and p
Fay-Herriot model Combine direct estimate with synthetic estimate Fay-Herriot aggregate level model fitted to obtain synthetic estimator i=1, 2, …,m
Fay-Herriot model - BLUP
Fay-Herriot model 2008 sample data Two variables - total acquisitions and total disposals Auxiliary variables for Fay-Herriot model –VAT turnover and expenditure Scaled estimates and auxiliary variables using the total number of employees Fitted mixed model
Plot of Residuals against Predicted (mixed model with no transformation)
Transformation Transformation needed Implementation of BLUP becomes complicated –noted by Chandra and Chambers, 2006
Plot of Residuals against Predicted (mixed model)
Plot of Residuals against Predicted (linear model without random effects)
Back transformation Used back transformation to obtain synthetic estimate (Chambers and Dorfman, 2003) Calculation of gamma - variance of random effects required back transformation
Evaluating use of Fay-Herriot model Gamma very high –Gamma using back transformation may not be suitable Investigated combined estimate using a fixed value for gamma Evaluation is via re-sampling –Reduced the sample size by 25% (about 6,000 units) –Repeated sub-sampling –Set gamma to 0.7 –Calculated a combined estimate
Evaluating use of Fay-Herriot model Estimated Bias and MSE of combined estimate
Results Average of the direct estimates very similar to the direct estimate from the full sample Variance of the synthetic estimate is small Variance of combined estimate lower than variance of direct estimate from full sample Bias is high in most industries –Relative bias also large High bias ratio resulted in higher Mean Square Error in most divisions
Results – Acquisitions Q DivisionPercentage bias (bands 2 to 4) Percentage bias (bands 2 to 5) Bias ratio Percentage difference in MSE Overall
Conclusions and further work Cut-off sampling with g-weight adjustment performed best –Know this has bias More work to be done –Impact on growth –Modelling at unit level –Additional covariates –Alternative estimation methods Model-based direct approach (Chandra and Chambers, 2006)
Questions