Presentation is loading. Please wait.

Presentation is loading. Please wait.

Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan.

Similar presentations


Presentation on theme: "Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan."— Presentation transcript:

1 Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan Bentley, Salah Merad and Kevin Moore

2 Overview Motivation Modelling Evaluation of benefits to estimation

3 Motivation Employment Headcount– current size stratifier 0-9; 10-19; 20-49; 50-99; 100-299; 300+ Issues Burden on businesses with large number of Part Time employees Homogeneity of strata Full Time Equivalent (FTE) Employees – suggested as alternative FTE = Full Time + 0.5*Part Time

4 Motivation Updating of register via a sample survey - Business Register and Employment Survey (BRES) Large businesses updated every year Small businesses less often Regression Modelling – suggested to improve timeliness of frame data Predict Full Time & Part Time; or Full Time Equivalent – for every local unit

5 Data Available Survey Data (current Business Register) Employees Region Industry Age Time of last update Number of local units in enterprise group Administrative Data Employees (from PAYE – Pay As You Earn) Turnover (from VAT – Value Added Tax)

6 Data Structure BR BRS BRBRS PAYE BRBRSPAYE VAT BRBRSPAYEVAT at least one of

7 Regression Modelling FTE Dependent Variable Modelling for business <100 employment

8 Regression Modelling Model identified includes the following covariates: Register employees PAYE employees VAT turnover Number of local units in enterprise group Time of last update Region Industry Significant interactions of these

9 Variable Transformations

10 Log Transformation

11 Model Residuals

12 Model Residuals – After Noise Added

13 Test for Constant Variance Breusch-Pagan test for heteroscedasticity Squared residuals regressed against covariates in substantive model Under null hypothesis: ~ Strong evidence to reject the null hypothesis: residuals appear to have non constant variance

14 Explanatory Power of the Model R2R2 Full Model 81.5 Simple Model – register employees as only predictor 79.6

15 Domain analysis of R 2 R2R2 IndustrySimple Model Full ModelDifference Manufacturing 82.184.22.1 Electricity, Gas & Water 68.068.80.9 Construction 62.968.15.2 Wholesale 81.683.41.8 Hotels and Restaurants 66.373.37.0

16 Model validation by data splitting Full Data Training Validation 50% R2R2 Training 81.7 Validation 81.4

17 Model validation by bootstrap Full Data Bootstrap Sample Sample with replacement Efron (1983) Over optimism less than 0.05%

18 Back-transformation Simple back-transformation will give under- estimates of the dependent variable on the original scale Wooldridge (2000) gives an adjustment for the log back-transformation:

19 Benefits to business survey estimation Monthly Production Inquiry (MPI) Monthly Inquiry into Distribution Services Sector (MIDSS) Using an expansion estimator: Assuming Neyman allocation, variance due to stratification:

20 Impact on Monthly Surveys Variance Indicator Stratification VariableMPI Turnover MIDSS Turnover Register Employment32.4181.5 Register FTE31.9141.7 Modelled FTE31.6133.0

21 Concluding Remarks Model identified for predicting FTE employees High R 2 and high predictive power Non constant variance Large reliance on one covariate – employment headcount Benefits to sample design and estimation FTE a useful frame variable Greatest benefit to sampling in service industries Additional benefit from modelling appears small

22 Areas for further work Improvements to modelling Heteroscedasticity – Multilevel modelling? More recent data (2005 – 2008) BRES data Improvements to evaluation Impact on other business sample surveys Impact at industry level Impact under ratio estimation Correlations between modelled FTE and survey variables: FTE as auxiliary Pilot study

23 Questions? Thank you for listening Contact: alan.bentley@ons.gov.uk


Download ppt "Investigating improvements in quality of survey estimates by updating auxiliary information in the sampling frame using returned and modelled data Alan."

Similar presentations


Ads by Google