Implementation of a more efficient way of collecting data SBS: use of administrative data Statistics Belgium June 2009
Content Prefill: rearrange data streams Electronic data collection Streamlining statistical production Sampling Estimation methods
Prefill: rearrange data streams Before: different parts via 2 separate channels: Postal survey with follow-up mailings Annex to balance sheet (XML or paper) Postal survey starting April replaced by websurvey starting September annual accounts must be filed with the Central Balance Sheet Office within seven months after the end of the financial year (sub)totals from Balance Sheet prefilled in webform link between Company Accounts and questionnaire is clarified Better understanding of what is asked Coherence of data submitted by respondent Websurvey application allows respondent to upload data from Company Accounts (XBRL) no need for separate channels
Prefill: rearrange data streams Timely access to Balance Sheet data becomes more important Before: Data available via FTP Manual intervention: 4 persons Now: Data available via Web Service Update runs fully automated: application runs scheduled on server Latest Balance Sheet data prefilled in SBS webform
Electronic data collection Basis: Eurostat XBRL reference architecture Developed in XBRL Pilot Project EU Public Licence http://forge.osor.eu/projects/eurostat-xbrl/ Added: Generation of prefilled forms Upload of stored instances Integration with user management Complex business rules Printing to Pdf Enhanced look and feel
Electronic data collection 95% of balance sheets are filed electronically with the National Bank (NBB) Major accounting software vendors have incorporated XBRL reporting SBS taxonomy is integrated with NBB taxonomy: 1 “Data Type” and 1 “Value List” at national level Software vendors are ready to create SBS XBRL instances Respondents can import accounts data at the push of a button
Electronic data collection Login at NSI’s website List of forms and their status
Electronic data collection Import XBRL Instance Fill out webform
Streamlining statistical production Example: administrative data Centralised validation and cleaning of administrative data Systematic use of statistical software to detect and prevent errors Module code versioning Quality reporting
Sampling Generalized Hidiroglou-Lavallée versus Ad-hoc strata bounds Two competing options Generalized Hidiroglou-Lavallée versus Ad-hoc strata bounds Stratification designs: Bi-dimensional strata (geography and sector of activity) Each stratum in 3 classes Large (exhaustive) Medium (simple random sampling) Small (0-probability of being sampled) Once the sample target has been re-defined by dropping the statistical units which are out-of scope according to an economic criterion, the definition of the strata bounds should follow. This is achieved by a modified Lavallée-Hidiroglou (HL) algorithm for log-linear and heteroscedastic linear regression relationship between the auxiliary variable and the target variable, using the Neyman allocation Sector of activity X Size class (4-digit NACE) Strata employees and/or VAT in millions € 5 50+ - 4 20 – 49 or > 8,00 3 10 – 19 4,00 - 8,00 2 5 - 9 2,00 - 4,00 1 1 - 4 0,80 - 2,00 and < 0,80
Sampling Simulation study based on SBS sampling frame:
Estimation methods Sampling and estimation: One strategy Strata by: activity (4-digit NACE) 8 size classes (number of employees and/or VAT threshold) Exhaustive sample in largest stratum (holds at least 50% of turnover in sector of activity) Other strata: Positive coordination with T in T+1 and T+2, followed by negative coordination T: questionnaire, T+1 and T+2: estimation, T+3: questionnaire… Rotating Sample Design: sample renewed consecutively in Industry, Commerce, Services Ndeyalkova, D., Pea, J. and Tillé, Y. (2008a). Sampling Procedures for Coordinating Stratified Samples: Methods Based on Microstrata. Int. Stat. Rev., 76, 368-386.
Estimation methods In this example: All sampled enterprises in Industry receive questionnaire New enterprises that enter in sample receive questionnaire Total sample from +/- 41.000 to 55.000 to accommodate for precision loss by estimating Number of enterprises that receive a questionnaire every year reduced by almost 50% Total number of enterprises that receive a questionnaire reduced by 35% (average)