Pieter Vlag ESSnet DWH: business register
Outline Central role of the statistical units, population frame, which includes number of enterprises, total turnover derived from the Value Added Tax (VAT) total employment derived from social security data. in a statistical DWH How to deal with different units in different sources ? Feedback of revised unit-, population-, turnover- and employment data in DWH to original sources (SBR, VAT, soc security data) 2
Definition of a statistical Datawarehouse 3 The broad definition of a data warehouse to be used in this ESSnet is therefore: ‘A common conceptual model for managing all available data of interest, enabling the NSI to (re)use this data to create new data/new outputs, to produce the necessary information and perform reporting and analysis, regardless of the data’s source.’
The statistical – DWH (1) 4 As staging area is “core business” for NSIs, term statistical DWH is used for staging area + WareHouse
The statistical – DWH (2) 5
Necessity of population frame 6 Datasource I: Admin data Datasource I: Survey 1 Datasource I: Survey 2 Datasource I: BIG DATA different sources cover different enterprises -> information about ? timing of availability sources differs -> when complete desc. available ?
Statistical-DWH with a population frame 7 Population. Datasource 1: admin data 1 Datasource 2: BIG DATA Datasource 3: survey 1 Datasource 4: survey 2 ADVANTAGE: the coverage of DWH is known (e.g. which enterprises are included in a DWH)
Units and target population 8 The population should be known for the preparation phase, integration phase and the actual datawarehouse datawarehouse; e.g. “about which enterprises info” its preparation phase ;e.g. when linking data sources Population aspects: Statistical unit (source: SBR) Number of enterprises (source: SBR) Turnover (source: VAT, via SBR ?) Employment (source: soc. sec, via SBR ?)
Proposal I 9 Only statistical unit (=enterprise) is used -for data-linking -processing in the statistical – DWH Justification: most obvious, ESSnet on Consistency, maintenance
Ideal world versus reality 10 In the ideal world only an unique ID for all enterprises exists the definition of the enterprises corresponds with the statistical unit In practice, several countries don’t have an unique ID different units exist (legal, tax….. ect.) Therefore…..
ESSnet DWH – business register 11 ENTERPRISE (=statistical unit) ENTERPRISE GROUP Legal unit “Accountìng” unit “VAT-unit” other units “other tax” units enterprise Enterprise Local unit LKAU KAU Enterprise group INPUTIN S-DWH processing OUTPUT
Unit base 12 Complexity of unit base depends on - scope of statistical-DWH -national legislation (practices) with respect to enterprise units Unit base closely related to Business Register. If compex, recommendation to place this base outside the Business registers - maintenance - more flexible in case of new in- and outputs - more transparent in case of linking errors
13 SBR Pop-frame VATempl. GSBPM 5.1: link & integrate GSBPM : “process” GSBPM : calculate aggregates Check processing “DATAWAREHOUSE” Position of Business Register in stat -DWH output 1 output 2 output 3 survey units tax BIG DATA other
SBR and statistical-DWH (1) 14 SBR = source units + population (number enterprises) VAT = source turnover Social security = source employment Population, turnover and employment together and integrated are the autentative source to which all other data are linked It is assumed that the autentative source is correct unless otherwise proven
SBR and statistical-DWH (2) 15 Does this mean that the SBR (and VAT and employment registers) is part of the statistical-DWH. Not necessarily, a copy of the population characteristics for period t can be derived from the SBR and used in the statistical-DWH PRO’s easier maintenance, not conflicts with surveys CON’s feedback to SBR in case of adjustments “SBR outside the statistical-DWH” (~ 50 % preference of NSIs) Alternatively, SBR integrated in the SBR “SBR inside the statistical-DWH” (~ 50 % preference of NSIs) PRO’s no feedback to SBR CON’s maintenance (especially with VAT + employment)
SBR and statistical-DWH (3) 16 Does this mean that totals of VAT-turnover and “register” employment are calculated within the SBR. Not necessarily, especially for STS and specialised low aggregate estimates knowledge of of (other sources of) the branche, thorough analyses Estimation techniques may be desired. In thise case a separate system for estimating VAT-turnover “register” employment is advised. Decision up to the NSIs.
17 SBR Pop-frame VATempl. GSBPM 5.1: link & integrate GSBPM : “process” GSBPM : calculate aggregates Check processing “DATAWAREHOUSE” Option of definition SBR in stat –DWH (2 extremes) output 1 output 2 output 3 survey units tax BIG DATA other
Feedback to SBR 18 Only if “SBR outside” In case of conflicting information between datasources and the authentic source (and indirectly SBR), two question When incorporating corrections in statistical DWH ? sure of influential error When incorporating corrections in SBR? at certain time periods (end of year ect.)
Last slide 19 Thank you for your attention, any questions or comments