Presentation is loading. Please wait.

Presentation is loading. Please wait.

“The infrastructure for the SBS-Frame production in ISTAT”

Similar presentations


Presentation on theme: "“The infrastructure for the SBS-Frame production in ISTAT”"— Presentation transcript:

1 “The infrastructure for the SBS-Frame production in ISTAT”
National Institute of Statistics – Italy “The infrastructure for the SBS-Frame production in ISTAT” Francesco Altarocca, ISTAT Diego Bellisai , ISTAT Antonio Laureti Palma , ISTAT CoE Workshop on Data warehousing Wiesbaden 23 – 24 November 2016

2 Summary SBS-Frame production process The SBS-Frame Data Warehouse
The data-centric workflow Frame-SBS application

3 SBS-Frame in the contest of business statistics
The SBS Frame is an archive of the main annual economic variables on all the active Italian enterprises (around 4.4 million units) obtained by administrative data sources. The Frame allows ISTAT to obtain by sum the main economic aggregates required by the Eurostat SBS (Structural Business Statistics) Regulation. It represents the new base for the National Accounts SEC 2010 estimates. The Frame allows ISTAT to overcome the limitations of the estimation domains of the sample surveys; the possibility to have accurate estimates on a relevant number of sub-populations.

4 The Frame’s Sources FS Financial Statements
ID Source Description Supplier units #vars FS Financial Statements annual profit and loss statements of limited liability companies Chambers of Commerce 750K ~300 SS Sector Studies survey SMEs with Turnover in [30K-7.5M] euros Italian Revenue Agency 3.5M ~60 UN Tax returns form unified model of tax declarations by legal form, containing economic information for different legal forms 4.4M IRAP Regional Tax on Productive Activities form Model of declaration for Regional Tax on Productive Activities payment ~70 SME Small Medium Ent. Survey sample survey on enterprises with less than 100 employees ISTAT 100K ~200 RACLI Labour Cost by Enterprise Reg. Register of Labour Cost by Enterprise 1.5M ~20 SBR Business Register Italian official Business Register of Active Enterprises

5 integrated micro-data view
The Frame’s Sources RACLI (~33% of SMEs) Units ID Ateco NEm TBR NEm PC WS WH SC Y 1 2 .....… Y k 3 S S……...… Y p Survey . N (4.3 mil) SBR Not covered ( ~ 4%) FS ( 16% of SMEs) SS 80% of SMEs) UNICO 97% of SMEs) integrated micro-data view

6 pre-treatment of the sources

7 Production process: pre-treatment

8 SBS-Frame process features
external administrative sources variability of the sources annual supply iterative activities many actors’ interactions different actor skills complex workflow distributed computing tracking methodological choices replicability of results documenting processes storing distributed knowledge safety in the production process

9 Frame-SBS Data Warehouse
To support the SBS process we have created an ad hoc S-DWH: specialized in structural business statistics, metadata driven model, based on 4 layers, an easy-to-use environment for access complex data, basic software interfaces for data I/O, control of information visibility;, The S-WDH is based on two sub-modules: view Builder: module for query building with the possibility of a data preview; view Manager: module for the management of views created

10 data-centric workflow application
We decided to manage the complexity of the procedure by an workflow application. We realize a workflow application based on a data-centric approach, i.e. fully integrated with the S-DWH. The data-centric workflow is a specialized statistical workflow designed to compose and execute a flow of statistical procedures. Basic requirements are: a collaborative environment, support statistical software development, allow iterative hypothesis and testing activities. to share knowledge and to track choices.

11 data-centric workflow, model and features
basic elements: Process, Phase, Task a process-manager manage Phases a phase-manager can manage at least a Phase tasks are basic elements each task is associated to software modules (SAS, Pl/Sql,R) Process P Phase H1 Phase H2 Task T1 Task T2 Task T3 Task T4

12 data-centric workflow characteristics
Flexible process management: abstraction for decoupling sources variability from domain variables (mapper), Easily adapt to variable information needs Iterative and incremental process (hypothesis, test) Dependence management Scientific WF vs Industrial WF Automatic process documentation and information gathering Storing and versioning scripts Keeping track of task executions and logs Interactive log viewer for early problem detection Production process replicability

13 SBS-Frame logical view
PROCESSES Studi di Settore DATA Bilanci UNICO IRAP DW access IN Phase Phase Phase Phase interpretation integration source BIL SS UN IRAP FS SS UN IRAP OUT S-DWH data-centric workflow

14 Frame-SBS application
VIEWER data-centric workflow MAPPING

15 thanks for your attention


Download ppt "“The infrastructure for the SBS-Frame production in ISTAT”"

Similar presentations


Ads by Google