Download presentation
Presentation is loading. Please wait.
1
ESSnet ValiDat Foundation
Mutual Interests – Requirements for a European Infrastructure from a Member State perspective
2
ESSnet ValiDat Foundation
Motivation When Eurostat started its internal project on validation policy, the goals were rather limited Decrease the number of data transmissions from the member states to Eurostat Increase the transparency of the validation rules applied to the national aggregates More pragmatically: Data ping-pong causes costs and problems in delivering data punctually EDIT as a temporary solution/work around Discussion in the EDAMIS and Validation User Group (EVUG) Controversial discussion about the attractiveness of the ESS.VIP Validation in the ESS
3
ESSnet ValiDat Foundation
Hindrances Why is the attractiveness evaluated so controversially? Objections of member states on a more centralized policy of validation: No benefits: „What do we gain from it?“ Higher costs: „They have not considered the specific national situation!“ Eurostat as an intrusive body: „Do they want to control and steer national production?“
4
ESSnet ValiDat Foundation
Solutions How can we overcome these hindrances? Increase the benefits for member states by incorporating the whole statistical production chain Take account of national situations by analysing jointly the diversity of systems and goals Intensify the co-operation of the NSIs and Eurostat involve member states / Eurostat in the discussion accept the different focuses of national and international Statistical offices
5
ESSnet ValiDat Foundation
Schema of goals EStat NSI Focus on Trans-mission Focus on Validation process (end to end) EStat NSI © Luca Gramaglia
6
ESSnet ValiDat Foundation
Implications Wider focus of the second goal has some implications: More involvement of NSIs (ugly, but necessary!) More complexity of the system Higher costs But greater benefit for each party
7
ESSnet ValiDat - Foundation
Infrastructure as proposed by Eurostat
8
ESSnet ValiDat - Foundation
Workflow as proposed by Eurostat
9
ESSnet ValiDat Foundation
Structural elements of proposed Eurostat infrastructure Repository/Registry Data structure (here SDMX) Validation rules Central Validation service(s) Structural validation services Content-based validation services Specification tool (Validation tool graphical user interface)
10
ESSnet ValiDat Foundation
Limitations of proposed Eurostat infrastructure Repository/Registry Focuses on SDMX (suitable mainly for macro data) Resources split in two repositories Validation service(s) Planned only for data transmitted to Eurostat Centralized service (data have to be transmitted instead of downloading validation rules for a decentralized service) Specification tool (Validation tool graphical user interface) Will be specific to validation of Eurostat rules
11
ESSnet ValiDat Foundation
Structural Elements of a proposed ESS-Infrastructure Modifications to Eurostat‘s tools and services required Repository/Registry – new requirements Validation service(s) – new requirements Specification tool – new requirements New services required for some Member states Mapping and transformation services Individual solutions needed
12
ESSnet ValiDat Foundation
Requirements in detail Scope: Production chain Scope: Specification and evaluation Language and standards Other functional requirements Roles Metadata Versioning Metrics
13
ESSnet ValiDat - Foundation
Scope: Which processes should be supported? HERE HERE HERE HERE
14
ESSnet ValiDat Foundation
Production chain One result of the survey - and dealt with in the handbook - is the observation, that validation is used in several phases of the production chain In GSBPM validation is mentioned twice „5.3 Review and validate“ in the processual phase „6.2 Validate outputs“ in the analytical phase In practice, validation is also being part of the data collection and the dissemination (and data transmission) phases The Eurostat solution focuses – from a member state perspective - on the very last step of the whole process
15
ESSnet ValiDat Foundation
Production chain Data Collection Validation is integrated in web forms and other channels of data entry In terms of efficiency validating in the data collection phase is highly desirable (Simon: Validation, the sooner the better!) From a Eurostat perspective receiving data from the MS is also a kind of data collection (but this is a bit philosophical) Must: A system should be able to feed data entry systems with suitable rules!
16
ESSnet ValiDat Foundation
Data collection and validation levels In terms of levels, structural validation (“level 0”),“level 1” and - occasionally – “level 2” validation is applied
17
ESSnet ValiDat Foundation
Production chain Data Processing Is mainly done by the NSIs on the basis of the „raw“ micro data (either from a survey or using administrative data) Discriminates between validation and editing of the data Is probably the most time consuming activity and should be automatized or supporting manual work Faces the more complex validation level 2 to 4 The validation system should be able to cover this phase well
18
ESSnet ValiDat Foundation
Production chain Analysing The term „output validation“ gives some hint on its meaning The key feature of validation in this phase is macro editing Validation is output driven and will not consider all (micro) data, but only outliers with a significant influence on the final results Only the highest levels (3 and 4, partly 5) are covered Domain knowledge is of high importance This is regularly the level of interest of Eurostat
19
ESSnet ValiDat Foundation
Production chain Dissemination In the survey this phase was also connected to validation processes Three aspects might be responsible for this assessment Getting user feedback on statistical data Providing quality indicators together with the data and particularly metadata on validation Connecting this phase with the data transmission to other statistical authorities
20
ESSnet ValiDat Foundation
Production chain Some conclusions Complexity: The incorporation of all production phases (GSBPM 4 to 7) in one system will increase the complexity of this system Quality of process: Quality of data will not be something checked at the very end of the process but will be an inherent feature of the process (ideally with some processual information – metadata – to be stored with the data itself) “Validation-strategy”: It will allow a better tuning of validation rules to each other and make the whole validation process more efficient
21
ESSnet ValiDat Foundation
Requirements in detail Scope: Production chain Scope: Specification and evaluation Language and standards Other functional requirements Roles Metadata Versioning Metrics
22
ESSnet ValiDat - Foundation
Scope: Specification and evaluation Having covered the production chain (GSBPM 4 to 7), what about the phases 1 to 3 and 8? Specify needs Design Build PRODUCTION Evaluate
23
ESSnet ValiDat - Foundation
Scope: Specification and evaluation Validation Life Cycle Simon et al. 2014
24
ESSnet ValiDat - Foundation
Scope: Specification and evaluation The validation system will probably have not much to contribute to the needs side All other non-production phases can be supported by the validation system The validation rules graphical user interface (VR-GUI) will help to design validation rules an optional metrics module connected to the VR-GUI can graphically represent the coverage and dependencies of rules a simple runtime environment as part of the VR-GUI can help checking rules in an intuitive way the status of a rule set (trial, productive, out of production) can be handled by the VR-GUI
25
ESSnet ValiDat - Foundation
Scope: Specification and evaluation VR-GUI (continued) The design process should be collaborative, i.e. the VR-GUI should be able to handle several editors in parallel Validation rules can be organized hierarchical and used were appropriate Validation rule can be organized in sequential sets and for different purposes (e.g. in the different production phases) Rules can be versioned Rules and rule sets will be automatically transferred to the registry and from there to the different productive systems
26
ESSnet ValiDat - Foundation
Scope: Specification and evaluation The registry can help with the review of the validation rule (sets) by storing error reports and performance indicators as well as paradata from the productive systems The productive systems (services for structural and content-based validation as well as others) provides data for metrics comparison with observed data comparison with „real data“ processual metadata
27
ESSnet ValiDat Foundation
Requirements in detail Scope: Production chain Scope: Specification and evaluation Language and standards Other functional requirements Roles Metadata Versioning Metrics
28
ESSnet ValiDat - Foundation
Language: A new Sta(nda)r(d) is born VTL - Validation and Transformation Language as standard
29
ESSnet ValiDat - Foundation
Language: A new Sta(nda)r(d) is born VTL – Version 1.1 Training required Simplifications of language constructs needed Tool support and supporting tools: Real world interpreter and compiler for VTL are in development VR-GUI should ease the usage by Syntax highlighting Code completion Providing known Variables …
30
ESSnet ValiDat - Foundation
Standardisation Besides the question of a common validation language some other technical aspects have to be standardized as well Information models: Which information model should be used? Formats: How should data and metadata (e.g. validation rules) be stored physically? (SDMX?) Protocols: How will the technical services and tools communicate with each other? (CSPA?)
31
ESSnet ValiDat Foundation
Requirements in detail Scope: Production chain Scope: Specification and evaluation Language and standards Other functional requirements Roles Metadata Versioning Metrics
32
ESSnet ValiDat - Foundation
Other functional requirements Roles The whole system need an elaborate authorisation management with different degrees of visibility and editing of rules Metadata A decoupling of a logical and a physiscal data model are important for matching different systems within the ESS The information model should provide a mean for storing processual data with the statistical data
33
ESSnet ValiDat - Foundation
Other functional requirements Versioning and status-dependent workflows Validation is a process of continous improvement (see „life cycle). The system has to be able to store different rules-sets simultaneously and operate data through different service by status Metrics Different parts of the validation system have to be adapted to provide metrics on the quality and costs of the validation process (see above)
34
ESSnet ValiDat Foundation
Requirements in detail Non-functional requirements Adaptability (to national systems and Usability (for different user groups) Performance (working with big datasets and complex rules) Stable and error free (as central part of statistical production) IT-Security, Data protection acts and Statistical confidentiality Organisational issues Training, support and documentation have to be secured Maintenance has to be secured Costs (development, modification, production)
35
ESSnet ValiDat - Foundation
Types The survey demonstrates 28 different solutions to validation in the ESS (n = 28) Differences in Organisation Methodology Tools and services
36
ESSnet ValiDat - Foundation
Types However, on an abstract level four major types occur Type I: Decentralized organisation, no common methodology, general purpose tools (e. g. Excel, SAS, SQL) Type II: Decentralized organisation, no or limited common methodology, specialized and domain-specific applications (applications for population, agriculture, prices ..) Type III: Centralized organisation, common methodology, generic tools and services for validation (and other statistical processes) (e. g. EDIT, Canceis) Type IV: Mixed approach
37
ESSnet ValDat - Foundation
Type 1
38
ESSnet ValDat - Foundation
Type 2
39
ESSnet ValDat - Foundation
Type 3
40
ESSnet ValiDat - Foundation
Types and solution(s) Not just one solution! Type 1: Use common methodology, replace general tools by generic validation service Type 2: Modify applications with plug-in for interpreting validation rules centrally stored or by using generic validation service Type 3: Transform validation rules into local validation language and keep national system intact Type 4: Change gradually to Type 3 or use generic validation service directly
41
ESSnet ValiDat - Foundation
The PoC*! The following presentation will show some of the more practical problems of the solution of connecting national type 3 environments to a European validation system Von der realen Welt zu .. .. VTL zu.. .. nationalen Systemen CBS (flexible Lösung in R) Destatis (innerhalb von eStatistik) * Proof of Concept
42
ESSnet ValiDat - Foundation
Next steps (from a Member State perspective) The ESS.VIP on Validation, the ESSnet ValiDat Foundation, this workshop all provide valuable insights in the current state of validation in the ESS Euphemistically speaking, the state can still be improved An active involvement of the Member states is important, because validation should be a process from end-to-end has to be understood as a joined endeavor of national and international statistical organisations need a degree of common understanding (and acceptance) and require some investments
43
ESSnet ValiDat - Foundation
Next steps (from a Member State perspective) Some foundations and baselines have been developed during the last years: A common methodology usable for the practitioner in the NSIs has to be developed. Now it is time to refine and train this methodology across the ESS A language appeared that might become the lingua franca in the global statistical community. It need to be further developed and implemented in tools, services and brains Eurostat is far advanced with some preliminary tools and services. Now it is the time to evaluate its usability and improve along the lines of my presentation
44
ESSnet ValiDat - Foundation
Next steps (from a Member State perspective) Open questions the governance has to be worked on, decided and implemented further PoCs, pilots and new services should be developed in the next years A second ESS.VIP, a Task force, a new ESSnet and your further involvement are of paramount importance to make it a success Is it worth it? We believe: Yes!
45
Vielen Dank für ihre Aufmerksamkeit!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.