Validation at Insee
Validation at Insee 1 – validation prior to transmission to EU 2 – French uses of Eurostat validation tools 3 – Additional support needs in the area of validation services 4 – Lessons learned from our efforts in modernising national validation systems
1 – Validation prior to transmission to EU Local solutions the same as for National Dissemination : a validation at each level of the management French specificity : studies Data feed a specific publication On each level of the writing process, figures are checked Various tools Excel / LibreOffice spreadsheets and macro Sas programs R programs Different from a statistics producer to another Not yet a unique, homogeneous and shared solution Is it really possible for a great variety of statistics productions ? Challenging : an up-coming project ?
2- Uses of Eurostat validation tools EDIT tool : Standalone versions (on each PC) Due to the nature of the data : confidential for Business demography, Ifats, EGR, TOURISM, EHISS (…) Input in CSV STRUVAL and CONVAL For SDMX files Short Term Statistics, NA, Asylum, Air Encapsulated in eDamis
EDIT Standalone solution Installation on each PC (15 + 5) Upload of the data (csv) Checking operations (called Job) Download of the feedback corrections and new Edit checking operations until sucessful feedbacks would be received
EDIT Feedbacks : informations on the kind of errors and where there are in the file « Not so easy to understand at the beginning » Difference between critical and non-critical errors → send an explanatory comment Limits of the standalone : no possibility to share the work between different persons and to optimize the process. For example one person cannot upload all the data and another one follow the checkings. → For the LC no visibility on the checking
STRUVAL AND CONVAL Used through edamis No specific installation needed Use of the V dataset (sandbox, not in production) Feedbacks (like with Edit tool) available in eDamis (in the « Received files » Menu) Only the correct files are sent in production
STRUVAL AND CONVAL STRUVAL : for Structural validation Implemented since 2nd semester of 2016 Structure of the header → if not correct the validation process stops Right dimensions and attributes Codes (respecting or not the codelists)
STRUVAL AND CONVAL CONVAL : for Content validation Recently implemented for NA / a bit early to give an opinion Example of some CONVAL feedbacks unsuccessful : Incorrect combination of OBS_VALUE and OBS_STATUS = inter dimensions / attributes checks For NA we passed the two validation systems but there were still errors :-) Up-coming ? Inter series checks (Y / Y-1) Threshold for some values and variations Like in EDIT tool ? Possible with SDMX ?
3- Additional support needed in the area of validation services EDIT tool First of all, a question : future of EDIT alongside STRUVAL and CONVAL ? An intermediary solution until the SDMX become the format of all Datasets ? Is Edit still extended to new domains ? For now : basically improvement of the existing tool To reduce the cost entry A more user-friendly GUI A more visible support team Guidelines and wiki need to be promoted Webinar maybe more often
3- Additional support needed in the area of validation services EDIT tool First of all, a question : future of EDIT alongside STRUVAL and CONVAL ? An intermediary solution until the SDMX become the format of all Datasets ? Is Edit still extended to new domains ? For now : basically improvement of the existing tool To reduce the cost entry A more user-friendly GUI A more visible support team Guidelines and wiki need to be promoted Webinar maybe more often
3- Additional support needed in the area of validation services EDIT tool About validation rules To share more widely the rules Some rules are too strict / sometimes more flexibility is needed (but difficult to apply, we understand it) For confidential data a secured application and no more standalone version, difficult to maintain / time-consuming
3-Additional support needed in the area of validation services For STRUVAL/ CONVAL : Good idea to encapsulate the validation tool in edamis Easy to use even with a basic knowledge of SDMX Warning : Burden of edamis in the peak days (29/09/17) => delays for feedbacks Some errors can still exist although successful feedbacks have been received
4- Lessons learned from our efforts in modernising national validation system Conforted us in our national checking operations pointed out missing controls Reduced the number of sendings And the « ping-pong » exchanges with Eurostat
4- Lessons learned from our efforts in modernising national validation systems Validation from the NSI is a crucial need for Eurostat To reduce the work of its teams To homogenize the data received (so they could be used) But the final users of Eurostat are also the NSI themselves Virtuous circle Useful for data sharing
4- Lessons learned from our efforts in modernising national validation system The validation tools from Eurostat are already a business process, level not yet reached in France We're actually redesigning our dissemination model → a centralized dissemination warehouse feeded by various databases → Validation process : a challenging issue EU has raised awareness of it A bit early yet, but we will surely benefit from Eurostat experience on implementing CONVAL and STRUVAL
Thank you for your attention ! Any questions ?