On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa, Monica Scannapieco, Diego Zardetto, Istat, Italy Istituto Nazionale di Statistica – ISTAT
The CSPA concept National Statistical Institutes (NSIs) produce Official Statistics having very similar goals Common activities carried on in an independent way, almost without relying on shared solutions Statistical organizations have attempted many times to share their processes, methodologies and software solutions (significant work to integrate) 2Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept As part of the modernization effort in the Official Statistics field, the High Level Group for the Modernization of Statistical Production and Services (HLG) has taken action in order to address these issues promotion of development and implementation of the CSPA (Common Statistical Production Architecture) 3Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept CSPA provides a template architecture for official statistics, describing: What the official statistical industry wants to achieve How the industry can achieve this, i.e. principles that guide how statistics are produced What the industry will have to do, compliance with the CSPA 4Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The CSPA concept 5Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014 editrules CANCEISSCS Tools Services CSPA compliant Platforms
The CSPA concept 6Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The Error Localization service In the POC initiative of 2013 CSPA project Istat undertook the responsibility of developing the CSPA Error Localization service, with the roles of designer, builder and assembler It was decided to wrap the “localizeErrors” function contained in the “editrules” R package developed at Statistics Netherlands 7Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The Error Localization service Data used for test cases come from Istat’s Structure of Earning Survey Input unit data sets involve 20 variables The rules set consists of 44 edits involving 17 numeric variables appearing in the unit data sets 3 different test cases with the same rules set 8Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014 Data set erroneus records Data set exact records Data set mixed records
The Error Localization service The service was implemented technically as a Java standalone application (jar executable file) that wraps up the “localizeErrors” function of the “editrules” R package The jar can be called by GUI or by command line and is responsible of: –Take input parameter from user (or application) –Invoke the execution of the R script in the R environment with provided input parameters –Return the output parameters (output file generation) 9Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
The Error Localization service The Error Localization service wrapped by the Java program was then deployed on CORE thus proving the fully compatibility of CSPA services with respect to a specific NSI’s internal platform CORE (COmmon Reference Environment) is the Istat internal platform for statistical processes execution 10Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
11Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014 Tool CSPA Platform Service ….
Conclusion Istat is currently involved in the 2014 CSPA Implementation project, with the role of developing the Error Correction service. the following activities are ongoing: –study how to extend such a service in order to perform a full editing and imputation process –design a CSPA specification, to be shared and agreed among CSPA implementation project participants –implement the specifications provided at by concrete CSPA services wrapping existing tools. 12Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014
Thank you for the attention ! 13Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014