Prime Esperienze di Utilizzo di R all’Interno dell’Istat The Web-Based Information System of Italian Population Census Maura Giacummo Leonardo Tininini Marina Venturi Antonino Virgillito
Antonino Virgillito, May 25th Introduction The 15 th Italian Population Census –24,700,000 households with 60,000,000 individuals –100,000 operators Several important innovations were introduced Supported by a web-based information system
Antonino Virgillito, May 25th th Italian Population Census: Main Innovations Use and analysis of local population registers –Information basis of the census –Comparison and re-alignment of the data on citizens, collected by the Census, with those stored in registers Involvement of local municipality departments in the entire census process Use of multi-channel data collection techniques –Traditional paper-based data collection and online electronic questionnaire
Antonino Virgillito, May 25th The Census Web-Based Information System Integrated web architecture for the management and the online data collection Designed and developed in Istat –Based on open-source software and platforms Web hosting service was outsourced
Antonino Virgillito, May 25th The Census Web-based Information System SGR Management System QPOP Online questionnaire RETE Documentation portal Common database Accessible from both the Census office operators and Istat personnel Allows operators to manage and follow all the phases of data collection Distribution of workload among Istat and municipalities 70 functions implemented, organized in 8 areas Accessible from both the Census office operators and Istat personnel Allows operators to manage and follow all the phases of data collection Distribution of workload among Istat and municipalities 70 functions implemented, organized in 8 areas
Antonino Virgillito, May 25th SGR Monitoring of Questionnaires SGR allowed operators to follow the evolution of every single questionnaire, in every phase –“Census tract agenda” control panel Questionnaires were initially sent to families using population registers data Under-coverage was handled through enumerator field works Questionnaires could be collected in different ways –Web –Manual return to municipality –Post shipping –Collected by enumerator
Antonino Virgillito, May 25th SGR Census-Register Comparison Operator could monitor the differences between the population registers and the census result, producing at the end a balance sheet on families and individuals at municipal level All the inconsistencies were revealed in SGR dashboard and could be processed by operators
Antonino Virgillito, May 25th The Census Web-based Information System SGR Management System QPOP Online questionnaire RETE Documentation portal Common database Web application that citizens could use to fill up their questionnaire Assisted in compilation through rule checking Completed questionnaires immediately available in SGR Could also be used by Census office operators for performing data entry Web application that citizens could use to fill up their questionnaire Assisted in compilation through rule checking Completed questionnaires immediately available in SGR Could also be used by Census office operators for performing data entry
Antonino Virgillito, May 25th QPOP Main Features All the three types of paper questionnaires were reproduced online –questionnaires for households in long and short version –questionnaire for cohabitations Access credentials –User ID is tax code of reference person for the household –password printed on the paper questionnaire Assisted input with real-time error checking –Only pre-checked data were inserted in the database –High quality of collected data Multi-language support State-of-the-art software design
Antonino Virgillito, May 25th QPOPSoftware design qualities Scalability –No performance bottlenecks –Sophisticated technical solutions were adopted for controlling the database load even in presence of high number of concurrent users Robustness –Severe test phase –No application errors Security –Redundant access controls for avoiding data corruption and intrusions –Redundant data control for guaranteeing consistency
Antonino Virgillito, May 25th QPOP Metadata-based Design All the text for the questions, response modalities and application messages is stored in specific metadata tables –Avoidance of redundant source code for the three questionnaires –Rapid adaptation to changes in the specifications Also the questions flow (“questionnaire graph”) is stored as metadata –Preserves the consistency of saved data All the metadata is cached in memory for ensuring fast access and saving database resources
Antonino Virgillito, May 25th QPOP Text Search Engine QPOP includes a sophisticated mechanism for automatic encoding of textual responses Used within the question on highest educational qualification –Dictionary with 6,000 distinct items –The user enters a list of words describing her full title of educational qualification –The system replies by proposing a list of titles taken from the available dictionary ranked according to their similarity to the search keywords –The user must select one of the proposed items Pre-processing of the dictionary for optimizing response times
Antonino Virgillito, May 25th Results 8,250,000 questionnaires were filled on the web –33% of total –Peaks of 300 questionnaire saved per minute –Severe service outage on the first day Due to inaccurate server tuning by the hosting company Recovered the day after Census-Register comparison completed for the 96% of the population (as of March 31 st )
Antonino Virgillito, May 25th Advantages of the Integrated Web Platform Decentralized management of the network Real-time monitoring of multi-channel collection Efficient and effective census-register reconciliation Both implemented systems will be reused for the forthcoming Industry and Services Census
Antonino Virgillito, May 25th Thank you