Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo Murcia 1
Development of metadata in the National Statistical Institute of Spain Background INE has been working on metadata for 10 years Closely following the METIS and Eurostat working groups´ developments During these 10 years, prototypes have been developed for repositories of concepts, questions and classifications related to different surveys carried out by the INE 2
Development of metadata in the National Statistical Institute of Spain Background Before: Metadata were not considered a part of the process. Now: Process is defined in terms of GSBPM Metadata will be considered by the Regulations. Metadata are integrated in the process 3
Development of metadata in the National Statistical Institute of Spain Current project Statistical Operation Database Process Reference Structural Process Efficiency and re-use Governance Exchange of information Metadata Integrated System (SIM) 4
Development of metadata in the National Statistical Institute of Spain Reference metadata. Methodological sheets for dissemination on the web (ESMS) 5
Development of metadata in the National Statistical Institute of Spain Reference metadata. Methodological sheets for dissemination on the web (ESMS) Origin of the project The Board of Directors approved it and enhance it 3. Improvement and standardisation of the methodological information on the INE’s web corresponding to the statistical operations carried out. 6
Development of metadata in the National Statistical Institute of Spain Part of the process Standardisation, Coordination, Quality, Dissemination IT Approved by a standard Implementation January- December Reference metadata Board of Directors 7
Development of metadata in the National Statistical Institute of Spain Some examples of reference metadata at the INE 8
Development of metadata in the National Statistical Institute of Spain Standardised Methodological Report Some examples of reference metadata at the INE 9
Development of metadata in the National Statistical Institute of Spain Reference metadata. Methodological sheets for dissemination on the web (ESMS) 10
Development of metadata in the National Statistical Institute of Spain The structural metadata. The databases 11
Development of metadata in the National Statistical Institute of Spain The structural metadata. The databases 12
Development of metadata in the National Statistical Institute of Spain Link to the microdata 13
Development of metadata in the National Statistical Institute of Spain Link to the microdata Legal marital status Legal marital status is defined as the (legal) conjugal status of each individual in relation to the marriage laws (or customs) of the country (i.e. de jure status). Source: Core Social Variables Microdata 14
Development of metadata in the National Statistical Institute of Spain Link to the macrodata 15
Development of metadata in the National Statistical Institute of Spain SIM Statistical Operations QuestionnairesQuestionsClassifications Variables Subjects MicrodataMacrodata The structural metadata. The databases 16
Development of metadata in the National Statistical Institute of Spain Searching tools - What surveys include the variable ‘Nationality’? - How is it done? - Is there any standard in INE for collecting this variable? 17
Development of metadata in the National Statistical Institute of Spain Searching tools 18
Development of metadata in the National Statistical Institute of Spain Implementation plan April 2012: submitted to the High Council on Statistics Beginning of 2013: Concept repository available on INE’s web site End of 2013: Classification repository Beginning of 2014: It would be possible to access the structural metadata base The structural metadata 19
Development of metadata in the National Statistical Institute of Spain The process metadata 20
Development of metadata in the National Statistical Institute of Spain Process metadata The Board of Directors, on its meeting of March 8 th, 2012, approved a standard promoting the use of the GSBPM as the language for describing the production model of the different statistical operations. 21
Development of metadata in the National Statistical Institute of Spain Pilot tests have already been carried out: Description of the process for the production of Retail Trade Indices Survey on Equipment and Use of Information and Communication Technologies in Households Process metadata General remarks: Different kind of tasks have to be distinguished in the production process: A. Monthly tasks: These tasks are basically data collection, editing, imputation and dissemination. B. Yearly tasks: B1. Yearly tasks related to the sample rotation. B2. Yearly tasks related to other potential changes in the survey, like data collection improvements. C. Other sporadic tasks, like a change of questionnaire 22
Development of metadata in the National Statistical Institute of Spain 1.The unit in charge of producing the index described the different tasks in terms of GSBPM A lack of information on data, people involved, used software, time used… was noticed 2.The metadata unit proposed some items (as a minimum) for covering this lack of information GSIM could solve this problem Process metadata Reflections: 23
Development of metadata in the National Statistical Institute of Spain Process metadata: Retail Trade Indices 24
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Process metadata Example 25
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Where in GSBPM is it located? Process metadata 26
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Process metadata 27
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Monthly and yearly tasks: Aimed at the production of information (short-term and structural) Yearly tasks: Related to samples, improvement of the software tools, questionnaires,... Non-periodical tasks: New base year, change of classification, methodological change Process metadata 28
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Timetables Process metadata 29
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Work flows and their documentation Process metadata 30
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Databases produced. Which ones are part of the corporative system? Process metadata 31
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Makes it easy to reuse Process metadata 32
Development of metadata in the National Statistical Institute of Spain Subprocess: 2.4. Sample frame & Design Methodology Actions: Action Periodicity (eg monthly, yearly,...) Starting date Final date Required input Initial file Final file Software (standard and/or tailor-made): Documentation, manual, handbook: Unit in charge Collaborating units Process metadata 33
Development of metadata in the National Statistical Institute of Spain Process metadata: Example Subprocess: 4.3 Run collection Actions: At the end of the reference time (day t) questionnaires are sent to respondent units, which send them back before day t+7 Action Periodicity (eg monthly, yearly,...) Monthly Starting date t+1 Final date t+7 Required input Questionnaires with updated postal addresses pre-printed on them Initial file Not applicable (N/A) Final file N/A Software (standard and/or tailor-made): Software tool for collection via web (ARCE), software tool for paper collection Documentation, manual, handbook: Questionnaires, passwords, Data collection handbook, Validation rules, labels Unit in charge Collection units located in the regional delegations Collaborating units Data collection unit and Retail Trade Index unit Task: M1 34
Development of metadata in the National Statistical Institute of Spain Adjusted series Series published Method used Software used Aggregation Reviews Quality indicators Process metadata 35
Development of metadata in the National Statistical Institute of Spain Conclusion MODEL Consideration of the whole process Relevance and need of linking metadata to data WORK METHOD Access to software tools by means of user and password Collaboration of the different units Dissemination inside and outside the INE SOFTWARE TOOLS Improvement of searching tools More friendly and flexible Aiming at promoting the reuse 36
Development of metadata in the National Statistical Institute of Spain It makes easy the change from vertical to thematic information It was a good documentation Well organised Conclusion Positive aspects: 37
Development of metadata in the National Statistical Institute of Spain Future Production of SIM Implementation of the reference metadata in the institution Dissemination of the metadata Inclusion of administrative sources Increase standardisation, and efficiency 38
Development of metadata in the National Statistical Institute of Spain Thank you for your attention Any questions? 39