United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, 20-23 September, 2011 Documentation and Cataloguing in Data.

Slides:



Advertisements
Similar presentations
3rd International Digital Curation Conference Washington, DC, Dec 2007 Paper Presentations: Interoperability, Metadata & Standards Data Documentation Initiative:
Advertisements

ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Chapter 12: ADO.NET and ASP.NET Programming with Microsoft Visual Basic.NET, Second Edition.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
Data Management: Documentation & Metadata Types of Documentation.
POLICIES AND PROCEDURES FOR ARCHIVING DATA IN BURUNDI.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Country Practices on Census Data Archiving.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
ISO as the metadata standard for Statistics South Africa
Database Systems: Design, Implementation, and Management Ninth Edition
World Bank: Microdata Library Development Data Group.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Interoperability Scenario Producing summary versions of compound multimedia historical documents.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
Archival information system ARHiNET Croatian national archival information system Vlatka Lemić Croatian State Archives, Croatia.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Data documentation and metadata for data archiving and sharing Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Introduction to Metadata, the DDI and the Metadata Editor Presentation to the SERPent project team by Margaret Ward 3 March 2010.
Documenting and disseminating census and survey data sets Ilpo Survo, United Nations ESCAP, Bangkok, for UNECE.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
Secure Epidemiology Research Platform (SERPent) Kick Start Meeting - April 15 th, 2010 Pascal Heus
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
OVERVIEW OF ARCHIVING OF MICRODATA SILAS M. MULWA Kenya National Bureau of Statistics United Nations Regional Seminar on Census Data Archiving for Africa.
DDI AND EXPERIENCES AT ICPSR Prepared for Expert Seminar Finnish Social Science Data Archive Tampere, Finland September 1-2, 2000.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Introduction to Census Archiving Session.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
Presented By Margaret Hellen Atiro Uganda Bureau of Statistics at the United Nations Regional Seminar on Census Data Archiving 20 – 23 Sep 2011, Addis.
METADATA ORGANISATION ESDS APPROACHES AND RESOURCES …………………………………………
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Ingest – Acquisition and deposit Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Metadata models to support the statistical cycle: IMDB
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Markup of Educational Content
Data Management: Documentation & Metadata
Application of Dublin Core and XML/RDF standards in the KIKERES
Enhancing ICPSR metadata with DDI-Lifecycle
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
2. An overview of SDMX (What is SDMX? Part I)
Session 2: Metadata and Catalogues
Metadata in Digital Preservation: Setting the Scene
School of Information Studies, Syracuse University, Syracuse, NY, USA
The role of metadata in census data dissemination
The Role of Metadata in Census Data Dissemination
Introduction to reference metadata and quality reporting
The Role of Metadata in Census Data Dissemination
Presentation transcript:

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data Archiving Session6

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 What is documentation?  Documentation: comprehensive information on the processes and methods used to produce, archive and disseminate micro-data oDocumentation includes metadata and other information related to the dataset

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Role of documentation  Documentation explains how the data were collected, their content and structure and any manipulation that may have taken place, how to access the data, terms for their use, etc  Documentation is required in order to understand and interpret the data by providing a context: without proper documentation, data are useless  The further data gets from its source, the greater the importance of the documentation (metadata)  Also allows reuse of documents for future surveys

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 When to undertake documentation  Documentation is an incremental process that should be a shared responsibility among various parts of an institution  Different types of documentation can be added by different people at various stages of an information object’s life cycle  A common documentation framework, used by different actors - the actor who is closest to the information to be used as documentation/metadata adds that information to the framework

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Types of material for documentation  Three broad categories of documentation: o Explanatory material o Contextual information o Cataloguing material

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Types of material for documentation  Explanatory material – required to ensure the long- term viability and functionality of a dataset and without which full understanding of the dataset and its contents cannot be achieved oData collection methods (data collection process including instruments used, methods employed, and how these were developed) oStructure of the dataset (information about relationships between individual files or records within the study, e.g., the number of cases and variables in each file and the number of files in the dataset) oTechnical information (computer system used to generate the files; software packages with which the files were created; medium on which the data was stored; and complete list of all data files present in the dataset)

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Types of material for documentation (contd)  Explanatory material (contd.) oVariables and values, coding and classification schemes (descriptions of all variables (or fields) in the dataset, with explanations about coding and classifications used and for blank and missing fields) oDerived variables (how it was done) oWeighting and grossing (procedures should be explained) oData source (sources from which the data is derived e.g. questions used) oConfidentiality and anonymization (if data contain any confidential information or anonymization has been implemented and implication of both on data usage)

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Types of material for documentation (contd)  Contextual information - the context in which the data was collected, and how it was put to use oDescription of the originating project (why the data collection was felt necessary; who or what was being studied; the geographic and temporal coverage) oProvenance of the dataset (history of the data collection process, changes and developments that occurred in the data themselves and the methodology, or any adjustments made) oSerial and time-series datasets, new editions (e.g., descriptions of changes in question text, variable labelling or sampling procedures for repeated cross-section, time-series datasets)

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Types of material for documentation (contd)  Catalogue metadata: o A sub-set of core data documentation providing standardized structured information explaining the purpose, origin, time reference, geographic location, creator, access conditions and terms of use of data

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards  Traditionally, data producers wrote text-based codebooks. To take advantage of web technology, these have been replaced by XML-based codebooks  Use of metadata standards brings key data documentation together into a single document, creating detailed and structured content about the data. This enhances: −Quality of statistical documentation provided to data users −Access to the data and semantic interoperability of data sets  The Data Documentation Initiative (DDI)  Dublin Core Metadata Standard

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  On XML (eXtensive Markup Language) −A way of tagging text for meaning instead of appearance (i.e., XML can be used to organize text by tagging with meaningful information −Unlike text in the database, XML text files can be viewed and edited using any standard text editor −With appropriate tools, XML files can be searched and queried like a regular database −XML documents can be read and transformed by other software applications into user-friendly formats, e.g., spreadsheets, PDF files or web pages

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  The Data Documentation Initiative (DDI): oIs based around the data lifecycle model and provides specifications for a structured framework for organizing the content, presentation, transfer and preservation of metadata in the social and behavioural sciences oProvides comprehensive metadata on the entire survey process and usage oFacilitates point-of-origin capture of metadata oIncludes machine actionable elements to facilitate processing, discovery and analysis

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  The Data Documentation Initiative (DDI): oFacilitates reuse of common metadata items because DDI is designed around schemes (lists of items) for commonly reused information within a study, e.g., categories, code schemes, concepts, universe, etc. −Items are entered once and used in multiple locations in a DDI document by referencing item in the list oReuse of items supports: −Consistency and accuracy of metadata content thereby minimizing redundancy and discrepancies −Internal and external implicit comparisons −External registries of concepts, questions, variables, etc. −Metadata driven processing

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  The Data Documentation Initiative (DDI): o Information in DDI schemes can be stored in external registries and used by multiple studies to support: −Comparisons within and between studies −Organizational consistency through use of agreed content managed in registries o Designed to support easy interaction with other major standards (Dublin Core, SDMX, ISO/IEC 1179, ISO 19115) −Ensures that metadata can be connected to other domains or stages of the lifecycle

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  Dublin Core Metadata Standard: o A general purpose metadata standard for describing digital resources related to micro-data −Questionnaires −Reports −Manuals −Data processing scripts −Programs −etc. o Makes it easy and inexpensive to create descriptive records for information resources while providing for effective retrieval of these resources on the web or other similar networked environment

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Metadata standards (contd.)  Dublin Core Metadata Standard: o Consists of 15 metadata elements: TitleRelationRights SubjectCoverageDate DescriptionCreatorFormat TypePublisherIdentifier SourceContributorLanguage

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 What is cataloguing?  Cataloguing: creation of documentation for a dataset providing standardized structured information so that searchers can easily identify and access datasets according to their needs (title of study, source, year of collection, etc)

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Cataloguing  Sharing survey micro-data with legitimate users offers many benefits, e.g., the diversity of research work, the acceptability of data, the quality of data, etc. Therefore, users should be informed about existence and characteristics of datasets  Cataloguing material serves as: −A bibliographic record of the dataset, allowing it to be properly acknowledged and cited in publications −A formal record for long-term preservation purposes −Basic instrument used for resource discovery, allowing datasets to be uniquely identified within the collection by providing appropriate information to help secondary users identify the study as useful to their purpose

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Cataloguing (contd.)  Searchable catalogues facilitate finding datasets and related metadata and increase access to datasets  Use of XML-based metadata standards facilitate creation of catalogues as they are structured making them searchable  Information on title of dataset, data collector(s), dates of data collection, temporal and geographic coverage, methods of data collection, sampling design and frames (if undertaken), other documentation information. Also variable names, abstracts and key words…

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Cataloguing (contd.)  Characteristics of a good survey catalogue - From the user point of view: o Compliant with international metadata standard, particularly XML standards o Provides detailed metadata, including at the variable level o Provides user-friendly search functionalities (full text search) o Provides clear information on the policy and procedure for accessing the data o Provides a list and direct access to reference materials (questionnaires, manuals, reports) o Includes a "search by topic" compliant with an international thesaurus

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Cataloguing (contd.)  Characteristics of a good survey catalogue - From the catalogue administrator's point of view: o Provides a secure environment for storing and sharing data and metadata o Provides a "users' requests" and "user's management" tool to receive and respond to data requests and information queries o Provides a solution for sharing public use files and licensed files o Generates admin reports on access requests received/processed; most popular surveys/documents; keywords used for searching data; etc.

United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Thank You!