Introduction to Metadata, the DDI and the Metadata Editor Presentation to the SERPent project team by Margaret Ward 3 March 2010.

Slides:



Advertisements
Similar presentations
International Household Survey Network Metadata Toolkit Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November.
Advertisements

Archiving Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November 6th, 2006.
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Data Archiving.
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
DLI Training Nesstar Workshop
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Accessing longitudinal data via the UK Data Archive / ESDS Jack Kneeshaw NCDS summer school course, July 2005 ESDS Longitudinal.
Quantitative Data Preparation Louise Corti ESDS/ UKDA Social Science Data Archives for Social Historians: creating, depositing and using qualitative data.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 23 June 2005 ESDS Longitudinal.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
Quantitative Data Preparation Alasdair Crockett, Data Services Manager UK Data Archive.
Anne Etheridge Economic and Social Data Service IASSIST May 2006 METADATA MANAGEMENT THE FORGOTTEN WORLD OF THE BACK OFFICE.
Introduction to the ESRC Question Bank Julie Lamb Department of Sociology University of Surrey.
Metadata and the UK Data Archive CESSDA Expert Seminar Odense September 2008 Margaret Ward Lenin Ageer.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
Why, what were the idea ? 1.Create a data infrastructure, 2.Data + the knowledge products that are produced on the basis of data a) Efficiant access to.
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
An Leabharlann UCD Órna Roche UCD James Joyce Library Metadata Documenting your data
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
Meta Dater Metadata Management and Production System for surveys in Empirical Socio-economic Research A Project funded by EU under the 5 th Framework Programme.
Arja Kuula: The DDI and Qualitative data IASSIST2001 Amsterdam, May 2001 Finnish Social Science Data Archive.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History 2 University of California, Berkeley School of Information IS 245: Organization.
NESSTAR - the data archive perspective by Margaret Ward UK Data Archive.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Learning and Teaching with the UK Census Developing the Collection of Historical and Contemporary Census Data and Materials into a Major Learning and Teaching.
W w w. n e s s t a r. c o m Unlocking data – creating knowledge.
Reusable!? Or why DDI 3.0 contains a recycling bin.
Codebook Centric to Life-Cycle Centric In the beginning….
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
ISO as the metadata standard for Statistics South Africa
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Curating and Managing Research Data for Re-Use Review & Processing Jared Lyle.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
DLI Training April 2004 Kingston Ontario. DDI What, Why, How?
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Data documentation and metadata for data archiving and sharing Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
Documenting and disseminating census and survey data sets Ilpo Survo, United Nations ESCAP, Bangkok, for UNECE.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Metadata Management and Tools August 1, 2013 Data Curation Course.
+ Information Systems and Databases 2.2 Organisation.
SCORM Course Meta-data 3 major components: Content Aggregation Meta-data –context specific data describing the packaged course SCO Meta-data –context independent.
DDI AND EXPERIENCES AT ICPSR Prepared for Expert Seminar Finnish Social Science Data Archive Tampere, Finland September 1-2, 2000.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
The Question Bank Graham Hughes & Julie Gibbs Department of Sociology University of Surrey Research Methods Festival, July 2008
NEFIS (WP5) Evaluation Meeting, November 2004 Evaluation Metadata Aljoscha Requardt, University of Hamburg Response rate: 93% (14 of 15 partners.
COMMON COMMUNICATION FORMAT (CCF). Dr.S. Surdarshan Rao Professor Dept. of Library & Information Science Osmania University Hyderbad
General concepts: DDI Irena Vipavc Brvar, ADP SEEDS Kick-off meeting, Lausanne, May 2015.
8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.
Presented By Margaret Hellen Atiro Uganda Bureau of Statistics at the United Nations Regional Seminar on Census Data Archiving 20 – 23 Sep 2011, Addis.
METADATA ORGANISATION ESDS APPROACHES AND RESOURCES …………………………………………
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Ingest – Acquisition and deposit Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Data Management: Documentation & Metadata
DDI for the Uninitiated
The role of metadata in census data dissemination
Data Liberation Initiative (DLI)
WHERE TO FIND IT – Accessing the Inventory
Presentation transcript:

Introduction to Metadata, the DDI and the Metadata Editor Presentation to the SERPent project team by Margaret Ward 3 March 2010

Overview Good practice in data documentation The DDI The Metadata Editor

“From the archivist’s and the end user’s perspective a ‘good’ dataset is one that is easy to use. Its documentation is clear and easy to understand, the data contain no surprises, and users are able to access the dataset with relatively little start- up time” Extracted from the ‘Guide to Social Science Data Preparation and Archiving’ (ICPSR) ICPSR A ‘good’ dataset

Why document data? The data documentation, or metadata, helps the researcher: Find the data they are interested in Understand how the data have been created Assess the quality of the data (e.g. standards used) and also Enables users to understand / interpret data Ensures informed and correct use of the data Reduces chance of incorrect use / misinterpretation

5 What should be provided? Explanatory material – information essential to the informed use of the dataset Contextual information – material about the context in which the data were collected and information about the uses to which the data were put Cataloguing information – used to create a formal catalogue record or study description for the study

6 Explanatory information Information about the data collection process and methods, e.g. instruments used, methods used and how developed, sampling design Information about the structure of the dataset, e.g. files, cases, relationships between files or records within a study Technical information, e.g. computer system used, software packages used to create files Variables and values, coding and classification schemes, e.g. full details of the variables and coding frames used Information about derived variables, e.g. full details on how these were created Cont…

7 Explanatory information Weighting and grossing Data source, e.g. details about the source the data were derived from Confidentiality and anonymisation, e.g. does the data contain confidential information on individuals Validation and other checks

8 Contextual information Description of the originating project, e.g. the aims and objectives of the project, who or what were being studied, geographical and temporal coverage etc. Provenance of the dataset, e.g. the history of the data collection process, details of data errors, bibliographic references to reports or publications based on the study Serial and time-series datasets - useful to have details of changes in question text, variable labels etc. over time

9 Using Data Documentation

10 Example: The UK Data Archive uses data documentation to create: Catalogue records for datasets User guides for datasets Data listings Nesstar datasets Using data documentation

11 UK Data Archive Catalogue records Information taken from: Study documentation Series information Data deposit forms - fields include title, principle investigator, sponsors, data collectors, dates of data collection, temporal and geographic coverage

12 Creating Survey Catalogue records

13 Survey catalogue records Used for retrieval purposes: use of controlled vocabularies provides means for consistent retrieval Information can be searched using a free-text search Catalogue records should provide users with enough information to enable them to decide if the data is suitable for their needs Used for administrative purposes e.g. provides information on the provenance of a dataset

14 Catalogue records contain… A description of the data – abstract, geographical and temporal coverage, population, variable labels and values A list of subject keywords Bibliographic information – principal investigator, sponsor Information on how the data were collected – methodology How to reference the data – citation Who owns the data – copyright Who can use the data – access conditions Where to get the data – distributor cont….

15 Catalogue records also contain.. Information on how to use the data, e.g. weighting details Lists of publications by the principal investigators and resulting from secondary analysis Links to related datasets, publications, related web sites, documentation When the data are available – new editions, frequency of release

16 Catalogue records The catalogue record should adhere to standards and rules to: Ensure consistency, accuracy, continuity Allow for consistent retrieval Enable interoperability between systems

17 Example: UK Data Archive Controlled vocabularies (dynamic) Names authority lists AACR2 (Anglo-American Cataloguing Rules Second Edition (1978), NCA (National Council on Archives) Rules for Construction of Personal, Place and Corporate Names (1997) Subject keywords – HASSET (Humanities and Social Sciences Electronic Thesaurus) (British Standard Guide to Establishment and development of monolingual thesauri – BS 5723, ISO 2788)

18 HASSET thesaurus HASSET thesaurus contains approximately: 4,500 subject terms 3,270 synonyms 28,00 relationships (BT,NT,TT,RT) (Broader, Narrower, Top, Related Terms) 2,730 geographic terms

19 HASSET terms

20 Controlled vocabularies (fixed) Subject categories – UK Data Archive - in-house schema Elements describing the methodology e.g. method of data collection, sampling, etc

21 International considerations Standardisation at an international level: Controlled vocabularies for methodology fields – work in progress within the DDI group and CESSDA Subject categories – UKDA scheme is mapped to the CESSDA Top Classification Thesaurus – ELSST (European Language Social Science Thesaurus) (3,209 terms)

22 What can we use to organise all the information we have? DDI and the Metadata Editor

23 The DDI

24 Introduction to the DDI Development of the Data Document Initiative (DDI) initially supported by ICPSR and then by a grant from the National Science Foundation (NSF) International committee set up which produced a Document Type Definition (DTD) for the ‘mark-up’ of what were originally known as ‘social science codebooks’ This DTD employs the eXtensible Mark-up Language (XML) and is used within the Nesstar system and Metadata Editor

25 The DDI (versions 1 & 2) There are five main sections of the DDI which are: 1. Document Description: containing items describing the marked-up document itself as well as its source documents 2. Study Description: contains items describing the overall data collection (e.g. title, citation, methodology, study scope, data access etc.) 3. Data Files Description: contains items relating to the format, size and structure of the data files

26 DDI 4. Variables description: contains items relating to variables in the data collection 5. Other Study-Related Materials: contains other study-related material not included in other sections (e.g. bibliography, separate questionnaire files, etc) Further information can be found at:

27 DDI XML Example – StdyDscr Demo: Demonstration dataset demo Ward, M. Eastaugh, K.

28 DDI XML Example – variable Gender Sex of respondent? Record respondent’s sex

29 DDI users Australian Social Science Data Archive Canadian Research Data Centres (CRDCs) CESSDA Data Portal The Dataverse Network European Social Survey (ESS) Gallup Europe ICPSR data catalogue MIDUS II – Midlife in the US: A national study of health and well-being The Tromsø Study – to determine the reasons for the high mortality rate in Norway International Household Survey Network Nesstar Links available from:

30 The Metadata Editor

31 Metadata Editor Standards DDI ( “Enables the effective, efficient and accurate use” of data resources Dublin Core ( – (Fifteen elements) “A standard for cross-domain information resource description”

32 Metadata Editor templates Metadata added by using templates Use templates to create individual sets of DDI fields Can add controlled vocabulary lists and default text Can rename template fields, i.e. use familiar terms.

33 Advantages of using templates Create to suit individual needs of an organisation or a data series Use of standard templates ensures consistent use of metadata fields Can add helpful information about each field to assist the data publisher

34 Import/Export Metadata Metadata can be imported and exported using the Metadata Editor – ‘Documentation’ Menu Options: Import from Study: import the metadata from an existing ‘Nesstar’ file selecting the fields to import. Import from DDI: import from an existing XML file Export DDI: Export metadata to a new XML file

35 Import/Export data Various formats available for both import and export including: SPSS portable, sav STATA Delimited text, e.g. csv, tab Nesstar/NSDstat

36 Study level metadata Information about the study Basic information needed, e.g. Title, unique ID, Abstract Other information could include: Primary investigator, Distributor, Version, copyright details Consider use of: Keywords, Topic classification Related information – related studies, related publications etc. Other Materials – links to useful resources

37 Variable level metadata Variable labels can easily be added/edited Category labels can easily be added/edited Identify ‘Weight’ variables Add question text and variable notes: – to each variable separately – to a block of variables Variable notes, e.g. how the variable was derived etc.

38 Data manipulation View the data as a matrix allowing direct data entry or editing Cut and paste data Add, insert and copy variables of different types, e.g. numeric, Fixed string, Dynamic string, Date Insert/replace data – insert data matrix from dataset, or fixed format text Delete variables Sort/Delete cases Conversion between variable types

39 Variable groups Used to organise data into specific categories, e.g. variables that relate to the same topic or theme A hierarchy of groups can be created, e.g. topics within a ‘Self- completion’ section Variables can belong to more than one group Groups are ‘virtual’ – variables are not moved within the file Groups can be arranged in any order Information about that group can be added, e.g. a group definition Advantages: Make it easier for end-users to navigate the dataset Reduces the load time of a dataset when published

40 Support for relational datasets Related, hierarchical, datasets are supported Use the ‘Key Variables & Relations’ section within a dataset to describe the relationship between files Add the related dataset names Add the key variables – used to link the files

41 External resources External resources include PDF files, ‘Word’ files, or the URL of an associated resource Within the Metadata Editor they can be described and published as ‘external’ resources Uses Dublin core fields for metadata Enables these ‘external’ resources to be viewed alongside survey data

42 Using the Metadata Editor Creating a survey catalogue record: Import data file Add study level metadata Add variable level metadata Check data/labels Create variable groups Save file

43 Review Good metadata enables easy discovery of data Good data documentation leads to informed re-use of data Provide meaningful information (titles, descriptions, abstract, keywords) in catalogue record

44 Metadata Editor Demonstration Importing data Adding study metadata Adding variable metadata Creating variable groups Using the template editor – metadata fields

45 Further information (Follow link to Microdata Management toolkit – Tools and guidelines) - DDI - UK Data Archive