 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.

Slides:



Advertisements
Similar presentations
Workshop on Metadata Standards and Best Practices November th, 2007 Session 4 The Data Documentation Initiative Technical Overview Pascal Heus Open.
Advertisements

ICPSR-SRO Shared Data Model Project Mary Vardigan Director, DDI Alliance.
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
DLI Training Nesstar Workshop
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Metadata at ICPSR Sanda Ionescu, ICPSR.
DDI3 Uniform Resource Names: Locating and Providing the Related DDI3 Objects Part of Session: DDI 3 Tools: Possibilities for Implementers IASSIST Conference,
Click to edit Master title style Click to edit Master subtitle style DDI Across the Life Cycle: One Data Model, Many Products IASSIST Meeting Tampere,
Meta Dater Metadata Management and Production System for surveys in Empirical Socio-economic Research A Project funded by EU under the 5 th Framework Programme.
XML for Information Management – Day 2 Airi Salminen University of Erlangen-Nuremberg Computational Linguistics Instructor: Professor Airi Salminen
Demonstration of a Blaise Instrument Documentation System “BlaiseDoc” Gina-Qian Cheung May 25, 2005 Institution for Social Research University of Michigan.
DDI URN Enabling identification and reuse of DDI metadata IDSC of IZA/GESIS/RatSWD Workshop: Persistent Identifiers for the Social Sciences Joachim Wackerow.
A database-driven tool to create items, variables and questionnaires NEPS Metadata Editor.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
Präsentationstitel IAB-ITM Find the right tags in DDI IASSIST 2009, 27th-30th Mai 2009 IAB-ITM Finding the Right Tags in DDI 3.0: A Beginner's Experience.
Codebook Centric to Life-Cycle Centric In the beginning….
Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, ICPSR Meinhard Moschner, GESIS Mary Vardigan, ICPSR Joachim Wackerow,
Data Management: Documentation & Metadata Types of Documentation.
© 2014 by the Regents of the University of Michigan Metadata from Blaise and DDI 3.0/3.2 Gina Cheung Beth-Ellen Pennell North American DDI Conference April.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
ISO as the metadata standard for Statistics South Africa
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
ESCWA SDMX Workshop Session: Role in the Statistical Lifecycle and Relationship with DDI (Data Documentation Initiative)
Locating objects identified by DDI3 Uniform Resource Names Part of Session: Concurrent B2: Reports and Updates on DDI activities 2nd Annual European DDI.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Curating and Managing Research Data for Re-Use Review & Processing Jared Lyle.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
DDI: Capturing metadata throughout the research process for preservation and discovery Wendy Thomas NADDI 2012 University of Kansas.
3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms Marco Pellegrino Eurostat.
IMS Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.
DDI 3.0 Overview Sanda Ionescu, ICPSR. DDI Background Development History 1995 – A grant-funded project initiated and organized by ICPSR proposes to create.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
DLI Training April 2004 Kingston Ontario. DDI What, Why, How?
Metadata Portal Project: Using DDI to Enhance Data Access and Dissemination Mary Vardigan Assistant Director, ICPSR Director, DDI Alliance.
Institute for Social Research University of Michigan
February 17, 1999Open Forum on Metadata Registries 1 Census Corporate Statistical Metadata Registry By Martin V. Appel Daniel W. Gillman Samuel N. Highsmith,
Copyright 2010, The World Bank Group. All Rights Reserved. ICT - a core management issue Part 1 Managing ICT resources Produced in Collaboration between.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Metadata Management and Tools August 1, 2013 Data Curation Course.
Evolution of Data Documentation Providing Social Science Data Services Jim Jacobs, 2008.
Colectica: A Platform for DDI 3 based Metadata Management Design. Collect. Share.
Secure Epidemiology Research Platform (SERPent) Kick Start Meeting - April 15 th, 2010 Pascal Heus
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
Colectica Feature Overview. Current Focus: Data Collection.
Slide 1 SDTSSDTS FGDC CWG SDTS Revision Project ANSI INCITS L1 Project to Update SDTS FGDC CWG September 2, 2003.
SDMX IT Tools Introduction
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
TIC Updates EDDI 2010 Wendy Thomas – 6 Dec Schedule and Process Changes Production schedule is moving to: – Summer / Winter release schedule January.
Metadata standards Using DDI to Inform, Organize, and Drive Survey Data Production.
Metadata models to support the statistical cycle: IMDB
Michigan Questionnaire Documentation System (MQDS)
What’s New in Colectica 5.3 Part 1
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 11/14/2018.
The Re3gistry software and the INSPIRE Registry
DDI for the Uninitiated
Enhancing ICPSR metadata with DDI-Lifecycle
Updates on the XSLT stylesheets for DDI
Question Banks, Reusability, and DDI 3.2 (Use Parameters)
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 5/21/2019.
Introduction to DDI Mogens Grosen Nielsen,
Palestinian Central Bureau of Statistics
Presentation transcript:

 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals for today

Introduction DDI 3 Background XML Background How DDI 3 documents survey instruments Creating DDI3 and Documentation Manual Markup Using functionality from CAI systems Custom development Colectica Discussion Questions and discussion Additional documentation activities

Data Documentation Initiative DDI3 Background

Background Concept of DDI and definition of needs grew out of the data archival community Established in 1995 as a grant funded project initiated and organized by ICPSR Members: –Social Science Data Archives (US, Canada, Europe) –Statistical data producers (including US Bureau of the Census, the US Bureau of Labor Statistics, Statistics Canada and Health Canada) February 2003 – Formation of DDI Alliance –Membership based alliance –Formalized development procedures Copyright © 2008 GESIS

Origins of the DDI Alliance Versions 1.* and 2.* were developed by an informal network of individuals from the social science community and official statistics –Funding was through grants It was decided that a more formal organization would help to drive the development of the standard forward –Many new features were requested –The DDI Alliance was born to facilitate the development in a consistent and on-going fashion Copyright © 2008 GESIS

Requirements for 3.0 Improve and expand the machine-actionable aspects of the DDI to support programming and software systems Support CAI instruments through expanded description of the questionnaire (content and question flow) Support the description of data series (longitudinal surveys, panel studies, recurring waves, etc.) Support comparison, in particular comparison by design but also comparison-after-the fact (harmonization) Improve support for describing complex data files (record and file linkages) Provide improved support for geographic content to facilitate linking to geographic files (shape files, boundary files, etc.) Copyright © 2008 GESIS

DDI 3.0 and the Data Life Cycle A survey is not a static process: It dynamically evolved across time and involves many agencies/individuals DDI 2.x is about archiving, DDI 3.0 across the entire “life cycle” 3.0 focus on metadata reuse (minimizes redundancies/discrepancies, support comparison) Also supports multilingual, grouping, geography, and others 3.0 is extensible Copyright © 2008 GESIS

Development of DDI – Acceptance of a new DDI paradigm –Lifecycle model –Shift from the codebook centric / variable centric model to capturing the lifecycle of data –Agreement on expanded areas of coverage 2005 –Presentation of schema structure –Focus on points of metadata creation and reuse 2006 – Presentation of first complete 3.0 model – Internal and public review 2007 – Vote to move to Candidate Version – Establishment of a set of use cases to test application and implementation 2008 – April: DDI 3.0 published Copyright © 2008 GESIS

 XML: Extensible Markup Language  Designed to transport and store data

XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection Logical Product Physical Data Structure Archive Conceptual Component Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection Logical Product Physical Data Structure Archive Conceptual Component Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

XML Schemas, DDI Modules, and DDI Schemes Copyright © 2008 GESIS Instance Study Unit Physical Instance DDI Profile Comparative Data Collection  Question Scheme  Control Construct Scheme  Interviewer Instruction Scheme Logical Product  Category Scheme  Code Scheme  Variable Scheme  NCube Scheme Physical Data Structure  Physical Structure Scheme  Record Layout Scheme Archive  Organization Scheme Conceptual Component  Concept Scheme  Universe Scheme  Geographic Structure Scheme  Geographic Location Scheme Reusable Ncube Inline ncube Tabular ncube Proprietary Dataset

Maintainable Schemes Category Scheme Code Scheme Concept Scheme Control Construct Scheme Geographic Structure Scheme Geographic Location Scheme Interviewer Instruction Scheme Question Scheme NCube Scheme Organization Scheme Physical Structure Scheme Record Layout Scheme Universe Scheme Variable Scheme Packages of reusable metadata maintained by a single agency Copyright © 2008 GESIS

Designed to Support Registries A “Registry” is a catalog of metadata resources Resource package –Structure to publish non-study-specific materials for reuse Extracting specified types of information in to schemes –Universe, Concept, Category, Code, Question, Instrument, Variable, etc. Allowing for either internal or external references –Can include other schemes by reference and select only desired items Providing Comparison Mapping –Target can be external harmonized structure Copyright © 2008 GESIS

Data Collection Methodology Question Scheme –Question –Response domain Instrument –using Control Construct Scheme Coding Instructions –question to raw data –raw data to public file Interviewer Instructions Question and Response Domain designed to support question banks – Question Scheme is a maintainable object Organization and flow of questions into Instrument – Used to drive systems like CASES and Blaise Coding Instructions – Reuse by Questions, Variables, and comparison Copyright © 2008 GESIS

QuestionItem in DDI

QuestionItem

Opening tag & identification QuestionText NumericDomain

In a QuestionScheme

ControlConstructScheme with QuestionConstructs

An Instrument

Those all go in a DataCollection element

The DataCollection element goes in a StudyUnit, which goes in a DDIInstance or ResourcePackage

 Create QuestionScheme and QuestionItems

 Create ControlConstructScheme  Add QuestionReferences

 Add control flow items to ControlConstructScheme  Include a main Sequence element

 Create the Instrument Element  Add the main ControlConstructReference

 Create the DDIInstance element  Create the StudyUnit element  Create the DataCollection element  Add the QuestionScheme, ControlConstructScheme, and Instrument to the DataCollection element

 Check the XML document against the DDI schemas to see if we got it right.

 We have DDI, now we need documentation

Custom DevelopmentMQDSColectica

Michigan Questionnaire Documentation System (MQDS) Sue Ellen Hansen Nicole Kirgis

What Does MQDS Do? Facilitates automated documentation and harmonization of Blaise survey instruments and datasets – Extracts survey question metadata – Standardized format

Survey Question Metadata Question universe Variable name and label Question text Question variable text (fills) Data type Code values and code text Skip instructions etc.

MQDS Version 1 Extracted metadata from Blaise data model as XML tagged data Provided user interface for selection of – Blaise files – Instrument questions and sections – Types of metadata to extract – Languages to display – Style sheet for generation of instrument documentation or codebook

Using MQDS V1 XML: Codebook in Five Languages National Latino and Asian American Study

MQDS Version 1 Limitations – XML not DDI-compliant DDI Version 2 did not have XML tags for all metadata provided by Blaise Did not provide easy means of adding XML tags without becoming noncompliant – XML files for complex surveys can be very large (text files) Entire files had to be processed in computer memory Limited ability to fully automate documentation

DDI Version 3 Released April 2008 Focus on complete data lifecycle –going beyond the codebook

DDI Version 3 Included extensions proposed by DDI working group on instrument design Persistent Content of QuestionUse of Question in Instrument Question text Static Dynamic or variable Order and routing Sequence / skip patterns Loops Multiple-part questionUniverse Response domain Open Set categories Special types (date, time, etc.) Analysis unit Definitional textInstructions

MQDS Version 3 Joint SRC and ICPSR venture Goals: – Address version 2 limitations Process Blaise instrument of any size – Exploit new elements and validate to the recently released DDI version 3 standard – Move from processing XML metadata in memory to streaming metadata to a relational database

MQDS Version 3 Relational Database: Import, Export, Transform 3. Transform 1. Import 2. Export XML (DDI 3) User specifies output files (location, Language/locale, XML output options, etc.) Codebook Questionnaire User specifies stylesheet selection criteria, type of output desired (html, rtf, pdf), etc. User specifies input files (location, file type, etc.) Blaise Datamodel (BMI) Blaise Database (BDB) Other File Types (e.g. SAS, SPSS, etc) Relational Db Relational Db SQL Server / SQL Server Express Database connection settings DDI 3 elements not in *.bmi

MQDS Version 3 Relational database – DDI compliant standardized tables – Flexibility for SRC and ICPSR to add extensions that meet their specific organizational needs – Allows Automated documentation of any Blaise survey instrument Importing and documenting data produced by other software Lower cost development of other tools that facilitate editing and disseminating data

MQDS V3 Prototype: Exporting Language XML

MQDS Development Expect to release Summer 2009 Working out a distribution plan for Blaise users