Evolution of Data Documentation Providing Social Science Data Services Jim Jacobs, 2008.

Slides:



Advertisements
Similar presentations
Workshop on Metadata Standards and Best Practices November th, 2007 Session 4 The Data Documentation Initiative Technical Overview Pascal Heus Open.
Advertisements

ICPSR-SRO Shared Data Model Project Mary Vardigan Director, DDI Alliance.
A Gentle Introduction to DDI - What's in it for me? Jim Jacobs University of California, San Diego Wendy Thomas University of Minnesota.
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Data Archiving.
DLI Training Nesstar Workshop
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Metadata at ICPSR Sanda Ionescu, ICPSR.
OBJECT ORIENTED PROGRAMMING M Taimoor Khan
Access to and specifics of detailed national LFS data – the case of Slovenia Sebastian Kočar Social Science Data Archives University of Ljubljana 4th DwB.
Looking into the future… DDI workshop IASSIST 2006 Jim Jacobs.
Fitting a survey life cycle in the DDI Irene Wong Chuck Humphrey IASSIST Edinburgh May 2005.
Organizing Groups Why, When and How to do it. Brief history of the need for groups In the old days of MARC records you could either be a machine readable.
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
IASSIST Conference 2006 – Ann Arbor, May Metadata as report and support A case for distinguishing expected from fielded metadata Reto Hadorn S I.
Codebook Centric to Life-Cycle Centric In the beginning….
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
Managing the Metadata Lifecycle The Future of DDI at GESIS and ICPSR Peter Granda, ICPSR Meinhard Moschner, GESIS Mary Vardigan, ICPSR Joachim Wackerow,
Documentation Tools in the Survey Lifecycle. Outline What is NSFG Webdoc? Instrument documentation != Survey documentation Data Cleaning/Processing in.
Reducing Metadata Objects Dan Gillman November 14, 2014.
 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.
Data Management: Documentation & Metadata Types of Documentation.
Programming Concepts and Languages Chapter 12 – Computers: Understanding Technology, 3 rd edition 1November
RSS, etc. James A. Jacobs Data Services Librarian University of California San Diego
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
DCT 1123 PROBLEM SOLVING & ALGORITHMS INTRODUCTION TO PROGRAMMING.
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
ESCWA SDMX Workshop Session: Role in the Statistical Lifecycle and Relationship with DDI (Data Documentation Initiative)
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
ControlDraw, Modularisation, Standards And Re-Use Standardised Specification and Modular Design How ControlDraw Help.
1 The Architectural Design of FRUIT: A Family of Retargetable User Interface Tools Yi Liu, H. Conrad Cunningham and Hui Xiong Computer & Information Science.
DDI 3.0 Overview Sanda Ionescu, ICPSR. DDI Background Development History 1995 – A grant-funded project initiated and organized by ICPSR proposes to create.
Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
CHAPTER FOUR COMPUTER SOFTWARE.
Chuck Humphrey Data Library Co-ordinator University of Alberta May 16, Capitalising on Metadata Tool development plans IASSIST 2007.
Technical Overview of SDMX and DDI : Describing Microdata Arofan Gregory Metadata Technology.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Copyright © 2007 Addison-Wesley. All rights reserved.1-1 Reasons for Studying Concepts of Programming Languages Increased ability to express ideas Improved.
CSE 219 Computer Science III Program Design Principles.
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
Chapter 10 Software Engineering. Understand the software life cycle. Describe the development process models. Understand the concept of modularity in.
Colectica: A Platform for DDI 3 based Metadata Management Design. Collect. Share.
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
SCORM Course Meta-data 3 major components: Content Aggregation Meta-data –context specific data describing the packaged course SCO Meta-data –context independent.
Looking into the future… Providing Social Science Data Services Jim Jacobs.
The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
SDMX IT Tools Introduction
Topic 4 - Database Design Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.
ASET 1 Amity School of Engineering & Technology B. Tech. (CSE/IT), III Semester Database Management Systems Jitendra Rajpurohit.
Alexandria University Faculty of Science Computer Science Department Introduction to Programming C++
Evolution of Data Documentation ICPSR Evolution of Data Documentation.
Databases Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
Computational Thinking, Problem-solving and Programming: General Principals IB Computer Science.
Metrics of Software Quality
DDI and GSIM – Impacts, Context, and Future Possibilities
Programming Concepts and Languages
TRANSLATORS AND IDEs Key Revision Points.
PREMIS Tools and Services
Capitalising on Metadata
Introducing the Data Documentation Initiative
DDI and GSIM – Impacts, Context, and Future Possibilities
Presentation transcript:

Evolution of Data Documentation Providing Social Science Data Services Jim Jacobs, 2008

Evolution of Data Documentation

In the beginning… …was the codebook.

…early digital codebooks… Codebook listed to tape

…early digital codebooks… OSIRIS Dictionaries

…early digital codebooks… SPSS (and SAS) code

…early digital codebooks… PDFs

What do early digital codebooks have in common? 1. Tied to a particular physical layout of a data file VARIABLE 6 OPINION OF COUNTRY OVERALL DECK 1/35

What do early digital codebooks have in common? 1. Tied to a particular physical layout of a data file 2. Each uses its own special syntax. VARIABLE 6 OPINION OF COUNTRY OVERALL DECK 1/35 D HUFAMINC 2 39 CITY $ 77-94

What do early digital codebooks have in common? 3. Some included information intended for human consumption. Q1. THINKING ABOUT THE COUNTRY OVERALL, DO YOU THINK THINGS IN THE U.S. ARE GENERALLY GOING IN THE RIGHT DIRECTION, OR DO YOU FEEL THINGS ARE SERIOUSLY OFF ON THE WRONG TRACK? VALUE LABEL VALUE N OF CASES RIGHT DIRECTION WRONG TRACK NO OPINION 8 48 NOT APPLICABLE* TOTAL 1008 *NOT FORM A

PDF Osiris dictionary SPSS cards CBLT Book Osiris SPSS Problems of early digital codebooks (part 1)

PDF Osiris dictionary SPSS cards CBLT Book Osiris SPSS (user has to re-create information in order to re-use information) Machine “readable” but not Machine “actionable”

XML helps solve the problem XML is not tied to any single piece of software. XML is designed to be easily parsed by computer. XML is (to some extent) self-documenting or self-descriptive. XML can include information intended both for humans and machines. XML is non-proprietary, open, flexible.

XML helps solve the problem Many tools exist to read/convert XML. (Java, javascript, perl, PHP, etc.) XSL and XSLT were created explicitly for converting XML. With them XML can be converted to HTML, PDF, other XML, etc. XML is highly structured so it can be predictably converted.

DDI 1 and DOCUMENT DESCRIPTION 2.0 STUDY DESCRIPTION 3.0 DATA FILES DESCRIPTION 4.0 VARIABLE DESCRIPTION 5.0 OTHER STUDY-RELATED MATERIALS Built to emulate early code BOOKS and digital Codebooks…

Problems of early digital codebooks (part 2) Static, inflexible. Meant to document the end point of research -- Views research as linear. Hard to re-use the information for new research.

Problems of DDI 1 and 2 Emulated the Code Book Not flexible enough We could do so much more…

Three Stages of Technological Change Type of ChangeCharacterized by ModernizationDoing what we’ve always done, but using technology to do more and to increase efficiency InnovationDoing things we’ve wanted to do, but could not do without the technology TransformationDoing things that we didn’t imagine until technology made it possible.

Three Stages of Technological Change Type of ChangeCharacterized by Early digital codebooks Doing what we’ve always done, but using technology to do more and to increase efficiency DDI 1 and 2Doing things we’ve wanted to do, but could not do without the technology DDI 3Doing things that we didn’t imagine until technology made it possible.

Three Stages of Technological Change Type of ChangeCharacterized by Early digital codebooks Making codebooks machine readable DDI 1 and 2Making codebooks re-usable, even machine actionable… DDI 3Re-thinking “documentation” Re-thinking the research process

DDI 1 and 2 Document Description Study Description Data Files Description Variable Description Other Study-Related Materials

DDI 1 and 2 Document Description Study Description Data Files Description Variable Description Other Study-Related Materials Study Concept Data Collection Data Processing Data Distribution Data Archiving Data Discovery Data Analysis Repurposing DDI 3

Life Cycle of Research, Data, Documentation

A modular approach Study Unit - Research question - Funding - Concepts - Background research

A modular approach Study Unit Data Collection - Instrument - Data collection process - Questionnaire

A modular approach Study Unit Data Collection Logical Product - Intellectual content of data - Relationship to questions and concepts - Relationship to processing (recodes, weighting, derivations, imputations)

A modular approach Study Unit Data Collection Logical Product Physical Data Product - Describes the structure (microdata, tabular, aggregate, Ncube…) (e.g., STF 1A)

A modular approach Study Unit Data Collection Logical Product Physical Data Product Physical instance - Each describes a single data file (e.g., STF1 A by state...each state is an instance)

A modular approach Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” -An instance module “wraps” the other modules. Like a table of contents to a group of studies and files and modules it brings everything together.

A modular approach Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” Archive - Each archive can add its own local information with an archive module.

A modular approach Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” Archive

A modular approach (but wait… there’s more!) Group module - Describe concepts, questions, and variables that occur in several studies. - Describe a series (e.g., CBP, CPS, Eurobarometer) - Describe a collection of studies (not a series) and identify the common comparable concepts, questions and variables.

A modular approach (but wait… there’s more!) Group module Comparative module -The Comparative module contains information for comparing concepts, questions, and variables between or among Study Units that have been housed in a Group.

A modular approach (but wait… there’s more!) Group module Comparative module Conceptual components module - Describe concepts and their relationships as concept groups. - Use known vocabularies and can indicate the level of similarity between two concepts by describing the extent of difference.

A modular approach Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” Archive Group module Comparative module Conceptual components module