Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance
Today’s Presentation What is DDI? History of the effort The DDI community DDI features, benefits, examples Relationship to SDMX and GSBPM Future directions
What Is DDI? An international specification for structured metadata describing social, behavioral, and economic data A standardized framework to maintain and exchange documentation/metadata A basis on which to build software tools Currently expressed in XML – eXtensible Markup Language
History First international committee established First DDI version published (aligned with codebooks, XML DTD- based) 2003 – DDI 2 published (support for aggregate/tabular data and geography added) Formation of the DDI Alliance, a self-sustaining membership organization 2008 – DDI 3 published (aligned with data lifecycle, XML Schema-based) 2010 – DDI rebranding – DDI Codebook (DDI 2 branch) and DDI Lifecycle (DDI 3 branch) development lines
DDI Alliance 30 members from around the world Modest institutional membership fee -- entitles members to help to shape the specification Yearly meetings Active working groups: Survey Design and Implementation, Qualitative Data New groups: Paradata, Administrative Data, Disclosure
The DDI Community
The DDI Community
Projects and Organizations Using DDI Australian Bureau of Statistics Canadian Research Data Centres (RDC) Program CESSDA Data Portal DataFirst at University of Cape Town The Dataverse Network European Social Survey (ESS) and ESS-Edu-net Gallup Europe
Projects and Organizations Using DDI General Social Survey Institute for the Study of Labor – IZA, Germany LISS -- Longitudinal Internet studies in the Social Sciences, Netherlands National Survey of Family Growth, US UNICEF, Child Info – Monitoring the Situation of Children and Women World Bank -- International Household Survey Network (IHSN) and Microdata Management Toolkit
Products of the DDI Alliance Specifications for DDI Codebook and Lifecycle Controlled vocabularies Tools catalog Training Papers and presentations
DDI Tools Several DDI authoring tools now available Nesstar IHSN Microdata Toolkit Colectica Tools for DDI in Research Data Centers
DDI Training and Exploration Training available on demand in fundamentals of DDI Advanced workshops in Germany in September 2011 DDI and Longitudinal Data DDI and Semantic Statistics
DDI Development Lines DDI Codebook (DDI 2 branch) – Reflects components of social science codebooks – Includes descriptions at the study, file, and variable level DDI Lifecycle (DDI 3 branch) – Reflects research data lifecycle – Optimized for reuse of metadata
Research Data Life Cycle
Initial concepts Questions and answers Grant info Questionnaire Coded instrument CAI metadata Paradata Data specs Recodes Summary descriptive info Terms of use Citation Packaging info Catalog record Indexing Related publications Replication code Publications Post-hoc harmonization Data transformations Preservation metadata Confidentiality Add’l processing
DDI as backbone for structured metadata CollectionConceptProcessing DistributionDiscoveryAnalysis Repurposing SIP AIP DIP CAI Tools MQDS etc. Information extracted from SPSS etc. O A I S Archive Custom Tools (e.g. Forms-based) Statistical packages Online Analysis. Search engines. Distribution Packages Web information system Data / Documents outside of DDI
DDI Lifecycle Features Machine-actionable Modular and extensible Multi-lingual Aligned with other metadata standards Can carry data in-line Focused on metadata reuse
DDI Lifecycle Features
Support for CAI instruments Support for longitudinal surveys Focus on comparison, both by design and after-the-fact (harmonization) Robust record and file linkages for complex data files Support for geographic content (shape and boundary files) Capability for registries and question banks
Example – Comparative Collaborative Psychiatric Surveys
Explore a Variable
Compare Questions in Proximity
DDI and SDMX -- Statistical Data and Metadata eXchange Not competing but complementary Two standards bodies talking with each other – Recent meetings in Utrecht, Lisbon, and Washington, DC; next one in Luxembourg Goals and use cases – Document survey process end-to-end – Drill back from macrodata to microdata – Data discovery
Generic Statistical Business Process Model -- GSBPM
GSBPM, SDMX, and DDI
Future Directions Data model development Additional controlled vocabularies Training in DDI and SDMX New tools – DDI from Blaise New versions of both Codebook and Lifecycle this year Work starting on DDI 4.0
Questions? Mary Vardigan –