Presentation is loading. Please wait.

Presentation is loading. Please wait.

DDI for the Uninitiated

Similar presentations


Presentation on theme: "DDI for the Uninitiated"— Presentation transcript:

1 DDI for the Uninitiated
Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta Data Management Issues Greetings: Good afternoon everyone, glad to be with you for this discussion. ACCOLEDS /DLI Training: December 2003

2 Cataloguing Experiences
How many have catalogued using MARC Dublin Core

3 Cataloguing Experiences
Objectives of cataloguing Inventory control Location tool Access Distribution

4 Enter DDI Documentation in a standardized mark-up language
Data Documentation Initiative (DDI) The Data Documentation Initiative is a standard coming out of the IASSIST community and housed with the ICPSR. The DDI committee has produced what is known as a Document Type Definition (DTD) for "markup" of data documentation, which have been called in the past as codebooks by some. The DTD employs the eXtensible Markup Language (XML), which is a dialect of a more general markup language, SGML.

5 The DDI project is housed at the ICPSR, which contains the detailed description for the DDI on its web site.

6 The DDI DTD is composed of five sections, each having its own set of tags and tag attributes. The five sections are the document description, study description, data files description, variable description and other related materials.

7 An Example American Public Opinion and U.S. Foreign Policy, 1994

8 XML-DDI Benefits The display of data documentation through a variety of style sheets; Input for further processing, such as creating statistical package command files, conducting advanced searches, comparing variables across data files, driving data extraction engines, etc. The benefits of a standard like XML DDI is its use to structure the content of data documentation for both the purposes of display but also as input for a variety of processing, including creating statistical package command files, conducting advanced searching, comparing variables across data files, and driving data extraction engines. The possibilities become quite large when the structured data documentation can be subsequently processed on the basis of the tags of the DTD. I was just thinking about the possibility of visualizing skip patterns in the original questionnaire based on the tag structure in a DDI compliant document.

9 Data Documentation There is a need for comprehensive data documentation that allows easily Finding variables By subject groupings By keywords, phrases or terms By response categories (value labels) Through linkages from the questionnaire Secondly, the better and more comprehensive the data documentation, the easier it is to discover the variables needed by researchers both for their analysis and for extracting case subsets. We need to simplified ways both to find variables and to trace variables to their origin. It has been interesting watching the reactions of users to the utility distributed with the Survey of Labour and Income Dynamics called SLIDRET. This small dbase application allows finding and extracting variables from the huge number that exists. Simply having the variables organized by subject groupings has been helpful. While this utility has been helpful with SLID, I see another approach as providing a more versatile platform for organizing data documentation, which I’ll mention shortly.

10 Data Documentation There is a need for comprehensive data documentation that allows easily Tracing variables back to their origins To a question To a response category for a multiple response item To the variables from which it was computed for a derived variable. Discovery of variables can flow from the list of variables back to the survey instrument just as easily as flowing from the questionnaire to the variable list. Knowing the context from which a variable originates is important information. Does a variable come from a question that occurs within a skip pattern? Is the variable part of a multiple response question? If the variable was derived, from which variables was it computed?

11 Data Documentation There is a need for comprehensive data documentation that allows easily Understanding the corrections that must be made because of the sampling methodology Also, clear instructions about the steps that must be taken to correct for the sampling procedure is critical.

12 What’s next? Let’s assume we have <ddi> compliant files … so what’s next? What are the choices? Also, clear instructions about the steps that must be taken to correct for the sampling procedure is critical.

13 General Choices Feed your own system (input from a structured file)
Look at systems using <ddi> files directly Wait for SAS, SPSS, etc. to become XML enabled Wait and see Also, clear instructions about the steps that must be taken to correct for the sampling procedure is critical.

14 Projects Using DDI NESSTAR Health Canada -- DAIS SDA, Berkeley
ICPSR’s metadata University of Minnesota US Census Bureau Harvard Virtual Data Center Also, clear instructions about the steps that must be taken to correct for the sampling procedure is critical.

15 Global Access, Local Support
Data users NESSTAR Central Server Data Producers

16 Data Observatory Workbench
Text Journal articles User guides Methodology instructions Tools Finding and sorting Browsing Analysing Publishing Hyperlinks Data Survey Indicators Administrative Geographical People Conferences Experts Discussion lists

17 Data Sharing - The NESSTAR Way (in 3 Steps)
Prepare your data using the Nesstar Publisher Microdata in SPSS, SAS, Stata, Statistica, ascii or other formats Table- or aggregated data in Excel, Ascii or other formats Documentation/metadata in various text-formats, including XML Data or metadata sitting in relational databases Import Import data and metadata from a variety of formats Cut and paste additional metadata from external sources Use templates to enforce structure and local ”best practice” Organize your variables in groups and sub-groups Add local controlled vocabularies or thesauri Validate your data/metadata against the DDI and your local ”best practice” Output DDI-instances and/or publish to a Nesstar server

18 Data Sharing - The NESSTAR Way (in 3 Steps) – (cont’d)
Publish your data to a Nesstar server Publish over the Web or a local area network (LAN) Organize your data in folders and sub-folders Define the access conditions of your data Customize the user-interface to your data Publish Data Store

19 3. Share and explore your data through a variety of interfaces
Data Sharing - The NESSTAR Way (in 3 Steps) – (cont’d) 3. Share and explore your data through a variety of interfaces Nesstar Explorer – a feature rich data browser (Java application) Nesstar light – the standard web-browser interface to Nesstar resources and services Choose between a variety of customized interfaces Develop your own customized interface or integrate Nesstar services in an existing web-application Access Data Store

20 Demo URL:

21 Where do we go from here? Need to start producing <ddi> files
Need to create incentives for survey managers to create <ddi> files Need to work cooperatively to convert legacy files

22 What’s ACCOLEDS’ role?


Download ppt "DDI for the Uninitiated"

Similar presentations


Ads by Google