Creating DDI Compliant Codebooks Wendy L. Thomas William C. Block Robert P. Wozniak Joshua J. Buysse A workshop presented at IASSIST 2001 Amsterdam NL.

Slides:



Advertisements
Similar presentations
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
Advertisements

DLI Training Nesstar Workshop
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
10. NLTS2 Documentation Overview. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
Pengolahan dan Analisa Data Indra Budi Fasilkom UI.
XHTML & CSS 2 By Trevor Adams. Last week XHTML eXtensible HyperText Mark-up Language The beginning – HTML Web Standards Concept and syntax Elements (tags)
An Introduction to the Data Documentation Initiative (DDI) ICPSR OR Meeting 2001 Wendy L. Thomas Data Access Core Director William C. Block Information.
Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available.
Introduction to SPSS Opening the program Type in data Open an existing data set For now, click “Cancel”
March 2013 ESSnet DWH - Workshop IV DATA LINKING ASPECTS OF COMBINING DATA INCLUDING OPTIONS FOR VARIOUS HIERARCHIES (S-DWH CONTEXT)
Creating DDI Compliant Codebooks William C. Block Wendy L. (Treadwell) Thomas Robert P. Wozniak Joshua J. Buysse IASSIST 2001 Amsterdam, The Netherlands.
Meta Dater Metadata Management and Production System for surveys in Empirical Socio-economic Research A Project funded by EU under the 5 th Framework Programme.
Fitting a survey life cycle in the DDI Irene Wong Chuck Humphrey IASSIST Edinburgh May 2005.
Locating the Geographic Center of DDI 3.0 IASSIST May 2006 Wendy Thomas University of Minnesota.
Is Your Data Facility ISO Compliant? Progress Towards Harmonizing the DDI and ISO/IEC Dan Gillman Information Scientist US Bureau of Labor Statistics.
QBM117 Business Statistics Statistical Inference Sampling 1.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Demonstration of a Blaise Instrument Documentation System “BlaiseDoc” Gina-Qian Cheung May 25, 2005 Institution for Social Research University of Michigan.
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
Reusable!? Or why DDI 3.0 contains a recycling bin.
Codebook Centric to Life-Cycle Centric In the beginning….
Data Management: Quantifying Data & Planning Your Analysis
Multiple Indicator Cluster Surveys Survey Design Workshop Data Analysis and Reporting MICS Survey Design Workshop.
Tutorial 3: Adding and Formatting Text. 2 Objectives Session 3.1 Type text into a page Copy text from a document and paste it into a page Check for spelling.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
ISO as the metadata standard for Statistics South Africa
WORKING WITH NAMESPACES
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Processing the Data.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Chapter 1 Getting Started
Organizing Your Data for Statistical Analysis in SPSS
Chapter Sixteen Starting the Data Analysis Angel Gillis & Winston Jackson Research for Nurses: Research for Nurses: Methods and Interpretation.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Use of survey (LFS) to evaluate the quality of census final data Expert Group Meeting on Censuses Using Registers Geneva, May 2012 Jari Nieminen.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
DLI Training April 2004 Kingston Ontario. DDI What, Why, How?
Role of Statistics in Geography
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
Content and Computer Platforms Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
Quantifying Data Advanced Social Research (soci5013)
Introduction to Metadata, the DDI and the Metadata Editor Presentation to the SERPent project team by Margaret Ward 3 March 2010.
Panel Study of Entrepreneurial Dynamics Richard Curtin University of Michigan.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
Overview of EAD Jenn Riley Metadata Librarian Digital Library Program.
Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition.
XP New Perspectives on XML, 2nd Edition Tutorial 2 1 TUTORIAL 2 WORKING WITH NAMESPACES.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Teacher Gradebook Assignment Module Post course outline/syllabus as well as all assignments and inclusions Input marks for assignments, tests etc. and.
Tutorial 13 Validating Documents with Schemas
1 Tutorial 12 Working with Namespaces Combining XML Vocabularies in a Compound Document.
1 FUNCTIONS - I Chapter 5 Functions help us write more complex programs.
Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available.
1 HTML Frames
Microsoft Expression Web 3 – Illustrated Unit D: Structuring and Styling Text.
Moodle Quizes Staff Guide. Creating Quizzes Click Add an Activity or Resource With the course in editing mode...
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
COMPREHENSIVE Excel Tutorial 12 Expanding Excel with Visual Basic for Applications.
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
Chapter 1: Statistics.
Technical Coordination Group, Zagreb, Croatia, 26 January 2018
STEPS Site Report.
New Perspectives on XML
Presentation transcript:

Creating DDI Compliant Codebooks Wendy L. Thomas William C. Block Robert P. Wozniak Joshua J. Buysse A workshop presented at IASSIST 2001 Amsterdam NL -15 May 2001

Structure for the Workshop 9:15 – 10:15 DDI compliant codebooks Contents Points and perspectives to keep in mind Best practices 10:15 – 11:30 MADDIE (break in here somewhere) Walk-through of functions Practice entries 11:30 – 12:15 Playtime and questions

Workshop Materials CD-ROM Copy of Maddie Quick Reference Guide Tag Library Awe-inspiring reference tool for every element and attribute in the DDI Codebook Stripped down model of an ICPSR codebook to use as a source for the workshop Do NOT try to use this with the data set described under this study number (its been edited beyond recognition)

Overall DDI Structure Document Description (1.0) Describes the XML document itself and the source materials Study Description (2.0) Describes the overall study Data File Descripton (3.0) Describes the physical data files Variables Description (4.0) Describes the variables themselves Other Materials (5.0)

Basic Concepts to Remember: It is NOT just your basic codebook Machine readable vs. Machine processable Human understandable vs. Machine understandable Information needs to be entered in discrete bits

Principles to follow: Use attributes Use ID attribute so you can use IDRefs Make implicit information explicit source, XML:Lang, level Follow ISO standards where available Inheritance

ID Attribute Provides a unique name for each specific element Must start with an alpha character and contain no spaces Must be unique within the XML document Create your own scheme for easy application and reference

Example of an ID scheme: Persons living on farms Farms over 100 acres

Using ID references Persons living on farms Farms over 100 acres

Best Practices: Multi-country data sets Example: EuroBarometer Questions vary by country Response category value varies by country Identify countries under and use sdatref attribute to identify variants The Netherlands France

Use of sdatRefs, methRefs and pubRefs: Under verison 1.01 these attributes have been made broadly available Their use varies only in the sections of the dtd to which they refer Each can contain references to one or more element IDs Examples of use: When two or more universe statements are used these can be stated in the study description and then variables can be associated to the correct universe by sdatRefs Changes in response category labels by country. The appropriate label is linked to the country by sdatRefs

sdatRefs Summary data description references that record the ID values of all elements within the summary data description section of the Study Description that might apply. These elements include: time period covered, date of collection, nation or country, geographic coverage, geographic unit, unit of analysis, universe, and kind of data.

methRefs methodology and processing references which record the ID values of all elements within the study methodology and processing section of the Study Description which might apply. These elements include: information on data collection and data appraisal (e.g., sampling, sources, weighting, data cleaning, response rates, and sampling error estimates).

pubRefs Provides a link to publication/citation references and records by listing the ID values of all citations elements within Section 2.5 or Section 5.0 that pertain to the element.

source, XML:lang, level Source attribute provides the source of the information in the element Remember that not all elements may be passed to another person/system and it is always good to know who to blame XML:lang provides language identifier The `default´ language to you may not be the `default´ language of the user Level indicates nesting patterns Some elements such as and occur in many locations in the dtd. This lets you identify the level of label (var, file, etc)

Using ISO standards (Generic element A.6.3.3) Description: Date the marked-up document was produced (not distributed or archived). The ISO standard for dates (YYYY- MM-DD) is recommended for use with the date attribute. Equivalent to Dublin Core Date. Example: January 25, 1999

Inheritance Lower levels in hierarchies inherit information from higher levels If a piece of information is true for the entire subset of elements, move it up to the next level This means consciously looking for common pieces of information and entering them appropriately

Referencing standard catagory lists Description: Standard category group used in a variable, like industry codes, employment codes, or social class codes. The attribute of "date" is provided to indicate the version of the code in place at the time of the study. The attribute of "URI" is provided to indicate a URN or URL that can be used to obtain the electronic form of the category group.

Example: Census of Population, Classified Index of Industries and Occupations Attributes: ID, xml:lang, source, date, URI

Recording or creating variable groups Variable groups can contain both variables and other variable groups. Variable groups are created this way in order to permit variables to belong to multiple groups. Variables that are linked by use of the same question need not be identified by a Variable Group element because they are linked by a common unique question identifier in the Variable element. All Variable Groups must be marked up before the Variable element is opened.

Types of Variable Groups: Section: Questions from the same section of the questionnaire, e.g., all variables located in Section C. Multiple response: respondent can select more than one answer from a variety of choices, e.g., what newspapers have you read in the past month. Grid: Sub-questions of an introductory or main question but which do not constitute a multiple response group, e.g., I‘m going to read a list of candidates and I would like you to tell me whether you have heard of them.

Type of groups continued Display: Questions which appear on the same interview screen (CAI) together or are presented to the interviewer or respondent as a group. Repetition: The same variable (or group of variables) which are repeated for different groups of respondents or for the same respondent at a different time. Subject: Questions which address a common topic or subject, e.g., income, poverty, children.

Type of groups continued Version: Variables, often appearing in pairs, which represent different aspects of the same question, e.g., pairs of variables (or groups) which are adjusted/unadjusted for inflation or season or whatever, pairs of variables with/without missing data imputed, and versions of the same basic question. Iteration: Questions that appear in different sections of the data file measuring a common subject in different ways, e.g., a set of variables which report the progression of respondent income over the life course.

Type of groups continued Analysis: Variables combined into the same index, e.g., the components of a calculation, such as the numerator and the denominator of an economic statistic. Pragmatic: A variable group without shared properties. Record: Variables from a single record in a hierarchical file. File: Variables from a single file in a multifile study.

Type of groups continued Randomized: Variables generated by CAI surveys produced by one or more random number variables together with a response variable, e.g, random variable X which could equal 1 or 2 (at random) which in turn would control whether Q.23 is worded "men" or "women", e.g., would you favor helping [men/women] laid off from a factory obtain training for a new job?

Type of groups continued And finally.... Other: Variables which do not fit easily into any of the categories listed above, e.g., a group of variables whose documentation is in another language.