Elements of Data Documentation
What are the most important elements to document? Who will be using the documentation? How should these elements be documented?
Data Level Documentation Elements to document Variables: name, labels, question text, length and type in data set Values: List of valid values, coding Derived data: algorithm used to create Missing data: how was it handled? Question routing (skip patterns) Error checking/validation
Study-Level Documentation Study Level Context of project Details of data collection Information about data files File name, date, version, number of cases Summary of measures Scaling/Scoring Validation/modification Longitudinal information Naming conventions Version information
Users of Data Documentation Know your potential audiences Data managers Statisticians Researchers Outside users
Types of Data Documentation Tabular codebook (Excel) Good for organizing a large amount of information concisely Sortable/filterable Annotated measure Contains basic variable and value information in context Data dictionary/Data narrative Good for measure/study-level information