Data Collection and Management for Clinical Research Michael A. Kohn, MD, MPP 31 August 2010
Topics Data Tables: rows = records, columns = fields Data Dictionary: a table of data about your data table(s) On-screen data collection forms
Data Tables Study data are stored in one or more data tables in which – Rows = Records = Entities Columns = Fields = Attributes
Interview Data
Exam Data
Questionnaire Data
Outcomes on same form as predictors
Demonstration: Creating a Data Table Label columns and enter rows of data in datasheet view
Demonstration: Data Dictionary Table design view: field (=column) names, data types, definitions, validation rules
Demonstration: Data Validation Disallowed values Duplicate primary keys (SubjectDemo)
(Subject)
Demonstration: Same Table in Excel, Stata Excel Stata Etc Rows = Records = Entities Columns = Fields = Attributes Access and Stata have a special row at the top for column headings (=field names); Excel just uses the first row.
Data Collection Forms Nothing focuses the study like the need to develop a data collection form Forces you to get concrete about your predictors, outcomes, and other measurements.
Planning Data Collection and Management Search the internet and ask other researchers for already developed data collection forms. Draft your data collection form. Test your data collection form with dummy subjects and, even better, with real (de-identified) study subjects. Enter your test data into a data table with rows corresponding to subjects and columns corresponding to data elements. (Use Excel, Access, Stata, or even Word.) Create a data dictionary. Decide who will collect the data, and when/how the data will be collected.
NHAMCS nhamcs100-ED_2007.pdf
From Paper Data Forms to Data Table(s)* Transcription directly into the table(s) Transcription via an online (screen) form Scanning using OMR software *Best option: Don’t use paper data collection forms at all
Creating On-Screen Forms Microsoft Access Filemaker Pro REDCap QuesGen (today) SurveyMonkey Others
On-Screen Data Collection Forms Demonstrate the QuesGen web form This is easy to do in SurveyMonkey and REDCap too Access and FilmakerPro can both create nice on-screen forms, but not as easy to make available and secure via the Internet
Clinical research studies consistently under-plan and under-budget for data management. One FTE research assistant, including benefits, will cost your study ~$100k/year; you should be willing to spend one tenth as much for consulting on your database system. Don’t leave data checking, validation, and cleaning for the biostatistician a) it will be too late; b) biostatisticians do not view data cleaning as part of their job.
Assignment? Write a one-page data management section for your research study protocol. Draft your data collection forms!