Identifying Problem Sources at Data Entry and Collection National Center for Immunization & Respiratory Diseases Influenza Division Nishan Ahmed Regional Training Workshop on Influenza Data Management Phnom Penh, Cambodia July 27 – August 2, 2013
Methods to Identify Data Problems Data collection Review of paper form for completeness Review of key fields for validation Sign off by data collector and reviewer Data entry Double Data Entry Built in checks at the data entry level Field Validation Rules Keeping data consistent across the record
Data Collection Review of paper form - onsite Data collected on form Form is reviewed by second person for completeness Form sent to central location for entry into data management system Review of key indicators – onsite Data collected on form Form reviewed by second person for accuracy and completeness of key indicators Date of birth vs. date of admission Gender vs. pregnancy Temperature falls in pre-determined acceptable range Sign off by data collector and reviewer
Data Entry Double Data Entry Pros All data is entered twice for ease of comparison ACCESS - Programmed computer-run check for inconsistencies between the two entries Useful in picking out keystroke mistakes Cons EXCEL – requires several steps to review and validate Not useful when dealing with systematic errors or incorrect measurements Time consuming procedure - costly
Data Entry
Built-in checks may include: Field Validation Rules Date Validations Date of onset should be on or before the date of specimen collection, date of consultation or admission Date of sample collection should be before date received at laboratory Other validity checks Temperature should be a valid measured temperature (i.e. between 35ºC and 41ºC) Pregnancy status should only be “yes” if patient is female and of child bearing age Test results should be consistent with the type of test performed (i.e. a rapid test will not yield Influenza A subtyping results)
Data Entry Built-in checks may include: Forced consistency across fields Forced – Data entry screen will not let you proceed with incorrect data, Voluntary - Gives a warning that the value entered may be wrong but will let you continue
Data Entry Setting Up Field Validation Rules On Tables - ACCESS
Data Entry Macros to Ensure Data Consistency Across Fields on Forms - ACCESS
Data Entry Macros to Ensure Data Consistency Across Fields on Forms - ACCESS
Data Entry Macros to Ensure Data Consistency Across Fields on Forms - ACCESS
Data Entry Built-in checks may include: System queries at site to aid in fixing data errors immediately Contain additional data validation criteria Additional validations that might be important For example: Might check for dates that seem reasonable for the time period, or pull a query for dates that are too far apart (i.e. date of onset is more than a week from the date of consultation)
Data Entry Setting up System Queries - ACCESS
Data Entry Setting up System Queries - ACCESS
Final Thoughts There are many different ways to ensure data quality at the data entry and data collection level Double Data Entry Good for finding mis-keyed values Built-in checks at data entry Can include both single field validations and controls to keep data consistent across fields System queries For systematic data checks and potential error identification Use what works for you and your data process
Exercise Objective: Create data validation controls/data checks for your system 1)Create built-in data checks in your database (i.e. date validations) 2)Create system queries so that sites can assist in the data cleaning process
For more information please contact Centers for Disease Control and Prevention 1600 Clifton Road NE, Atlanta, GA Telephone, CDC-INFO ( )/TTY: Web: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. THANK YOU!!! National Center for Immunization & Respiratory Diseases Influenza Division