Data Preparation (Click icon for audio) Dr. Michael R. Hyman, NMSU
File, Record, and Field
Data Matrix
Data Entry Process of transforming data from research projects to computers
Five Steps for Data Preparation Validation Editing Coding Data entry/transcription Machine cleaning of data
Validation Check that interviews conducted as specified Ensure respondent qualified Interviewer looked/acted professionally Interview conducted in proper environment All appropriate questions asked
Editing: Personal Interviews Check for: Omissions Ambiguities Inconsistencies Proper skip patterns Properly recorded answers, especially to open-ended questions
Editing: Self-Administered Questionnaires Check for: All questionnaire sections and key questions answered Respondents understood instructions and took task seriously No missing pages Questionnaire returned before cutoff date
Solutions for Editing Problems Re-contact respondent Discard questionnaire Use only good items Data analysis implications (beyond scope of class)
Coding Process of grouping and assigning numeric codes to different question responses Closed-ended questions easier because pre-coded
Pre-coding Example
Coding an Open-Ended Question Generate list of responses Consolidate responses (subjective judgment) Set response category codes Assign independent response category and record associated numeric code
Portion of Travel Study Code Book
Data Entry Process Validated, edited, and coded questionnaires given to data entry operator More accurate and efficient to go directly from questionnaire to data entry device and storage medium Skip coding sheets
Data Transcription
Intelligent Data Entry Checking entered data for internal logic by either the data entry device or another connected device Excel/Quattro and SPSS rely on dumb data entry Require data cleaning
Machine Cleaning of Data Computerized error check Identifies and suggests fixes for logical errors Marginal report Computer-generated table of response frequencies for questions Monitor entry of valid codes and skip patterns
Machine Cleaning Instructions
Recoding Data
Recoding Data Using computers to convert original codes used for raw data into codes that are more suitable for analysis Var1 = 8 - Var1
Collapsing a Five-Point Likert Scale
Coping with Missing Data
Item Non-response to Questions of Fact
Ways to Handle Missing Responses Leave blank Case-wise deletion Pair-wise deletion Mean response Imputed response