Survey Methodology Survey data entry/cleaning EPID 626 Lecture 10.

Slides:



Advertisements
Similar presentations
MICS Data Processing Workshop Overview. Data Processing Design Data processing is organized around clusters There is one set of data files for each cluster.
Advertisements

MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Overview of Data Processing System.
Fact Finding Techniques
Preparing Data for Quantitative Analysis
Causal-Comparative Research Designs
Pengolahan dan Analisa Data Indra Budi Fasilkom UI.
Anita M. Baker, Ed.D. Jamie Bassell Evaluation Services Program Evaluation Essentials Evaluation Support 2.0 Session 2 Bruner Foundation Rochester, New.
1 QUANTITATIVE DESIGN AND ANALYSIS MARK 2048 Instructor: Armand Gervais
Bivariate Analysis Cross-tabulation and chi-square.
Konstanz, Jens Gerken ZuiScat An Overview of data quality problems and data cleaning solution approaches Data Cleaning Seminarvortrag: Digital.
Recap of basic SPSS and statistics 5 th - 9 th December 2011, Rome.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Non-Experimental designs: Developmental designs & Small-N designs
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Health Informatics Series
Data Management: Quantifying Data & Planning Your Analysis
Chapter One: The Science of Psychology
Quantifying Data.
1 Organising data in a spreadsheet Module 1 Session 3.
Basic Concept of Data Coding Codes, Variables, and File Structures.
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Managing Your Own Data (…if you have to) Kathryn A. Carson, Sc.M. Senior Research Associate Department of Epidemiology Johns Hopkins Bloomberg School of.
Conducting a Job Analysis to Establish the Examination Content Domain Patricia M. Muenzen Associate Director of Research Programs Professional Examination.
Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine Minimum Documentation Requirements.
How to process data from clinical trials and their open label extensions PhUSE, Berlin, October 2010 Thomas Grupe and Stephanie Bartsch, Clinical Data.
Chapter 8: Systems analysis and design
Black Box Software Testing Domain Testing Assignment Fall 2005 Assignment 2 This assignment is due on September 24, Please use the latest version.
Survey Methodology Data interpretation and presentation EPID 626 Lecture 11.
JDS Special program: Pre-training1 Carrying out an Empirical Project Empirical Analysis & Style Hint.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
ITEC224 Database Programming
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
Chapter One: The Science of Psychology. Ways to Acquire Knowledge Tenacity Tenacity Refers to the continued presentation of a particular bit of information.
Evaluating a Research Report
© 2007 by Prentice Hall 1 Introduction to databases.
Consumer behavior studies1 CONSUMER BEHAVIOR STUDIES STATISTICAL ISSUES Ralph B. D’Agostino, Sr. Boston University Harvard Clinical Research Institute.
Copyright (c) Cem Kaner. 1 Software Testing 1 CSE 3411 SWE 5411 Assignment #1 Replicate and Edit Bugs.
Copyright © 1994 Carnegie Mellon University Disciplined Software Engineering - Lecture 7 1 Design and Code Reviews - Overview What are design and code.
 Muhamad Jantan & T. Ramayah School of Management, Universiti Sains Malaysia Data Analysis Using SPSS.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
The Software Development Process
Unit 18 Advanced Database Design
CS211 Slide 3-1 ADCS 21 Systems Analysis Phase Overview Systems Requirements Checklist Fact-Finding techniques Documentation (Chapter 3) SYSTEMS ANALYSIS.
Presenting and Analysing your Data CSCI 6620 Spring 2014 Thesis Projects: Chapter 10 CSCI 6620 Spring 2014 Thesis Projects: Chapter 10.
Healthy Futures Performance Measures Session. Session Overview Combination of presentation and interactive components Time at the end of the session for.
SOCIAL SCIENCE RESEARCH METHODS. The Scientific Method  Need a set of procedures that show not only how findings have been arrived at but are also clear.
Preparing Data for Quantitative Analysis Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
Education Performance Measures Session. Session Overview Combination of presentation and interactive components Time at the end of the session for Q&A.
ESS. THE SCIENTIFIC METHOD “The strongest arguments prove nothing so long as the conclusions are not verified by experience. Experimental science is the.
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
PSY6010: Statistics, Psychometrics and Research Design Professor Leora Lawton Spring 2007 Wednesdays 7-10 PM Room 204.
GEM METADATA DEVELOPMENT Xiaoping Wang, Macrosearch Allen Macklin, PMEL and Bernard Megrey, AFSC.
Generating Summaries from FOT Data ITS World Congress, Detroit 2014 Dr. Sami Koskinen, VTT
Important Sections of the Methodology Chapter in the Dissertation
24 Nov 2007Data Management and Exploratory Data Analysis 1 Yongyuth Chaiyapong Ph.D. (Mathematical Statistics) Department of Statistics Faculty of Science.
17b.Accessing Data: Manipulating Variables in SAS ®
The Psychologist as Detective, 4e by Smith/Davis © 2007 Pearson Education Chapter One: The Science of Psychology.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Elementary Analysis Richard LeGates URBS 492. Univariate Analysis Distributions –SPSS Command Statistics | Summarize | Frequencies Presents label, total.
1 Data Collection and Sampling ST Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results of a statistical.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
Session 6: Data Flow, Data Management, and Data Quality.
PS Research Methods I with Kimberly Maring Unit 9 – Experimental Research Chapter 6 of our text: Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy,
Using internet information critically Reading papers Presenting papers
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Just the basics: Learning about the essential steps to do some simple things in SPSS Larkin Lamarche.
Chapter Eight: Quantitative Methods
MOON Data File Components
Presentation transcript:

Survey Methodology Survey data entry/cleaning EPID 626 Lecture 10

To do or not to do: Contracting the work During study planning, you should decide whether to do the data entry, management, and analysis yourself, or whether to contract with someone else to do it What are the advantages and disadvantages? When might you want to? When might you not want to?

Contracting Advantages –Specialized expertise –Potential ability to access national network of personnel –Reduction of load on study personnel –Third party (without financial or professional stake in results) increases legitimacy of the results

Contracting Disadvantages –Generally more expensive Is this true? Discuss profits vs. expertise and efficiency –Lose direct control over quality of data and study conduct –May be more difficult to interpret data without having done the analysis

DIY: Now what? Data analysis plan Data entry Data diagnostics Data cleaning Data setup

Data analysis plan (DAP) Design from the protocol and the survey instrument –Note: they may be discrepant Aim: –Resolve discrepancies before you start working with the data –Establish a clear plan for data management and analysis

DAP elements Summarize methods For each survey objective, identify and describe the relevant variables Identify the analysis methods –Software –Statistical methods, tests, significance levels, definitions

DAP elements (2) Describe plan for handling: –missing values –out-of-range values –zeros if doing log transformations –data collapsing Describe subgroup or by-group analyses

DAP elements (3) Set up dummy tables and graphs Review this DAP carefully and pass it around

Data entry Design a database that resembles the survey instrument in layout and format Pretest it extensively Designer should be present at the beginning of data entry to fix bugs Double data entry? Avoid necessity of interpretation by entry personnel

You and Your Data Your first eight hours together

First things first Virus-check the files Write protect original data Back up files and CRFs –On-site: hard drives, diskettes, safes –Off-site: safe deposit box

First things first (2) Import data –Error prone; be very careful here Validate and verify the data

Validating and verifying data Run frequencies for categorical variables Run univariate statistics for continuous variables Examine key variables (those used in the evaluation of primary objectives) Look at variables by group (sex, age, etc)

Validating and verifying data (2) Recode missing values Calculate checks for error prone variables –Ex. Check dates against time-to variables –Check anything that the interviewer had to calculate, such as a total score Derive any key variables that need to be calculated from other variables, and verify them too

Validating and verifying data (3) Rearrange, combine, or separate datasets as needed for analysis –Ex. Split demographic data, primary outcome, secondary outcome data Annotate a survey instrument with variable names Create a data dictionary –Include variable name, type, length, and description or label

Validating and verifying data (4) Look for obvious errors –Ex. Spelling of medication or medical condition –Be very careful about correcting them –Document any changes –Think about a query system –May need interviewer to resolve errors

Validating and verifying data (5) Run rough crosstabs for reference –Ex. Number by sex, group, and age –Use to track observations Create data listings –Very useful for reference and to identify problems in the data Check data coming from different sources –Be very careful with merging

Validating and verifying data (6) Aside: Variable naming –Should be meaningful and descriptive –But be careful about overly descriptive names Long variable names are difficult to manipulate If meaning appears obvious, people won’t look it up Back all of this up in the same way you backed up the original data