Quantitative Data Preparation Louise Corti ESDS/ UKDA Social Science Data Archives for Social Historians: creating, depositing and using qualitative data.

Slides:



Advertisements
Similar presentations
Archiving Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November 6th, 2006.
Advertisements

DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
Statistical Software Packages: How do I get this into that? Gillian Byrne Memorial University of Newfoundland Atlantic DLI Training - April 23, 2004.
DLI Training Nesstar Workshop
Guidelines for data preparation Social Science Data Archives: creating, depositing and using data Plymouth, 22 October 2004 Louise Corti.
The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Accessing the NCDS and the BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 21 February 2007 ESDS Longitudinal.
Accessing Longitudinal Data via the Economic and Social Data Service Jack Kneeshaw 11 July 2006 ESDS Longitudinal.
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
Quantitative Data Preparation Alasdair Crockett, Data Services Manager UK Data Archive.
Accessing the MCS from the Economic and Social Data Service Jack Kneeshaw MCS workshop 28 June 2007 ESDS Longitudinal.
Accessing the NCDS and the BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 16 October 2007 ESDS Longitudinal.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw and Alasdair Crockett MCS workshop 20 November 2003 ESDS Longitudinal.
Guidelines for data preparation Social Science Data Archives: creating, depositing and using data Swansea 23 March 2005 John Southall.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Data Analysis using SPSS By Dr. Shaik Shaffi Ahamed Ph. D
Metadata at ICPSR Sanda Ionescu, ICPSR.
33 CHAPTER BASIC APPLICATION SOFTWARE. © 2005 The McGraw-Hill Companies, Inc. All Rights Reserved. 3-2 Competencies Discuss common features of most software.
DATA IN Qualitative Data Acquisitions Process Louise Corti ESDS Qualidata, UKDA IASSIST WORKSHOP 27 May 2003.
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Data format translation and migration Future possibilities Alasdair Crockett, Data Standards Manager UK Data Archive.
Microsoft Access Exporting Access Data and Mail Merging.
Data Processing A simple model and current UKDA practice Alasdair Crockett, Data Standards Manager, UKDA.
McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 33 CHAPTER BASIC APPLICATION SOFTWARE.
Documentation Tools in the Survey Lifecycle. Outline What is NSFG Webdoc? Instrument documentation != Survey documentation Data Cleaning/Processing in.
Tutorial 8 Sharing, Integrating and Analyzing Data
Good Data Management Practices Patty Glynn 10/31/05
Data Management: Documentation & Metadata Types of Documentation.
Data Management: Quantifying Data & Planning Your Analysis
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
Basic Concept of Data Coding Codes, Variables, and File Structures.
Pasewark & Pasewark 1 Access Lesson 6 Integrating Access Microsoft Office 2007: Introductory.
1 Access Lesson 6 Integrating Access Microsoft Office 2010 Introductory Pasewark & Pasewark.
Managing Your Own Data (…if you have to) Kathryn A. Carson, Sc.M. Senior Research Associate Department of Epidemiology Johns Hopkins Bloomberg School of.
BASIC COMPUTER APPLICATIONS PHARMACY INFORMATICS.
General-Purpose APPLICATION SOFTWARE
Coding for Excel Analysis Optional Exercise Map Your Hazards! Module, Unit 2 Map Your Hazards! Combining Natural Hazards with Societal Issues.
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Overview of EpiInfo 6 Dr. Troy Gepte. Why do we use Statistical Software? Convenience Accuracy Guides data collection Ensures that data is processed Facilitates.
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
Guidelines for data preparation Social Science Data Archives: creating, depositing and using data Loughborough, 21 January 2005 Louise Corti.
Sociological metodology Quantification Petr Soukup.
Computer Science & Engineering 2111 Lecture 9 Data Validation, Worksheet Protection, and Macros 1CSE Data Validation and Macros.
Quantifying Data Advanced Social Research (soci5013)
Colectica: A Platform for DDI 3 based Metadata Management Design. Collect. Share.
Page 1 ISMT E-120 Desktop Applications for Managers Integrating Applications.
McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 33 CHAPTER BASIC APPLICATION SOFTWARE.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Basic Application Software.
Chapter Fifteen Chapter 15.
© 2014 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
Group 1 BDMPS Project Work The Survey on Use of ICT Facilities in TIC.
Data Entry Goal is to accurately transcribe data from data sheets into electronic form –Good database design helps –Validation rules help –Good data sheet.
Data Management in Clinical Research Rosanne M. Pogash, MPA Manager, PHS Data Management Unit January 12,
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Data Preparation and Description Lecture 24 th. Recap If you intend to undertake quantitative analysis consider the following: type of data (scale of.
Coding Preparing The Research for Data Entry. Coding (defined) Coding is the process of converting questionnaire responses into a form that a computer.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
Preparing for Data Analysis Some tips and tricks for getting your data organized so that you can do the “fun stuff”!
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
Forum to improve your experience entering data into SRDR 1 SRDR is being developed and maintained by the Brown EPC under contract with the Agency for Healthcare.
Preservation Planning Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Software and file types
Karen Dennison Collections Development Manager
Working with Data in Windows
What’s New in Colectica 5.3 Part 2
Spss 11.5 Tutorial.
Presentation transcript:

Quantitative Data Preparation Louise Corti ESDS/ UKDA Social Science Data Archives for Social Historians: creating, depositing and using qualitative data 25 November 2003

What characterises a good quantitative dataset? accurate data well labelled data well documented data data that can be stored in user-friendly dissemination formats, but can also be archived in a future-proof preservation format

Accuracy of data: validation checks Computer aided surveys (CAPI, CATI or CAWI) these are the most accurate way of gathering survey data, but the software (e.g. Blaise) and hardware (e.g. a laptop for every interviewer) may be beyond project resources computer aided surveys allow one to build in as many logical checks - on question routing and responses - as is possible at the point of data creation Non computer aided surveys less control over initial responses, but checks can performed: –at the point of data entry/transcription if data entry software is used. However, there are few cheap data entry packages around –the only feasible option may be to enter data without checks directly into a spreadsheet style interface (e.g. Excel worksheet, SPSS data view), and perform validation checks afterwards - via command files in statistical packages or Visual Basic code in Excel or Access

An example of data seemingly untouched by the human eye: Originating error in text variables: OccupationDescription of Occupation sole traderpurveyor of seafood Propagated error in derived numeric variables: Respondent was coded under the standard occupational (SIC) code relating to food retailers: 52.2 Retail sale of food, beverages and tobacco in specialised stores

Labelling of data all variables should be named. Variable names should not exceed 8 characters where possible, as the most common format for disseminating data is SPSS all variables should be labelled. Labels should be brief (preferably < 80 characters), but precise and always make explicit the unit of measurement for continuous (interval) variables. Where possible, all variable labels should reference the question number (and if necessary questionnaire). For example, the variable q11bhexc might have the label q11b: hours spent taking physical exercise in a typical week. This gives the unit of measurement and a reference to the question number (q11b), so the user can quickly and easily cross-reference to it for categorical variables, all codes (values) should be given a brief label (preferably < 60 characters). For example, p1sex (gender of person 1) might have these value labels: 1 = male, 2 = female, -8 = dont know, -9 = not answered Where possible, all such labelling should be created and supplied to the UKDA as part of the data file itself. This is the expectation with data supplied in one of the three major statistical packages - SPSS, STATA or SAS.

Documentation Core documentation: Questionnaire. Methodology: details of sample design, response rate, etc. Codebook, i.e. a comprehensive list of variable names, variable descriptions, code names and variable formatting information. This is essential If the package being used for data management does not allow the sort of variable and code labelling to be stored within the data file Technical report describing the research project. Other useful documentation that is seldom supplied: Code used to create derived variables or check data (e.g. SPSS, STATA or SAS command files).

Good and bad data documentation formats For full details for all types of data see: Preferred format(s) Acceptable format(s) Problematic format(s) Data held in a statistical package SPSS - portable (.por) or system (.sav) file. STATA; SAS (with formats information), delimited text Fixed-width (undelimited) text format. Data held in a Spreadsheet Delimited text (tab delimited or comma separated), Excel, Lotus Quattro Pro Data held in a database Delimited text with SQL data definition statements, MS ACCESS, dBase, FoxPro, SIR export, XML Filemaker Pro, ParadoxFixed-width (undelimited) text format. Documentatio n (e.g. questionnaires, codebooks, interviewers instructions, project description, etc.) Microsoft Word, Adobe PDF, Rich text format (RTF) SGML, HTML, XML, WordPerfect Hard copy (paper) Preferred format(s) Acceptable format(s) Problematic format(s) Data held in a statistical package SPSS - portable (.por) or system (.sav) file. STATA; SAS (with formats information), delimited text Fixed-width (undelimited) text format. Data held in a Spreadsheet Delimited text (tab delimited or comma separated), Excel, Lotus Quattro Pro Data held in a database Delimited text with SQL data definition statements, MS ACCESS, dBase, FoxPro, SIR export, XML Filemaker Pro, ParadoxFixed-width (undelimited) text format. Documentatio n (e.g. questionnaires, codebooks, interviewers instructions, project description, etc.) Microsoft Word, Adobe PDF, Rich text format (RTF) SGML, HTML, XML, WordPerfect Hard copy (paper)