Good Data Management Practices Patty Glynn 10/31/05

Slides:



Advertisements
Similar presentations
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Overview of Data Processing System.
Advertisements

Quantitative Data Preparation Louise Corti ESDS/ UKDA Social Science Data Archives for Social Historians: creating, depositing and using qualitative data.
Housekeeping: Variable labels, value labels, calculations and recoding
A complete citation, notecard, and outlining tool
How to Import an Excel File Using the SAS Import Wizard SAS 9 for Windows.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Coding and Debugging. Requirements and Specification Recall the four steps of problem solving: Orient, Plan, Execute, Test Before you start the implementation.
Chapter 3: Editing and Debugging SAS Programs. Some useful tips of using Program Editor Add line number: In the Command Box, type num, enter. Save SAS.
RESEARCH WORKFLOW USING STATA How to Be an Effective Researcher CCPR Workshop.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
File Management and Storage Devices. Floppy Disk Drive ► A floppy drive (normally designated as the "A" drive). ► A floppy drive (normally designated.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Introduction to SPSS (For SPSS Version 16.0)
Stata Introduction Sociology 229A, Class 2 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
A Visual Introduction to PC SAS. Start SAS by double-clicking on the SAS icon...
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
SPSS 202: Data Management by SPSS (Workshop) Dr. Daisy Dai Department of Medical Research 1.
Microsoft ® Office Word 2007 Training Mail Merge II: Use the Ribbon and perform a complex mail merge [Your company name] presents:
©2001 Chariot Software Group Using MicroGrade Classroom Management Software.
Introduction to SPSS (For SPSS Version 16.0)
Product Retrieval Statistics Canada / Statistique Canada Chuck Humphrey ACCOLEDS/DLI Training December, 2001.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
An Introduction to Microsoft Word. Microsoft Word This program allows you to type letters, papers, reports and even books. It is available through the.
Introduction to VB.NET Tonga Institute of Higher Education.
SLIR Computer Lab: Orientation and Training December 16, 1998.
 Overview of SPSS  Interface  Getting Started  Managing Data  Descriptive Statistics  Basic Analysis  Additional Resources.
ALEXANDER C. LOPILATO R: Because the names of other stat programs don’t make sense so why should this one?
Microsoft Office Illustrated Introductory, Premium Edition with Word 2003 Getting Started.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Objectives Understand what MATLAB is and why it is widely used in engineering and science Start the MATLAB program and solve simple problems in the command.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 24 Designing a Quantitative Analysis Strategy: From Data Collection to Interpretation.
1 Functions 1 Parameter, 1 Return-Value 1. The problem 2. Recall the layout 3. Create the definition 4. "Flow" of data 5. Testing 6. Projects 1 and 2.
An Introduction to Microsoft Word. Microsoft Word This program allows you to type letters, papers, and other documents. This program allows you to type.
TIMES 3 Technological Integrations in Mathematical Environments and Studies Jacksonville State University 2010.
Microsoft Office XP Illustrated Introductory, Enhanced Started with Word 2002 Getting.
← Select Exchange Once logged in. ↓ click Join Course Icon.
How to get data for small areas: Example: Regency of Bangli in the province of Bali, from the 2010 and 2000 census samples of Indonesia 1.Login 2.Browse.
The Control Panel is the starting point when you wish to load files into Blackboard. Students cannot see this panel, unless they know your password of.
Lesson 12: Using the Recycle Bin deleting files or folders what the Recycle Bin is restoring files from the Recycle Bin emptying the Recycle Bin identifying.
May 7, We manage documents and their changes with versioning and check out/check in procedures.
Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University.
How to Make an extract of Puerto Ricans censused abroad : 1.Login 2.Select samples (default is all) 3.Select variables (include BPLCTRY) 4.Select cases:
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
11/25/2015Slide 1 Scripts are short programs that repeat sequences of SPSS commands. SPSS includes a computer language called Sax Basic for the creation.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
IT1001 – Personal Computer Hardware & system Operations Week7- Introduction to backup & restore tools Introduction to user account with access rights.
Data Management Research Methods Professional Development Institute December 4, 2015.
+ Publishing Your First Post USING WORDPRESS. + A CMS (content management system) is an application that allows you to publish, edit, modify, organize,
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Creating Your Own Online Classroom MOODLE. Welcome Amy Basket – 17 years with Bay City Public Schools – Gifted and Talented Program – Volunteer Program.
Your current Moodle 1.9 Minimum Requirements Ability to do a TEST RUN! Upgrading Moodle to Version 2 By Ramzan Jabbar Doncaster College for the Deaf By.
111 State Management Beginning ASP.NET in C# and VB Chapter 4 Pages
The Next Step Hudson Fare Files 102 – Import & upload Rev. 10/14.
Introduction to the SPSS Interface
Development Environment
Release Numbers MATLAB is updated regularly
Downloading and Preparing a StudentVoice File for SPSS
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
DEPARTMENT OF COMPUTER SCIENCE
Product Retrieval Statistics Canada / Statistique Canada Title page
Microsoft Word 2003 Illustrated Complete
Adding Assignments and Learning Units to Your TSS Course
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
ECONOMETRICS ii – spring 2018
Data Entry and Managment
New Perspectives on Windows XP
Stata Basic Course Lab 2.
Introduction to the SPSS Interface
Presentation transcript:

Good Data Management Practices Patty Glynn 10/31/05

Four Statistical Packages SPSS Stata R SAS

Point and Click Command Line Programs (the best way) Three Ways to Work

Outline Sermon on SYNTAX Cleaning data and creating variables Never overwrite original data Practices that will help you keep track of your work Safeguarding your work

A Sermon on SYNTAX Command line and Point and Click –Advantages: Quick, may require less learning –Disadvantages: Takes longer the second time – you must wade through the point and click menu rather than just change a word You do not have a record of what you have done

SPSS The King of Point and Click

You can point and click to get files, create variables, change variable values, and do analysis, and end up without a record of what you have done. You will be sorry.

Or, you can use Point and Click as an aid as you write programs. You can copy syntax created by Point and Click into your program. In SPSS programs are written in a Syntax Window and they have the extension of.sps when you save them.

You can modify SPSS defaults so that commands will be reflected in the log. This allows you to copy commands from your log into your program file. These changes also make debugging easier.

You will find information about how to modify SPSS at the following URL.

STATA

You can point and click, issue commands on the command line, or create.do files. “.do” files can store your programs.

R

With R you can point and click, issue commands on the command line, or create.R files. “.R” files store your programs. Results from P&C are reflected so you can copy them into your program.

SAS

SAS allows some point and click, but immediately offers an editor where you can write your programs. SAS programs end with the.sas extension, and are text files. SAS features an enhanced editor with cool color coding that makes it easier to write and debug programs.

Outline Sermon on SYNTAX Cleaning data and creating variables Never overwrite original data Practices that will help you keep track of your work Safeguarding your work

Never clean data in the data view

Scenario 1: You get a data set and find errors in it. You change the values in the data window. You save it with point and click, over-writing your original data. Later you try to recall what changes you made, when and why. Of course you can’t. You can’t even be sure that you made the “corrections” for the proper cases. You can’t look back at older data sets to confirm what you did. You sit there sweating.

Scenario 2 same as Scenario 1 : You save it with point and click, over-writing your original data and, while you are saving the file, 1)Your computer goes down because of a power outage OR 2) There is a brief interruption in the network HALF OF YOUR DATA SET IS LOST. You cry.

Scenario 3: You get a data set and find errors in it. You write a program that: 1)gets the original data 2)makes changes in values with SYNTAX 3)Includes comments about the changes 4)saves the new file in a different name Science marches forward.

Creating Variables and Recoding is not the same as Cleaning Data You always want clean data You may not always want the recoded or created variables Make new variables, but keep the old ones. (don’t over-write) Use the original to check the new

Examples of Recoding/Creating Creating a series of dummies from a categorical variable Creating an index from a series of scale variables Creating a dichotomous or categorical variable from a continuous variable Always consider MISSING VALUES

Sample SPSS Program * CleanNew.sps. * 10/10/05 created dummy for male. Get file = ‘dirty.sav’. * Cleaning data, PJG, looked at survey form, educ for ID=1 should be 16, 10/9/05. If id = 1 educ = 16. * Create a dummy variable from “gender”. If gender = ‘m’ male = 1. If gender = ‘f’ male = 0. If gender = ‘’ male = -9. Missing values male (-9). Variable label male ‘Male’. Value labels male 1 ‘Male’ 0 ‘Female’. Save outfile = ‘CleanNew.sav’ / drop gender.

Summary for Cleaning and Creating variables Use syntax (programs) to create and clean variables Document when and why in your programs Save new file – do not over-write the old

Outline Sermon on SYNTAX Cleaning data and creating variables Never overwrite original data Practices that will help you keep track of your work Safeguarding your work

It may be months between the time that you finish a paper, submit it, and get to revise it for publication.

What you will need to know: The origin of your variables: –What is the source for each variable –How were they created? What programs created your final tables? What program files created the file you used for your final tables?

Create a Directory for the Project For example, c:\MA_Thesis Store all of the programs and data in that directory and subdirectories

Naming Conventions For every data file you have, you should have a program file with a corresponding name. When you have finished your paper, create a program file for each table. For example: table1.sas table2.sas

Document your work Write comments in your program. Put a file in your directory called a_note, readme, or something similar that includes a brief description of the project and important information.

Outline Sermon on SYNTAX Cleaning data and creating variables Never overwrite original data Practices that will help you keep track of your work Safeguarding your work

Multiple backups – not all stored in the same basket Worry about the future –Keep up with formats (cards, tapes, floppy disks, CDs, what next? ) –Store in portable formats

Documents that may be helpful

Computer Environments for the Social Sciences CSSS 506 Winter Quarter 2006 See contents from 2005:

The End Patty Glynn 10/31/05