CISER Setup Files Creator and Its DDI 2.5 Codebook

Slides:



Advertisements
Similar presentations
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
Advertisements

Metadata at ICPSR Sanda Ionescu, ICPSR.
Utilizing the GDB debugger to analyze programs Background and application.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Using the IEA IDB Analyzer to merge and analyze data.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer to merge and analyze data.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
The Metadata Toolbox: A User’s Perspective on DDI J.M. Eisenhauer Smith, Data Analyst/Archivist Center for Demography of Health and Aging University of.
Preservation and Security IPUMS International Wendy Thomas Data Archivist.
Data format translation and migration Future possibilities Alasdair Crockett, Data Standards Manager UK Data Archive.
Advances in Data Preservation: The Roper Center Archive Approach Cynthia Teixeira The Roper Center for Public Opinion Research.
Data Processing A simple model and current UKDA practice Alasdair Crockett, Data Standards Manager, UKDA.
The eXtensible Past XML As a Means for Easy Access to Historical Research Data and a Strategy for Digital Preservation.
Learn to Predict “Affecting Changes” in Software Engineering Xiaoxia Ren Dec. 8, 2003.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Introduction to SPSS Edward A. Greenberg, PhD
Morpho Activity Start Entering/Practicing with real data.
4/22/2017 5:36 PM EViews Training Creating Workfiles.
How to Tag a Corpus Using Stanford Tagger. Accuracy All tokens: 97.32% Unknown words: 90.79%
Organizing a project, making a table Biostatistics 212 Lecture 7.
How to get data for small areas: Example: Regency of Bangli in the province of Bali, from the 2010 and 2000 census samples of Indonesia 1.Login 2.Browse.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Entering Data Manually PowerPoint Prepared by.
+ Plug-in Demo: Dropbox Judith Schwartz Fall, 2011.
LETS GET GOING Research Data Management. Research Data Management Decisions making In this module, we’ll discuss how best to set up your research: Filing.
Background Cornell Institute for Social and Economic Research (CISER): Data and Computing Support for Social and Economic Researchers at Cornell University.
Colectica: A Platform for DDI 3 based Metadata Management Design. Collect. Share.
Python module distribution. Modules in Python Modules are everywhere.
COLECTICA FOR EXCEL: USING DDI LIFECYCLE WITH SPREADSHEETS NADDI 2013.
© 2012 LogiGear Corporation. All Rights Reserved FitNesseFitNesse Authors: Nghia Pham 1.
Chapter 2 Getting Data into SAS Directly enter data into SAS data sets –use the ViewTable window. You can define columns (variables) with the Column Attributes.
Files in Python The Basics. Why use Files? Very small amounts of data – just hardcode them into the program A few pieces of data – ask the user to input.
Coding Preparing The Research for Data Entry. Coding (defined) Coding is the process of converting questionnaire responses into a form that a computer.
A SCRIPT FOR ARCHIVING DIGITAL RESEARCH DATA IMPROVING ACCURACY AND EFFICIENCY IN THE DATAVERSE NETWORK ABSTRACT SUMMARY Rachel Carriere, Thu-Mai Christian,
The NCCS Data Web: An Introduction The National Center for Charitable Statistics at the Urban Institute January.
New Web Tools from NCCS Linda Lampkin & Tom Pollak Center on Nonprofits and Philanthropy at the Urban Institute ARNOVA Annual Conference Denver November.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Statistical Exploratory Analysis with “EnQuireR” 1.Introduction 2.Installation 3.How to 4.Report.
Survey Training Pack Session 14 – Transferring CSPro, Access and Excel Files to SPSS.
DU REDCap Introduction
Development Environment
Incorporating W3C’s DQV and PROV in CISER’s Data Quality Review and
GML in CDI and CSR ISO using Ends&Bends
MIRACLE Cloud-based reproducible data analysis and visualization for outputs of agent-based models Xiongbing Jin, Kirsten Robinson, Allen Lee, Gary Polhill,
Computer Science 210 Computer Organization
Manual update through the RetailMedia.patch
Press <spacebar> to continue tutorial
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
EVOLUTION FROM EXCEL PIVOT TABLES TO
Working with Data in Windows
Computer Science 210 Computer Organization
What’s New in Colectica 5.3 Part 1
Coping with SPSS Syntax files on the DLI FTP and Web Sites
DataForge: A DDI-Enabled Toolkit for Researchers and Data Managers
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 11/14/2018.
Lab 1 Introductions to R Sean Potter.
A DDI3.2 Style for Data and Metadata Extracted from SAS
DDI for the Uninitiated
Introduction to IPUMS NYTS and IPUMS YRBSS
Using AMOS With SPSS Files.
Introduction to IPUMS NYTS and IPUMS YRBSS
Generating Define.xml at Kendle using DefinedocTM
VENDOR STOCK MASS UPLOAD MANUAL Enter the login id and password
Generating Define.xml at Kendle using DefinedocTM
The Next Generation of the Microdata Information System MISSY: An Integrated Solution for the Documentation of European Microdata European DDI User Conference,
Downloading Arduino FOR MAC.
EGR Identification service
Data Liberation Initiative (DLI)
Data compilation and pre-validation
Presentation transcript:

CISER Setup Files Creator and Its DDI 2.5 Codebook Florio Arguillas CISER Abstract INSTALLATION INSTRUCTIONS SETUP FILE CREATOR FOR FIXED WIDTH DATA SETUP FILE CREATOR FOR COLUMN BINARY DATA Most, if not all, Data Archives still possess in their collection ASCII and/or Column Binary data files that have codebooks, but no accompanying setup (or program) files to read them. This lack of setup files deters researchers from using these datasets and is costly cumulatively if multiple researchers independently write their own code to read these datasets. It also threatens the ability to use these datasets into the future, especially those that are still in column binary formats. The Data Archives of the Cornell Institute for Social and Economic Research (CISER) and the Roper Center for Public Opinion Research have many of these ASCII and column binary datasets that have codebooks, but no setup files. Recognising the threat that this poses to the usability and preservation of these datasets, and given the volume, we developed a software written in Python that would automatically generate SAS, SPSS, STATA and R setup files and datasets, and DDI 2.5 Codebook. The only requirement is to enter variable metadata (such as variable name, record number, starting and ending column location, variable labels, and value labels) in a specific format in Excel, which will be used as input by the Setup File Creator. This poster showcases the Setup File Creator for fixed-width and column binary data and how we made the software data archive and archivist friendly by simplifying the input and maximizing the volume and usefulness of its output. Install software on desired folder location. -Setup File Creator for Fixed Width Data (SFC_FW) -Setup File Creator for Column Binary Data (SFC_CB) Required software packages: - Python 2.7 - Excel - At least one of these: SAS, SPSS, STATA Edit the SFC_FW.vbs file -Change the path of the SFC_FW1_START.bat file. Direct it to the folder where you installed the SFC_FW or SFC_CB Set WshShell = CreateObject("WScript.Shell") WshShell.Run chr(34) & “<path>\SFC_FW1_START.BAT" & Chr(34), 0 Set WshShell = Nothing Download and install latest version of Python 2.7 Required pre-loaded Python 2.7 packages: sys, os.path, datetime, hashlib, tkinter Required external Python package: xlrd xlrd is a library for reading data and formatting information from Excel files, whether they are .xls or .xlsx files LIMITATIONS SETUP FILE CREATOR FOR FIXED WIDTH DATA Currently, can process only datasets with one record per case SETUP FILE CREATOR FOR COLUMN BINARY DATA Currently, can process only datasets with one record per case and columns with only one punch/variable Currently, can’t process datasets with multiple punches/variables in one column and those with variable values wrapping into other column(s), but can be manually programmed by the researcher in the setup file created by the SFC_CB. Edit SFC_FW1.bat file and specify correct location of SPSS (stats.exe), SAS (sas.exe) , STATA (Stata-MP-64.exe) exe files, and SFC_FW folders. set path=%path%;c:\python27 set path=%path%;C:\Program Files\IBM\SPSS\Statistics\24 set path=%path%;C:\Program Files\SASHome\SASFoundation\9.4 set path=%path%;C:\Program Files (x86)\Stata14 set path=%path%;U:\SFC_FW point to folder path python SFC_FW_MAIN.py SFC_FW1.bat Or manually register the above paths in the system environment variables Welcome Screen