SDA: a tool for teaching and research with microdata Laine Ruus University of Toronto. Data Library Service.

Slides:



Advertisements
Similar presentations
DDI 101 Presented to the : Ontario DLI training session Queens Kingston, Ontario Presented to the : Ontario DLI training session Queens Kingston, Ontario.
Advertisements

EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
Using SDA part ii Accoleds/DLI Training 2003 University of Calgary 1 Using SDA ICPSR Part II ACCOLEDS / DLI Training 2003 December 8 – 10 University.
ADePT Automated DECs Poverty Tables Michael Lokshin, Zurab Sajaia and Sergiy Radyakin DECRG-PO The World Bank.
4 th QS-Maple Abu Dhabi, UAE, May DATA IN TEACHING AND LEARNING Why should we teach students to use real and complex data?... and how should we.
Metadata at ICPSR Sanda Ionescu, ICPSR.
Anita M. Baker, Ed.D. Jamie Bassell Evaluation Services Program Evaluation Essentials Evaluation Support 2.0 Session 2 Bruner Foundation Rochester, New.
Variables 9/10/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp.48-58) Chapter 1 Introduction.
17a.Accessing Data: Manipulating Variables in SPSS ®
Access to and specifics of detailed national LFS data – the case of Slovenia Sebastian Kočar Social Science Data Archives University of Ljubljana 4th DwB.
What is a Programming Language? The computer operates using binary numbers. The computer only knows about 1’s and 0’s. Humans can also use 1’s and 0’s,
First Year in Focus at Canadian Colleges and Universities.
May 14, 2001California Digital Library Using DDI Extensions as Intermediary for Data Storage and Data Display Patricia Cruse Marsha Fanshier Fredric Gey.
The Metadata Toolbox: A User’s Perspective on DDI J.M. Eisenhauer Smith, Data Analyst/Archivist Center for Demography of Health and Aging University of.
IASSIST 2003 Changes in the Way Data Archives Process Data Data Processing at ICPSR Darrell Donakowski.
The Minority Data Resource Center Felicia LeClere, Ph.D. Director, MDRC.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
15a.Accessing Data: Frequencies in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
15b. Accessing Data: Frequencies in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
MR2300: MARKETING RESEARCH PAUL TILLEY Unit 10: Basic Data Analysis.
Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library March 6, 2009.
Feeds Computer Applications to Medicine NSF REU at University of Virginia July 27, 2006 Paul Lee.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Merging census aggregate statistics with postal code-based microdata Laine Ruus University of Toronto. Data Library Service ,
Linux Operations and Administration
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
 Overview of SPSS  Interface  Getting Started  Managing Data  Descriptive Statistics  Basic Analysis  Additional Resources.
Searching for Statistics Why can’t we find the data we need? Where should we even start?
Nesstar: A Web-based Data Extraction and Analysis System Richard Pinnell & Sandra Keys, University of Waterloo Libraries.
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Data and Social Research Chuck Humphrey Data Library Rutherford North Library.
Technical Overview of SDMX and DDI : Describing Microdata Arofan Gregory Metadata Technology.
Sociological metodology Quantification Petr Soukup.
DLI Workshop -- Mar Hosted by Dalhousie University March 2000 DLI Training Workshop.
Health Data Sources Sunny Kaniyathu 03 February 2011.
Ontario Data Documentation, Extraction Service and Infrastructure IASSIST 2008 Palo Alto, California.
Axel Naumann University of Nijmegen / NIKHEF, NL ROOT 2004 Users Workshop The Future of THtml Plans and Status of ROOT’s documentation facility.
DLI Boot Camp 2011 Finding Statistics: Tools and Techniques Jean Blackburn Vancouver Island University Library SDA.
POLS 328.3: Public Policy Analysis Finding data and statistics.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Presented by : Sébastien Lauzon (Finance Canada).
Using the SDA on the Web Ed Nelson, CSU Fresno Social Science Research and Instructional Council.
Soc : Principles of Research Design LONGITUDINAL DATA Sunny Kaniyathu, Data Services Librarian.
National Hospital Discharge Survey: A Hands-On Workshop Using Public-Use Data Files Michelle N. Podgornik, MPH 2006 Data Users Conference July 11, 2006.
SOC 503 Techniques & Methods of Social Science Data Resources at Princeton University.
Project? Microdata? Say what? TRY Conference May 5, 2008 Suzette Giles, Ryerson University Laine Ruus, University of Toronto.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Analysis Introduction Data files, SPSS, and Survey Statistics.
Ontario Data Documentation, Extraction Service and Infrastructure.
Handling Reference Questions DLI Orientation Session Kingston, Ontario April 5, 2004.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
16a. Accessing Data: Means in SPSS ®. 16a. Accessing Data: Means in SSPS ® 1 Prerequisites Recommended modules to complete before viewing this module.
17b.Accessing Data: Manipulating Variables in SAS ®
Elementary Analysis Richard LeGates URBS 492. Univariate Analysis Distributions –SPSS Command Statistics | Summarize | Frequencies Presents label, total.
Hosted by the University of Regina Library December 1999 DLI Training Workshop Chuck Humphrey.
Soc 332.6: Principles of research design Finding statistics.
Laine Ruus University of Toronto.Data Library Service
The Research Process First, Collect data and make sure that everything is coded properly, things are not missing. Do this for whatever program your using.
Census 2010: Accessing Census Data THURSDAY, July 21, :30am.
Real Time Remote Access: Educational resources Susan Mowers, University of Ottawa.
Survey Documentation and Analysis (SDA)
Introduction to Survey Documentation and Analysis (SDA)
Jean Blackburn Vancouver Island University
Susan Mowers, Data Librarian, GSG Centre - UOttawa
DDI for the Uninitiated
University of Regina Library
Survey Documentation and Analysis (SDA)
Presentation transcript:

SDA: a tool for teaching and research with microdata Laine Ruus University of Toronto. Data Library Service 2007/05/17

What this poster covers: Introduction Demo of main SDA capabilities Advantages and disadvantages for teaching and research Common questions about SDA

is brought to you by: University of California, Berkeley. Computer-assisted Survey Methods Program (CSM) – writes and supports the server-side software University of Toronto. Centre for Computing in the Humanities and Social Sciences (CHASS) – provides the hardware, buys the software, and provides system support wetware University of Toronto. Libraries – provides the budget to purchase the data, and care, feeding and user support wetware

Our experience with SDA CHASS installed SDA in the fall of 2004 At last count, have 600+ data files in SDA Some have only the metadata that was generated from the original syntax files (SAS/SPSS/Stata), but a number also have full question text. Most are microdata, but a few are aggregate statistics (census files) A number of voracious data users now expect to find the latest microdata released by Stat Can in SDA

Review of main SDA utilities Frequencies, weighted & unweighted Crosstabulations Comparison of means (ANOVA) Correlations Regressions Logit/probit regressions

Tips & tricks Have we not gotten around to coding the missing values? Want to include missing values in your cross-tabulation, or other analysis? Collapsing uniform categories of continuous variables on the fly Recoding variables on the fly

Tips & tricks (2) Computing percentages in aggregate data? Dummy coding variables in regressions Defining an interaction on the fly

Advantages for teaching: Stable environment, 24x7 access Very easy to explain to novice users Reduce/eliminates need for computer labs or statistical software Teach statistics rather than software Students get hands on data quickly Switch easily between weighted and unweighted distributions

Advantages for teaching (2): Measures of association and tests of significance comparable to SAS Design effects, where cluster/sample variables available Interactive demonstration of statistical concepts Share recoded variables Can quickly mount additional data to fulfill your teaching needs

Advantages for research: Stable environment, 24x7 access Access to latest available version of the data Basic exploratory data analysis: eg are there enough cases for my subset? Download data and import to SAS/SPSS/Stata on own workstation Share recoded variables Integrated variable descriptions (selected data files)

Advantages for data management: Creates metadata from SAS/SPSS/Stata syntax or DDI format xml files Very easy and fast to import files with good syntax files Control over what users can and cannot do Outputs include SAS/SPSS/Stata syntax or DDI format xml files Overhead: size of uncompressed data + about 50%

Disadvantages of SDA: Can’t search for variables/values within/between data files (yet) – at least, not at UT/CHASS Can’t download created/recoded variables – coming in spring 2009 No random sampling function. See Graphics minimal, eg no stem-and-leaf, box-plots etc Can only output to Word/Excel from IE, not from Netscape/Mozilla/Firefox Doesn’t output SAS/SPSS/Stata system/export files Little support for Study/File level metadata (DDI) No support for nCubes (DDI 2)

Common questions from researchers & students: When to weight versus not to weight Does it only do cross-tabs? But I want the raw data, not a cross-tabulation! Why can’t I get a cross-tab of this [eg continuous income] variable? Differences between syntax, data, and system files.

An application we wouldn’t have tackled without SDA: Q: I need the average expenditure on eye care in Canada by age group of household head for as long a time-period as possible. A: Once we explained SDA, the student had generated this statistics from each of the FAMEX/SHS files, in under 30 mins. (He knew only Stata.)

Functions we know to be coming in SDA Within and between file variable searching Will allow users to load own data files (Archiver in SDA 3.1) – we have not played with this yet

Questions: Question 1: Where will I find the SDA server at University of Toronto? Answer 1: The URL is: Select ‘Microdata analysis and extraction’

Questions (cont’d): Question 2 How are files chosen to be mounted on the SDA server at UT? Answer 2 All significant Canadian microdata files, eg by Statistics Canada as released by DLI Other files based on faculty/student requests

Questions (cont’d): Question 3: My research is being done collaboratively with a colleague at another Canadian university. Can my colleague get access to SDA? Answer 3: SDA is available as a subscription service to other Canadian DLI- member universities and colleges. Current subscribers include: U of Victoria, Ryerson U, and Memorial U