Data Collection, Harmonisation and Storage (An international perspective) Jon Johnson (CLS, Senior Database Manager) Sub-brand to go here CLS is an ESRC.

Slides:



Advertisements
Similar presentations
Questasy Technical Overview Alerk Amin. Data Dissemination Requirements Data collection Multiple languages One system –Data and metadata –Administrators.
Advertisements

ESDS user support materials and resources: how to use them Support Services Royal Statistical Society, London 13 February 2009.
Accessing longitudinal data via the UK Data Archive / ESDS Jack Kneeshaw NCDS summer school course, July 2005 ESDS Longitudinal.
The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 10 November 2004 ESDS Longitudinal.
Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Access to Economic and Social Data via the UK Data Archive Jack Kneeshaw UKDA.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 23 June 2005 ESDS Longitudinal.
Accessing the NCDS and the BCS70 via the Economic and Social Data Service Jack Kneeshaw and Alasdair Crockett NCDS/BCS workshop 29 October 2003 ESDS Longitudinal.
Accessing the MCS from the Economic and Social Data Service Jack Kneeshaw MCS workshop 13 October 2009 ESDS Longitudinal.
ESDS Qualidata and QUADS Coordination Louise Corti Online Resources Day 15 November 2005, London.
Accessing the NCDS and BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 31 March 2004 ESDS Longitudinal.
Accessing the NCDS and BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 27 October 2004 ESDS Longitudinal.
Accessing the NCDS and the BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 21 February 2007 ESDS Longitudinal.
Accessing Longitudinal Data via the Economic and Social Data Service Jack Kneeshaw 11 July 2006 ESDS Longitudinal.
An Introduction to the UK Data Archive and the Economic and Social Data Service November 2007 Jack Kneeshaw, UKDA.
Economic and Social Data Service a distributed data service for the social sciences.
Accessing the MCS from the Economic and Social Data Service Jack Kneeshaw MCS workshop 28 June 2007 ESDS Longitudinal.
Accessing the UK Longitudinal Studies via the ESDS Jack Kneeshaw UK Data Archive/Economic and Social Data Service 21 June 2004 ESDS Longitudinal.
Accessing the NCDS and the BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 16 October 2007 ESDS Longitudinal.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw and Alasdair Crockett MCS workshop 20 November 2003 ESDS Longitudinal.
ESDS Government Resources for Government Crime Surveys ESDS Government Centre for Census and Survey Research University of Manchester.
Introduction to ESDS Longitudinal and ESDS International survey data and resources Jack Kneeshaw Economic and Social Data Service City University, 27 April.
Metadata and the UK Data Archive CESSDA Expert Seminar Odense September 2008 Margaret Ward Lenin Ageer.
Metadata at ICPSR Sanda Ionescu, ICPSR.
1. The Digital Library Challenge Resources are hybrid: –Different formats: print, video, audio, web, etc. –Different locations: library, departments,
Creating a special collections repository (DERA) Bernard Scaife
Dependent Interviewing: Seminar, University of Essex September 2004 Peter Shepherd Centre for Longitudinal Studies, Institute of.
Developing and improving data resources for social science research Enhancing, enriching and developing household sample surveys in the UK: the strategic.
Survey Metadata Documentation Sue Ellen Hansen, Gina-Qian Cheung, Kirsten Alcser, Grant Benson, Ashley Bowers, Karl Dinkelmann, Youhong Liu, Beth-Ellen.
A database-driven tool to create items, variables and questionnaires NEPS Metadata Editor.
An integrated system for handling restricted use data Felicia LeClere, Ph.D. IASSIST 2009 Tampere, Finland.
1 CES IASSIST 2002, June 2002 University of Connecticut MetaNet: Standardising Statistical Metadata Methodology Karen Brannen University of Edinburgh,
Shirley Crompton Source: Rob Allan. Institutional Repository Subject Repository Data Producer Repository share resources solve bigger problems integrate.
 Name and organization  Have you worked with DDI before? (2 or 3)  If not, are you familiar with XML?  What kind of CAI systems do you use?  Goals.
© 2014 by the Regents of the University of Michigan Metadata from Blaise and DDI 3.0/3.2 Gina Cheung Beth-Ellen Pennell North American DDI Conference April.
1. Fathers in the UK Millennium Cohort Study EUCCONET Workshop Vienna 24 February 2010 Lisa Calderwood Sub-brand to go here CLS is an ESRC Resource Centre.
1. Family change in the first five years of life: new evidence from the UK Millennium Cohort Study Lisa Calderwood Sub-brand to go here CLS is an ESRC.
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
Extensible Markup Language (XML) 101 David Wallace Corporate Chief Technology Officier Management Board Secretariat Ontario Government.
Access to Economic and Social Data via the UK Data Archive Jack Kneeshaw UKDA.
1 Meeting on the Management of Statistical Information Systems (MSIS 2010) (Daejeon, Republic of Korea, April 2010) NIS ICT Strategy in the Production.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
Access to the LSYPE and associated resources at the Economic and Social Data Service Jack Kneeshaw LSYPE workshop 1 October 2009 ESDS Longitudinal.
What is the potential for a European multi-national cross- cohorts resource 23 June 2010 Jane Elliott Centre for Longitudinal Studies Sub-brand to go here.
Implementing the Standard on digital recordkeeping.
Data documentation and metadata for data archiving and sharing Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
Copyright 2010, The World Bank Group. All Rights Reserved. ICT - a core management issue Part 1 Managing ICT resources Produced in Collaboration between.
Experiences of managing Birth Cohort Data at CLS Jon Johnson (Senior Database Manager) Sub-brand to go here CLS is an ESRC Resource Centre based at the.
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
Web: Minimal Metadata for Data Services Through DIALOGUE Neil Chue Hong AHM2007.
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
Developing Electronic Resources for SACS Reaffirmation of Accreditation SACS Leadership Team Meeting September 1, 2010.
Return from Anarchy Jon Johnson 11 May 2005 Migrating from SPSS to SIR.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
U.S. Department of the Interior U.S. Geological Survey Manage and Provide Information: Examples from fish health, contaminants, and water quality data.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
>> Metadata What is it, and what could it be? EU Twinning Project Activity E.2 26 May 2013.
Metadata standards Using DDI to Inform, Organize, and Drive Survey Data Production.
Handling Attrition and Non-response in the 1970 British Cohort Study
VI-SEEM Data Repository
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 11/14/2018.
Keeping Up With Questasy


Implementing DDI in a Survey Organisation
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 5/21/2019.
Robert Dattore and Steven Worley
Presentation transcript:

Data Collection, Harmonisation and Storage (An international perspective) Jon Johnson (CLS, Senior Database Manager) Sub-brand to go here CLS is an ESRC Resource Centre based at the Institute of Education

2 2 Contents 1Introduction 2Survey Data ‘production line’ 3Data Management Compared 4National Longitudinal Surveys 5PSID and HRS (USA) 6MCS, NCDS and BCS70 (UK) 7LISS Panel (Netherlands) 8Management strategies compared 9Storage, maintenance and output 10Meta Data Standards 11New Requirements

3 3 Introduction In November 2008 CLS (MCS,NCDS, BCS70) and ULSC (BHPS, Understanding Society) were commissioned as part of Objective 5 of the Survey Resources Network by the ESRC to:  Examine potential efficiencies in data management processes, particularly in relation to data management software;  Examine the use of cutting-edge data collection methods for longitudinal surveys carried out at CLS/ULSC Completed a wide ranging review of the Survey Data Process and submitted it to the ESRC in November

4 4 Survey Data ‘production line’

5 5 Data Management Compared Various strategies to cope with the complex data flows of survey collection, management and dissemination: Final report will be available from Highly Integrated : National Longitudinal Surveys (USA) Partnership : PSID and HRS (USA) Contracted : MCS, NCDS and BCS70, BHPS,USoc (UK) Loosely Integrated : LISS Panel (Netherlands)

6 6 National Longitudinal Surveys (USA) Over more than two decades the NLS has developed in-house software to capture the survey. More recently they have integrated this into a turnkey solution where the storage of the survey is itself a mirror of the data collection instrument. Based on a highly normalised Oracle database, a snapshot of the data is auto-processed and available to researchers on a “create your own dataset basis” and then turned into standard flat datasets for use by researchers. Ref:

7 7 PSID and HRS (USA) Both the Panel Study of Income Dynamics (PSID) and the Health and Retirement Survey (HRS) utilise the in-house resources of the Survey Research Centre which provides survey data collection resources primarily to studies based at the University of Michigan. Survey instrument design is closely linked both to the PI and data management teams using Blaise for data collection. Data is prepared internally using SAS and processed to download as packaged datasets from PSID and also from IPCSR. Ref: and

88 MCS, NCDS and BCS70 (UK) CLS is responsible for specification of the instruments and data output which is implemented by a third party survey organisation. Data is further processed within CLS using SIR and provided to researchers as packaged datasets for download from the ESDS Data Archive. Meta-data is harvested from the CAI instrumentation and held in an SQL database for generation of HTML web pages directly from DDI 2.0 XML Ref: and

9 9 LISS Panel (Netherlands) The LISS Panel is primarily a web based survey, which uses a layer over Blaise with a dedicated survey instrument programming section closely linked to the survey design team. Data is produced from Blaise and managed in SPSS and provided as prepared datasets for use by researchers for download from LISS. A separate SQL metadata database, based on DDI 3.0 is used to provide navigation and generate the codebook etc. Ref:

10 Management strategies compared All studies face the same challenges 1.Complex data 2.Data description handling 3.Management of meta-data 4.Myriad audiences 5.Longitudinal consistency 6.Resource constraints 7.Re-purposing of data

11 All in one basket approach NLSNHANES

12 Data and Meta-data separated LISS / PSID / HRSMCS / NCDS / BCS / BHPS / USoc

13 14 Storage, maintenance, output Cleaning your data Cohort data continually evolves 2-3% of people mis-report sex Interviewers mis-key data Data entry clerks mis-key data Respondents mis-understand questions Outputting and deriving data Synchronizing changes, derivations and internal consistency, e.g. geographical identifiers and outputting in the best format for research is a function best done by DB staff

14 15 Meta Data Standards The Data Documentation Initiative has emerged as the front runner as the basis for an international standard 1.Existing foothold is limited 2.Lacks sufficient support for longitudinal studies 3.Provides at least a minimum of data which would enable international cross-cohort data discovery Can we establish a ‘Dublin Core’ for longitudinal / birth cohort surveys?

15 13 New Requirements Video / audio Genetics Web capture e.g. social networks Paper Archives Record Linkage Biological measures Data security (ISO27001) Disclosure control

16 Any questions? Institute of Education University of London 20 Bedford Way London WC1H 0AL Tel +44 (0) Fax +44 (0) Web