Senior Data and Support Services Officer

Slides:



Advertisements
Similar presentations
April 2010 MRC Data Sharing Policy Peter Dukes Policy Lead – Data Sharing & Preservation.
Advertisements

The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 10 November 2004 ESDS Longitudinal.
Opening up access to birth cohort study data: A UK Medical Research Council pilot project Jack Kneeshaw Senior Data and Support Services Officer UK Data.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 23 June 2005 ESDS Longitudinal.
Accessing the NCDS and BCS70 via the Economic and Social Data Service Jack Kneeshaw NCDS/BCS70 workshop 27 October 2004 ESDS Longitudinal.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw and Alasdair Crockett MCS workshop 20 November 2003 ESDS Longitudinal.
Corporate Records Management (Practitioner) Information Governance Policy Team NHS Connecting for Health.
Health Records Management Practitioner
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
NESSTAR - the data archive perspective by Margaret Ward UK Data Archive.
Medical Audit.
© 2013 Cengage Learning. All Rights Reserved. 1 Part Four: Implementing Business Ethics in a Global Economy Chapter 9: Managing and Controlling Ethics.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
Chapter 6 Supporting Knowledge Management through Technology
Jump to first page (o ns) Modernising Statistical Systems to improve Quality The experiences of the Office for National Statistics (ONS) Presented by Emma.
Developing Policy and Procedure Management System إعداد برنامج سياسات وإجراءات العمل 8 Safar February 2007 HERA GENERAL HOSPITAL.
Joint UNECE / Eurostat meeting on Population and Housing Censuses 7-9 July 2010, Geneva Disseminating Census information to maximise use and value Keith.
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
IAEA International Atomic Energy Agency Methodology and Responsibilities for Periodic Safety Review for Research Reactors William Kennedy Research Reactor.
Data for secondary analysis: the experience of the UK Data Archive Hilary Beedham UK Data Archive.
DOE Data Management Plan Requirements
California Department of Public Health / 1 CALIFORNIA DEPARTMENT OF PUBLIC HEALTH Standards and Guidelines for Healthcare Surge during Emergencies How.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Implementing Clinical Governance COMPASS Consultant Outcome Indicators Programme.
Community of Practice K Lead Project Team: الالتزامالتحفيز التفكير المؤسسي المرونةالتميزالشراكةالاستقامة.
Digital Repository Certification Schema A Pathway for Implementing the GEO Data Sharing and Data Management Principles Robert R. Downs, PhD Sr. Digital.
Workplace Projects.
Well Trained International
Disaster and Emergency Planning
ICAO Seminar on Aeronautical spectrum management (Cairo, 7 – 17 June 2006) SAFIRE Spectrum and Frequency Information Resource (presented by Eurocontrol)
FAST at the British Library
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
Access Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
The scope and focus of the Research
Project Management Processes
An Overview of Data-PASS Shared Catalog
Project Integration Management
Karen Dennison Collections Development Manager
TechStambha PMP Certification Training
Service Organization Control (SOC)
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Why QFD….? Product should be designed to reflect customers’ desires and tastes. House of Quality is a kind of a conceptual map that provides the means.
Institutional role in supporting open access, open science, open data
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
Research Ethics Matthew Billington
Reading Research Papers-A Basic Guide to Critical Analysis
Section 3: Sweep implementation
ICAO Seminar on Aeronautical spectrum management (Cairo, 7 – 17 June 2006) SAFIRE Spectrum and Frequency Information Resource (presented by Eurocontrol)
Explore. Discover. Focus.
ESDS resources for managing and analysing data
The JISC IE Metadata Schema Registry
Geospatial Data Use and sharing Concepts
BAI Gender Action Plan 27th April 2018 IFI - Spotlight Stephanie Comey.
Information management and communication
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
CVE.
BETTER AND PROPER ACCESS TO PACIFIC MICRODATA
Project Management Processes
Collecting and Using Archival Data
The MRC Research Data Gateway
Software Requirements Specification (SRS) Template.
Stewardship in biotechnology
JISC and SOA A view Robert Sherratt.
VERITE – Dissemination plan
Information and outreach
The role of metadata in census data dissemination
STEPS Site Report.
Portfolio Committee on Communications
Presentation transcript:

Senior Data and Support Services Officer Opening up access to birth cohort study data: A UK Medical Research Council pilot project Jack Kneeshaw Senior Data and Support Services Officer UK Data Archive May 17 2007

The context: From principle to action “MRC policy on data sharing recognises the value of making scientific data more widely available across the research community, a recognition that is agreed to be a next necessary step by other researchers, research funders, national and international government bodies. After a period of raising awareness in the research community about the value of timely and responsible sharing of data, MRC needs now to move from principle to action.”

The subject: The NSHD, aka the 1946 British Birth Cohort Study established in 1946, the National Survey of Health and Development (NSHD) is one of the longest-running large-scale longitudinal studies in existence since 1962, the study has been funded continuously by the MRC data include a wide spectrum of risk exposures and of clinically validated measures of mental and physical health, and biological and cognitive function; survey has data on periods of the life-course that cannot be reliably accessed in retrospect or in GP records 12,000+ variables across the various component datasets

The state of play (1): Access/dissemination the study team receive slightly upwards of 30 data access requests p.a.: a figure increasing year-on-year process of request through to supply can be drawn-out and episodic: specifying and retrieving data and documentation to be sent is time consuming and involves a high level of manual intervention the ‘ship is now creaking’

The state of play (2): Finding/using the data the data collection is not well publicised: besides the NSHD web site itself, there are few finding aids that may guide potential users to the data potential users of the data have no means of searching for the survey instruments, topics, questions and variables that they might be interested in aside from the variable names, the data files supplied by the study team do not include any metadata

What might be? Is restricted access placing a ceiling on the level of scientific output that results from the use of the NSHD data? Cohort Annual usage (all users) Annual usage academic (exc. students) Publications since start of study   1946 cohort (NSHD) c. 30 354 1958 cohort (NCDS) 172 117 950 1970 cohort (BCS70) 169 119 329

How do we get from here to there? Four criteria identified in order to make the data resource widely usable in the scientific community: (1) data have to be indexed to an international standard; (2) searching content of data and metadata has to be possible via the web for both in-house and remote users; (3) once identified through a search, data have to be easily accessible, along with all the information needed for informed research use; (4) technical and procedural (governance) arrangements need to respect data subject confidentiality and take account of statutory and other regulatory requirements.

The recurring theme: Wider access vs. risk of disclosure special and increased risk of disclosure for longitudinal studies – more data points, vast range of information collected – rightly concerns the study team and sponsor important to start from position that disclosure risk can never be eliminated but can only be managed balance between attracting wider use of the data and retaining an appropriate level of disclosure risk becomes key

“It should always be borne in mind that, once data have been collected, the risk of disclosure can never be eliminated entirely; and, indeed, elimination of risk cannot be the aim if there is a policy to share the data. Instead, the aim must be to limit or control the risk. That is to say, in the context of sharing data so as to increase the scientific output, the aim of a disclosure risk strategy ought to be to define a level of risk that is acceptable: a policy aimed at reducing the risk of disclosure to a point as close as possible to elimination is not likely to be optimal.”

Solutions?: From managing to sharing data 2-year pilot project initiated – project board convened – membership includes presenter specific aims of project: (1) prepare a subset of NSHD data, along with data descriptions and documentation (metadata) in a digital format suitable for entry into the Nesstar software;

(2) define, document and implement governance arrangements for access to NSHD data through Nesstar; (3) implement management tools for data security and integrity, such as logging and access controls, where appropriate; (4) evaluating the benefits of implementation for the perspectives of the NSHD research team and other data users; (5) determine the financial costs, time and effort required, and other implications for extending the approach to the whole NSHD data library; (6) document ‘lessons learned’ to inform similar activities undertaken in the future.

Key outcomes expected generate a baseline measure of current activities (NSHD research team time, cost, etc.) required for provision of data and documentation help define access arrangements for whole study, including digitised genetic/phenotypic data widen (inc. geographers, social scientists, inc. non-UK) and deepen (current users find it more accessible, Nesstar facility to share derived variables) user base > > scientific output

Nesstar as the data sharing tool: Why? What? How?

Why Nesstar? Nesstar is primarily a tool for data discovery with a strong focus on metadata that allows users to browse study information down to the level of variable software allows users, via a standard web browser, to view frequencies, conduct simple tabulations, produce graphs, sub-set and weight data ‘user defined variables’ function of specific interest to study team Nesstar is not the only package of its type – but study team’s view is that, for searching, browsing, locating and exploratory analysis purposes, Nesstar is almost certainly the package best suited for the project’s needs

What’s to do? estimated 2,000 variables (of the total of 12,000+) will be described in terms of their origins, distribution and response in the cohort, derivation methods from other variables if appropriate, and code book sources each variable will be assigned keyword(s) datasets published on web via Nesstar

How to do it? NSHD ‘in-house’ solutions (e.g. scrambling ID for issued files) Nesstar modifications, especially to download function, to protect against inappropriate use publication of new derived variables via ‘user defined variables’

Where next? project proper begins in July, though data/metadata prep. work already underway aim to make 2,000 variables available to selected user test group by end of year 1 testing year 2: evaluation inevitably limited in scope but user feedback very important findings/recommendations published at end of year 2 and a successful report may see more MRC data rolled out via modified Nesstar product

Further details: Jack Kneeshaw – kneejw@essex.ac.uk