Data selection & licensing

Slides:



Advertisements
Similar presentations
Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Advertisements

A centre of expertise in data curation and preservation DCC Workshop: Curating sApril 24 – 25, 2006 Funded by: This work is licensed under the Creative.
Electronic theses and copyright Janet Aucock Head of Repository services March 2014.
JISC Managing Research Data 2 Programme Launch Nottingham 1-2 December,2011 Robin Rice Data Librarian University of Edinburgh.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
Good practice in Research Data Management Module 2: RDM Introduction.
IS Audit Function Knowledge
JRC's Open Access (OA) Policy G. P. Tartaglia, A. Annoni, G. Merlo, F
Open Data & overview of data management in H February 2015.
Data Management Development and Implementation: an example from the UK SLA Conference, Boston, June 2015 Geraldine Clement-Stoneham Knowledge and Information.
EPSRC expectations on research data: What researchers need to know 12/03/2015 Masud Khokhar and Hardy Schwamm.
DATA MANAGEMENT SUPPORT FOR RESEARCHERS …………………………………………
Research Data Management at University of Aberdeen & RGU 7 th October 2014 This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland.
Data Management Planning and DMPonline
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
Text Mining: Opportunities and Barriers John McNaught Deputy Director National Centre for Text Mining
R ESEARCH D ATA M ANAGEMENT : AN I NTRODUCTION TO THE B ASICS Open Access and Data Curation Team.
DIY Research Data Management Training Kit for Librarians Data sharing Anne Donnelly Liaison Librarian College of Medicine & Veterinary Medicine College.
1 Why should “WE” CARE about data?. International initiatives OECD principles and guidelines for access to research data from public funding 2007 “Access.
Because good research needs good data The DCC lifecycle model, Exeter Uni, 19 May 2012 Funded by: The Digital Curation Lifecycle Model Joy Davidson and.
Research Data Management for Research Support staff 30 th June 2015 Isabel Chadwick, Research Data Management Librarian
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
11 Researcher practice in data management Margaret Henty.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
Preserving your research data for future use This work is licensed under a Creative Commons Attribution 3.0 Unported License.Creative Commons Attribution.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
RCUK Policy on Open Access Name Job title Research Councils UK.
DATUM for Health – Healthy research needs healthy data I’ve collected my data, so what do I do with it now? Research data management Session 2 Data Curation.
Writing a data management plan (DMP) Stephen Grace and David McElroy Writing a DMP workshop, UEL 5 March 2015.
Because good research needs good data The DCC lifecycle model, Exeter Uni, May 2011 Funded by: The Digital Curation Lifecycle Model Joy Davidson.
Publish your Data on the Tropical Data Hub Seeding the Commons Project Australian National Data Service e-Research Centre James Cook University This work.
General Data Protection Regulation (EU 2016/679)
Publishing research in a peer review journal: Strategies for success
Jeff Moon Data Librarian &
Selecting what to keep and what to bin
CESSDA SaW Training on Trust, Identifying Demand & Networking
Open Exeter Project Team
Frameworks for Sensitive Data in the Research Lifecycle
Slides Template for Module 5
ELIXIR Core Data Resources and Deposition Databases
Copyright and Open Licensing
M25 Group Open Library Data A British Library Perspective
Copyright and Open Licensing
EPSRC research data expectations and research software management
Making Data Providers’ Contribution Count
An Introduction to Open Data
Institutional role in supporting open access, open science, open data
General Finnish DMP Guidance
Digital Curation Centre
Open Access to your Research Papers and Data
EOSCpilot Skills Landscape & Framework
Introduction to Research Data Management
Funding body requirements
Creating a Culture of Open Data in Academia
DATA LIFECYCLE & DATA MANAGEMENT PLANNING
Copyright, Fair Use, and Creative Commons Licensing
Copyright, Fair Use, and Creative Commons Licensing
Research Data Management
Recording Clinical Data
Recording Clinical Data
Jisc Research Data Shared Service (RDSS)
Research Data Management
Heidi Imker and Dan Tracy Faculty Meeting Lightning Talk February 2019
Internal Control Internal control is the process designed and affected by owners, management, and other personnel. It is implemented to address business.
CARL Guide to Author Rights
Copyright and Open Licensing
Copyright and Higher Degree Students
Research Data Dr Aoife Coffey, Research Data Coordinator
Copyright and Higher Degree Students
Presentation transcript:

Data selection & licensing Digital Curation Centre This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License.

Digital Curation Centre - CC-BY Outline Why selection is important How to select data Licensing data – options and implications 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Why not keep it all? Globally, data volumes are doubling every two years John Gantz and David Reinsel 2011 Extracting Value from Chaos www.emc.com/digital_universe. 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Data volumes escalate Volumes rising faster in data-intensive research domains e.g. DNA sequence data is doubling every 6-8 months The cost of data production is dropping more rapidly than the cost of data retention – sensors are cheap and getting cheaper “ELIXIR and Open Data” View from an ELIXIR Node” Barend Mons, ELIXIR Launch event, 18th Dec 2013 2017-03-08 Digital Curation Centre - CC-BY

Storage mgmt costs rise long-term Hardware costs decline, but power and staff costs keep rising David Rosenthal blog.dshr.org/2012/05/lets-just-keep-everything-forever-in.html 2017-03-08 Digital Curation Centre - CC-BY

The storage is cheap fallacy Decreasing hardware costs offset by data volume growth Backup and mirroring multiplies cost of preserved data Discovery becomes harder as chaff outweighs the wheat Curation of unused data is a waste of resources But discovery techniques are getting better – this argument is not clear-cut This is a significant factor – curation is often manual 2017-03-08 Digital Curation Centre - CC-BY

When should selection begin? NTU workshops When should selection begin? 2017-03-08 Appraisal should begin as early as possible! Periodically for longitudinal and reference datasets The appraisal process should begin during the planning phase when researchers should be able to provide a general overview of the expected outputs of the project and their potential for reuse. Refining this picture during the project will enable detailed metadata and processing to be carried out on the most relevant and valuable parts of the dataset. The ‘Appraisal’ part of the lifecycle relates to all the data produced by the project which may include some auxiliary material. 2017-03-08 Digital Curation Centre - CC-BY Kevin Ashley, Digital Curation Centre - CC-BY

What questions must be answered? NTU workshops What questions must be answered? 2017-03-08 What must be kept to manage compliance risk? (and what must NOT be kept) What data has value and should be kept? What data could be re-used? Given costs what will or won’t be kept? How will it be kept and shared, on what terms? This final point is an important one – what data is to be shared, what to be preserved and what to be disposed of. There is a good chance that a variety of strategies may need to be applied to data produced by a single project. 2017-03-08 Digital Curation Centre - CC-BY Kevin Ashley, Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY NTU workshops What ‘must’ be kept? 2017-03-08 Some data may be part of research record, evidence for e.g. … Supporting patent applications or IP Evidence of investigations or inquiries Health & Safety (Lab book) Jisc Infonet Guidance on Managing Research Records tools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf What can’t be kept: Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http://www.data-archive.ac.uk/create-manage/consent-ethics/legal Compliance also about data that won’t be kept, or may only be shared with approved researchers… 2017-03-08 Digital Curation Centre - CC-BY Kevin Ashley, Digital Curation Centre - CC-BY

Other drivers for what ‘must’ be kept NTU workshops Other drivers for what ‘must’ be kept 2017-03-08 Funder policies: “Data with acknowledged long-term value ” RCUK Common Principles on Data Policy Journal policies: … a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers… (http://www.nature.com/authors/policies/availability.html) The Law What counts depends on data’s value for purposes it has served or may serve, so consider these as first step. 2017-03-08 Digital Curation Centre - CC-BY Kevin Ashley, Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Appraisal and deposit Relevance to Mission – including any legal/funder requirement to retain the data beyond its immediate use. Scientific or Historical Value – significance and relationship to publications etc. Uniqueness – can it be found elsewhere / if we don’t preserve it, who will? Potential for Redistribution – quality / IP / ethical concerns are addressed. Non-Replicability – either impossible to replicate (e.g. atmospheric or social science data) or not financially viable. Economic Case – costs of managing and preserving the resource stack up well against potential future benefits. Full Documentation – surrounding / contextual information necessary to facilitate future discovery, access, and reuse is adequate. 2. Prioritise based on relationship with publications, e.g. underpins scientific record (c.f. Sarah Callaghan, Preparde) 5. Privilege irreproducible data… How to Appraise & Select Research Data for Curation Angus Whyte, Digital Curation Centre, and Andrew Wilson, Australian National Data Service (2010) 2017-03-08 Digital Curation Centre - CC-BY

Step 2 What data should have value Indicators that data have value Quality of the data and its description complete, accurate, reliable, valid, representative etc Demand high known users, integration potential, reputation, recommendation, appeal Replication difficulty difficult, costly, or impossible to reproduce Low barriers legal/ ethical, copyright non-restrictive terms and conditions Rarity unique copy or other copies at risk Which related material does data depend on for its value? 2017-03-08 Digital Curation Centre - CC-BY

Step 3 What could it be reused for? Step back and reflect – typical reuse purposes Verification Further analysis Reputation building Resource development Further publications inc. data articles Learning and teaching materials Private reference Then relative to these, which data must be kept and which data and related materials will have significant value? 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY NTU workshops Step 4 Cost factors 2017-03-08 Consider these when deciding what to keep Costs incurred during project may add to the data’s value Cost of curation – to make data reusable Cost of long-term retention can be calculated Cost factors: Creation, collection & cleaning Short-term storage & backup Short-term access & security Team communication & development Preservation & long-term access What action needs to be taken to ensure preservation is costed? 2017-03-08 Digital Curation Centre - CC-BY Kevin Ashley, Digital Curation Centre - CC-BY

Who should help appraise? RLUK ‘skills gaps’ survey of Subject Librarians & Managers “  …nine   key   areas   where   future  involvement  by  Subject Librarians  is  considered  to  be  important  now  and  is  also expected  to  grow  sharply… Ability  to  advise  on  preserving  research  outputs (49%  see as essential  in  2-5  years;  10%  now) Knowledge  to  advise  on  data  management  and  curation,  (4 8%  essential  in  2-5  years;  16%  now)…” Mary  Auckland 2012 Reskilling Libraries for Research 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY What data to keep Roles and Responsibilities 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Who else? Others who may be involved in appraising research data… Domain specialists Archives Research Office- Business development IT Support/ Research Computing Research Ethics Committee Records Management/ FOI Compliance Facilities Managers (if physical samples involved) 2017-03-08 Digital Curation Centre - CC-BY

LEGAL ISSUES & LICENCING 2017-03-08 Digital Curation Centre - CC-BY

Two types of legal issue Things that the law requires you to consider Things that the law allows you to do Consequences for researchers - requirements Your organisation must know what data it possesses It must know whether exceptions to access may apply It must know if some of the data belongs to others It must know what data once existed, but has now been deleted – and why These are difficult questions for most of us! So document those appraisal decisions! 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Requirements Data protection If human subjects are involved Common European framework – Singapore has similar requirements Informed consent essential Make consent broad to allow reuse Protect data Provide subject access Right of correction 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY FOI & EIR FOI = Freedom of Information EIR = Environmental Information Regulations First is nation-state specific; second from European regulation Both have similar effects, but differ in detail 2017-03-08 Digital Curation Centre - CC-BY

What the law allows – licensing & trust agreements Licences allow you to constrain how others use your data They range from very open to very restrictive You MUST own the data in order to be able to licence it Licences unlikely to help with sensitive data – seek help, use trusted repositories 2017-03-08 Digital Curation Centre - CC-BY

License your data for reuse Outlines pros and cons of each approach and gives practical advice on how to implement your licence CREATIVE COMMONS LIMITATIONS NC Non-Commercial What counts as commercial? SA Share Alike Reduces interoperability ND No Derivatives Severely restricts use www.dcc.ac.uk/resources/ how-guides/license-research-data 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Data and copyright Ability to copyright data varies throughout the world Europe also offers ‘database right’ – applies even if data cannot be copyrighted. International licences help avoid this legal minefield Standard licences strongly recommended – we are not all legal experts 2017-03-08 Digital Curation Centre - CC-BY

Nature, Wednesday 3rd August 2016 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Types of data licence Creative Commons V4.0 CC-BY or CC0 strongly recommended Also in existence: Open Data Commons Open Government Licence (UK) If your data includes/derived from other data – what licence applied to it? Licence ‘stacking’ can be a problem See also CODATA/RDA study on legal interoperability 2017-03-08 Digital Curation Centre - CC-BY

Digital Curation Centre - CC-BY Any questions? Images Alphabetti spaghetti from Ursus Wehrli: https://www.kunstaufraeumen.ch/en Closed data from: http://www.gsma.com 2017-03-08 Digital Curation Centre - CC-BY