Keeping your Research Alive: Preserving Research Data
Overview Introduction What is research data Why manage research data? How to manage research data -Types of research data -Ethics & IP -Access, Sharing & Re-use -Storage & preservation The Lifecycle of Research Data Data Management Planning & Approaches to managing data Help & Advice
By the end of the session you will be better able to: Describe the forms research data takes and the role of contextual documentation and metadata in enabling data re- use Describe how managing research data effectively will improve your research, save you time, decrease the risks of data loss and increase your professional impact and identify tools to help Describe University of Leeds and research funder data management expectations Identify sources of information and guidance on managing research data effectively, including additional training courses
What are research data?
What research data do you have? Images from
What is research data? Data The lowest level of abstraction from which information and knowledge are derived. Research Data Recorded, factual material commonly retained by and accepted in the [research] community as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created. Engineering and Physical Sciences Research Council (EPSRC)
Why manage research data?
Society Government Research Funders University Individual
Why manage research data? Society “Publicly funded research data are a public good, produced in the public interest, which should be made openly available.” (RCUK) The Organisation for Economic Co-operation and Development (OECD) The OECD is a unique forum where the governments of 30 democracies work together to address the economic, social and environmental challenges of globalisation. “.. promote a culture of openness and sharing of research data”
Why manage research data? Government The government in its ‘Innovation and Research Strategy for Growth’ has committed to the principle that publicly funded academic research is a public good produced in the public interest and that, while intellectual property must be protected and commercial interests considered, it should be made openly available with as few restrictions as possible. In this way, we will more effectively realise the social and economic benefits of spreading knowledge, raising the prestige of UK research and encouraging technology transfer. Open Data White Paper ‘Unleashing the Potential’
Why manage research data? Funders Publicly funded research data are a public good […] which should be made openly available with as few restrictions as possible Data with acknowledged long term value should be preserved and remain accessible and usable for future research […]sufficient metadata should be recorded and made openly available. RCUK recognise that there are legal, ethical and commercial constraints on the release of research data. [Researchers] may be entitled to a limited period of privileged use of the data […] to enable them to publish the results of their research. […] all users of research data should acknowledge the sources of their data and abide by the T&Cs under which they are accessed. It is appropriate to use public funds to support the management and sharing of publicly-funded research data. The mechanisms should be efficient and cost-effective
Why manage research data? University Research data will be managed to agreed standards […] and in accordance with funder requirements All research data should be offered and assessed for deposit and preservation in an appropriate University, national or international data service or domain repository […] Data should not be deposited with any organisation that does not commit to its access and availability for re-use At the completion of each research project, the PI should ensure that all relevant research data are made available, subject to meeting appropriate requirements, in the location specified in the data management plan. The University is responsible for the provision of training, support and advice Responsibility for research data management during any research project or programme lies with responsible owners such as Principal Investigators (PIs). The management of Research Data reflects our: - commitment to research excellence - recognition of our duty to our funders - appreciation of the value of our data - to us and to others
Why manage research data? Individual Increase in citation rates when researchers share the data underlying publications (Piwowar, 2007, PLoS) Data sharing builds communities Stronger networks More collaboration Better research Increase research efficiency Save time and resources Enhance data security You may be the first re-user of your own data!
How to manage research data
-Types of research data -Ethics & Intellectual Property -Access, Sharing & re-use -Storage & Preservation UoB
How to manage research data Types of research data Categories of data Text Documents (inc with images), spreadsheets Field notebooks, diaries Questionnaires, transcripts, codebooks Audiotapes, videotapes Photographs, films Test responses Slides, artefacts, specimens, samples Collection of digital objects acquired and generated during the process of research Data files Database contents (video, audio, text, images) Data can be analogue, born digital or digitised.
How to manage research data What does this mean?
How to manage research data What is metadata? In the context of data management, metadata are a subset of core standardised and structured data documentation that explains the origin, purpose, time reference, geographic location, creator, access conditions and terms of use of a data collection. Metadata are typically used: for resource discovery, providing searchable information that helps users to easily find existing data as a bibliographic record for citation UK Data Archive, Managing & Sharing Data Guide, May
How to manage research data What is metadata? Metadata can be stored at different levels: Individual file level (e.g. image format, resolution, colour depth etc.) Data object level (e.g. multiple files make up a single panoramic data object) Object group level (e.g. multiple photos taken at same location during a single session) Project level (e.g. relevant research records / consent forms. etc.) More info:
How to manage research data Storing Research Data File formats To ensure your data will still be able to be read in the future, it should be stored in an open format. MS Word / Powerpoint with Macros PDF / A MS Excel CSV / Tab-delimited
How to manage research data Storing Research Data (2) Bitmap (.bmp) Photoshop (.psd) JPEG-2000 (no compression) TIFF (v6 uncompressed) Real Audio (.rm) Windows Media Audio (.wma) FLAC (.flac) WAV (.wav)
How to manage research data Storage,Preservation and Curation Storage: Storing raw copy of data such that it can be retrieved at a later data with 100% accuracy. Preservation: The active management of digital information over time to ensure its accessibility. Curation: Maintaining, preserving and adding value to digital research data throughout its lifecycle.
How to manage research data Access, sharing and re-use Brainstorm – constraints on data sharing Possible solutions / approaches
How to manage research data Access, sharing and re-use Ethics Ethical review – research with people –balancing data protection with data sharing Informed consent – current and future use Confidentiality – is anonymisation appropriate? Access control – who, what, when? IPR Copyright – clarify before research starts Licensing options – CC, ODC, End User Licence
How to manage research data Access, sharing and re-use Ways of sharing data – repositories - subject data repository – Archaeology Data ServiceArchaeology Data Service - national data repository – UK Data ArchiveUK Data Archive - interdisciplinary (including negative data) – FigShareFigShare - institutional data repositories – Edinburgh DataShare – LeedsEdinburgh DataShare Advantages - permanent / stable - findable - citable - safe and controlled environment
How to manage research data Access, sharing and re-use Further information: Secretariat / RIS - actice.html actice.html IPR Policy Code of Practice on Data Protection – checklist Research ethics policy (2013) UK Data Archive - Digital Curation Centre - Funders – e.g.
How to manage research data Access, Sharing and re-use Emerging scholarly practice Credit for re-use: University of London. Institute of Education. Centre for Longitudinal Studies (2010): National Child Development Study: Sample of Essays (Sweep 2, Age 11), UK Data Archive. Higher profile for research data: Data Citation Index
Example of data sharing
The lifecycle of research data
Research Data Lifecycle ►► ► ► ► ► ► ► RE-USING DATA follow-up research new research undertake research reviews scrutinise findings teach and learn CREATING DATA design research plan data management (formats, storage etc.) plan consent for sharing locate existing data collect data (experiment, observe, measure, simulate) capture and create metadata PROCESSING DATA enter data, digitise, transcribe, translate check, validate, clean data anonymise data where necessary describe data manage and store data ANALYSING DATA interpret data derive data produce research outputs author publications prepare data for preservation GIVING ACCESS TO DATA distribute data share data control access establish copyright promote data PRESERVING DATA migrate data to best format migrate data to suitable medium back-up and store data create metadata and documentation archive data UK Data Archive Research Data Lifecycle
Data Management Planning
4. A data management plan that explicitly addresses the capture, management, integrity, confidentiality, preservation, sharing and publication of research data must be created for each proposed research project or funding application. Sufficient metadata shall also be created and stored to aid discovery and re-use. Data management plans should take account of and ensure compliance with relevant legislative frameworks which may limit public access to the data (for example, in the areas of data protection, intellectual property and human rights). University of Leeds Research Data Management Policy
DMP Example Exercise Look at two examples of DMPs (page 23 and page 25). Identify strengths and weaknesses of each
Data Management Planning
Approaches to Preserving and Managing Research Data
Levels of managing data Use meta data tools Structure files Document the research
Consider Data format File formats Organising files Structure File names Data storage Volume Format (optical, magnetic) Security Backup
What is your experience of finding files? Challenges? Solutions?
Summary
Publicly funded research should be for the common good Managing research data is part of good research practice The data allows you to justify your research findings Enables you to more easily find and re-use your research data Managed data can be shared with others A research data management plan helps in achieving this Be clear about who is responsible for research data and its management
References Cited papers Digital Curation Centre MIT Libraries on Data Management and Publishing Research Councils UK on Governance of Good Research UK Data Archive University of Leeds on Safeguarding data Research Data Management Data Protection MANTRA research data training, University of Edinburgh