“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick.

Slides:



Advertisements
Similar presentations
Gretchen Gueguen Digital Archivist April 12, 2012.
Advertisements

UKCoRR meeting Kingston University, November 2007 Mary Robinson European Development Officer University of Nottingham, UK
1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
LIFE 2 LIFE2 Conference The Life Model Paul Wheatley Digital Preservation Manager The British Library.
Supporting further and higher education Learning design for a flexible learning environment Sarah Knight and Ros Smith Pedagogy Strand of the JISC e-Learning.
Configuration management
Preserving E-Prints: Scaling the Preservation Mountain Sheila Anderson, Arts and Humanities Data Service Stephen Pinfield, University of Nottingham.
The White Rose Collaborative Collection Partnership Brian Clifford University of Leeds.
CESSDA Question Databank Tender, results and future Maarten Hoogerwerf, CESSDA expert seminar 2009.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Towards a mature, multi- purpose repository for the institution… Chris Awre, Simon Lamb, Richard Green Open Repositories 2012, Session RF6.
Administration & Workflow
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
ISO 9000 Certification ISO 9001 and ISO
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
August 14, 2015 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
Management, marketing and population of repositories Morag Greig, University of Glasgow.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
5-7 November 2014 ADLSN - ADLC Practical Digital Content Management from Digital Libraries & Archives Perspective.
Glen Robson Head of Systems Unit National Library of Wales
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
A disaggregated model for preservation of E-Prints Gareth Knight SHERPA DP Project Arts and Humanities Data Service.
The repositories Landscape: where are Repositories now and what’s around the corner? UKDA-store Louise Corti UKDA, University of Essex MIMAS OPEN FORUM.
Access Across Time: How the NAA Preserves Digital Records Andrew Wilson Assistant Director, Preservation.
Configuration Management (CM)
Archivematica, NLW & ARCW Glen Robson Head of Systems
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
Meet and Confer Rule 26(f) of the Federal Rules of Civil Procedure states that “parties must confer as soon as practicable - and in any event at least.
Overview The problem The solution Benefits/Lesson Resource Audit and Comparison Tool (ReACT) Ray Moore (Archaeology.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
June 3, 2016 Research data management – an introduction Slides provided by the DaMaRO Project, University of Oxford Research Services.
| Ingest Levels and Persistent Identification | October Ingest Levels and Persistent Identification Services for R & D and heritage organisations.
Persistent Digital Archives and Library System (PeDALS)
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Carcanet Case Study Fran Baker, John Rylands University Library University of Manchester SPRUCE event 19 January 2012.
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
GPO’s Future Digital System (FDsys) November 2, 2006 LS&CM CENDI Presentation.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Digital Preservation – the Welsh Way Sally McInnes, Chair ARCW, Digital Preservation Group IRMS Wales/ARA Wales/ARCW 16 October 2015.
F EDORA 4 – R UMORS & T HOUGHTS Mark Bussey Chief Information Leafblower.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
OA at Lancaster University Louise Tripp Academic Liaison Librarian and Open Access
Author 1 | Author 2 | Author 3 (edit this list via View > Slide Master) Insert your unit name via View > Slide Master Bullets in PowerPoint This paragraph.
Author 1 | Author 2 | Author 3 (edit this list on View > Slide Master) Go to View > Slide Master to insert your headship or centre name here This is an.
Preserving containers EUAN COCHRANE DIGITAL PRESERVATION MANAGER YALE UNIVERSITY LIBRARY.
Data Management and Archival Storage Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
DART Project Work Packages CR4 and CR5 Tom Denison, Nicholas McPhee, Monash University.
BANKING INFORMATION SYSTEMS
Hydra, research data and Archivematica
DAITSS and the Florida Digital Archive
An Introduction to Tessella and The Safety Deposit Box Platform
YugNIRO Digitization Proposal 2012
Karen Dennison Collections Development Manager
Research data preservation in Canada
Robin Dale RLG OAIS Functionality Robin Dale RLG
Digital Preservation through EPrints-Archivematica Integration
Presentation transcript:

“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick Institute for Archives University of York 13 August 2015

Project aim “…to investigate Archivematica and explore how it might be used to provide digital preservation functionality within a wider infrastructure for Research Data Management.”

What about Hydra? Hydra is not mentioned much in our project report...this is deliberate! We wanted to keep our findings generic to make it most useful to a wide range of institutions who may be interested in digital preservation......this means we are more likely to get further funding However...

Project team University of Hull: Chris Awre – Head of Information Services, Library and Learning Innovation Richard Green – Independent Consultant Simon Wilson – University Archivist University of York: Julie Allinson – Manager, Digital York Jen Mitcham – Digital Archivist

About the project Funded as part of Jisc Research Data Spring Started 30 th March 2015 Phase 1 is complete Phase 2 has just started and will run until November …and we hope phase 3 will be funded

Project structure Phase 1 – explore: testing, research, thinking - produce a report (3 months) Phase 2 – develop: make Archivematica better for RDM, plan implementation (4 months) Phase 3 – implement: set up proof of concepts at York and Hull (6 months)

Phase 1 -The key questions Why? Why are we bothering to 'preserve' research data. What are the drivers here and what are the risks if we don't? Why are we looking at Archivematica? What? What are the characteristics of research data and how might it differ from other born digital data that memory institutions are establishing digital archives to manage and preserve? What types of files are our researchers producing and how would Archivematica handle these? What does Archivematica offer us and what benefits does it bring? How? How would we incorporate Archivematica into a wider technical infrastructure for research data management and what workflows would we put in place? Where would it sit and what other systems would it need to talk to? How can we improve Archivematica for RDM? Who? Who else is using Archivematica (or other digital preservation systems) to do similar things and what can we learn from them? What staff resource is needed to preserve research data with Archivematica?

Why Archivematica? “The goal of the Archivematica project is to give archivists and librarians with limited technical and financial capacity the tools, methodology and confidence to begin preserving digital information today.”

Why Archivematica? Standards-based Open Source Flexible and customisable Compatible with hundreds of file formats Advanced search and storage management Integrated with third-party systems From

What does research data look like? York RDM questionnaire 2013: Please select the main types of electronic research data you generate

Top research data applications at York

The importance of identification How well are these formats identified by digital preservation tools? Better than expected! Sometimes partial Sometimes quite generic (without a version number)

What does research data look like? Potentially quite big Wide range of file formats (some well understood but a long tail of more specialist/obscure formats) Sometimes sensitive and/or confidential Ever changing (new software and techniques are used for dynamic and cutting edge research) May be different versions of the data (as new publications are released) Value not well understood at the point of deposit

What does Archivematica do? The short answer: “It packages data up in a standards compliant way and prepares it to be stored for the long term”

What does Archivematica do? The longer answer: Assigns unique identifiers Creates a checksum for each object Creates a text file with a directory tree of the transfer Option to quarantine data for a specified period Runs virus checks Cleans up file and directory names (removing characters that may cause problems) Runs identification tools so you can find out what file formats you have Extracts data from zip files (or not if you would rather not) Extracts metadata embedded in the files (if you want) Normalises files (if a migration path exists)...

What does Archivematica do? The really really long answer (if you have time): Read the manual

What does Archivematica do? One final answer (honest): It gives us a greater level of confidence that we will be able to continue to provide access to usable copies of research data over the longer term

What are the downsides? It isn’t a magic bullet There is no guarantee your data will be readable in the future It can only be as good as current digital preservation practice It can be fiddly to install correctly The GUI isn’t that intuitive You need staff who understand it

Phase 2: ‘develop’ 1.Enable better workflows for RDM (producing a DIP on request) 2.Allowing the DIP (access copy of data) to be usable by different repository systems 3.Helping reduce bottlenecks for big data 4.Workflows for unidentified files 5.Enabling easier querying of data within Archivematica by third party applications 6.Better documentation

Phase 2: RDM Workflows at York We get a copy of data from researcher We transfer it to Archivematica Archivematica packages it up for storage and creates the Archival Information Package (AIP) Archivematica sends the AIP to archival storage Metadata is published in data catalogue If someone requests the data Archivematica will create a Dissemination Information Package (DIP) DIP will be uploaded to Digital Library for access

How do York plan to use Archivematica?

Where to find out more

Where to find out more

Where to find out more

Thanks for listening You can contact me on: –