Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute.

Slides:



Advertisements
Similar presentations
Adding OAI-ORE Support to Repository Platforms Alexey Maslov, Adam Mikeal, Scott Phillips, John Leggett, Mark McFarland Texas Digital Library TCDL09.
Advertisements

IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
Preserv: Preservation architecture and interface A brief overview of ideas wrt to the project plan For Preserv partners meeting, BL, London, 18th November.
P2N: Cloud Control David Tarrant Ben OSteen
Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Digital Preservation for Digital Repositories David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
UKOLN is supported by: Put functionality Augmenting interoperability across scholarly repositories 20/21 April 2006 Rachel Heery, UKOLN, University of.
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Capacity Building Passing on the Experience Dr. Noha Adly World Digital Library Arab Peninsula Regional Group meeting.
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
Copying Archives Project Group Members: Mushashu Lumpa Ngoni Munyaradzi.
Digital Preservation Infrastructure in the University of Alberta Libraries Peter Binkley Digital Initiatives Technology Librarian
A Better Option for IT’s Data Management Challenge By Shaun Smale Solutions Consultant, BridgeHead Software.
Windows Azure Conference 2014 Hybrid Cloud Storage: StorSimple and Windows Azure.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Repositories.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Open Dialogue on Digital Data management
Archival Prototypes and Lessons Learned Mike Smorul UMIACS.
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
Anthony Atkins Digital Library and Archives VirginiaTech ETD Technology for Implementers Presented March 22, 2001 at the 4th International.
Towards smart storage for repository preservation services Steve Hitchcock, David Tarrant, Adrian Brown 1, Ben O’Steen 2, Neil Jefferies 2 and Leslie Carr.
Records Survey and Retention Schedule Recertification 2011.
 EPrints & Preservation David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
David Tarrant University of Southampton Applying Open Storage to Institutional Repositories.
NCSU Libraries TRLN Digital Preservation Seminar NCSU.
Enabling E Research ANU Data Commons. What is it ? Building a repository for data sets o data can be deposited o updated o published to Research Data.
Reliability Focus Area Project L13 SHRP 2 Technical Coordinating Committee for Reliability Research Meeting Irvine, California April 08, 2010 Zongwei Tao,
Information- and Archive Services | Information- and Archive Services in the Netherlands Jacqueline Slats DLM Forum May 13, 2011 Budapest.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
Access Across Time: How the NAA Preserves Digital Records Andrew Wilson Assistant Director, Preservation.
WebInfoMall: the Chinese Web Archive how we got started and how it is now Huang Lianen and Li Xiaoming Peking University, China Digital Archive Workshop.
Choosing Delivery Software for a Digital Library Jody DeRidder Digital Library Center University of Tennessee.
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Data Archiving and Networked Services Introduction to Data Management Planning.
EPrints 10 Years of Digital Preservation. What is EPrints For?  EPrints offers a safe, open and useful place to store, share and manage material in the.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Digital Library Repositories and Instructional Support Systems: Repository Interoperability Working Group Leslie Johnston University of Virginia Library.
TSS Database Inventory. CIRA has… Received and imported the 2002 and 2018 modeling data Decided to initially store only IMPROVE site-specific data Decided.
Digital Collections Forum Doug Moncur AIATSIS September 2004.
Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation.
Extracting value from grey literature Processes and technologies for aggregating and analysing the hidden Big Data treasure of the organisations.
Digital Library Storage Strategies Robert Cartolano, Director Library Information Technology Office November 14, 2008.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
Thinking Long Term - Archive Strategies for Alfresco Nathan McMinn Remote Service Engineer Alfresco Chetan Lalye Senior Software Architect Agilent Technologies.
Usecases: 1.ISIS Neutron Source 2.DP for HEP Matthew Viljoen STFC, UK APARSEN-EGI workshop: preserving big data for research Amsterdam Science Park 4-6.
AXF – Archive eXchange Format Report of AXF WG to TC-31FS 6 December, 2012.
DArcMail Demonstration D igital Arc hive e Mail System Riccardo Smithsonian Institution Archiving.
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Tokamak data mirror for JET and MAST Moving towards an open data repository for European nuclear fusion research.
Digital Initiatives Technology Librarian
P2N: Cloud Control David Tarrant Ben O’Steen
Storage & Digital Asset Management CIO Council Update
SCALABLE OPEN ACCESS Hussein Suleman
Implementing an Institutional Repository: Part II
Barracuda Solutions VMware® vCloud® Air™ Version 1.0 | February 2015.
Jisc Research Data Shared Service (RDSS)
Implementing an Institutional Repository: Part II
Presenter name goes here Presenter title goes here
How to Implement an Institutional Repository: Part II
MapReduce: Simplified Data Processing on Large Clusters
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute view of any final end product and at this stage should purely be considered a set of realistic ideas.

Outline StorageTek 5800 (The Honeycomb) provides high resilience data storage with a built in metadata layer. EPrints is a piece of repository software for managing large collections of digital objects and their related metadata.

EPrints Open Source repository software to provide open access to institutional output. Provides a powerful plugin based package which can easily be extended at any layer to suit a users requirements. 2 types of archive Those used to manage publications and small objects. Those used to deposit large objects. These tend to contain heavier customisation.

Preserv2 Preserv2 is the 2 nd iteration of a project looking at preservation services for repositories. Beyond simple backup Format Renderers, Format Translation, Risk Assessment, Interoperability and long term storage.

Why use a Honeycomb? A Honeycomb is not just a “Big Disk” A Service Based Architecture: Big object, big storage, more powerful plugins/services. Smaller Repositories can jointly use a single Honeycomb as a “Preservation Service”. Preservation Service Providers Can combine several servers into a “Honeycomb Cloud”

EPrints Architecture EPrints (Repository) Layer Object Storage Metadata Storage

EPrints and Honeycomb EPrints (Repository) Layer STK5800 HoneyComb

Services for Repositories EPrints (Repository) Layer Metadata Services Storage Beans Automated Wide Area Backup Automated Wide Area Backup

Metadata Services Same resilience as data. Averts the need to store a file id/url somewhere in order to find an object. Enables collections to be constructed by independent parties. Objects can be exported into many formats accurately.

Storage Beans Can perform operations upon the objects in the system without reliance upon the repository to manage these processes. (e.g. Object Translation) Preservation services can provide feedback to repository administrators on potential risks to their objects. (e.g. Object Classification, age) Can be used to extend the metadata layer to provide more powerful access to objects and their parts/pages. (e.g. Retrieve me page 10 of volume 6 of X)

Wide Area Replication (Backup) The possibility to link two or more Honeycombs together over a wide area to provide mirrored backup. This can be implemented by the archive which can store its objects in a “Honeycomb Cloud”

Possible Architectures (2) Repository

Possible Architectures (3) Repository

Possible Architectures (4) Repository

Preservation Services A “Honeycomb Cloud” provides the basis for a preservation service which can be provided to many small scale (<200Gb) repositories. Options for object storage: Locally with Honeycomb acting purely as a preservation service. Hand all object storage and retrieval to Honeycomb Cloud. A half and half solution: Small Objects served locally, Large Objects from Honeycomb. Recent and Popular Objects served locally, Older Objects considered preserved.

The out of the box repository solution for Large Repositories.

Thumpers “Big Disk” The Thumper system (STK 4500) is essentially a “Big Disk” server. “Out of the Box” solution. Expansions: Services to enable replication between 2 thumpers. Preservation services using a Honeycomb. Aimed at Repositories where tape backup is not ideal.

Ecrystals (Possible Use Case) Large Chemistry repository which currently stores only processes result objects (small). These result files are generated from >1Gb raw datasets. 8+ Datasets generated a day. After 6 months results sets are of less worth. This represents 1TB of raw data in a 6 month period.

ECrystals – Single Honeycomb Architecture Current Repository Remains All Results Sets Stored on HoneyComb Pros Simplistic Architecture Sole use of Honeycomb Year of “on-site” storage. Cons Cost Backup Procedure? EPrints (Repository) Layer

“ Thumper System “ Thumper System ECrystals – Thumper with “Honeycomb Cloud” Pros Single local machine 6 months+ locally Accessible Automated Preservation Preservation Services managed by Honeycomb Cloud. Storage Beans on Honeycomb Cloud compress older/less popular objects Cons ? EPrints (Repository) Layer

Summary Honeycomb provides: Better separation of repository layer from storage layer. Repository interoperability. A new approach to storing and preserving data from institutional repositories based on EPrints and other software.