FITS and C3PO enhancements Paul Wheatley SPRUCE Project Manager University of

Slides:



Advertisements
Similar presentations
OpenDOAR The Directory of Open Access Repositories Bill Hubbard SHERPA Manager University of Nottingham.
Advertisements

DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
LIFE 2 LIFE2 Conference The Life Model Paul Wheatley Digital Preservation Manager The British Library.
A centre of expertise in digital information managementwww.ukoln.ac.uk Approaches To E-Learning: Developing An E-Learning Strategy Brian Kelly UKOLN University.
A centre of expertise in digital information managementwww.ukoln.ac.uk QA For Web Sites: Introduction To QA Brian Kelly UKOLN University of Bath Bath .
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
A centre of expertise in digital information management A QA Framework To Support Your Library Web Site Review Brian Kelly UKOLN University of Bath Bath.
Page 1 Capability Business Benefit Business Risk KEYBA Capabilities: Benefits V Risks Facilitation of Decision making Getting the right people together.
Getting started with hands-on preservation Paul Wheatley SPRUCE Project Manager University of
Setting the Stage Provide a high-level overview of the accessioning and management processes Depict where/how DLESE tools are used in the processes Identify.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Validata Release Coordinator Accelerated application delivery through automated end-to-end release management.
SCIDIP-ES Components Oct ,Brussels. Basic Preservation Strategies Often stated as: “Emulate or Migrate” OAIS concepts change these to: Add Representation.
On Privacy-aware Information Lifecycle Management (ILM) in Enterprises: Setting the Context Marco Casassa Mont Hewlett-Packard.
Imperial College Web Review Imperial College.... An audience-focused realignment of our web strategy with our College strategy, our market, technology.
Preserving webharvests at the National Library of New Zealand Te Puna Mātauranga o Aotearoa Peter McKinney Digital Preservation Policy Analyst National.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
GRAD 521, Research Data Management Winter 2014 – Lecture 6 Amanda L. Whitmire, Asst. Professor.
Metadata standards, tools and processes for audio preservation at the British Library: An overview of new systems for audio description, preservation and.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
JumpStart the Regulatory Review: Applying the Right Tools at the Right Time to the Right Audience Lilliam Rosario, Ph.D. Director Office of Computational.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
FINAL DEMO Apollo Crew, group 3 T SW Development Project.
LIFE 3 LIFE3: Predicting Long Term Preservation Costs Paul Wheatley Digital Preservation Manager The British Library.
LIFE 3 LIFE 3 : Predicting Long Term Preservation Costs Brian Hole LIFE 3 Project Manager The British Library KeepIt training course 05/02/10.
International Council on Archives Section on University and Research Institution Archives Michigan State University September 7, 2005 Preserving Electronic.
Metadata Standards and Applications 1. Introduction to Digital Libraries and Metadata.
Tackling concrete digital preservation challenges with SPRUCE Paul Wheatley SPRUCE Project Manager University of Leeds Twitter:
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
What is a Business Analyst? A Business Analyst is someone who works as a liaison among stakeholders in order to elicit, analyze, communicate and validate.
Supporting practical preservation work and making it sustainable with SPRUCE Paul Wheatley SPRUCE Project Manager University of These.
8 TH GRADE 4 TH QUARTER PROJECT Due: May 15 th, 2013.
A survey based analysis on training opportunities Dr. Jūratė Kuprienė Framing the digital curation curriculum International Conference Florence, Italy.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
ISO 9001:2008 to ISO 9001:2015 Summary of Changes
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
A centre of expertise in digital information managementwww.ukoln.ac.uk Making Effective Use Of Benchmarking Tools Brian Kelly UKOLN University of Bath.
People Mashing: What we did in the AQuA Project Paul Wheatley (and) Andrew Jackson, Bo Middleton, Jodie Double, Rebecca McGuinness.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The pan-European.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
The Importance of Standards in Digital Preservation Tina Norris Kayla Payne Jennifer
Digital Collections Audit and Preservation Business Case Bishopsgate Institute.
The common structure and ISO 9001:2015 additions
Tivoli © 2010 IBM Corporation CCMDB New Features for CCMDB August 2010.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
Software Engineering Lecture 8: Quality Assurance.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
A centre of expertise in digital information managementwww.ukoln.ac.uk UKOLN is supported by: This work is licensed under a Attribution- NonCommercial-ShareAlike.
IQ Server Product Overview June The problem we solve in a customer’s words… “We have almost 400 applications and they are all intertwined and very.
Feedback from Interoperability Workshop: Graham Worsley and David Calder TSA Conference 15 November 2011.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Digital Preservation MetaArchive Cooperative, Digital Preservation Policy Planning Workshop Boston College, Boston, MA October 26, 2010.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
Moving on : Repository Services after the RAE
Institute of Municipal Finance Officers & Related Professions
Data Flows in ACTRIS: Considerations for Planning the Future
Policy-Based Data Management integrated Rule Oriented Data System
Digital Preservation In Practice
Pacific GIS/RS Conference
Sophia Lafferty-hess | research data manager
Data Stewardship Interest Group WGISS-45 Meeting
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
Research data preservation in Canada
Open Archival Information System
Robin Dale RLG OAIS Functionality Robin Dale RLG
Notes: Rapid assessments.
Presentation transcript:

FITS and C3PO enhancements Paul Wheatley SPRUCE Project Manager University of

Practitioner needs What are the main practitioner needs (in terms of supporting tools)?

SPRUCE Mashups – the impact

Theme 1: Quality Assurance The problem: –Some have broken data –Some have suspected broken data –Some have an intention to process data in some way, but concerned about lack of ability to check the process doesn't break the data The solution: –Cross section of automated QA approaches required. How do we spot the flaws automatically? How do we fix them automatically? Often involves cross checking (eg. Data to metadata) Sometimes explorative. What actually caused the problem, how do we prevent it? Every case feels unique, but often strikes a chord more widely

Theme 2: Appraisal + Ingest preparation The problem: –We have digital stuff, what is it, what should I worry about, what do I do next? –We know roughly what we've got (we've had some before) but we have a largely manual appraisal process that doesn't scale well –How do we turn this blob of content into something we can ingest into our repository? The solution: –Characterisation capability needs to vastly improve –Automatic extraction of properties / flavour of content to aid appraisal/selection –Inform processing of data prior to ingest

Theme 3: Identify/locate preservation worthy data The problem: –Institution has preservation worthy data scattered across shared server space –Data is unmanaged, not check summed, often doesn’t have a responsible owner –Sorting this data from non-preservation worthy data is a challenge The solution: –Find it Tools/approaches to “smell” preservation worthy data –Make it safe Check summing, creating manifests, registering basic details with a central authority with preservation responsibility, periodically recalc checksums. All components are there but not in usable package –Get it ready to ingest De-duplication, curation, management, add metadata, other ingest preparation

Theme 4: Conformance to institutional profile/policy The problem: –Institution has policy driven requirements for the shape of its content, defined by specific profiles –Does data conform to these profiles? –If not (in some cases), can it be made to conform? The solution: –Conformance checking focused characterisation and validation –Modification of content + associated QA

Theme 5: Identify preservation risks The problem: –Data is in the repository, what risks does it face? –Some worry about whether they should be migrating their content –Some specifically want to format migrate and want help doing it –Root of problem is: what are the risks? –Risks themselves not well understood –Woeful tool provision to assist in automated risk assessment The solution: –Tools/approaches for identifying specific preservation risks in digital data –Logical progression is then for planning, action and QA

Overall challenge - characterisation Summary of 5 main challenges: –Quality Assurance –Appraisal and ingest preparation –Identify/locate preservation worthy data –Conformance to profiles/policy –Identify preservation risks Conclusion: –Practitioners need better characterisation capability –In other words they need better automated ways to understand their data

FITS and C3PO FITS –Assess content, identifies characteristics and extracts metadata C3PO –Provides a visual interface to navigate and understand the data extracted by FITS What did we do? –Update FITS functionality, better coverage, uses latest tools –Addressed tool maintenance by providing the infrastructure to make the tools community maintainable