Dependency Management

Slides:



Advertisements
Similar presentations
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Project Overview APA Conference 2012 ESA/ESRIN (Frascati), 6-7 November 2012 D. Giaretta (APA)
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
From Analog to Digital: Changes in Preservation Gregor Trinkaus-Randall Digital Commonwealth Conference Worcester, MA March 25, 2010.
SCIDIP-ES Components Oct ,Brussels. Basic Preservation Strategies Often stated as: “Emulate or Migrate” OAIS concepts change these to: Add Representation.
PREMIS What is PREMIS? – Preservation Metadata Implementation Strategies When is PREMIS use? – PREMIS is used for “repository design, evaluation, and archived.
Future Access to the Scientific and Cultural Heritage – A shared Responsibility Birte Christensen-Dalsgaard State and University Library.
Automatic Evaluation of Migration Quality in Distributed Networks of Converters Miguel Ferreira Supervisors Ana Alice Baptista.
introduction to MSc projects
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Integrating Digital Curation in a Digital Library curriculum: the International Master DILL case study Anna Maria Tammaro University of Parma Florence,
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 6 Slide 1 Requirements Engineering Processes l Processes used to discover, analyse and.
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
What is a Business Analyst? A Business Analyst is someone who works as a liaison among stakeholders in order to elicit, analyze, communicate and validate.
Data Preservation Creating trustworthy archives. Digital Preservation does not happen by accident  To preserve digital information, we need to take careful,
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
1 Digital Preservation Testbed Database Preservation Issues Remco Verdegem Bern, 9 April 2003.
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT The importance of interoperability and intelligibility in digital.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
4 Mar 2004http:// VERS: Victorian Electronic Record Strategy Digital Preservation Seminar ODU Spring 2004.
Enterprise Oracle Solutions Oracle Report Manager The New ADI and More Revised:June 20091Report Manager/SROAUG Presentation.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
8 April Risk Management of Digital Information: A File Format Investigation Gregory W. Lawrence, et. al June 2000 Council on Library and Information.
INFORMATION SYSTEMS SERVICES UNIVERSITY OF LEEDS ERPANET: OAIS Seminar Copenhagen - København 28th November 2002 Introducing the OAIS Model _________________________________.
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
Working with personal digital archives Susan Thomas Project Manager & Digital Archivist project Manuscripts Matter, Electronica panel London, October.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network aparsen.eu #APARSEN Options.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN CoE offerings Simon Lambert STFC All Hands Meeting, Amsterdam,
Preservation Planning Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Work Package 25 Interoperability and Intelligibility Yannis Tzitzikas FORTH-ICS.
DP Knowhow: Open Archival Information Systems (OAIS) in ISO APA/C-DAC International Conference on Digital Preservation and the Development of Trusted.
Chapter 1 The Systems Development Environment
Digital Sustainability on the EU Policy Level
WP14 Common Testing Environments
Ingest and Dissemination with DAITSS
Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,
Chapter 1 The Systems Development Environment
SAP University Alliances
Chapter 1 The Systems Development Environment
Statewide Digitization and the FCLA Digital Archive
Active Data Management in Space 20m DG
Chapter 1 The Systems Development Environment
Chapter 1 The Systems Development Environment
Sophia Lafferty-hess | research data manager
Storage Basic recommendations:
PREMIS Tools and Services
Introduction to Systems Analysis and Design Stefano Moshi Memorial University College System Analysis & Design BIT
Beyond Description: Metadata for Catalogers in the 21st Century
Purpose of meeting: Establish Team
The Reference Model for an Open Archival Information System (OAIS)
Nancy Y. McGovern Digital Preservation Officer, ICPSR IASSIST 2007
Chapter 1 The Systems Development Environment
Future of EDAMIS Webforms
Palestinian Central Bureau of Statistics
Presentation transcript:

Dependency Management APARSEN Training “Access & Usability” Florence, 17 – 18 September, 2014 rene.van.horik@dans.knaw.nl

Outline Digital Preservation Strategies Interoperability Automatic Reasoning (for digital preservation) Deliverable 25.1 “Interoperability Objectives and Approaches” (145 pages) Deliverable 25.2 “Interoperability Strategies” (79 pages) (available on the project website www.aparsen.eu) Outline of presentation: What is quality? Attention for “Context” Models that provide context Data management is important for digital preservation and helps to measure / assess quality. DM is main topic of online course “Essentials 4 Data Support”

Digital Preservation Threats to future access to digital information Format obsolescence Not possible to render object Operating system obsolescence Hardware failure What is it?

Digital Preservation Strategies Technology preservation strategy Technology emulation strategy Digital information migration strategy

Interoperability What is interoperability? (exercise 1) Why is it important for digital preservation? DP = Interoperability with the future It enables use and exchange of information/knowledge Avoiding “vendor lock-in” (open standards) Implies standardization and “trust”

Automatic Reasoning Digital Preservation is based has a lot of dependencies Are assumptions still valid in the future e.g. Documentation is understandable File format is still usable Digital archive still has funding …

Can the OAIS Reference Model help us? YES! The OAIS reference model will help us to be precise and unambiguous More specific please…

Designated Community An identified group of potential Consumers who should be able to understand a particular set of information. The Designated Community may be composed of multiple user communities. A Designated Community is defined by the Archive and this definition may change over time.

Representation Information The information that maps a Data Object into more meaningful concepts. An example of Representation Information for a bit sequence which is a FITS file might consist of the FITS standard which defines the format plus a dictionary which defines the meaning in the file of keywords which are not part of the standard. Another example is JPEG software which is used to render a JPEG file; rendering the JPEG file as bits is not very meaningful to humans but the software, which embodies an understanding of the JPEG standard, maps the bits into pixels which can then be rendered as an image for human viewing.

Example Digital object: Digitized Mediaeval charter Designated community: historians (mediaevalists) Representation information: Digital image in JPEG-> JPEG standard ; JPEG software Transcription in PDF -> PDF standard ; PDF viewer Content in Latin -> English – Latin Dictionary ; Annotations in XML -> ASCII Standard ….

Exercise Take a digital object as an example This object has to be preserved First give a description of the digital object Describe the Designated Community (which assumptions do you make?) Which Representation Information is required in order for the Designated Community to be able to understand the digital object?

Automatic Reasoning EPIMENIDES System developed by FORTH Uses cases provided by DANS Demonstrator: http://www.ics.forth.gr/isl/epimenides

Example 2014: We consider PDF as a durable format (facts and rules) 2014: We create a knowledge base expressing our knowledge concerning the PDF-format (e.g. Software to convert to PDF. Software to check whether PDF files are not corrupt (and still can perform their task) 2015 – 2023: We maintain and apply the knowledge base 2024: The knowledge base is drastically changed, e.g. because PDF format is obsolete 2024: As we have stored dependencies in a system we know what risks that threathen the usabilty of PDF files and we can take measurements (e.g. Migration / emulation) (The knowledge base is created and maintained in the Epimenidis system)

Use Cases (Slides by Forth) For plain users For Archivists

For plain users: The user uploads a file or a zipped bundle of files Upload your own digital objects

The system finds the tasks that usually make sense to apply to the uploaded digital objects Rendering for this .txt file Runnability for this .exe file Requesting performability checking

Getting the results of the Dependency Analysis (the results of the automatic reasoning) Reds: Inability to perform this task on this file Greens: Ability to perform these tasks over these objects

Ability to explore the dependencies related to one task Direct dependencies of Rendering Task

Use Case for Archivists: Aiding the Definition of new Tasks Name of the new task Define the dependencies of this task

Use Case for Archivists: Consequences of a Hypothetical Loss

Exploring the contents of its Knowledge Base Explore the contents of the underling RDF/S triple store

Concluding Remarks (1/2) Each interoperability objective or challenge can be considered as a kind of demand for the performability of a particular task (or tasks). However each task for being performed has various prerequisites (e.g. operating system, tools, software libraries, parameters, etc). We call all these dependencies. Standardization reduces the dependencies without vanishing them, moreover sometimes they can not be adopted However, the ultimate objective is the ability to perform a task, not the compliance to a standard We need an alternative approach which reduce the human effort A dependency management approach Each interoperability objective or challenge can be considered as a kind of demand for the performability of a particular task. However each task for being performed has various prerequisites or dependencies. The definition and adoption of standards (for data and services), aids interoperability because it is more probable to have (now and in the future) systems and tools that support these standards, than having systems and tools that support proprietary formats. From a dependency point of view, standardization essentially reduces the dependencies and makes them more easily resolvable; it does not vanish dependencies. But the ultimate objective is the ability to perform a task, not the compliance to a standard. For these reasons we have proposed a dependency management approach that can reduce the human effort for DP services.

Concluding Remarks (2/2) The proposed approach can be used for capturing converters and emulators and can be applied to concrete use cases We have designed and implemented a proof of concept prototype (Epimenides) for testing whether the proposed reasoning approach behaves as expected. We should also mention that since the implementation is based on W3C standards, it can be straightforwardly enriched with information coming from other external sources (i.e. SPARQL endpoints). An advantage of our approach is that can be used for capturing converters and emulators that are basic preservation strategies. Based on that approach we have designed and implemented a proof of concept prototype (Epimenides) for testing whether the proposed reasoning approach behaves as expected, that is based on W3C standards.

Value of the work done Enables a flexible strategy of achieving interoperability by combining existing software The offered automated reasoning could greatly reduce the human effort required for checking (or periodically monitoring) whether a task on a digital object is performable. The proposed approach offers a flexible strategy of achieving interoperability by combining existing software, and vanishing a gap that prevents the performability of a task. Also could greatly reduce the human effort required for checking (or periodically monitoring) whether a task on a digital object is performable.

http://datasupport.researchdata.nl/en Proper data management obviously has influence on the quality of data. So proper traininig is an important issue. “Data Essentials 4 Data Support” aims to improve the skills of data supporters.

Network of Excellence