Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Role of Trustworthy Digital Repositories in Sustainability

Similar presentations

Presentation on theme: "The Role of Trustworthy Digital Repositories in Sustainability"— Presentation transcript:

1 The Role of Trustworthy Digital Repositories in Sustainability
David Giaretta and Big Data to Knowledge AHM & Open Data Science Symposium 29 Nov – 1 Dec 2016

2 Interoperability, Re-use, Preservation and Sustainability
Exploitation/ Re-use Replication of results VALUE Usability Interoperability What do the bits mean? Need “metadata” What kinds? How much of each kind? Preservation “metadata” Sustainability Usability underpins re-use, interoperability, replication of results and also PRESERVATION! EU Commissioner for the Digital Agenda said: “Data is the new Gold” but Gold is precious because it is rare, and does not combine Data is precious because there is so much and it becomes more valuable when it is combined

3 Digitally encoded information – 1’s and 0’s
BITS: HEX: Two IEEE bit real numbers: E E10 Two 32 bit integers Actually ASCII Characters: NMQMPJ ………. Was my flight reference Example: “ca fe ba be” at start indicates Java class file 4e 4d 51 4d 50 4a 20 20 Assuming “big-endian” Let us look at some bits …. What could they mean? What does this mean?

4 …semantics … Could be Findable and Accessible - encoded as Comma Separate Value (CSV) file in ASCII or Unicode or encoded with XML markup Can anyone guess what this table means? Longitude Latitude Ozone Date 132 50 34.9 12/03/1999T17:20:43.1 178 45 12/03/1999T19:37:52.7 180 78 12/03/1999T21:16:23.9 This could be encoded as bits for example in ASCII. One might guess that the columns mean this is a table. The right hand symbols look like a date and time – but what time zone etc etc What about the rest?

5 OAIS (ISO 14721) and digital preservation
Reference Model for Open Archival Information System (OAIS) provides a very general approach OAIS approach to digital preservation: covers all types of digitally encoded information provides a way to test whether preservation is successful does not require seeing into the future does require transparency – be clear what is being promised but does not require “open access” Very widely accepted and provides the basis for pretty well all work in digital preservation OAIS provides a good basis for certification Available free from Here is a very brief overview of OAIS It was important from the start to define a vocabulary that could be used across disciplines – to allow everyone to understand each other when they described or intercompared their archives Of course there are other collections of terms and relations between the OAIS terms and these others using SKOS (Simple Knowledge Organisation System) – that allows one to say whether one term is broader or narrower than another – or “related” to another It is important to realise that OAIS covers the fundamental concepts of digital preservation BUT does not include such things as the funding etc. NOT a DESIGN for a repository – but could be used as a checlist The other fundamantal aims were Be able to be applicable to ANY digital object It must be TESTABLE No NEED to FORSEE the future

6 Preserving digitally encoded information
In order to use/understand the bits requires what OAIS calls “Representation Information” – anything needed to allow the data to be interpreted by software or people and certainly requires semantics and many other things Additional things such as software which are readily available now may not be available in future If the bits are unchanged we can keep hashes and be pretty sure of authenticity. If we have to change the bits e.g. Transform to another format then Evidence of Authenticity needs care Probably needs other software etc It may be that the information must be handed over To different system and/or different organisation Need to take care of the details which tend to be ignored

7 Partial Representation Information Network for MERIS Level 2 data

8 Role of people (and automated systems)
Creation of data and capture/creation of the metadata required for use/exploitation now and into the future Follow “Active” Data Management Plans (RDA and CCSDS/ISO) Funding, Management and Operation of the repository Defines the “Designated Community” e.g. people who understands particular sub-discipline Undertakes preservation activities for the data – ensuring that the data will be usable by members of the Designated Community despite changes in h/w, s/w, environment etc Use the data (including by the Designated Community) Exploit and create value from the data Judge the value of the data

9 Many types of Audit and Certification
ISO focuses on keeping the Information understandable / usable based on OAIS concepts – including usability 100+ metrics covering all aspects of the repository to ensure the auditor looks at the details uses the ISO certification process on which our lives depend in so may areas e.g. medical equipment, food safety, airlines, automobiles etc.- 3rd party visits and evaluation ISO type audits focus on keeping the bits safe in the context of the needs of the organisation the information is an asset of the business – what happens after the organisation ceases to exist is of no concern. Security certification may be needed for any information that can be used to identify an individual DIN 31644 audit and certification process not clear ISO – Records Management No formal audit process World Data System and Data Seal of Approval Small set (16) metrics – not detailed Recognised as much “lower” than ISO (DSA as “bronze” and ISO as “gold”) Other standards are concerned with information but OAIS takes a different point of view which is expressed in a rather extreme way here.

10 ISO Standards for certification
ISO 16363: Audit and Certification of Trustworthy Digital Repositories Available free from ISO 16919: Requirements For Bodies Providing Audit And Certification of Trustworthy Digital Repositories Available free from Used for accreditation of auditors by National Accreditation Bodies Auditors available early next year These standards are written so that they can be used to audit and certify digital repositories within the ISO process on which all our lives depend

11 Sustainability and Trustworthiness
Requires resources ($ / £ / …) Are the resources being well spent – will the data be usable? Is the Value (or potential value likely to be derived) worth the Cost An important factor in appraisal – cannot preserve everything There are economies of scale There are limits to the availability of expertise Competition between repositories? Trustworthiness is a way to choose between repositories ISO certification requires detailed evidence and is fundamentally linked to usability - from which value, and hence sustainability, is derived

12 Useful Links OAIS WEB pages: Site to gather proposals for OAIS updates in 2017: ISO 16363: Integrated GLOSSARY of digital preservation SKOS ontology to show relationship between terms from different glossaries OAIS, APARSEN, DPC, ANZ, SNIA, INTERPARES, ISO16363 Active Data Management Plans: CCSDS/ISO Research Data Alliance: Me:

13 END

Download ppt "The Role of Trustworthy Digital Repositories in Sustainability"

Similar presentations

Ads by Google