Presentation is loading. Please wait.

Presentation is loading. Please wait.

Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Services and Sustainability David Giaretta,

Similar presentations


Presentation on theme: "Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Services and Sustainability David Giaretta,"— Presentation transcript:

1 Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Services and Sustainability David Giaretta, director@alliancepermanentaccess.org director@alliancepermanentaccess.org APA APARSEN workshop, Decmber 2013

2 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Outline What digital objects do you have? What do you mean by preservation? - Preservation for whom? - Why preserve? - Drivers What are the threats to your digital objects? - What changes? - What do preservation organisations find most difficult? What can your organisation do by itself? - What services would be useful? - Sharing the effort APA/APARSEN/SCIDIP-ES services - Preservation and Value  Increasing value  Improving preservation

3 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 What digital objects do you have?

4 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 What do you mean by preservation? Preservation for whom? Why preserve? Drivers

5 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 OAIS Open Archival Information System reference model provides: - fundamental concepts for preservation - fundamental definitions so people can speak without confusion - “now adopted as the de facto standard for building digital archives"  In Cyberinfrastructure Vision for 21st Century Discovery ► http://www.nsf.gov/pubs/2007/nsf0728/nsf0728.pdf http://www.nsf.gov/pubs/2007/nsf0728/nsf0728.pdf TESTABLE

6 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Key OAIS Concepts Claiming “This is being preserved” is untestable - Essentially meaningless  Except “BIT PRESERVATION” How can we make it testable? - Claim to be able to continue to “do something” with it  Understand/use ► Need Representation Information Still meaningless… - Things are too interrelated  Representation Information potentially unlimited - Need to define a Designated Community – those we guarantee can understand – so we can test Many other concepts identified Finer grained taxonomy than simply saying “metadata” - Allows one to ask if one has all the required types

7 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 FITS FILE FITS DICTIONARY FITS STANDARD PDF SOFTWARE JAVA VM PDF STANDARD FITS JAVA SOFTWARE DICTIONARY SPECIFICATION XML SPECIFICATION UNICODE SPECIFICATION Rep Info Network

8 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Our approach For information preservation and re-use: get Representation Information or Transform Alternatively move to another repository

9 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Archival Information Package Preservation Description Information Preservation Description Information Content Information further described by Package Description Packaging Information derived from described by delimited by identifies Data Object Data Object Representatio n Information Representatio n Information Physical Object Digital Object Structure Information Semantic Information Reference Information Provenanc e Information Context Information Fixity Information Other Representatio n Information Interpreted using Bit adds meaning to Access Rights Information Interpreted using 1 * 1 1...*

10 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Representation Information Representation Information Provenance has

11 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Basic preservation activities Libraries say: “Emulate or migrate” - Works well with data only in special cases  Can repeat what was done before instead of new things - Does not help with building cross-disciplinary Earth Science community

12 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Data contains numbers etc – need meaning 12

13 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6... through complex processing schemes Algorithm Manual

14 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6...to be combined and processed to get this 14 Level 2Level 0Level 1 Processing Processing/c ombining

15 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Data Intensive Science (4 th Paradigm): data at the centre of the scientific process. EC Recommendation 2012/417/EU, July 2012: “Access to and preservation of scientific information”. Data is the new gold. “We have a huge goldmine … Let’s start mining it.” Neelie Kroes, Vice-President of the European Commission responsible for the Digital Agenda International Context

16 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 But… Gold is precious because - it is rare - it does not combine with other elements - it does not perish Data is precious because - there is so much of it - it is more valuable when it is combined together - it is highly perishable Need to ensure long term preservation, accessibility, understandability and usability of data

17 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 What are the threats to your digital objects? What changes? What do preservation organisations find most difficult?

18 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Threats Things change…… - Hardware - Software - Environment - Tacit knowledge Things become unfamiliar

19 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Parse.Insight survey Researchers: 1/3 Europe 1/3 USA 1/3 rest of world Responses from researchers, data managers and publishers: 44% Europe 33% USA 23% rest of world

20 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Threats to preservation (R) The ones we trust to look after the digital holdings may let us down The current custodian of the data may cease to exist Loss of ability to identify the location of data Access and use restrictions may not be respected in the future Evidence may be lost Lack of sustainable hardware/software Users may be unable to understand or use the data

21 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Threats to preservation (R) Users may be unable to understand or use the data e.g. the semantics, format or algorithms involved.

22 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 ThreatRequirement for solution Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved Ability to create and maintain adequate Representation Information Non-maintainability of essential hardware, software or support environment may make the information inaccessible Ability to share information about the availability of hardware and software and their replacements/substitutes The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity Ability to bring together evidence from diverse sources about the Authenticity of a digital object Access and use restrictions may make it difficult to reuse data, or alternatively may not be respected in future Ability to deal with Digital Rights correctly in a changing and evolving environment Loss of ability to identify the location of data An ID resolver which is really persistent The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future Brokering of organisations to hold data and the ability to package together the information needed to transfer information between organisations ready for long term preservation The ones we trust to look after the digital holdings may let us down Certification process so that one can have confidence about whom to trust to preserve data holdings over the long term RepInfo toolkit, Packager and Registry – to create and store Representation Information. In addition the Orchestration Manager and Knowledge Gap Manager help to ensure that the RepInfo is adequate. Registry and Orchestration Manager to exchange information about the obsolescence of hardware and software, amongst other changes. The Representation Information will include such things as software source code and emulators. Authenticity toolkit will allow one to capture evidence from many sources which may be used to judge Authenticity. Packaging toolkit to package access rights policy into AIP Persistent Identifier system: such a system will allow objects to be located over time. Orchestration Manager will, amongst other things, allow the exchange of information about datasets which need to be passed from one curator to another. Certification toolkit to help repository manager capture evidence for ISO 16363 Audit and Certification

23 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Hand-over Preservation requires funding Funding for a dataset (or a repository) may stop Need to be ready to hand over everything needed for preservation - OAIS (ISO 14721) defines “Archival Information Package (AIP). - Issues:  Storage naming conventions  Representation Information  Provenance  ….

24 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 When things changes We need to: - Know something has changed - Identify the implications of that change - Decide on the best course of action for preservation - What RepInfo we need to fill the gaps  Created by someone else or creating a new one - If transformed: how to maintain data authenticity - Alternatively: hand it over to another repository - Make sure data continues to be usable Orchestration Service Gap Identification Service Preservation Strategy Tk RepInfo Registry Service Authenticity Toolkit Storage Service Data Virtualisa tion Toolkit Process Virtualisa tion Toolkit RepInf o Toolkit

25 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 What can your organisation do by itself? What services would be useful? Sharing the effort

26 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 APA/APARSEN/SCIDIP-ES services Your organisation: - Does it have guaranteed funding forever? - Does it have enough funding to do everything it needs by itself? - Does it do preservation perfectly? Preservation and Value - Increasing value - Improving preservation How can you know if someone is trying to sell “ snakeoil ”

27 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 OAIS

28 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 CASPAR inheritance CASPAR – an FP6 project Completed fundamental research into digital preservation Produced prototypes for services and toolkits which SCIDIP- ES is building on Produced evidence that these services and toolkits did help in digital preservation

29 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 CASPAR Testing

30 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 The complete view Storage Service Gap Identificatio n Service Orchestratio n Service RepInfo Registry Service Preservatio n Strategy Toolkit Data Virtualisatio n Toolkit Process Virtualisatio n Toolkit Authenticity Toolkit Packagin g Toolkit RepInfo Toolkit Finding Aid Toolkit Cloud Storag e External Access/U se Services Persistent ID i/f Service External PI services ISO Certificatio n Organisatio n Certificatio n Toolkit Services: run on remote servers Toolkits Runs on local machines These SUPPLEMENT what repositories do (customised for repositories) Make it easier for repositories to do preservation – share the effort These SUPPLEMENT what repositories do (customised for repositories) Make it easier for repositories to do preservation – share the effort

31 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 SCIDIP-ES in brief Upgrade CASPAR prototype components into scalable, robust e- infrastructure components to support digital preservation of all types of digital objects decentralised, heterogeneous, asynchronous, no single point of failure Persistent, simple re- implementable interfaces critical mass of users: Earth science as initial focus Other disciplines via APA DIGITAL PRESERVATION RESEARCH needed to create the tools needed to create the “metadata” used by the e- infrastructure and user applications. Tools may be domain dependent. Must include Rep. Info. Network of the metadata SCIence Data Infrastructure for Preservation – with focus on Earth Science Led by ESA. Currently in negotiation with EU. For more information see http://www.scidip-es.eu Storage Service Gap Identification Service Orchestration Service RepInfo Registry Service Preservation Strategy Toolkit Process Virtualisatio n Toolkit Finding Aid Toolkit Cloud Storag e Persistent ID i/f Service Extern al PI service s ISO Certificatio n Organisatio n Certificatio n Toolkit External Access/U se Services E-INFRASTRUCTURE TOOLKITS Archives User application s Domain independent Infrastructure counters threats identified by PARSE.Insight based on CASPAR prototypes APARSEN will produce a common vision to allow a coherent approach Will help archives with certification

32 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Bulk Uploader - Need to be as easy as possible if dealing with the people with no knowledge of OAIS - Quick gain

33 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Preservation Planning Processes Scoping Formulation Impl ESA, Rome14/11/2013

34 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Design Preservation Network Model (PNM) Capture PNM properties cost, risks, objectives, decisions, actions links to metric evidence… Evaluate and select preservation solution/s ESA, Rome14/11/2013 Formulation Preservation Strategies Toolkit

35 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 ESA, Rome14/11/2013 Implementation Design RepInfo Network Create RepInfo objects Capture RepInfo properties façade to various tools Search, re-use and share Registry objects Maintain registry objects Repinfo Toolkit

36 aparsen.eu #APARSEN Preservation Planning Data Management Data Management Archival Storage Archival Storage Access Ingest SIP Descriptive Information AIP queries query responses orders DIP MANAGEMENT Administration OAIS Overview Functional Model RIT Strategy Orchestration Packaging Registry GIS RIT Finding Aid Storage Certification HAPPI

37 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Preservation and Value

38 Certification David Giaretta, APA Webinar, December 2013 aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Cycle of value for preservation Positioning for SCIDIP- ES services 38 WHO PAYS, AND WHY? Is now a common question about preservation when talking to decision makers. Immediate VALUE is sought SCIDIP-ES provides away to enhance value.

39 aparsen.eu #APARSEN Network of Excellence


Download ppt "Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT-2009-6 Services and Sustainability David Giaretta,"

Similar presentations


Ads by Google