Presentation is loading. Please wait.

Presentation is loading. Please wait.

ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of Software & Documents at CEOS Agencies Harwell, UK.

Similar presentations


Presentation on theme: "ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of Software & Documents at CEOS Agencies Harwell, UK."— Presentation transcript:

1 ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of Software & Documents at CEOS Agencies Harwell, UK (UKSA) – 28 Sep – 02 Oct, 2015 R.Leone/I.Maggio, European Space Agency

2 ESA UNCLASSIFIED – For Official Use Agenda  Software & Documents Preservation overview  Approaches & lessons learned in other domains APARSEN (Dedicated Session); Vatican Library (Dedicated Session) International Cartographic Association; Planetary Data System; Other domains  Outcomes from Survey in other domains.

3 ESA UNCLASSIFIED – For Official Use Software & Documents Preservation overview

4 ESA UNCLASSIFIED – For Official Use Digital Objects Preservation Generic definition: Digital Preservation is the management and maintenance of digital objects so they can be accessed and used by future users. In Earth Observation Context Digital Objects: Data Records: these include raw data and/or Level-0 data, higher-level products, browse images, auxiliary and ancillary data, calibration and validation data sets, and descriptive metadata; Associated Knowledge: this includes all the Tools used in the Data Records generation, quality control, visualization and value adding, and all the Information needed to make the Data Records understandable and usable by the Designated Community.

5 ESA UNCLASSIFIED – For Official Use Associated Knowledge Preservation

6 ESA UNCLASSIFIED – For Official Use Associated Knowledge elements Software/Tools: Software Applications: Data Product generation Quality control Product visualization Value adding Information: Documentation Images Metadata file (information on creation, access rights, restrictions, preservation history, and rights management) Multimedia (Video/Audio) SW related “IT Infrastructure”: Compiler Programming language Storage system Operative System Libraries Databases Workflows Bi directional links Schemas Email

7 ESA UNCLASSIFIED – For Official Use Information Preservation

8 Information Format for Digital Preservation Text documents (often MS Word, Excel Files, txt, etc.) can be preserved as: PostScript, PDF, DSSSL, RTF, ASCII, SGML, TIFF, CGM PostScript, PDF, RTF are proprietary DSSSL, SGML not (yet?) widely used CGM has multiple variants in use Images can be preserved as: Loss of Quality JPEG, JPEG2000 Lossless compression TIFF, PBM, PNG, FITS. Metadata can be preserved as: ASCII, the most durable format for metadata because it is widespread, backwards compatible when used with Unicode (superset of ASCII), and utilizes human- readable characters, not numeric codes.Unicode For higher functionality, SGML or XML should be used. Multimedia can be preserved as: AVI, QuickTime, MPEG, WMV, MJ2. ESA UNCLASSIFIED – For Official Use PDF, PDF/A, FITS TIFF, FITS ASCII, XML MJ2 From various Standards and Publications Sources

9 ESA UNCLASSIFIED – For Official Use Software/Tools Preservation

10 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

11 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

12 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

13 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

14 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

15 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

16 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

17 ESA UNCLASSIFIED – For Official Use Software Preservation Techniques HOW ???

18 Preservation of Hardware Easiest way to ensure that there will always be hardware on which to run your software is to preserve the hardware (and its operating system and any other reliant software). Advantages: It is easy and clearly defined Can change to Hardware Emulation at later date Disadvantages: Costly, especially when hardware fails Does not guarantee future access if dependent on other hardware/software (e.g. networking) Maintenance (over time hardware components will wear out and must be replaced; If the hardware is no longer manufactured components become scarce and expensive) Isolation (if your software only works with very specific hardware, you limit your users to those people with the right hardware) ESA UNCLASSIFIED – For Official Use

19 Emulation ESA UNCLASSIFIED – For Official Use Emulation addresses the original HW & SW environment of the digital object, and recreates it on a current machine. The emulator allows the user to have access to the software on a current platform as if it was running in its original environment. Emulate hardware platforms with another piece of hardware, typically a special purpose emulation system Emulate applications & Operating System: ability of a software to emulate (imitate) another software (e.g. application or operating system). Advantages: Emulating hardware easier to manage If emulation layer continues to be developed, software can continue to be run indefinitely Disadvantages: Right and ad-hoc emulator should be found; if old hardware is rare, no emulator might exists Need all aspects of hardware to be emulated correctly The emulation software could itself become obsolete

20 Hardware virtualization or platform virtualization refers to the creation of a virtual machine that acts like a real computer with an operating system. Software executed on these virtual machines is separated from the underlying hardware resources. Different types of hardware virtualization include: Full virtualization – complete simulation of the actual hardware to allow software to run unmodified. Partial virtualization – some but not all of the target environment attributes are simulated. Hardware virtualization is not the same as hardware emulation. In hardware emulation, a piece of hardware imitates another, while in hardware virtualization, a hypervisor (a piece of software) imitates a particular piece of computer hardware or the entire computer. Advantages: No need to update/migrate software running on a Virtual Machine when underlying hardware changes Disadvantages: Need to virtualize SW to run on virtual machines ESA UNCLASSIFIED – For Official Use Virtualization

21 Migration ESA UNCLASSIFIED – For Official Use Updating software as required to maintain same functionality, porting/transferring before platform obsolescence. Periodic transfer from one HW and/or SW configuration to another, or from one generation of computer technology to a subsequent one. Migration approaches: Complete re-write of the code allows the software to be used on a completely different system Continual migration, keeping code up to date with the latest changes to the hardware and software that code relies on Advantages: Enables access on other platforms Retain ability to retrieve/access/use data exploiting technology progress Disadvantages: Requires continued effort for development; significant effort if complex SW; time-consuming, costly, error-prone

22 Cultivation ESA UNCLASSIFIED – For Official Use Cultivation is the process of opening development of an own software. It aims at keeping software “alive” by moving to a shared development model through: Distributing and spreading knowledge about the software to minimize single point of failures (e.g. departure of a developer) Building a self-sustaining community of developers working together and sharing efforts to keep software up to date Advantages: Increases chances of further development of software and possible migration to other platforms Disadvantages: Long process, requiring more coordination Possibility for loss of control of direction Time investments into building the community, which requires work to understand what the community wants and how to appeal to them

23 Hibernation ESA UNCLASSIFIED – For Official Use Aim to preserve the knowledge of how to resuscitate/recreate the exact functionality of the software at a later date. Applicable for example when: the software has come to the end of its useful life, but there is the possibility that it might need to be resurrected to double-check analysis or prove a result in future Currently there is not a user community but this might occur in the future Advantages: allows break in effort Disadvantages: Can be difficult to check if hibernation processes are rigorous until after it is too late Preparing software for hibernation can be resource heavy, and if the software is never resurrected you may feel that those resources were wasted

24 Deprecation & Procrastination ESA UNCLASSIFIED – For Official Use Deprecation Not properly a preservation approach; it is easy to perform but often marks the end of a software package’s life and is typically only chosen when no other option is available. It might be expensive or impossible to resurrect the software in future. Procrastination “Never put off until tomorrow what can be done the day after tomorrow... or the next day” Advantages: Comes naturally, very easy and very cheap Disadvantages: Not a valid preservation technique! May require software archaeology skills in the future

25 ESA UNCLASSIFIED – For Official Use Approaches & lessons learned in other domains

26 APARSEN & Vatican Library ESA UNCLASSIFIED – For Official Use Dedicated Sessions

27 International Cartographic Association ESA UNCLASSIFIED – For Official Use The mission of the International Cartographic Association (ICA) is to promote the disciplines and professions of cartography and GIScience in an international context.

28 International Cartographic Association Approaches ESA UNCLASSIFIED – For Official Use Source: Preservation in Digital Cartography: Archiving Aspects book Information Preservation Approach: For cartographic heritage the format (which describes the structure of the digital document, which should be read/used by an application) is important because this will help to build an application in case of inaccessibility. Two various kinds of format can be identified: binary and ascii. Binary format is a direct machine code, which cannot be humanly read; Ascii format can directly read with any text editor. Format standards like Extensible Markup Languages (XML) are also of the form ascii. Therefore the format GML as XML specialty gains also importance for preservation issues. Images: TIFF standard used for preservation Software Preservation Approach: Encapsulation in a compiled proprietary format & Migration

29 ESA UNCLASSIFIED – For Official Use PLANETARY DATA SYSTEM The Planetary Data System (PDS) is an archive of data products from NASA planetary missions, which is sponsored by NASA's Science Mission Directorate. The PDS actively manage the archive to maximize its usefulness, and it has become a basic resource for scientists around the world.

30 PLANETARY DATA SYSTEM – Formats Overview ESA UNCLASSIFIED – For Official Use VICAR Format: is an image file format developed by the NASA's Jet Propulsion Laboratory. The data format was not changed during a preservation project, which means that data from missions archived during the data preservation task are still in VICAR format. Public domain software exists to read the data and convert it into data formats more commonly in use by today’s commercially available image processing software. PDS Format: During the 1980s, the Planetary Data System established standards governing the format of image data delivered to the PDS. Later missions conformed to that format standard, and the PDS provides online software enabling users to download data from the PDS archives and convert it into other useful data formats. PDS formatted images usually have a.IMG file extension and have detached ASCII-formatted text labels containing the image's metadata. FITS Format: FITS stands for "Flexible Image Transport System” is the standard format endorsed by both NASA and the IAU for use with astronomical data sets. It is also increasingly being used for space image data, particularly for small bodies missions like Rosetta and Deep Impact.

31 PLANETARY DATA SYSTEM Approaches ESA UNCLASSIFIED – For Official Use Source: PDS4 Concept document (https://pds.nasa.gov/pds4/doc/concepts/Concepts_150909.pdf)https://pds.nasa.gov/pds4/doc/concepts/Concepts_150909.pdf Planetary Data System Standards Reference document. Information Preservation Approach: Documents: All documents in PDS archives must appear in PDF/A (an encoded byte stream), UTF-8 (a parsable byte stream), or both. PDF, Microsoft Word, and Postscript are acceptable secondary formats for documents once a PDF/A or UTF-8 version has been included. Images: GIF, JPEG, PNG, and TIFF are among formats allowed for supplementary encoded images, such as in a browse collection. Software Preservation Approach: No approach highlighted

32 ESA UNCLASSIFIED – For Official Use “Selecting File Formats for Long-Term Preservation” was published by the National Archive. It gives general advice on issues relating to the preservation and management of electronic records. It is intended for use by anyone involved in the creation of electronic records that may need to be preserved over the long term, as well as by those responsible for preservation. The National Archives is a government department and an executive agency of the Ministry of Justice. They incorporate the Office of Public Sector Information and Her Majesty's Stationery Office. They also perform the Historical Manuscripts Commission's functions in relation to private records. “ELECTRONIC RECORDS - Recommendations for Preservation Formats “ was published by Smithsonian Institution Archives and it gives guidelines regarding file formats used for the long term preservation of electronic records. The Smithsonian Institution Archives serves as the institutional memory of a unique cultural organization. The history of the Smithsonian is a vital part of American history, of scientific exploration, and of international cultural understanding. Other Domains (1)

33 ESA UNCLASSIFIED – For Official Use “The Significant Properties of Software: A Study ” It gives information relating to the software component and the digital preservation. The STFC is one of the UK’s seven publicly funded Research Councils responsible for supporting, coordinating and promoting research, innovation and skills development in seven distinct fields. “Principles and Good Practice for Preserving Data “ This document provides basic guidance for managers in statistical agencies who are responsible for preserving data using the principles and good practice defined by the digital preservation community. The guidance in this paper defines the rationale for preserving data and the principles and standards of good practice as applied to data preservation, documents the development of a digital preservation policy and uses digital archive audit principles to suggest good practice for data. The Interuniversity Consortium for Political and Social Research (ICPSR) is an international consortium of about 700 academic institutions and research organizations. ICPSR is a unit within the Institute for Social Research at the University of Michigan and maintains its office in Ann Arbor Other Domains (2)

34 ESA UNCLASSIFIED – For Official Use REFERENCE MODEL FOR AN OPEN ARCHIVAL INFORMATION SYSTEM This document is a technical Recommended Practice for use in developing a broader consensus on what is required for an archive to provide permanent, or indefinite Long Term, preservation of digital information. The Emulation Approaches recommendation are considered. The Consultative Committee for Space Data Systems (CCSDS) was formed in 1982 by the major space agencies of the world to provide a forum for discussion of common problems in the development and operation of space data systems. Other Domains (3)

35 Outcomes from survey in other domains ESA UNCLASSIFIED – For Official Use No defined Best practices on Software and Document preservation in Space domain; Recommendations on document format standard in Libraries and Archives domain; Only few references about Software preservation approaches but no recommendations Need for an harmonized approach in software preservation best practices

36 Thank you for your attention !!! Questions ?? Thank you for your attention


Download ppt "ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of Software & Documents at CEOS Agencies Harwell, UK."

Similar presentations


Ads by Google