ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group EO Data Associated Knowledge Preservation WGISS-41 Meeting - Canberra, (AUS) 14–18.

Slides:



Advertisements
Similar presentations
Strategic issues for digital projects... …or, what are we doing here?
Advertisements

Strategic issues for digital projects... …or, what are we doing here?
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
System Integration Verification and Validation
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group Session LTDP Activities & DSIG Introduction WGISS-37 Meeting Cocoa Beach (Florida-US)
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Components and Architecture CS 543 – Data Warehousing.
Software Requirements
Data Stewardship Interest Group WGISS-39 Meeting
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 5 Slide 1 Requirements engineering l The process of establishing the services that the.
PV2013 Summary Results Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
This chapter is extracted from Sommerville’s slides. Text book chapter
Effective Methods for Software and Systems Integration
Software Configuration Management (SCM)
Records Survey and Retention Schedule Recertification 2011.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-39 Meeting Preservation of Software & Documents at CEOS Agencies: approaches.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Structured Documentation Management (Smart Documents) An Open Governance Initiative.
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-39 Meeting Data Purge Alert Procedure Tsukuba, Japan – May, 2015 Mirko.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Structured Documentation Management (Smart Documents) An Open Governance Initiative.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
A CIDOC CRM – compatible metadata model for digital preservation
Information Systems & Databases 2.2) Organisation methods.
ESA UNCLASSIFIED – For Official Use ESA AVHRR Data Holdings and Ongoing Curation Activities Mirko Albani, Sergio Folco.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
FEA DRM Management Strategy Presented by : Mary McCaffery, US EPA.
WGISS Working Group on Information Systems and Services Richard MORENO CNES WGISS report, Agenda Item 14 Tromsø, Norway October 2014.
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of Software & Documents at CEOS Agencies Harwell, UK.
WGISS /09/2015 DATA PRESERVATION – CNES APPROACH B. Chausserie-Laprée.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-40 Meeting Preservation of SW & Documents at CEOS Agencies Approaches and Lessons.
ESA Report to DAI/IPR WG Gian Maria Pinna DAI/IPR meeting Toulouse 2-5 November 2004.
DSpace - Digital Library Software
1/ 4 OCTOBER 2007 Electronic Records Retention Issues Frank Nemeth NMCI Engineering.
U.S. Department of the Interior U.S. Geological Survey John Faundeen October 9, 2011 Future Directions in Data Preservation ^ and Current.
EO Dataset Preservation Workflow Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Implementation Review1 Archive Ingest Redesign March 14, 2003.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group ESA – EO Data Stewardship Maturity Matrix WGISS#41 Meeting, Canberra, (AUS) 14–18 March,
ESA UNCLASSIFIED – For Official Use INSPIRE Orthoimagery TWG Status Report Antonio Romeo ESRIN 15/02/2012.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Preservation Functionality in a Digital Archive Erik Oltmans Koninklijke Bibliotheek Raymond J. van Diessen IBM Business Consulting Services Hilde van.
Research and Service Support Resources for EO data exploitation RSS Team, ESRIN, 23/01/2013 Requirements for a Federated Infrastructure.
Preservation Planning Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Data Stewardship Interest Group WGISS-43 Meeting
NASA Earth Science Data Stewardship
CESSDA SaW Training on Trust, Identifying Demand & Networking
Data Stewardship Interest Group WGISS-43 Meeting
An Approach to Software Preservation
PLM, Document and Workflow Management
Data Stewardship Interest Group WGISS-42 Meeting
Active Data Management in Space 20m DG
Exploitation of ISS Scientific data - sustainability
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
INPE, São José dos Campos (SP), Brazil
Data Stewardship Interest Group WGISS-45 Meeting
Malte Dreyer – Matthias Razum
Digital Preservation and Trusted Digital Repositories
Reportnet 3.0 Database Feasibility Study – Approach
Presentation transcript:

ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group EO Data Associated Knowledge Preservation WGISS-41 Meeting - Canberra, (AUS) 14–18 March, 2016 Mirko Albani, European Space Agency

ESA UNCLASSIFIED – For Official Use Outline  Background and Summary Information  “EO Data Associated Knowledge Preservation” Best Practice Table of Content  Example of Recommendations & Use Cases  Way Forward and next steps

ESA UNCLASSIFIED – For Official Use Associated Knowledge elements Software/Tools: Software Applications: Data Product generation Quality control Product visualization Value adding Information: Documentation Images Metadata file (information on creation, access rights, restrictions, preservation history, and rights management) Multimedia (Video/Audio) SW related “IT Infrastructure”: Compiler Programming language Storage system Operative System Libraries Databases Workflows Bi directional links Schemas

Background and summary CEOS WGISS #39 and #40 Collection of information and discussion on Software & Documents Preservation aspects: focus on Formats and Software preservation techniques Presented experiences and lessons learned at WGISS and LTDP WG Agencies Presented experiences in other domains (e.g. Vatican Library, Astronomy, etc). ESA UNCLASSIFIED – For Official Use Need for an harmonized approach in EO Data associated knowledge preservation (e.g. documents, software/tools) Drafting of dedicated Best Practice

Best Practice in CEOS format ESA UNCLASSIFIED – For Official Use

Best Practice ToC ESA UNCLASSIFIED – For Official Use FOR DISCUSSION

Information Preservation Recommendation (example) Text documents should be preserved as: Images should be preserved as: Metadata should be preserved as: Multimedia should be preserved as: ESA UNCLASSIFIED – For Official Use PDF/A, FITS TIFF, FITS XML, ASCII MPEG-4, MJ2 Formats Recommended

ESA UNCLASSIFIED – For Official Use Information Format Recommendations (example) Documents: two categories to be preserved:  Mission and data documentation (mandatory) for which two preservation formats are recommended for implementation  PDF/A and FITS  Other documentation (e.g. papers, presentations) for which one preservation file format is considered sufficient  PDF/A Images: related to EO missions (e.g. old missions quick-looks, paper images etc.)  Preservation format recommended is TIFF or FITS depending on the complexity level of the metadata schema to be preserved (i.e. FITS flexibility for metadata editing allows to handle more complex metadata schema) Metadata files: recommended preservation format is XML Multimedia files: recommended format (e.g. Videos) is MPEG-4 Rationale: to grant a reliable preservation of information two formats are identified and both should be used when the relevant digital object is identified as mandatory.

ESA UNCLASSIFIED – For Official Use Software/tools Preservation Software preservation is complex Software has lots of different components and dependencies (often not easy to define especially for old one). Software operates in a complex environment During the initial dataset appraisal a decision on the approach to be followed to handle and preserve the mission data, information and the related SW has to be taken.  Decision depends on mission relevance, temporal and geographical coverage, size, storage media and archiving format, technical aspects and cost.  Software Preservation approach can be based on different strategies (e.g. virtualization, periodic migrations, hibernation) with different pros/cons and cost/benefits.  Several attributes to be considered in the strategy selection.

Software Preservation steps ESA UNCLASSIFIED – For Official Use Preservation Retrieve Reconstruct Replay Preserve: a copy of a software “product” needs to be stored for long term preservation. There should be a strategy to ensure that the storage is secure and maintains its authenticity over time, with appropriate strategies for storage replication, media refresh, format migration etc. Retrieve: in order to retrieve it at a date in the future, it needs to be clearly labelled and identified, with a suitable catalogue. This should provide search on its function and origin (provenance information). Reconstruct: The preserved product can be reinstalled or rebuilt within a sufficiently close environment to the original that it will execute satisfactorily. For software, this is a particularly complex operation, as there are a large number of contextual dependencies to the software execution environment which are required to be satisfied before the software will execute at all. Replay: In order to be useful at a later date, software needs be replayed, or executed and perform in a manner which is sufficient close in its behavior to the original. As with reconstruction, there may be environmental factors which may influence whether the software delivers a satisfactory level of performance.

ESA UNCLASSIFIED – For Official Use Software Preservation Attributes Functionality  What it does and what data it depends on Environment  Platform, operating system, programming language  Versions Dependencies  Compilation dependency graph  Standard libraries  Other software products  Specialised hardware Software is a Composite digital object  Collection of modules  Specifications, Configuration scripts, test suites, documentation Architecture  Client/server, storage system, input / output User interaction  Command line, User Interface  User model

Software Preservation Recommendations: approaches Three specific cases to be considered: 1.Future Missions: Recommendations about software engineering best practice such as: clear licensing; clear documentation; use of commonly adopted and modern programming languages; modular design; revision management and change control; established software testing regime and validated results; separation between data and code; and clear understanding of dependencies all make the work of software preservation easier. If the already existing Software Engineering Best Practices cover all preservation requirements, only a statement, with the standards references, could be highlighted. 2.Historical Missions: Recommendations take into consideration different scenarios based on data uniqueness, available funding, etc. Example of Recommendation: if reduced funding available and similar instrument dataset already preserved then hibernation and procrastination recommended (i.e. keep in original format without any further effort). 3.Current Missions: Mixed approach. For any new development, the Future Missions recommendations should be implemented, otherwise the approach to be followed is the Historical missions one. ESA UNCLASSIFIED – For Official Use

Next steps  March-September 2016: Best Practice Drafting;  WGISS#42: Assessment of Drafting Status and Discussion;  End October 2016: Issue to DSIG for Review;  End of January 2017: WGISS Comments & Feedback;  End of February 2017: Final Issue to WGISS;  WGISS#43: Final Presentation and Formal Approval ESA UNCLASSIFIED – For Official Use

Thanks for your attention ESA UNCLASSIFIED – For Official Use

Software Preservation Techniques Technique s

Thank you for your attention !!! Questions ?? Approaches and Lessons Learned at ESA

ESA Approach – Preservation Life cycle ESA UNCLASSIFIED – For Official Use Mission Preservation Requirements What to preserve, Information Format & Software preservation approach IMPLEMENTATION Lesson Learned Preservation Workflow PDSC Tailoring Checklist Appraisal, Preservation Requirements, Objectives and Cost Assessment Application/Infrast ructure/Informatio n Schema

ESA UNCLASSIFIED – For Official Use Mission Preservation Requirements: Designated community requirements. PRESERVATION WORKFLOW: during the initial phase a dataset appraisal will provide an initial conception of whether the data set should be preserved and kept accessible and usable for the long term. Topics to be considered include mission relevance, economic considerations, temporal and geographical coverage, size, storage media and archiving format. The results of this phase are some main decisions on Preservation Requirements but also on File Format and how preserve the Software. Application/Infrastructure/Information Schema: result of the dataset appraisal and allows the definition of a preservation network used by Curator to take decision in terms of Risk and Cost management. PDSC TAILORING CHECKLIST: Preserved Dataset Content addresses on WHAT preserve and it should be used as a “checklist” and opportunely tailored for each EO mission/instrument according to their specific characteristics to generate one or more inventory documents.The checklist shows the relevant fields on Format File for any documents to be preserved and for which it is foreseen also another secondary file format to grant the preservation. For the Software preservation the file indicates which strategy needs to be pursued. IMPLEMENTATION: during the preservation process all lessons learned are collected in order to improve the mission preservation cycle ESA Approach – Preservation Life cycle

ESA UNCLASSIFIED – For Official Use Approaches at ESA: Information and SW Tools preservation Two categories of documents to be preserved:  Mission documentation (mandatory) for which two preservation formats are considered for implementation  PDF/A and FITS  Other documentation (e.g. papers, presentations) for which one preservation file format is considered sufficient  PDF/A  Currently many documents in paper; several formats for other info. Images related to EO missions (e.g. old missions quick-looks)  Preservation format considered is TIFF or FITS depending on the complexity level of the metadata schema to be preserved (i.e. FITS flexibility for metadata editing allows to handle more complex metadata schema) Metadata files selected preservation format is XML Multimedia files format (e.g. Videos) under analysis

ESA UNCLASSIFIED – For Official Use Approaches at ESA: Information and SW Tools preservation During the initial dataset appraisal a decision on the approach to be followed to handle and preserve the mission data, information and the related SW is taken.  Decision depends on mission relevance, temporal and geographical coverage, size, storage media and archiving format, technical aspects and cost. Software Preservation approach can be based on different strategies for different missions (e.g. virtualization, periodic migrations, hybernation).

ESA UNCLASSIFIED – For Official Use Approach at ESA: Virtualization Experiences Dependency on legacy hardware, difficult to maintain, was a critical issue for long-term exploitation of the EO products archive and a typical situation for IPFs of historical missions. Instrument Processing Facilities Virtualization: a series of mission data processors was virtualised - AVHRR, SeaWiFS, Landsat TM, Envisat ASAR, Envisat MERIS, Envisat AATSR, Envisat GOMOS, Envisat MIPAS, Envisat SCIAMACHY, Envisat MRA2/MWR, ERS SAR, ERS Scatterometer, ERS GOME, ERS Altimeter. Multi-Mission Facility Infrastructure Virtualization: Migration of MMFI building elements including IPF processors to a virtual-machine hosted environment, identifying and eliminating as much as possible dependencies on specific HW and related SW libraries. IPFs and MMFI components can be now executed in virtual environments and have been installed on Cloud for many pilot and operational activities.

Approach at ESA Preserved Missions examples SEASAT and JERS-1: Missions documentation will be preserved in PDF/A and FITS (both copies to be generated) JERS-1:  After an initial dataset appraisal JERS-1 Processing Software was improved, aligned to ALOS mission and VIRTUALIZED  Reprocessing campaign being performed in a Cloud environment SEASAT:  After an initial dataset appraisal the decision on DEPRECATION of the original processing software was taken  New SAR IPF derived from ALOS-JERS processors: virtualized to run on cloud.  Full reprocessing done and processor HIBERNATED. Other Historical Missions  Processing software hibernation (e.g. after reprocessing campaigns)

ESA UNCLASSIFIED – For Official Use Discussion & Possible way forward

1.What formats would be most convenient for preserving text documents, metadata files, and videos? Text Documents (word, txt, ppt)  PDF, PDF/A, FITS, other; Images (bmp, tif, jpg, gif)  TIFF, FITS, other; Metadata file  XML, ASCII; Video (avi, vob, m4v, mov, mpx, etc)  MJ2, other. 2.Drafting a Best Practice on recommended approach(es) for information and SW tools preservation on the basis of CEOS agencies experience? ESA UNCLASSIFIED – For Official Use Summary