Download presentation
Presentation is loading. Please wait.
1
EOSDIS Data Preservation Archive (EDPA)
Bradley Hazuka LP DAAC Senior System Engineer Innovate, Inc., Contractor to the USGS EROS Center
2
Overview EDPA Definition and Goals EDPA Team Members EDPA Prototyping
Roadmap
3
EDPA Definition and Goals
EOSDIS Data Preservation Archive (EDPA) will preserve earth data so that the data remain available for future societal needs. Ensure data remains accessible and usable despite the continuing obsolescence of hardware, software and storage formats. Compile a complete record of earth data that can be handed off to a long term permanent archive. Earth Science Data Record (ESDR) Preservation Content Specification (PCS) Low Level Data, Ancillary Data, Algorithms, Documentation
4
EDPA Definition and Goals
Follow archiving best practices by creating a near-line and off-line copy of each EOSDIS approved data collections (3-2-1 Backup). Provide tools to recover and restore Cumulus data that has been corrupted or removed by accidental or malicious means. Provide a means for migrating entire datasets to alternate cloud vendors or processing centers. Ensure archive integrity through reconciliation and data verification. Provide tools to easily prepare, send, and verify data from Cumulus to a long term archive location. Helps to guard against accidental deletion and Vendor lock in are two big selling points
5
EDPA Team Members ESDIS LP DAAC PO DAAC ASF DAAC Katie Baynes
Bradley Hazuka (Lead) Chris Torbert (USGS) Jacob Campbell Matt Martens Darla Werner Jason Werpy PO DAAC Michael Gangl Michael McAuley ASF DAAC Scott Arko Chris Stoner
6
Prototype Goals Interface with Cumulus as a service and provide a simple backup and restore implementation for data recovery. Implement a prototype by the end of FY 17 that demonstrates the selected Preservation use cases. Refine Preservation capabilities and add Restoration capabilities in FY 18 based on selected use cases.
7
Prototype Use Cases Category Use Case Description Cumulus Phase
Preservation Ingest Archive data directly from cumulus FY 17 Automatic verification of archived data (checksums) Data Management Maintain a catalog and references to all holdings. FY 18 Restoration Distribution Stage data for recovery using Cumulus Ingest Reconcile and verify EDPA archived data with data provider holdings. Ensure ingest and receipt of ongoing data in addition to one time ingest from data provider. Ensure recoverability through automated recoverability audits of products and processes. Storage/ Sustainability Utilize frozen storage that is durable and offline or near-line (cold)
8
Prototype Preservation Ingest Use Case
9
Prototype Preservation Ingest Use Case (Continued)
10
Prototype Assumptions
The implementation will be a service oriented architecture The EDPA prototype does not address AWS egress cost and is not associated with customer distribution of data. The EDPA prototype is a proof of concept that will utilize AWS and on premise resources. The EDPA prototype will utilize Cumulus data sets from LP DAAC, PO DAAC and NSIDC DAAC. NSIDC
11
Prototype Assumptions (Continued)
The EDPA prototype will utilize Cumulus data recipes requiring no or little change to Cumulus. Data restoration through the EDPA prototype will be done via Cumulus Ingest. The EDPA prototype will maintain a simple design with limited functionality and scale. The EDPA prototype will utilize available hardware. The EDPA prototype will not provide metrics to EMS.
12
Where does EDPA live?
13
Prototype Architecture
14
Prototype Preservation Flow
15
Prototype Restoration Flow
16
Prototype Status Received concurrence for initial use cases and design for prototype from ESDIS (Katie Baynes), LP DAAC, PO DAAC and ASF DAAC. Standing up development environments to perform prototyping work. Cloud environment created using SGT contractor cloud provisioned space. Local hardware being setup for prototype environment. Working on using NGAP Sandbox Cumulus dev area Working with virtual team and local staff to refine architecture, flows, and design.
17
Roadmap Delivered EDPA Long term Plan Quad Chart (July 15, 2017)
Delivered EDPA Long Term Strategic Approach (July 28, 2017) Finalized initial use case ideas (August 11, 2017) Finalized use cases for FY 17 prototype (August 15, 2017) SE TIM Presentation on EDPA (August 30, 2017) Prototype Preservation Ingest functionality (Q4 FY 17) Prototype Restoration Distribution functionality (Q2 FY 18) On-Board another Cumulus enabled DAAC (TBD)
18
Questions
19
Contact Information Bradley Hazuka (605) Darla Werner (605)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.