EOSDIS Data Preservation Archive (EDPA)

Slides:



Advertisements
Similar presentations
Alternate Software Development Methodologies
Advertisements

Network Design and Implementation
Metrics and Monitoring Capabilities for Earth Science Data Systems ESDSWG Wilmington, Delaware October 20-22, 2009.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
SCIDIP-ES Components Oct ,Brussels. Basic Preservation Strategies Often stated as: “Emulate or Migrate” OAIS concepts change these to: Add Representation.
Preservation Strategies: What do long-term archives do with my data? Jeff Arnfield NOAA’s National Climatic Data Center Version 1.0 Review Date.
May 17, Capabilities Description of a Rapid Prototyping Capability for Earth-Sun System Sciences RPC Project Team Mississippi State University.
IT PLANNING Enterprise Architecture (EA) & Updates to the Plan.
Next Generation Application Platform (NGAP) Andrew Mitchell WGISS-39 Tsukuba, Japan Monday, May 11,
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
November 2009 Network Disaster Recovery October 2014.
Effective Methods for Software and Systems Integration
Solution Overview for NIPDEC- CDAP July 15, 2005.
Agenda Teams Responsibilities Timeline Homework and next steps.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Software Engineering Quality What is Quality? Quality software is software that satisfies a user’s requirements, whether that is explicit or implicit.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
EOSDIS Status 9/29/2010 Dan Marinelli, NASA GSFC
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
EOSDIS Status 10/16/2008 Dan Marinelli, Science Systems Development Office.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
Washington State Archives “Going Paperless” Presented by: Leslie Koziara, ERMP May 7, 2009 A GUIDE TO WASHINGTON STATE’S APPROVAL PROCESS FOR THE DESTRUCTION.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
PoDAG XXI: SEEDS SEED: NSIDC Potential Interactions NSIDC DAAC should prepare an evaluation of their desired future roles in "core activities" and in mission.
Science Data in the Science Mission Directorate (SMD) Jeffrey J.E. Hayes Program Executive for MO & DA, Heliophysics Division August 17, 2011.
ASDC Ingest Automation Efforts through Collaboratory for quAlity Metadata Preservation (CAMP) Presented by: Aubrey Beach (Booz Allen Hamilton, NASA ASDC)
National Archives and Records Administration Status of the ERA Project RACO Chicago Meg Phillips August 24, 2010.
Evolving a Legacy System Evolution of the Earth Observing Data and Information System M. Esfandiari 1, H. Ramapriyan 1, J. Behnke 1, E. Sofinowski 2 1.
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
KEY PERSONNEL Dr. Bob Schutz, GLAS Science Team Leader Dr. Jay Zwally, ICESat Project Scientist, GLAS Team Member Mr. David Hancock, Science Software Development.
LANCE Processing at the AMSR-E SIPS Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville Joint.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
MODIS SDST, STTG and SDDT MODIS Science Team Meeting (Land Discipline Breakout Session) July 13, 2004 Robert Wolfe Raytheon NASA GSFC Code 922.
July 2003 At A Glance The GMSEC provides efficient and enabling GSFC mission services and products for the next decade. Benefits Establishes a Single Strategic.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
© 2014 VMware Inc. All rights reserved. Cloud Archive for vCloud ® Air™ High-level Overview August, 2015 Date.
LP DAAC Overview – Land Processes Distributed Active Archive Center Chris Doescher LP DAAC Project Manager (605) Chris Torbert.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
DAAC Roles with Common Metadata Repository (CMR) ESDIS System Engineering Technical Interchange November 2014 Copyright © 2014 Raytheon Company. All Rights.
U.S. Department of the Interior U.S. Geological Survey July 2014 OPeNDAP Services – Present and Future at LP DAAC Brian Davis 1, Rob Quenzer 1, Jason Werpy.
Commvault and Nutanix October Changing IT landscape Today’s Challenges Datacenter Complexity Building for Scale Managing disparate solutions.
A Solution for Maintaining File Integrity within an Online Data Archive Dan Scholes PDS Geosciences Node Washington University 1.
NASA Earth Science Data Stewardship
MICROSOFT AZURE ISV PROFILE: BMC SOFTWARE
Video Streaming and Hosting
Backup, Archive & Recovery
Amazon Storage- S3 and Glacier
Policy-Based Data Management integrated Rule Oriented Data System
Persistent Identifiers Implementation in EOSDIS
Microsoft SharePoint Server 2016
NASA’s EOSDIS – Long Term Archive Infrastructure and Processes
Software Requirements
OPERATIONAL and USER STATISTICS
The Brocade Cloud Manageability Vision
Future Data Architecture Cloud Hosting at USGS
Storage & Digital Asset Management CIO Council Update
Future Data Architectures Big Data Workshop – April 2018
LP DAAC AppEEARS Data Access
Leigh Grundhoefer Indiana University
Office 365 and Microsoft Project Integrations for HULAK Project Management Software Enable Teams to Remain Productive and Within Budget OFFICE 365 APP.
Click to add title Planning for LSST Verification George Angeli LSST All Hands Meeting Tucson August 15, 2016.
Chris Lynnes with contributions from Katie Baynes NASA/GSFC
Robin Dale RLG OAIS Functionality Robin Dale RLG
PCW-09 Vision: Information Center Approval System
The Survival Plan.
NOAA OneStop and the Cloud
NM Department of Transportation (NMDOT) MMS/PMS Phase 2
Executive Sponsor: Tom Church, Cabinet Secretary
Presentation transcript:

EOSDIS Data Preservation Archive (EDPA) Bradley Hazuka LP DAAC Senior System Engineer Innovate, Inc., Contractor to the USGS EROS Center

Overview EDPA Definition and Goals EDPA Team Members EDPA Prototyping Roadmap

EDPA Definition and Goals EOSDIS Data Preservation Archive (EDPA) will preserve earth data so that the data remain available for future societal needs. Ensure data remains accessible and usable despite the continuing obsolescence of hardware, software and storage formats. Compile a complete record of earth data that can be handed off to a long term permanent archive. Earth Science Data Record (ESDR) Preservation Content Specification (PCS) Low Level Data, Ancillary Data, Algorithms, Documentation

EDPA Definition and Goals Follow archiving best practices by creating a near-line and off-line copy of each EOSDIS approved data collections (3-2-1 Backup). Provide tools to recover and restore Cumulus data that has been corrupted or removed by accidental or malicious means. Provide a means for migrating entire datasets to alternate cloud vendors or processing centers. Ensure archive integrity through reconciliation and data verification. Provide tools to easily prepare, send, and verify data from Cumulus to a long term archive location. Helps to guard against accidental deletion and Vendor lock in are two big selling points

EDPA Team Members ESDIS LP DAAC PO DAAC ASF DAAC Katie Baynes Bradley Hazuka (Lead) Chris Torbert (USGS) Jacob Campbell Matt Martens Darla Werner Jason Werpy PO DAAC Michael Gangl Michael McAuley ASF DAAC Scott Arko Chris Stoner

Prototype Goals Interface with Cumulus as a service and provide a simple backup and restore implementation for data recovery. Implement a prototype by the end of FY 17 that demonstrates the selected Preservation use cases. Refine Preservation capabilities and add Restoration capabilities in FY 18 based on selected use cases.

Prototype Use Cases Category Use Case Description Cumulus Phase Preservation Ingest Archive data directly from cumulus FY 17 Automatic verification of archived data (checksums) Data Management Maintain a catalog and references to all holdings. FY 18 Restoration Distribution Stage data for recovery using Cumulus Ingest Reconcile and verify EDPA archived data with data provider holdings. Ensure ingest and receipt of ongoing data in addition to one time ingest from data provider. Ensure recoverability through automated recoverability audits of products and processes. Storage/ Sustainability Utilize frozen storage that is durable and offline or near-line (cold)

Prototype Preservation Ingest Use Case

Prototype Preservation Ingest Use Case (Continued)

Prototype Assumptions The implementation will be a service oriented architecture The EDPA prototype does not address AWS egress cost and is not associated with customer distribution of data. The EDPA prototype is a proof of concept that will utilize AWS and on premise resources. The EDPA prototype will utilize Cumulus data sets from LP DAAC, PO DAAC and NSIDC DAAC. NSIDC

Prototype Assumptions (Continued) The EDPA prototype will utilize Cumulus data recipes requiring no or little change to Cumulus. Data restoration through the EDPA prototype will be done via Cumulus Ingest. The EDPA prototype will maintain a simple design with limited functionality and scale. The EDPA prototype will utilize available hardware. The EDPA prototype will not provide metrics to EMS.

Where does EDPA live?

Prototype Architecture

Prototype Preservation Flow

Prototype Restoration Flow

Prototype Status Received concurrence for initial use cases and design for prototype from ESDIS (Katie Baynes), LP DAAC, PO DAAC and ASF DAAC. Standing up development environments to perform prototyping work. Cloud environment created using SGT contractor cloud provisioned space. Local hardware being setup for prototype environment. Working on using NGAP Sandbox Cumulus dev area Working with virtual team and local staff to refine architecture, flows, and design.

Roadmap Delivered EDPA Long term Plan Quad Chart (July 15, 2017) Delivered EDPA Long Term Strategic Approach (July 28, 2017) Finalized initial use case ideas (August 11, 2017) Finalized use cases for FY 17 prototype (August 15, 2017) SE TIM Presentation on EDPA (August 30, 2017) Prototype Preservation Ingest functionality (Q4 FY 17) Prototype Restoration Distribution functionality (Q2 FY 18) On-Board another Cumulus enabled DAAC (TBD)

Questions

Contact Information Bradley Hazuka brad.hazuka.ctr@usgs.gov (605) 594-2667 Darla Werner darla.werner.ctr@usgs.gov (605) 594-6178