NCEI’s Long Term Archive: Infrastructure, Processes, Volume and Trend

Slides:



Advertisements
Similar presentations
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Advertisements

Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Integrating NOAA’s Unified Access Framework in GEOSS: Making Earth Observation data easier to access and use Matt Austin NOAA Technology Planning and Integration.
Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Symposium on Digital Curation in the Era of Big Data: Career Opportunities and Educational Requirements Workforce Demand and Career Opportunities From.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Modeling Framework Generally modeling framework is made up of the following components: A set of biophysical modules that simulate biological and physical.
1 Next Generation of Operational Earth Observations From the National Polar-Orbiting Operational Environmental Satellite System (NPOESS): Program Overview.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Reginald Lawrence, NOAA/NESDIS/OSD NOAA Direct Readout Conference
OEI’s Services Portfolio December 13, 2007 Draft / Working Concepts.
Chapter 4 Realtime Widely Distributed Instrumention System.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
NIST Data Science SymposiumMarch 4, 2014 NIST Data Science SymposiumMarch 4, Climate Archives in NOAA: Challenges and Opportunities March 4, 2014.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
NODC Metadata Management for Geoportal Server and Beyond John Relph NOAA National Oceanographic Data Center.
1 NATIONAL CENTERS FOR ENVIRONMENTAL INFORMATION NCEI-IOOS Project Updates Mathew Biddle May 28th, 2015 IOOS DMAC Meeting, IOOS.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
NOAA Report WGISS 19 Climate and Meteorology Status Glenn K. Rutledge NOAA Cordoba, Argentina March 7,2005.
Children’s Health Exposure Analysis Resource (CHEAR) CHEAR Center for Data Science Susan Teitelbaum, PhD November 4, 2015.
Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt
DISCUSSION DRAFT ONLY Data Management METRICS for NNDC and CLASS David Hermreck.
End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Microsoft Azure and ServiceNow: Extending IT Best Practices to the Microsoft Cloud to Give Enterprises Total Control of Their Infrastructure MICROSOFT.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
OAIS (archive) OAIS (archive) Producer Management Consumer.
April 7, 2016 NOAA Satellite and Information Service | National Centers for Environmental Information Mike Tanner Director, Center for Weather and Climate.
Connected Infrastructure
2nd GEO Data Providers workshop (20-21 April 2017, Florence, Italy)
Legacy and future of the World Data System (WDS) certification of data services and networks Dr Mustapha Mokrane, Executive Director, WDS International.
Digital Repository Certification Schema A Pathway for Implementing the GEO Data Sharing and Data Management Principles Robert R. Downs, PhD Sr. Digital.
Microsoft Azure-Powered BlueCielo Meridian360 Portal Improves Asset Data Integrity and Facilitates Secure Collaboration with External Stakeholders MICROSOFT.
Parcel Tracking Solution Parcel Tracking What to look for Architecture
OneStop Project Update for WGISS
OAIS Producer (archive) Consumer Management
INTAROS WP5 Data integration and management
Joseph JaJa, Mike Smorul, and Sangchul Song
Advanced Tracking and Resource tool for Archive Collections (ATRAC)
Connected Infrastructure
Microsoft Azure Platform Powers New Elements Constellation Software Suite to Deliver Invaluable Insights From Your Data for Marketing and Sales MICROSOFT.
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
OneStop: Project Overview
Topics NOAA support for: CWIC Infrastructure NOAA as a Data Provider
FDA Objectives and Implementation Planning
Essential Climate Variable (ECV) Inventory
OneStop: Architecture Review
OneStop: Progress Toward Implementation of Enterprise Storage Services
OneStop Metadata Team Lead
Clouds & Containers: Case Studies for Big Data
Prepared by: Jennifer Saleem Arrigo, Program Manager
Abiquo’s Hybrid Cloud Management Solution Helps Enterprises Maximise the Full Potential of the Microsoft Azure Platform MICROSOFT AZURE ISV PROFILE: ABIQUO.
WIS Strategy – WIS 2.0 Submitted by: Matteo Dell’Acqua(CBS) (Doc 5b)
Jisc Research Data Shared Service (RDSS)
Bird of Feather Session
Robin Dale RLG OAIS Functionality Robin Dale RLG
4/5 May 2009 The Palazzo dei Congressi di Stresa Stresa, Italy
Remedy Integration Strategy Leverage the power of the industry’s leading service management solution via open APIs February 2018.
NATIONAL CENTERS FOR ENVIRONMENTAL INFORMATION
NOAA OneStop and the Cloud
Presentation transcript:

NCEI’s Long Term Archive: Infrastructure, Processes, Volume and Trend Nancy Ritchey Archive Branch Chief, NOAA’s National Centers for Environmental Information Thursday, September 28, 2017 NOAA Satellite and Information Service | National Centers for Environmental Information

Be the Nation’s Trusted Authority for Environmental Information Outline Who we are The Numbers Trusted Archive Data Management Processes Infrastructure Be the Nation’s Trusted Authority for Environmental Information

NOAA’s National Centers for Environmental Information (NCEI) Responsible for preserving and providing access to one of the most significant archives on Earth, with comprehensive oceanic, atmospheric, and geophysical data From the depths of the ocean to the surface of the sun and from million-year-old sediment records to near real-time satellite images Nation’s leading authority for environmental information

NCEI has a Nationwide Presence

NCEI Environmental Data Archive Volume Increasing Data Volumes from Station, Model, Radar, and Satellite Sources 2016 Total: 28.6 Petabytes Aug 2017 32.8PB* 28 PB = Storage in a stack of smart phones 17 Eiffel Towers high (1,063 feet) Due to increase in satellite and model data Volume (Petabytes) *Includes primary and secure data

NCEI Environmental Data User-Requested Volume Increasing Data Requests from Station, Model, Radar, and Satellite Sources Volume (Petabytes)

NCEI Cumulative Satellite Data Archival Volume In Situ adds 10PB by 2025 and 18PB by 2030 Notes: * 2005 - 2016 volumes are based on data archived at NCEI, some of which is compressed. * 2017-2030 are estimated volumes, based on historical and projected values * Unless otherwise noted, units are in petabytes * JPSS includes legacy POES, S-NPP and planned JPS missions. * JPSS volumes include Retained Intermediate Products, which are used to generate other products. These are roughly equal in volume to combined volume of JPSS Raw Data Record (RDR, L0) and Sensor Data Record (SDR, L1b) * GOES-R / GOES includes both legacy GOES as well as GOES-R/S/T projections * "Other Satellite Products" includes a variety of smaller volume satellite data and derived products.

Trusted Archive Strengthen the trust of user communities, partners and the public by quantifiably demonstrating capabilities for ensuring data integrity and reliability over the long term. NCEI leads four World Data Systems: Geophysics, Meteorology, Oceanography, and Paleoclimatology Recommended option is the Core Trustworthy Data Repository. Examine processes and capabilities which will ultimately validate the trustworthiness of the archive. Core Trustworthy Data Repository, jointly offered by the Data Archiving and Networked Services archive in the Netherlands (creator of the Data Seal of Approval) and the International Council for Science World Data System (WDS).

Data Stewardship Maturity Matrix (DSMM) A Unified Framework for Measuring Stewardship Practices Applied to Individual Digital Earth Sciences Data Products http://tinyurl.com/DSMMintro NCEI scope policy

DSMM Defines Measureable, Five-Level Progressive Practices in Nine Quasi-Independent Key Components

Processes: Data Management Location-based processes Piloted an End-to-End Data Stewardship process Initiating Field Teams to support data stewardship throughout the organization Developing an approach for assigning resources and prioritizing NCEI Product Areas NCEI scope policy

Processes: Data Management Testing a Cost Estimation Tool for Data Stewardship Tiers 1 and 2 Developing a business approach to using the Cost Model 6 Authoritative Records 5 Derived Products 4 Scientific Improvements 3 Enhanced Access and Basic Quality Assurance 2 Long-Term preservation and Basic Access 1 National Services and International Leadership

Infrastructure: NCEI “Tomorrow” External Catalogs Data.gov, Google, WIS, WDS, CEOS, DataONE, etc. Community- Specific “Thin” Portals OneStop UI + data.noaa.gov OneStop API (Geoportal + ElasticSearch services) Metadata WAFs Metadata Docs Metadata Database + Services Metadata CRUD Tools Enterprise Services PRODUCERS CONSUMERS SIP One-Off’s Queries Results S2N+ATRAC Automations Common Ingest System Ingest Processes Collection and Granule Metadata Analytics Engine DIP Data Ingest: Common Ingest is a modular, configurable, scaleable, and extensible system includes GUI-based configuration and management, complete provenance tracking, and integrated security scanning.  Deployed Aug 2017 Initiated migration of data streams across NCEI Archive: Currently multiple systems supporting NCEI archival. Initiating effort to determine a scaleable, extensible enterprise system approach Access OneStop: Search and discovery Common Access: Microservices-based platform will consolidate historical and desperate access systems, enabling order fulfillment and certification MSN/OneStop Disk Storage + Public Cloud AIP Hyrax ERDDAP FTPS/HTTPS TDS WxS LAS Automations CLASS Common Submission CS Agg. Tools CS * M2M for access in limited cases Ingest Access Offsite Tape: Disaster Recovery* + System Backups (provided by NESDIS Ground Enterprise, NGE) AIP

Infrastructure: Ingest Common Ingest: Architecture Data Provider Remote IO Aggregate Submission Move Common Ingest Manager Rename Message Oriented Middleware (Queuing System) Archive Common Ingest is a modular, configurable, scaleable, and extensible system includes GUI-based configuration and management, complete provenance tracking, and integrated security scanning.  Deployed Aug 2017 Initiated migration of data streams across NCEI User Interface Tracking DB Tape Archive

Infrastructure Services MSN Service Model Application Services Data Acquisition (e.g., Common Ingest) Product Generation (e.g., Reprocessing) Data Discovery (e.g., OneStop) Data Dissemination (e.g., OneStop, THREDDS) Platform Services Database Services (e.g., Postgres, NOSQL) Web Services (e.g., Drupal, Apache) Catalog Services (e.g., OneStop APIs) Infrastructure Services NESDIS initiative to develop an enterprise infrastructure to support Satellite Application and Research and NCEI. Led by NCEI Leverage existing capabilities (ex. OneStop) Develop required, non-existent capabilities or replace antiquated technological solutions Shared Compute Services (oVirt, Condor) Shared Storage Services (SCDR, Gluster) Processing Services (e.g., Cloud/NOAA HPC) Archive Services (CLASS Partition/Cloud) OneStop architecture of tiered services aligns directly within MSN service model

Questions? Nancy.Ritchey@noaa.gov

www.ncei.noaa.gov www.climate.gov NCEI Climate Facebook: http://www.facebook.com/NOAANCEIclimate NCEI Ocean & Geophysics Facebook: http://www.facebook.com/NOAANCEIoceangeo NCEI Climate Twitter (@NOAANCEIclimate): http://www.twitter.com/NOAANCEIclimate NCEI Ocean & Geophysics Twitter (@NOAANCEIocngeo): http://www.twitter.com/NOAANCEIocngeo