Hierarchical Data Format (HDF) Status Update

Slides:



Advertisements
Similar presentations
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Advertisements

The HDF Group November 3-5, 2009HDF/HDF-EOS Workshop XIII1 HDF5 Advanced Topics Elena Pourmal The HDF Group The 13 th HDF and HDF-EOS.
Different Streaming Technologies. Three major streaming technologies include:
1 Input/Output. 2 Principles of I/O Hardware Some typical device, network, and data base rates.
DM_PPT_NP_v01 SESIP_0715_JP Indexing HDF5: A Survey Joel Plutchak The HDF Group Champaign Illinois USA This work was supported by NASA/GSFC under Raytheon.
Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group June 30, NPOESS Data Formats Working Group.
Developing a NetCDF-4 Interface to HDF5 Data
HDF5 Tools Update Peter Cao - The HDF Group November 6, 2007 This report is based upon work supported in part by a Cooperative Agreement.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.
The HDF Group April 17-19, 2012HDF/HDF-EOS Workshop XV1 Introduction to HDF5 Barbara Jones The HDF Group The 15 th HDF and HDF-EOS Workshop.
1 HDF-EOS and Related Tools Status Update. 2 Overview.
Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF Project Update Mike Folk, Kent Yang, Elena Pourmal The HDF Group April 5, 2010 April 5, 2011Annual HDF Briefing to ESDIS1.
HDF5 A new file format & software for high performance scientific data management.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI CloudBroker Platform integration into WS-PGRADE/gUSE Zoltán Farkas MTA.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
HDF Mike Folk National Center for Supercomputing Applications Science Data Processing Workshop February 26-28, 2002 HDF Update HDF.
The future of MINC Robert D. Vincent
May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
The HDF Group ESIP Summer Meeting HDF Studio John Readey The HDF Group 1 July 8 – 11, 2014.
Page 1 Status of HDF-EOS, Related Software, and Tools Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshp XIII Riverdale, MD November 4, 2009.
HDF 1 New Features in HDF Group Revisions HDF and HDF-EOS Workshop IX November 30, 2005.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
April 28, 2008LCI Tutorial1 Introduction to HDF5 Tools Tutorial Part II.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
Support for NPP/NPOESS by The HDF Group Mike Folk The HDF Group HDF and HDF-EOS Workshop XII October 17, 2008 Oct HDF and HDF-EOS Workshop XII1.
1 HDF-EOS Development Current Status and Schedule Larry Klein, Shen Zhao, Abe Taaheri and Ray Milburn L-3 Communications Government Services, Inc. September.
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
Deutscher Wetterdienst
1 HDF-EOS Status, Related Tools and Issues. 2 Overview.
1 HDF5 Life cycle of data Boeing September 19, 2006.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
The HDF Group Milestone 5.1: Initial POSIX Function Shipping Demonstration Jerome Soumagne, Quincey Koziol 09/24/2013 © 2013 The HDF Group.
Page 1 TOOLKIT / HDF-EOS Status and Development Abe Taaheri, Raytheon IIS Aura DSWG meeting October 2007.
The HDF Group Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group November 5, 2009 November 3-5,
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 Introduction to HDF5 Command-line Tools.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
July 20, Update on the HDF5 standardization effort Elena Pourmal, Mike Folk The HDF Group July 20, 2006 SPG meeting, Palisades, NY.
Comments from User Services C. Boquist/Code 423 The HDF Group Meeting 1 April 2009.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
The HDF Group HDF Group Support for NPP/JPSS Mike Folk, Elena Pourmal, Larry Knox, Albert Cheng The HDF Group DEWG Meeting June 19, 2012.
The HDF Group January 8, ESIP Winter Meeting Data Container Study: HDF5 in a POSIX File System or HDF5 C 3 : Compression, Chunking,
Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal The HDF Group Annual HDF Briefing to ESDIS March 31, 2009 March Annual HDF Briefing.
ITMT Windows 7 Configuration Chapter 4 – Working with Disks and Devices ITMT 1371 – Windows 7 Configuration 1.
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
The HDF Group Introduction to HDF5 Session Three HDF5 Software Overview 1 Copyright © 2010 The HDF Group. All Rights Reserved.
HDF and HDF-EOS Workshop XII
The Post Windows Operating System
HDF Product Designer: Using Templates to Achieve Interoperability
Elena Pourmal The HDF Group
Hierarchical Data Formats (HDF) Update
Moving from HDF4 to HDF5/netCDF-4
HDF5 New Features October 8, 2017
Introduction to HDF5 Session Five Reading & Writing Raw Data Values
NetCDF 3.6: What’s New Russ Rew
Plans for an Enhanced NetCDF-4 Interface to HDF5 Data
Introduction to HDF5 Tutorial.
Efficiently serving HDF5 via OPeNDAP
Chapter 2: The Linux System Part 1
Peter Cao The HDF Group November 28, 2006
Introduction to HDF5 Mike McGreevy The HDF Group
Moving applications to HDF
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
HDF-EOS Workshop XXI / The 2018 ESIP Summer Meeting
Elena Pourmal The HDF Group HDF Workshop July 17, 2018
Adapting an existing web server to S3
Presentation transcript:

Hierarchical Data Format (HDF) Status Update ESIP Summer 2018 Elena Pourmal EED2 Technical Lead epourmal@hdfgroup.org This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations.

Outline Update on current HDF releases Moving to HDF5 1.10 series New features Moving to HDF5 1.10 series Controlling HDF5 file versioning Taking advantage of HDF compression What is coming in HDF5 1.12? Non-POSIX I/O and new defaults Getting help with HDF software and data

Current HDF releases HDF5 1.8.21 (June 2018) HDFView 3.0 (June 2018) Vulnerability patches Tools fixes Support for Intel Fortran v 18 compiler on Windows There will be one more maintenance release of HDF5 1.8 version. It is time to move to HDF5 1.10 series! HDFView 3.0 (June 2018) For more info see https://hdfgroup.org

Current HDF releases HDF5 1.10.2 ( March 2018) Vulnerability patches Enabling control over the HDF5 file versioning Enabling compression for parallel (MPI I/O) writes HDF5 1.10.3 is coming later this year Parallel compression enhancements See https://hdfgroup.org for details

Moving to HDF5 1.10 series Controlling HDF5 file versioning HDF5 library is ALWAYS backward compatible New version of the library will always read files created by the earlier versions HDF5 library is forward compatible By default the library will create objects in a file that can be read by the earlier versions of the library HDF5 file does not have a version Versioning is done on an object level

Moving to HDF5 1.10 series Q: How one can assure that HDF5 files created by HDF5 library version 1.10 and later will be read by the applications based on HDF5 1.8 and earlier? A: Use H5Pset_libver_bounds( hid_t fapl_id, H5F_libver_t low, H5F_libver_t high ) in applications Uses file access property to specify the features that can be created by the library specified by the “high” parameter and latest versions of the objects available in the library specified by the “low” parameter. H5F_LIBVER_EARLIEST, H5F_LIBVER_V18, H5F_LIBVER_V110 , H5F_LIBVER_LATEST

Moving to HDF5 1.10 series Taking advantage of HDF5 compression Compression works for both sequential and parallel (MPI I/O) writes/reads HDF5 supports GZIP and SZIP compressions Open Source and free SZIP from German Climate Computing Center https://www.dkrz.de/redmine/projects/aec/wiki/Downloads Fully compatible with SZIP provided by The HDF Group (encoder is not free for commercial data usage) Multiple third-party compressions available as plugins; see https://portal.hdfgroup.org/display/support/Contributions One compression doesn’t fit all data!

HDF5 Compression Using compression with Sentinel Data HDF5 file that was created by converting Sentinel 1 GeoTiff file. File contains one 32-bit integer array with dimensions 20256x25478; dimensions correspond to the number of image strips stored in the original Sentinel 1 GeoTiff file. Compression Compression ratio File size in bytes No compression 1 2065283096 (2GB) SZIP 1.062 1944126897 GZIP 1.966 1049969129 SHUFFLE + GZIP 2.192 941879752 (< 1GB)

HDF5 Compression Using compression with SeaSat Data HDF5 file contained 3 datasets Table below shows CR for each dataset when using GZIP, SZIP and combinations of SHUFFLE and GZIP Different compressions (highlighted) can be applied to get compression ratio (CR) of 1.9 Compression CR HH CR latitude CR longitude Total file size in bytes TCR Original file (GZIP) 1.167 2.693 2.747 407848072 1 SZIP 1.337 3.789 4.423 317040127 1.29 SHUFFLE + GZIP 1.329 20.049 24.003 216176244 1.89

HDF5 Compression Using compression with SeaSat Data Compression and decompression will differ depending on the method Table below shows elapsed times for the h5repack to encode data and h5dump to display data for SeaSat file. Compression TCR Time to compress using h5repack   Time to decompress with h5dump SZIP 1.29 0:11.34 6:21.98 BLOSC 1.59 0:13.87 6:15.92 SHUFFLE + GZIP 1.89 0:20.91 6:31.29

What is coming in HDF5 1.12? New defaults and file format changes UTF-8 encoding for strings (vs. current ASCII encoding) Setting “low” to H5F_LIBVER_V18 (vs. H5F_LIBVER_EARLIEST in H5Pset_libver_bounds( hid_t fapl_id, H5F_libver_t low, H5F_libver_t high ) Better performance for groups and attributes traversals No limitation on the attribute sizes File format extensions to address misc. file format issues (e.g., 64-bit dataspaces encoding)

What is coming in HDF5 1.12? Virtual Object Layer to perform I/O to any storage including Object Storage Plugin architecture for VOL plugins REST VOL plugin VOL plugins in progress: RADOS: Reliable Autonomic Distributed Object Store is part of CEPH distributed storage system. DAOS: Distributed Asynchronous Object Storage (DAOS) is an open-source software-defined object store.

Virtual Object Layer HDF5 APIs VOL Layer DAOS plugin HDF5 plugin REST ADIOS plugin HDF5 library internals Virtual File Driver ADIOS File on POSIX File System MPI I/O SEC2 S3 DAOS Object Store Amazon Cloud HDF5 File on POSIX File System

Questions?

This work was supported by NASA/GSFC under Raytheon Co This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C. in partnership with