Presentation is loading. Please wait.

Presentation is loading. Please wait.

HDF5 New Features October 8, 2017

Similar presentations


Presentation on theme: "HDF5 New Features October 8, 2017"— Presentation transcript:

1 HDF5 New Features October 8, 2017
Elena Pourmal Copyright 2017, The HDF Group.

2 Outline Latest HDF Releases Upcoming HDF Releases
Future work (HDF5 1.8, 1.10 and 1.12 releases) Features backlog

3 Latest HDF Releases HDF5 1.8.19 (June 2017)
Functions to managing compression plugins locations: Several H5PL (C) APIs were added to manipulate the entries of the plugin path table: H5PLappend, H5PLget, H5PLinsert, H5PLprepend, H5PLremove, H5PLreplace, and H5PLsize. Optimized functions for reading compressed raw data chunk directly from the file: H5Dget_chunk_storage_size, H5DOread_chunk C++ wrappers for handling links and HDF5 objects; improved class hierarchy Support for OpenMPI Improvements for the h5repack and h5diff tools when used with external compression libraries Supported on Linux (CentOS, Debian, Fedora, Ubuntu), Windows 7 and 10, Cygwin, Mac OS X, AIX, SunOS 5.11. Do you have a system we don’t support? We would love to hear from you!

4 Latest HDF Releases HDF5 1.10.1 (May 2017)
See Performance improvements Metadata cache image Metadata Cache Evict on Close Paged Aggregation Page buffering Addressed the issue with SWMR file locking scheme C++ wrappers and classes (in support of the features above and missing H5L, H5O wrappers and more) Added support fro NAG compiler H5DOread_chunk is not available in this release

5 HDF5 1.10.1 Features Metadata (MD) Cache Image
Improves performance for MD intensive applications that use restart files Writes the content of the MD cache in a single block on file close Populates the MD cache with the content of this block on file open avoiding many small I/O operations Use with caution with parallel HDF5 applications – open call has to be collective to avoid deadlocks. The issue will be resolved in the future. Forward compatibility issue when HDF5 1.8 reads a file with the cache image h5clear tool with –m or –image flags removes cache image and file can be open by HDF5 1.8

6 HDF5 1.10.1 Features Metadata Evict on Close
Reduces application memory footprint HDF5 library is conservative about holding on to HDF5 object metadata (object headers, chunk index structures, etc.) MD cache size may grow, resulting in memory pressure on an application system New access property causes all metadata for an object to be evicted from the cache as long as metadata is not referenced from any other open object.

7 HDF5 1.10.1 Features Paged Aggregation Page Buffering
Improves I/O by performing I/O operations on aligned pages HDF5 file space allocation accumulates small pieces of metadata and raw data in aggregator blocks (pages) . The feature provides control on the block sizes and block alignment in the file via file creation property. The feature works along with File Space Management and enables free space tracking in the file to reduce wasted space within the file. The feature is disable when SWMR access is enabled (i.e., no tracking of the free space in the file). Page Buffering Improves performance on parallel file systems by avoiding small and random I/O accesses Works with Paged Aggregation feature

8 Data and Metadata Aggregators
The new aggregators pack small raw data and metadata allocations into aligned blocks which work with the page buffer. We need to combine the next three slides. There are two points to make: when features are not enabled metadata and raw data are scattered in the file. When the features are enabled small pieces of metadata and raw data are aggregated in pages and library does I/O on a page instead on individual metadata or raw data item as shown on the next two slides. Data Metadata HDF5 File Small allocations

9 HDF5 Page Buffering Page buffer contains MD pages (L2 cache)
Metadata blocks are aligned HDF5 File Metadata blocks are multiples of 64K

10 Page Buffering HDF5 APIs
HDF5 internals: MDC cache, preparation for I/O L2 cache HDF5 VFD: performs I/O on pages HDF5 File

11 HDF5 1.10.1 Features SWMR file locking issue
The feature was introduced to guard against ”unauthorized access’” to the file under construction. Prevent multiple writers to modify a file Prevent readers to access a file under construction in non-SWMR mode. The file locking calls used in HDF (including patch1) will fail when the underlying file system does not support file locking or where locks have been disabled. An environment variable named HDF5_USE_FILE_LOCKING can be set to 'FALSE’ to disable locking. It becomes user’s responsibility to avoid problematic access patterns (e.g., multiple writers accessing the same file) Error message was improved to identify the file locking problem.

12 What is coming next? HDF5 1.8 Releases (Fall 2017, Spring and Fall 2018, Spring 2019) The HDF Group will drop HDF5 1.8.* in mid 2019. We encourage you to move to HDF * ASAP Please report all problems you encounter to HDF (end of 2017 or Q1 of 2018) VDS fixes Opening VDS in parallel mode Resolving location of source files according to the master file Compatibility issues with HDF5 1.8 H5Pset_libver_bounds will set low and upper bounds for a specific minor release (1.6, 1.8, 1.10, e.g. H5_LIBVER_V18 Create files fully compatible with HDF5 1.8 when used with H5P_LIBVER_LATEST Optimizations for file OPEN/CLOSE operations for parallel HDF5 applications.

13 What is coming next? HDF5 1.10.*
Addressing performance degradation before we drop HDF5 1.8 We observe 2x or sometimes more performance degradation when applications are moved from 1.6 to 1.8 to 1.10. We identified a major issue in the HDF5 property lists code; there could be other places in the library that contribute too. If you have an examples that show performance drop from release to release, please send them to Performance benchmark suite Regression test suite (parallel and sequential I/O kernels) with results published on CDash Open Source; contributions are welcome.

14 What is coming next? POSIX is going away (like Fortran); data is moving to object store HDF5 has to be portable between traditional file systems and storage systems of the future HDF5 Virtual Object Layer or VOL VFD for Object Store

15 HDF5 Virtual Object Layer (VOL)
Goal Provide an application with the HDF5 data model and API, but allow different underlying storage mechanisms New layer below HDF5 API layer Intercepts all API calls that potentially could touch the data on disk and route them to a Virtual Object Driver Potential Object Drivers (or plugins): Native HDF5 driver (writes to HDF5 file) Raw driver (maps groups to file system directories and datasets to files in directories) Remote driver (the file exists on a remote machine) Drivers to other file formats (netCDF, HDF4, ADIOS, FITS, etc.) VOL properties Stackable Dynamically loaded

16 HDF5 VOL - an interface to non-HDF5 storage

17 HDF5 VOL - an interface to non-HDF5 storage
Different File Formats plugins:

18 HDF5 S3 VFD Goal Serve HDF5 files from Object Store Use existing HDF5 library and new VFD drivers to access the HDF5 file (work in progress) VFD uses range gets to read the desired data from the HDF5 file R/O case Optimization is performed to avoid small metadata accesses Metadata is extracted and stored along with the original file VFD loads metadata information into the buffer on file open operation. Ingestion tool is required to create object with metadata information R/W case takes advantage of paged allocation feature introduced in HDF VFD tracks allocations for the pages containing metadata and raw data for an object.

19 HDF5 SWMR VFD Goal Approach
Implement real time full SWMR with min modifications to the HDF5 library. Approach Select a time t that is max acceptable delay between data write and data read. Define one tick to be t/3 Writer: Use MD aggregation to allocate al metadata in pages Modify page buffer to track pages changed during current tick or still being modified. Modify APIs FUNC_LEAVE (TBD) macro to check if current tick has expired and invoke writer_end_of_tick to do the following: Flush MD cache to page buffer Write modified pages to backing store (POSIX, NFS, ObStore) Construct/update index mapping base addresses of all MD pages to location in the backing store; replace old version of the index with the new one Release space on backing store that contains pages and indices that are more than x(TBD) ticks old Make a note od start time of new tick Resume normal processing

20 HDF5 SWMR VFD (cont’ed) Reader:
Must use page buffer and configured with sufficient metadata pages. Modify VFD to intercept page reads and satisfy them from index discussed in writer. Modify MD cache to invalidate entries with base addresses within a specific range of addresses. Modify FUNC_ENTER macros to see if the current tick has expired and call reader_start_of_tick Directs VFD to reload the index Determine which pages has been modified since the last time the index was resolved For each modified page Evict fro page buffer Instruct the MD cache to invalidate all entries located in modified page Note the start time of new tick, so the end can be detected Resume normal processing

21 What is coming next? HDF5 1.12.0 (Q4 2018) VOL architecture
VOL plugins HDF5 REST VOL plugin REST VOL substitutes REST API requests for file i/o actions See C/Fortran applications should be able to run as is In development – Beta expected by the end of 2017 HDF5 JSON plugin Parallel compression Available in HDF5 develop branch Parallel performance improvements

22 What is coming next? HDF5 “Enterprise Edition”
HDF5 library adds-on and Cloud solutions S3 VFD Spark connector for HDF5 Library of HDF5 Compression filters HDF Cloud solution (HSDS server) Thinking about future HDF5 (ideas) Community edition Enterprise version of the library

23 What is coming next? Library of HDF5 compression filters (work in progress) Library of HDF5 filter plugins registered with The HDF Group Filters have BDS or BSD type of license Distributed as a source code and/or pre-built binaries for Windows, Mac, Linux and system of user’s choice Tested with the HDF5 releases Version 1 includes filters to assure interoperability with h5py and PyTables and popular compression methods for floating-point data: BLOSC , BZIP2, LZ4, LZF, MAFISC, ZFP, SZIP from German Climate Computing Center. Test scripts to run h5repack with user’s data User level documentation

24 What is coming next? Highly Scalable Data Service (HSDS)
RESTful interface to HDF5 using object storage Storage using AWS S3 Built in redundancy Cost effective Scalable throughput Runs as a cluster of Docker containers Elastically scale compute with usage Feature compatible with HDF5 library Implemented in Python using asyncio Task oriented parallelism

25 Object Storage Challenges for HDF
Not POSIX! High latency (>0.1s) per request Not write/read consistent High throughput needs some tricks (use many async requests) Request charges can add up (public cloud) For HDF5, using the HDF5 library directly on an object storage system is a challenge. Will need an alternative solution…

26 HSDS S3 Schema Big Idea: Map individual HDF5 objects (datasets, groups, chunks) as Object Storage Objects How to store HDF5 content in S3? Limit maximum storage object size Support parallelism for read/write Only data that is modified needs to be updated (Potentially) Multiple clients can be reading/updating the same “file” Each chunk (heavy outlines) get persisted as a separate object Legend: Dataset is partitioned into chunks Each chunk stored as an S3 object Dataset meta data (type, shape, attributes, etc.) stored in a separate object (as JSON text)

27 Client/Server Architecture

28 Architecture for HSDS Legend: Client: Any user of the service
Each node is implemented as a docker container Legend: Client: Any user of the service Load balancer – distributes requests to Service nodes Service Nodes – processes requests from clients (with help from Data Nodes) Data Nodes – responsible for partition of Object Store Object Store: Base storage service (e.g. AWS S3)

29 What is coming next? HDF5 backlog
SWMR reimplementation based on page buffering Journaling Sub-filing AIO Query and Indexing of HDF5 user’s metadata and raw data Full SWMR implementation MWMR (multiple writer/multiple reader implementation) Data streaming Multi-threading

30 Fighting Technical Debt
Poor / non-existent developer level documentation Slows software development Training new developers is glacial Developing procedures to repair this – but will take years Recent features merged quickly, before they were ready Resulted in bugs, poor performance, hacks to paper over design flaws. Last minute design changes complicated matters further Solution: better design procedures, sufficient resources for sustaining engineering Many incomplete / prototype features in development branches In the future, contract separately for prototype development vs. full implementations and integration into the main development branch. Deal with backlog according to priorities and available resources Excessive complexity of some modules in the library (i.e. metadata cache) Where practical, refactor/redesign to abstract out complexity

31 31 Thank you! Questions?


Download ppt "HDF5 New Features October 8, 2017"

Similar presentations


Ads by Google