Download presentation
Presentation is loading. Please wait.
1
Using Compression filters in HDF5
HDF5s` new external filter interface in action Euge Wintersberger ICALEPCS 2017,
2
Nature of the data passed to the algorithm!
Motivation Applying different compression algorithms to individual datasets is one of the key features of HDF5. Apply compression only where feasible Other data can be read and written without any performance penalty We can pick the optimum algorithm for each dataset Performance key figures for a compression algorithm: Throughput (Mbyte/sec) Compression ratio depend on Nature of the data passed to the algorithm!
3
The situation before HDF5 1.8.11
Could use custom filter algorithms for reading and writing #define H5Z_FILTER_BZIP2 305 /* declare a filter function */ size_t H5Z_filter_bzip2(unsigned flags, size_t cd_nelmts, const unsigned cd_values[], size_t nbytes, size_t *buf_size,void**buf); const H5Z_class2_t H5Z_BZIP2[1] = {{ H5Z_CLASS_T_VERS, /* H5Z_class_t version */ /* Filter id number */ (H5Z_filter_t)H5Z_FILTER_BZIP2, 1,/* encoder_present flag (set to true) */ 1,/* decoder_present flag (set to true) */ "bzip2",/* Filter name for debugging */ NULL, /* The "can apply" callback */ NULL, /* The "set local" callback */ /* The actual filter function */ (H5Z_func_t)H5Z_filter_bzip2, }}; /* somewhere in the code */ status = H5Zregister(H5Z_BZIP2); Two issues Need to change sourcecode Not possible for commercial applications! Currently used Eiger detector PyTables h5py
4
New approach since HDF5 1.8.12 HDF5_PLUGIN_PATH=... Application
libLZ4.so FilterID HDF5 library libbitshuffle.so libBZ2.so The library looks for the appropriate filter by itself using the ID of the filter!
5
Where to get the filter plugins?
Supported platforms Windows Linux macOS
6
Installing the filters – on Windows
7
Install the filters – on Linux (Debian)
Add repository key and sources list $ wget -q -O - | apt-key add - $ cd /etc/apt/sources.d $ wget Install the package $ apt-get update $ apt-get install hdf5-plugin-lz4
8
Install the filters – on Linux (Ubuntu)
Supported versions Ubuntu (Trusty Tahr) Ubuntu (Xenial Xerus)
9
Install the filters – on macOS
Installing the dependencies $ brew install cmake $ brew install git $ brew install hdf5 $ brew install lz4 $ git clone $ cd HDF5-External-Filter-Plugins $ git checkout new_build $ cmake -DENABLE_LZ4_PLUGIN=ON -DENABLE_BITSHUFFLE_PLUGIN=ON \ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/opt/hdf5 $ make $ make test $ make install Build the code Make installation available
10
Using the filter plugins (from Python)
Reading – there is nothing you have to do Writing import h5py f = h5py.File("bitshuffle_file.h5","w") filter_id = 32008 d1 = f.create_dataset("with_lz4",(100,100),compression=filter_id, compression_opts=(0,2)) d2 = f.create_dataset("without_lz4",(100,100),compression=filter_id) No additional packages must be imported You need to know The filters ID The compression options accepted by the filter
11
Current status Included filters: BZIP2 LZ4 LZ4+bitshuffle
Installation packages for: Windows (VS2015), Linux (Debian, Ubuntu) Simplified build for Windows using Conan
12
Todos Create GitHub pages Update the documentation
Review of the LZ4 API calls for the new LZ4 1.4 version BLOSC filter is still missing Installation packages for MacOS RPM based Linux distributions (RedHat, CentOS, …) Update Debian packages
13
Thank you for your attention!
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.