Using Compression filters in HDF5

Slides:



Advertisements
Similar presentations
Django-springsteen aggregating distributed search services (pretty fonts: making under-prepared a fashion statement since 1436)
Advertisements

'08 Rabat Why are we using FreeBSD? Scaleable Services Workshop AfNOG 2008 Rabat, Morocco slides by Hervey Allen presented by Joe Abley.
Android: Hello World Frank Xu Gannon University. Steps Configuration ▫Android SDK ▫Android Development Tools (ADT)  Eclipse plug-in ▫Android SDK and.
Linux+ Guide to Linux Certification, Third Edition Chapter 11 Compression, System Backup, and Software Installation.
Package Managers What are they and why we use them.
Version Control with git. Version Control Version control is a system that records changes to a file or set of files over time so that you can recall.
ISCE telecon Mar 19, 2015 ISCE developer team. Agenda ISCE download website Currently supported sensors Minimum requirements Recommended setup (OS X and.
Linux Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Yannick Patois – Datagrid Repository Presentation- 2001/11/21 - n° 1 Partner Logo DataGrid Software Repository presentation A short presentation of the.
Presented By: Muhammad Tariq Software Engineer Android Training course.
Version control Using Git Version control, using Git1.
Geant4 Installation Supported platforms:  Scientific Linux with gcc 4.1.2/4.6  Mac Os X 10.7 and 10.8 with gcc 4.21  Windows7 with Visual Studio.
GTRI Proprietary / Limited Distribution. Architecture File System DataLoader API Analytics API Visualization API MongoDB Resource Management Layer Python.
PALMS update Marco Mambelli 18/9/ PALMS project OASIS provides the infrastructure to host the software in CVMFS but the users need more guidance.
XGL: X11 replacement ? O.Couet, ROOT meeting 11/07/2006.
© 2002 IBM Corporation Confidential | Date | Other Information, if necessary June, 2011 Made available under the Eclipse Public License v Mobile.
Managed by UT-Battelle for the Department of Energy Kay Kasemir, Xihui Chen ORNL/SNS April Control System Studio Training - Development.
Technical Overview. Project Overview Document Library Document List Index TransmittalsPlanning.
Intro to Git presented by Brian K. Vagnini Hosted by.
RCE Platform Technology (RPT) Mark Arndt User Support.
Separate distribution of the analysis code (and more) P. Hristov 19/03/2014.
Isecur1ty training center Presented by : Eng. Mohammad Khreesha.
Importing a github repository Dong Nie. Example used: JavaTeaching If you have already loaded JavaTeahcing from zip file, you should delete it before.
Use of CMT in LHCb CMT Workshop, LAL (Orsay) 28 th February - 1 st March 2002 P. Mato / CERN.
DECTRIS Ltd Baden-Daettwil Switzerland Continuous Integration and Automatic Testing for the FLUKA release using Jenkins (and Docker)
MAUS Status A. Dobbs CM43 29 th October Contents MAUS Overview Infrastructure Geometry and CDB Detector Updates CKOV EMR KL TOF Tracker Global Tracking.
1 Creative Innovation – Customer Satisfaction – Continual Quality Improvement Accessing and Building Asterisk SCF.
XNAT 1.7: Getting Started 6 June, Introduction In this presentation we’ll discuss:  Features and functions in XNAT 1.7  Requirements  Installing.
Introduction to Android Programming
Version Control Systems
Outline Installing Gem5 SPEC2006 for Gem5 Configuring Gem5.
Hierarchical Data Formats (HDF) Update
Outline Introduction Programming in eclipse Debugging in eclipse
Embedded Software Development with Python and the Raspberry Pi
Deployment and Management of an application
CompTIA Server+ Certification (Exam SK0-004)
Introduction to .NET Core
Version control, using Git
COMP Introduction to Operating Systems Project 1 – Installing CentOS
ATLAS Software Distribution
Python in astronomy + Monte-Carlo techniques
Revision Control, Automated Testing and Docker RSE Conference 2017
Spacewalk and Koji at Fermilab
BIND 10 Packaging & Distribution
Version Control Systems
(Dectris Eiger) HDF5 Stream Writer
Opyum: offline package management with Yum -- Debarshi Ray
Storing, Sending, and Tracking Files Recitation 2
ELECTRON DESKTOP DEVELOPMENT FOR WEB DEVELOPERS
OpenOffice. org Extensions Infrastructure What it is –. What it can –
PostgreSQL Database and C++ Interface (and Midterm Topics)
RedHat Package Management
Intro to SQL Operations Studio
Ucsmsdk v UCS Python SDK.
Getting Started with Contribution to Openstack
Introduction to GO CS240 20/8/2017.
University of Texas Rio Grande Valley Systems Administration CSCI 6175
EMSE 6574 – Programming for Analytics: Python 101 – Python Enviornments Joel Klein.
Source Code Repository
Understanding Linux and the BASH shell v
GitHub and Git.
Intro to Git and GitHub Version Control using Git Carol Schofield
Carthage ios 8 onwards Dependency manager that streamlines the process of integrating the libraries into the project.
Git GitHub.
University of Washington CSE 416 Spring 2019 Hongjun Wu
Data science laboratory (DSLAB)
VERSION CONTROL SVN (SubVersioN)
PyWBEM Python WBEM Client: Overview #2
Module 02 Operating Systems
Presentation transcript:

Using Compression filters in HDF5 HDF5s` new external filter interface in action Euge Wintersberger ICALEPCS 2017, 8.10.2017

Nature of the data passed to the algorithm! Motivation Applying different compression algorithms to individual datasets is one of the key features of HDF5. Apply compression only where feasible Other data can be read and written without any performance penalty We can pick the optimum algorithm for each dataset Performance key figures for a compression algorithm: Throughput (Mbyte/sec) Compression ratio depend on Nature of the data passed to the algorithm!

The situation before HDF5 1.8.11 Could use custom filter algorithms for reading and writing #define H5Z_FILTER_BZIP2 305 /* declare a filter function */ size_t H5Z_filter_bzip2(unsigned flags, size_t cd_nelmts, const unsigned cd_values[], size_t nbytes, size_t *buf_size,void**buf); const H5Z_class2_t H5Z_BZIP2[1] = {{ H5Z_CLASS_T_VERS, /* H5Z_class_t version */ /* Filter id number */ (H5Z_filter_t)H5Z_FILTER_BZIP2, 1,/* encoder_present flag (set to true) */ 1,/* decoder_present flag (set to true) */ "bzip2",/* Filter name for debugging */ NULL, /* The "can apply" callback */ NULL, /* The "set local" callback */ /* The actual filter function */ (H5Z_func_t)H5Z_filter_bzip2, }}; /* somewhere in the code */ status = H5Zregister(H5Z_BZIP2); Two issues Need to change sourcecode Not possible for commercial applications! Currently used Eiger detector PyTables h5py

New approach since HDF5 1.8.12 HDF5_PLUGIN_PATH=... Application libLZ4.so FilterID HDF5 library libbitshuffle.so libBZ2.so The library looks for the appropriate filter by itself using the ID of the filter!

Where to get the filter plugins? Supported platforms Windows Linux macOS

Installing the filters – on Windows

Install the filters – on Linux (Debian) Add repository key and sources list $ wget -q -O - http://repos.pni-hdri.de/debian_repo.pub.gpg | apt-key add - $ cd /etc/apt/sources.d $ wget http://repos.pni-hdri.de/jessie-pni-hdri.list Install the package $ apt-get update $ apt-get install hdf5-plugin-lz4

Install the filters – on Linux (Ubuntu) Supported versions Ubuntu 14.04 (Trusty Tahr) Ubuntu 16.04 (Xenial Xerus)

Install the filters – on macOS Installing the dependencies $ brew install cmake $ brew install git $ brew install hdf5 $ brew install lz4 $ git clone https://github.com/nexusformat/HDF5-External-Filter-Plugins.git $ cd HDF5-External-Filter-Plugins $ git checkout new_build $ cmake -DENABLE_LZ4_PLUGIN=ON -DENABLE_BITSHUFFLE_PLUGIN=ON \ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/opt/hdf5 $ make $ make test $ make install Build the code Make installation available

Using the filter plugins (from Python) Reading – there is nothing you have to do Writing import h5py f = h5py.File("bitshuffle_file.h5","w") filter_id = 32008 d1 = f.create_dataset("with_lz4",(100,100),compression=filter_id, compression_opts=(0,2)) d2 = f.create_dataset("without_lz4",(100,100),compression=filter_id) No additional packages must be imported You need to know The filters ID The compression options accepted by the filter

Current status Included filters: BZIP2 LZ4 LZ4+bitshuffle Installation packages for: Windows (VS2015), Linux (Debian, Ubuntu) Simplified build for Windows using Conan

Todos Create GitHub pages Update the documentation Review of the LZ4 API calls for the new LZ4 1.4 version BLOSC filter is still missing Installation packages for MacOS RPM based Linux distributions (RedHat, CentOS, …) Update Debian packages

Thank you for your attention! Questions?