The HDF Group Multi-threading in HDF5: Paths Forward Current implementation - Future directions May 30-31, 2012HDF5 Workshop at PSI 1
Outline Introduction Current implementation Paths forward: Improve concurrency Reduce latency Conclusions and Recommendations May 30-31, 2012HDF5 Workshop at PSI2
INTRODUCTION May 30-31, 2012HDF5 Workshop at PSI3
Introduction HDF5 design principles Flexibility Adaptability to new computational environments Current challenges: Multi-threaded applications run on multi-core systems HDF5 thread-safe library cannot support concurrency built into such applications May 30-31, 2012HDF5 Workshop at PSI4
CURRENT IMPLEMENTATION May 30-31, 2012HDF5 Workshop at PSI5
Current Implementation HDF5 uses single global semaphore Controls modification of memory and file data structures: One thread at a time enters the library An application thread enters HDF5 API routine, acquires semaphore Other threads are blocked until the thread completes API call and releases semaphore No simultaneous modifications of data structures that can cause file corruption No race conditions when several threads try to modify a memory data structure May 30-31, 2012HDF5 Workshop at PSI6
Current Implementation Pros: Current implementation provides thread-safety needed to avoid corruption of data structures Cons: No concurrent use of HDF5 library by multi- threaded applications May 30-31, 2012HDF5 Workshop at PSI7
PATHS FORWARD May 30-31, 2012HDF5 Workshop at PSI8
Improving Concurrency Replace single global semaphore with semaphores that guard individual data structures Pros: Greater level of concurrency No corruption of internal data structures Each thread waits only when it needs to modify a data structure locked by another thread Reduces waiting time for a resource to become available May 30-31, 2012HDF5 Workshop at PSI9
Improving Concurrency Cons: Replacing the global semaphore with individual semaphores, locks, etc. requires careful analysis of HDF5 data structures and their interactions 300K lines of C code in library will require 4-6 FTE years of knowledgeable staff Significant future maintenance effort Testing challenges May 30-31, 2012HDF5 Workshop at PSI10
Reducing Latency Reduce waiting time for each thread to acquire global semaphore Reduce time by removing known HDF5 bottlenecks: I/O performance “Compute bound” (CB) operations Datatype conversions Compression and other filters General overhead E.g., structures for storing and accessing chunked datasets and metadata May 30-31, 2012HDF5 Workshop at PSI11
Reducing Latency: I/O Performance Support asynchronous I/O (AIO) access to data in HDF5 file AIO initiated within the library in response to an API call Completes in the background after API call has returned Global semaphore is released when API call returns – less waiting time May 30-31, 2012HDF5 Workshop at PSI12
Reducing Latency: CB operations Use multiple threads within HDF5 library to Perform datatype conversion Perform compression on one chunk Multiple threads work on one chunk Perform compression on many chunks Each thread works on a chunk May 30-31, 2012HDF5 Workshop at PSI13
Reducing Latency: General Optimizations Traditional optimizations Some examples: Algorithm improvements for handling Chunk cache Hyperslab selections Memory usage Data structure improvements Chunk indices with O(1) lookup speed Advanced B-tree implementations May 30-31, 2012HDF5 Workshop at PSI14
Reducing Latency Pros: Smaller development effort, ~ 1.5 FTE years Localized changes to the library Easier to maintain Incremental improvements Cons: Still uses global semaphore May 30-31, 2012HDF5 Workshop at PSI15
RECOMMENDATIONS May 30-31, 2012HDF5 Workshop at PSI16
Decision Reduce Latency Decision factors: Available expertise Cost Already funded features: AIO Using multiple threads to compress a chunk Future maintainability May 30-31, 2012HDF5 Workshop at PSI17
Other considerations Approaches are not mutually exclusive Both can be implemented in the future if funding is available May 30-31, 2012HDF5 Workshop at PSI18
The HDF Group Thank You! Questions? May 30-31, 2012HDF5 Workshop at PSI 19