Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen Oct. 15. Seminar Data-Intensive Scalable Computing Laboratory (DISCL) Locality-driven High-level I/O Aggregation for Processing Scientific Datasets 1
Introduction Scientific simulations nowadays generate a few terabytes (TB) of data in a single run and the data sizes are expected to reach petabytes (PB) in the near future. VPIC, Vector Particle in Cell, Plasma physics, 26 bytes per particle, 30TB Accessing and analyzing the data reveals poor I/O performance due to the logical-physical mismatching.
Introduction Scientific Datasets and Scientific I/O Libraries PnetCDF, HDF5, ADIOS PnetCDF MPI-IO Parallel File Systems Scientific I/O libraries allow users to specify array-based logical input Logical-physical mismatching
Motivation I/O methods in scientific I/O libraries(PnetCDF, ADIOS, HDF5): Independent I/O Collective I/O Nonblocking I/O Processes collaboration: No Calls collaboration : No Processes collaboration: Yes Calls collaboration : No Processes collaboration: Yes Calls collaboration : Yes
Motivation Contention on Storage Server without Aware of Locality … Call 0 … Call 1 … Call i … Two Phase Collective I/O … ag 00 ag 01 ag 02 ag 03 …… … ag 10 ag 11 ag 12 ag 13 ag i0 ag i1 ag i2 ag i3
Performance with Overlapping Calls Conclusion: Overlapping Should be Removed
Idea: High level I/O Aggregation start{0,0,0} length{100,200,100} start{0,0,0} length{100,200,100} start{0,0,100} length{100,200,100} start{0,0,100} length{100,200,100} start{10,20,100} length{10,150,400} start{10,20,100} length{10,150,400} start{10,170,100} length{10,150,400} start{10,170,100} length{10,150,400} Physical Layout Physical Layout sub 0 sub 2 sub 0 sub 2 sub 1 sub 3 sub 1 sub 3 Physical Layout Physical Layout start{0,0,0} length{100,200,200} start{0,0,0} length{100,200,200} start{10,20,100} length{10,300,400} start{10,20,100} length{10,300,400} Call 0 Call 1 Logical Input Decomposition
Idea: High level I/O Aggregation Basic Idea Figure out the overlapping among requests Eliminate the overlapping before doing I/O Challenges How to decompose the requests How to aggregate the sub-arrays at a high level
Hila: High Level I/O Aggregation Way to figure out the physical layout Sub-correlation Function Sub-correlation Set Lustre Striping: stripe size: t; stripe count: l; Dataset : Dimension: d; subsets size: m
Hila Algorithm: Prior Step Prior Step: calculate sub-correlation set, one time analysis
Hila Algorithm: Decomposition Main Steps: Request Decomposition and Aggregation
Improvement with Hila Performance Improved with Hila
Improvement with Hila FASM Improved with Hila
Conclusion and Future Work Conclusion The mismatching between logical access and physical layout can lead to poor performance. We propose the locality-driven high-level aggregation approach (HiLa) to facilitate the existing I/O methods by eliminating the overlapping among sub-array requests. Future Work Apply to write operations Integrate with file systems.
Locality-driven High-level I/O Aggregation for Processing Scientific Datasets Thanks Q&A