Download presentation
Presentation is loading. Please wait.
Published byElmer Jacobs Modified over 9 years ago
1
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group help@hdfgroup.org 5/19/20081SCICOMP 14 Tutorial
2
Outline Overview of Basic HDF5 concept Overview of Parallel HDF5 design MPI-IO vs. Parallel HDF5 Overview of Parallel HDF5 programming model The benefits of using Parallel HDF5 Situations where parallel HDF5 may not work well 5/19/20082SCICOMP 14 Tutorial
3
Overview of Basic HDF5 Concept 5/19/20083SCICOMP 14 Tutorial
4
What is HDF5? File format for managing any kind of data Software (library and tools) for accessing data in that format Especially suited for large and/or complex data collections Platform independent C, F90, C++, Java APIs 5/19/20084SCICOMP 14 Tutorial
5
Example HDF5 file “/” (root) “/foo” Raster image palette 3-D array 2-D array Raster image lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Table 5/19/20085SCICOMP 14 Tutorial
6
compressed extendable Metadata for Fred Dataset “Fred” File A File B Data for Fred Special Storage Options chunked compressed extendable Split file Better subsetting Access time; extendable Improves storage efficiency, Transmission speed Arrays can be extended in any direction Metadata in one file, raw data in another
7
5/19/2008SCICOMP 14 Tutorial7 Virtual File I/O Layer Allows HDF5 format address space to map to disk, the network, memory, or a user-defined device Network File FamilyMPI I/OMemory Virtual file I/O drivers Memory Stdio File Family File “Storage” … …
8
Overview of Parallel HDF5 Design 5/19/20088SCICOMP 14 Tutorial
9
PHDF5 Requirements MPI programming PHDF5 files compatible with serial HDF5 files Shareable between different serial or parallel platforms Single file image to all processes One file per process design is undesirable Expensive post processing Not useable by different number of processes Standard parallel I/O interface Must be portable to different platforms 5/19/20089SCICOMP 14 Tutorial
10
PHDF5 Implementation Layers Application Parallel computing system (IBM AIX) Compute node I/O library (HDF5) Parallel I/O library (MPI-I/O) Parallel file system (GPFS) Switch network/I/O servers Compute node Disk architecture & layout of data on disk PHDF5 built on top of standard MPI-IO API 5/19/200810SCICOMP 14 Tutorial
11
Parallel Environment Requirements MPI with MPI-IO. E.g., MPICH2 ROMIO Vendor’s MPI-IO: IBM,SGI etc. Parallel file system. E.g., GPFS Lustre 5/19/200811SCICOMP 14 Tutorial
12
MPI-IO vs. HDF5 MPI-IO is an Input/Output API. It treats the data file as a “linear byte stream” and each MPI application needs to provide its own file view and data representations to interpret those bytes. All data stored are machine dependent except the “external32” representation. External32 is defined in Big Endianness Little endian machines have to do the data conversion in both read or write operations. 64-bit sized data types may lose information. 5/19/200812SCICOMP 14 Tutorial
13
MPI-IO vs. HDF5 Cont. HDF5 is a self-described data management software. It stores the data and metadata according to the HDF5 data format definition. Each machine can store the data in its own native representation for efficient I/O. Any necessary data representation conversion is done by the HDF5 library automatically. 64-bit sized data types may not lose information. 5/19/200813SCICOMP 14 Tutorial
14
Programming Restrictions Most PHDF5 APIs are collective PHDF5 opens a parallel file with a communicator Returns a file-handle Future access to the file via the file-handle All processes must participate in collective PHDF5 APIs Different files can be opened via different communicators 5/19/200814SCICOMP 14 Tutorial
15
Examples of PHDF5 API Examples of PHDF5 collective API File operations: H5Fcreate, H5Fopen, H5Fclose Objects creation: H5Dcreate, H5Dopen, H5Dclose Objects structure: H5Dextend (increase dimension sizes) Array data transfer can be collective or independent Dataset operations: H5Dwrite, H5Dread 5/19/200815SCICOMP 14 Tutorial
16
PHDF5 API Languages C and F90 language interfaces Platforms supported: Most platforms with MPI-IO supported IBM SP, Linux clusters, Cray XT3, SGI Altix 5/19/200816SCICOMP 14 Tutorial
17
How to Compile PHDF5 Applications h5pcc – HDF5 C compiler command Similar to mpicc h5pfc – HDF5 F90 compiler command Similar to mpif90 To compile: % h5pcc h5prog.c % h5pfc h5prog.f90 5/19/200817SCICOMP 14 Tutorial
18
Overview of Parallel HDF5 Programming Model 5/19/200818SCICOMP 14 Tutorial
19
Creating and Accessing a File Programming model HDF5 uses access template object (property list) to control the file access mechanism General model to access HDF5 file in parallel: Setup MPI-IO access template (access property list) Open File Access Data Close File 5/19/200819SCICOMP 14 Tutorial
20
Parallel File Create ->36 plist_id = H5Pcreate(H5P_FILE_ACCESS); ->37 H5Pset_fapl_mpio(plist_id, comm, info); ->42 file_id = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id); 49 /* 50 * Close the file. 51 */ 52 H5Fclose(file_id); 54 MPI_Finalize(); 5/19/200820SCICOMP 14 Tutorial
21
Writing and Reading Hyperslabs Programming model Distributed memory model: data is split among processes PHDF5 uses hyperslab model Each process defines memory and file hyperslabs Each process executes partial write/read call Collective calls Independent calls 5/19/200821SCICOMP 14 Tutorial
22
P0 P1 File Hyperslab Example Writing dataset by columns 5/19/200822SCICOMP 14 Tutorial
23
Writing Dataset by Column P1 P0 File Memory block[1] Block[0] P0 offset[1] P1 offset[1] stride[1] dimsm[0] dimsm[1] 5/19/200823SCICOMP 14 Tutorial
24
Writing Dataset by Column 85 /* 86 * Each process defines hyperslab in * the file 88 */ 89 count[0] = 1; 90 count[1] = dimsm[1]; 91 offset[0] = 0; 92 offset[1] = mpi_rank; 93 stride[0] = 1; 94 stride[1] = 2; 95 block[0] = dimsm[0]; 96 block[1] = 1; 97 98 /* 99 * Each process selects hyperslab. 100 */ 101 filespace = H5Dget_space(dset_id); 102 H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, stride, count, block); 5/19/200824SCICOMP 14 Tutorial
25
Writing Dataset by Column P1 P0 File Memory block[1] Block[0] P0 offset[1] P1 offset[1] stride[1] dimsm[0] dimsm[1] 5/19/200825SCICOMP 14 Tutorial
26
96 /* Create property list for collective dataset write. */ 98 plist_id = H5Pcreate(H5P_DATASET_XFER); ->99 H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE); 100 101 status = H5Dwrite(dset_id, H5T_NATIVE_INT, 102 memspace, filespace, plist_id, data); 104 H5Dclose(dset_id); Dataset collective Write 5/19/200826SCICOMP 14 Tutorial
27
My PHDF5 Application I/O is slow If my application I/O performance is slow, what can I do? Use larger I/O data sizes Independent vs. Collective I/O Specific I/O system hints Increase I/O bandwidth 5/19/200827SCICOMP 14 Tutorial
28
Independent Vs Collective Access User reported Independent data transfer mode was much slower than the Collective data transfer mode Data array was tall and thin: 230,000 rows by 6 columns : 230,000 rows : 5/19/200828SCICOMP 14 Tutorial
29
# of RowsData Size (MB) Independent (Sec.) Collective (Sec.) 163840.258.261.72 327680.5065.121.80 655361.00108.202.68 1229181.88276.573.11 1500002.29528.153.63 1803002.75881.394.12 Independent vs. Collective write 6 processes, IBM p-690, AIX, GPFS 5/19/200829SCICOMP 14 Tutorial
30
Independent vs. Collective write (cont.) 5/19/200830SCICOMP 14 Tutorial
31
Parallel Tools ph5diff Parallel version of the h5diff tool h5perf Performance measuring tools showing I/O performance for different I/O API 5/19/200831SCICOMP 14 Tutorial
32
ph5diff A parallel version of the h5diff tool Supports all features of h5diff An MPI parallel tool Manager process (proc 0) coordinates each the remaining processes (workers) to “diff” one dataset at a time; collects any output from each worker and prints them out. Works best if there are many datasets in the files with few differences. Available in v1.8. 5/19/200832SCICOMP 14 Tutorial
33
h5perf An I/O performance measurement tool Test 3 File I/O API POSIX I/O (open/write/read/close…) MPIO (MPI_File_{open,write,read,close}) PHDF5 H5Pset_fapl_mpio (using MPI-IO) H5Pset_fapl_mpiposix (using POSIX I/O) 5/19/200833SCICOMP 14 Tutorial
34
APIs that applications can use to achieve better performance H5Pset_dxpl_mpio_chunk_opt H5Pset_dxpl_mpio_chunk_opt_num H5Pset_dxpl_mpio_chunk_opt_ratio H5Pset_dxpl_mpio_collective_opt 5/19/200834SCICOMP 14 Tutorial
35
The benefits of Using HDF5 Self-describing Allow tools to access the file without knowledge of applications that produces it Flexible design Move between system architectures and between serial and parallel applications Flexible application control Application can choose the way on how to store data in HDF5 to achieve better performance Advanced features Support complex selections More user control for performance tuning 5/19/200835SCICOMP 14 Tutorial
36
Situations where parallel HDF5 may not work well We don’t support BlueGene Misuse HDF5 can cause bad performance Chunking storage http://www.hdfgroup.uiuc.edu/papers/papers/ParallelIO/HDF5- CollectiveChunkIO.pdf Collective IO 5/19/200836SCICOMP 14 Tutorial
37
Questions? 5/19/200837SCICOMP 14 Tutorial
38
Questions for audiences Any suggestions on general improvement or new features for parallel HDF5 support? Any suggestions on other tools for parallel HDF5? 5/19/200838SCICOMP 14 Tutorial
39
Useful Parallel HDF Links Parallel HDF information site http://hdfgroup.org/HDF5/PHDF5/ Parallel HDF5 tutorial available at http://hdfgroup.org/HDF5/Tutor/ HDF Help email address help@hdfgroup.org 5/19/200839SCICOMP 14 Tutorial
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.