Presentation is loading. Please wait.

Presentation is loading. Please wait.

April 28, 2008LCI Tutorial1 Parallel HDF5 Tutorial Tutorial Part IV.

Similar presentations


Presentation on theme: "April 28, 2008LCI Tutorial1 Parallel HDF5 Tutorial Tutorial Part IV."— Presentation transcript:

1 April 28, 2008LCI Tutorial1 Parallel HDF5 Tutorial Tutorial Part IV

2 April 28, 2008LCI Tutorial2 Parallel HDF5 Introductory Tutorial

3 April 28, 2008LCI Tutorial3 Outline Overview of Parallel HDF5 design Setting up parallel environment Programming model for Creating and accessing a File Creating and accessing a Dataset Writing and reading Hyperslabs Parallel tutorial available at http://www.hdfgroup.org/HDF5/Tutor/

4 April 28, 2008LCI Tutorial4 Overview of Parallel HDF5 Design

5 April 28, 2008LCI Tutorial5 PHDF5 Requirements Support MPI programming PHDF5 files compatible with serial HDF5 files Shareable between different serial or parallel platforms Single file image to all processes One file per process design is undesirable Expensive post processing Not useable by different number of processes Standard parallel I/O interface Must be portable to different platforms

6 April 28, 2008LCI Tutorial6 PHDF5 Implementation Layers Application Parallel computing system (Linux cluster) Compute node I/O library (HDF5) Parallel I/O library (MPI-I/O) Parallel file system (GPFS) Switch network/I/O servers Compute node Disk architecture & layout of data on disk PHDF5 built on top of standard MPI-IO API

7 April 28, 2008LCI Tutorial7 Parallel Environment Requirements MPI with MPI-IO. E.g., MPICH2 ROMIO Vendor’s MPI-IO Parallel file system. E.g., GPFS Lustre Specially configured NFS

8 April 28, 2008LCI Tutorial8 How to Compile PHDF5 Applications h5pcc – HDF5 C compiler command Similar to mpicc h5pfc – HDF5 F90 compiler command Similar to mpif90 To compile: % h5pcc h5prog.c % h5pfc h5prog.f90

9 April 28, 2008LCI Tutorial9 h5pcc/h5pfc -show option show displays the compiler commands and options without executing them, i.e., dry run % h5pcc –show Sample_mpio.c mpicc -I/home/packages/phdf5/include - D_LARGEFILE_SOURCE - D_LARGEFILE64_SOURCE - D_FILE_OFFSET_BITS=64 -D_POSIX_SOURCE - D_BSD_SOURCE -std=c99 -c Sample_mpio.c mpicc -std=c99 Sample_mpio.o - L/home/packages/phdf5/lib /home/packages/phdf5/lib/libhdf5_hl.a /home/packages/phdf5/lib/libhdf5.a -lz -lm -Wl,-rpath - Wl,/home/packages/phdf5/lib

10 April 28, 2008LCI Tutorial10 Collective vs. Independent Calls MPI definition of collective call All processes of the communicator must participate in the right order Independent means not collective Collective is not necessarily synchronous

11 April 28, 2008LCI Tutorial11 Programming Restrictions Most PHDF5 APIs are collective PHDF5 opens a parallel file with a communicator Returns a file-handle Future access to the file via the file-handle All processes must participate in collective PHDF5 APIs Different files can be opened via different communicators

12 April 28, 2008LCI Tutorial12 Examples of PHDF5 API Examples of PHDF5 collective API File operations: H5Fcreate, H5Fopen, H5Fclose Objects creation: H5Dcreate, H5Dopen, H5Dclose Objects structure: H5Dextend (increase dimension sizes) Array data transfer can be collective or independent Dataset operations: H5Dwrite, H5Dread

13 April 28, 2008LCI Tutorial13 What Does PHDF5 Support ? After a file is opened by the processes of a communicator All parts of file are accessible by all processes All objects in the file are accessible by all processes Multiple processes write to the same data array Each process writes to individual data array

14 April 28, 2008LCI Tutorial14 PHDF5 API Languages C and F90 language interfaces Platforms supported: Most platforms with MPI-IO supported IBM SP, Linux clusters, HP Alpha Clusters, SGI IRIX64/Altrix, Red Storm (Cray XT3), …

15 April 28, 2008LCI Tutorial15 Creating and Accessing a File Programming model HDF5 uses access template object (property list) to control the file access mechanism General model to access HDF5 file in parallel: Setup MPI-IO access template (access property list) Open File Access Data Close File

16 April 28, 2008LCI Tutorial16 Setup access template Each process of the MPI communicator creates an access template and sets it up with MPI parallel access information C: herr_t H5Pset_fapl_mpio(hid_t plist_id, MPI_Comm comm, MPI_Info info); F90: h5pset_fapl_mpio_f(plist_id, comm, info) integer(hid_t) :: plist_id integer :: comm, info plist_id is a file access property list identifier

17 April 28, 2008LCI Tutorial17 C Example Parallel File Create 23 comm = MPI_COMM_WORLD; 24 info = MPI_INFO_NULL; 26 /* 27 * Initialize MPI 28 */ 29 MPI_Init(&argc, &argv); 33 /* 34 * Set up file access property list for MPI-IO access 35 */ ->36 plist_id = H5Pcreate(H5P_FILE_ACCESS); ->37 H5Pset_fapl_mpio(plist_id, comm, info); 38 ->42 file_id = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id); 49 /* 50 * Close the file. 51 */ 52 H5Fclose(file_id); 54 MPI_Finalize();

18 April 28, 2008LCI Tutorial18 F90 Example Parallel File Create 23 comm = MPI_COMM_WORLD 24info = MPI_INFO_NULL 26CALL MPI_INIT(mpierror) 29! 30! Initialize FORTRAN predefined datatypes 32CALL h5open_f(error) 34! 35! Setup file access property list for MPI-IO access. ->37CALL h5pcreate_f(H5P_FILE_ACCESS_F, plist_id, error) ->38CALL h5pset_fapl_mpio_f(plist_id, comm, info, error) 40 ! 41 ! Create the file collectively. ->43 CALL h5fcreate_f(filename, H5F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) 45 ! 46 ! Close the file. 49 CALL h5fclose_f(file_id, error) 51 ! 52 ! Close FORTRAN interface 54 CALL h5close_f(error) 56 CALL MPI_FINALIZE(mpierror)

19 April 28, 2008LCI Tutorial19 Creating and Opening Dataset All processes of the communicator open/close a dataset by a collective call C: H5Dcreate or H5Dopen; H5Dclose F90: h5dcreate_f or h5dopen_f; h5dclose_f All processes of the communicator must extend an unlimited dimension dataset before writing to it C: H5Dextend F90: h5dextend_f

20 April 28, 2008LCI Tutorial20 C Example: Create Dataset 56 file_id = H5Fcreate(…); 57 /* 58 * Create the dataspace for the dataset. 59 */ 60 dimsf[0] = NX; 61 dimsf[1] = NY; 62 filespace = H5Screate_simple(RANK, dimsf, NULL); 63 64 /* 65 * Create the dataset with default properties collective. 66 */ ->67 dset_id = H5Dcreate(file_id, “dataset1”, H5T_NATIVE_INT, 68 filespace, H5P_DEFAULT); 70 H5Dclose(dset_id); 71 /* 72 * Close the file. 73 */ 74 H5Fclose(file_id);

21 April 28, 2008LCI Tutorial21 F90 Example: Create Dataset 43 CALL h5fcreate_f(filename, H5F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) 73 CALL h5screate_simple_f(rank, dimsf, filespace, error) 76 ! 77 ! Create the dataset with default properties. 78 ! ->79 CALL h5dcreate_f(file_id, “dataset1”, H5T_NATIVE_INTEGER, filespace, dset_id, error) 90 ! 91 ! Close the dataset. 92 CALL h5dclose_f(dset_id, error) 93 ! 94 ! Close the file. 95 CALL h5fclose_f(file_id, error)

22 April 28, 2008LCI Tutorial22 Accessing a Dataset All processes that have opened dataset may do collective I/O Each process may do independent and arbitrary number of data I/O access calls C: H5Dwrite and H5Dread F90: h5dwrite_f and h5dread_f

23 April 28, 2008LCI Tutorial23 Programming model for dataset access Create and set dataset transfer property C: H5Pset_dxpl_mpio H5FD_MPIO_COLLECTIVE H5FD_MPIO_INDEPENDENT (default) F90: h5pset_dxpl_mpio_f H5FD_MPIO_COLLECTIVE_F H5FD_MPIO_INDEPENDENT_F (default) Access dataset with the defined transfer property

24 April 28, 2008LCI Tutorial24 C Example: Collective write 95 /* 96 * Create property list for collective dataset write. 97 */ 98 plist_id = H5Pcreate(H5P_DATASET_XFER); ->99 H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE); 100 101 status = H5Dwrite(dset_id, H5T_NATIVE_INT, 102 memspace, filespace, plist_id, data);

25 April 28, 2008LCI Tutorial25 F90 Example: Collective write 88 ! Create property list for collective dataset write 89 ! 90 CALL h5pcreate_f(H5P_DATASET_XFER_F, plist_id, error) ->91 CALL h5pset_dxpl_mpio_f(plist_id, & H5FD_MPIO_COLLECTIVE_F, error) 92 93 ! 94 ! Write the dataset collectively. 95 ! 96 CALL h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, data, & error, & file_space_id = filespace, & mem_space_id = memspace, & xfer_prp = plist_id)

26 April 28, 2008LCI Tutorial26 Writing and Reading Hyperslabs Distributed memory model: data is split among processes PHDF5 uses HDF5 hyperslab model Each process defines memory and file hyperslabs Each process executes partial write/read call Collective calls Independent calls

27 April 28, 2008LCI Tutorial27 P0 P1 File Example 1: Writing dataset by rows P2 P3

28 April 28, 2008LCI Tutorial28 Writing by rows: Output of h5dump HDF5 "SDS_row.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 5 ) / ( 8, 5 ) } DATA { 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13 }

29 April 28, 2008LCI Tutorial29 Memory File Example 1: Writing dataset by rows count[0] = dimsf[0]/mpi_size count[1] = dimsf[1]; offset[0] = mpi_rank * count[0]; /* = 2 */ offset[1] = 0; count[0] count[1] offset[0] offset[1] Process 1

30 April 28, 2008LCI Tutorial30 Example 1: Writing dataset by rows 71 /* 72 * Each process defines dataset in memory and * writes it to the hyperslab 73 * in the file. 74 */ 75 count[0] = dimsf[0]/mpi_size; 76 count[1] = dimsf[1]; 77 offset[0] = mpi_rank * count[0]; 78 offset[1] = 0; 79 memspace = H5Screate_simple(RANK,count,NULL); 80 81 /* 82 * Select hyperslab in the file. 83 */ 84 filespace = H5Dget_space(dset_id); 85 H5Sselect_hyperslab(filespace, H5S_SELECT_SET,offset,NULL,count,NULL);

31 April 28, 2008LCI Tutorial31 P0 P1 File Example 2: Writing dataset by columns

32 April 28, 2008LCI Tutorial32 Writing by columns: Output of h5dump HDF5 "SDS_col.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 6 ) / ( 8, 6 ) } DATA { 1, 2, 10, 20, 100, 200, 1, 2, 10, 20, 100, 200 }

33 April 28, 2008LCI Tutorial33 Example 2: Writing dataset by column Process 1 Process 0 File Memory block[1] block[0] P0 offset[1] P1 offset[1] stride[1] dimsm[0] dimsm[1]

34 April 28, 2008LCI Tutorial34 Example 2: Writing dataset by column 85 /* 86 * Each process defines hyperslab in * the file 88 */ 89 count[0] = 1; 90 count[1] = dimsm[1]; 91 offset[0] = 0; 92 offset[1] = mpi_rank; 93 stride[0] = 1; 94 stride[1] = 2; 95 block[0] = dimsf[0]; 96 block[1] = 1; 97 98 /* 99 * Each process selects hyperslab. 100 */ 101 filespace = H5Dget_space(dset_id); 102 H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, stride, count, block);

35 April 28, 2008LCI Tutorial35 Example 3: Writing dataset by pattern Process 0 Process 2 File Process 3 Process 1 Memory

36 April 28, 2008LCI Tutorial36 Writing by Pattern: Output of h5dump HDF5 "SDS_pat.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4 }

37 April 28, 2008LCI Tutorial37 Process 2 File Example 3: Writing dataset by pattern offset[0] = 0; offset[1] = 1; count[0] = 4; count[1] = 2; stride[0] = 2; stride[1] = 2; Memory stride[0] stride[1] offset[1] count[1]

38 April 28, 2008LCI Tutorial38 Example 3: Writing by pattern 90 /* Each process defines dataset in memory and 91 * writes it to the hyperslab in the file. 92 */ 93 count[0] = 4; 94 count[1] = 2; 95 stride[0] = 2; 96 stride[1] = 2; 97 if(mpi_rank == 0) { 98 offset[0] = 0; 99 offset[1] = 0; 100 } 101 if(mpi_rank == 1) { 102 offset[0] = 1; 103 offset[1] = 0; 104 } 105 if(mpi_rank == 2) { 106 offset[0] = 0; 107 offset[1] = 1; 108 } 109 if(mpi_rank == 3) { 110 offset[0] = 1; 111 offset[1] = 1; 112 }

39 April 28, 2008LCI Tutorial39 P0P2 File Example 4: Writing dataset by chunks P1P3

40 April 28, 2008LCI Tutorial40 Writing by Chunks: Output of h5dump utility HDF5 "SDS_chnk.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 1, 2, 2, 3, 3, 4, 4, 3, 3, 4, 4 }

41 April 28, 2008LCI Tutorial41 Example 4: Writing dataset by chunks File Process 2: Memory block[0] = chunk_dims[0]; block[1] = chunk_dims[1]; offset[0] = chunk_dims[0]; offset[1] = 0; chunk_dims[0] chunk_dims[1] block[0] block[1] offset[0] offset[1]

42 April 28, 2008LCI Tutorial42 Example 4: Writing by chunks 97 count[0] = 1; 98 count[1] = 1 ; 99 stride[0] = 1; 100 stride[1] = 1; 101 block[0] = chunk_dims[0]; 102 block[1] = chunk_dims[1]; 103 if(mpi_rank == 0) { 104 offset[0] = 0; 105 offset[1] = 0; 106 } 107 if(mpi_rank == 1) { 108 offset[0] = 0; 109 offset[1] = chunk_dims[1]; 110 } 111 if(mpi_rank == 2) { 112 offset[0] = chunk_dims[0]; 113 offset[1] = 0; 114 } 115 if(mpi_rank == 3) { 116 offset[0] = chunk_dims[0]; 117 offset[1] = chunk_dims[1]; 118 }

43 April 28, 2008LCI Tutorial43 Parallel HDF5 Intermediate Tutorial

44 April 28, 2008LCI Tutorial44 Outline Performance Parallel tools

45 April 28, 2008LCI Tutorial45 My PHDF5 Application I/O “inhales” If my application I/O performance is bad, what can I do? Use larger I/O data sizes Independent vs Collective I/O Specific I/O system hints Parallel File System limits

46 April 28, 2008LCI Tutorial46 Independent Vs Collective Access User reported Independent data transfer mode was much slower than the Collective data transfer mode Data array was tall and thin: 230,000 rows by 6 columns : 230,000 rows :

47 April 28, 2008LCI Tutorial47 # of RowsData Size (MB) Independent (Sec.) Collective (Sec.) 163840.258.261.72 327680.5065.121.80 655361.00108.202.68 1229181.88276.573.11 1500002.29528.153.63 1803002.75881.394.12 Independent vs Collective write 6 processes, IBM p-690, AIX, GPFS

48 April 28, 2008LCI Tutorial48 Independent vs Collective write (cont.)

49 April 28, 2008LCI Tutorial49 Effects of I/O Hints: IBM_largeblock_io GPFS at LLNL Blue 4 nodes, 16 tasks Total data size 1024MB I/O buffer size 1MB

50 April 28, 2008LCI Tutorial50 GPFS at LLNL ASCI Blue machine 4 nodes, 16 tasks Total data size 1024MB I/O buffer size 1MB Effects of I/O Hints: IBM_largeblock_io

51 April 28, 2008LCI Tutorial51 Parallel Tools ph5diff Parallel version of the h5diff tool h5perf Performance measuring tools showing I/O performance for different I/O API

52 April 28, 2008LCI Tutorial52 ph5diff An parallel version of the h5diff tool Supports all features of h5diff An MPI parallel tool Manager process (proc 0) coordinates each the remaining processes (workers) to “diff” one dataset at a time; collects any output from each worker and prints them out. Works best if there are many datasets in the files with few differences. Available in v1.8.

53 April 28, 2008LCI Tutorial53 h5perf An I/O performance measurement tool Test 3 File I/O API Posix I/O (open/write/read/close…) MPIO (MPI_File_{open,write,read.close}) PHDF5 H5Pset_fapl_mpio (using MPI-IO) H5Pset_fapl_mpiposix (using Posix I/O)

54 April 28, 2008LCI Tutorial54 h5perf: Some features Check (-c) verify data correctness Added 2-D chunk patterns in v1.8

55 April 28, 2008LCI Tutorial55 Useful Parallel HDF Links Parallel HDF information site http://www.hdfgroup.org/HDF5/PHDF5/ Parallel HDF5 tutorial available at http://www.hdfgroup.org/HDF5/Tutor/ HDF Help email address help@hdfgroup.org

56 April 28, 2008LCI Tutorial56 Questions? End of Part IV

57 April 28, 2008LCI Tutorial57 Caching and Buffering in HDF5 Tutorial Part V

58 April 28, 2008LCI Tutorial58 Software stack and the “magic box” Life cycle: What happens to data when it is transferred from application buffer to HDF5 file? File or other “storage” Virtual file I/O Library internals Object API Application Data buffer H5Dwrite Magic box Unbuffered I/O Data in a file

59 April 28, 2008LCI Tutorial59 Inside the magic box Understanding of what is happening to data inside the magic box will help to write efficient applications HDF5 library has mechanisms to control behavior inside the magic box Goals of this talk: Describe some basic operations and data structures and explain how they affect performance and storage sizes Give some “recipes” for how to improve performance

60 April 28, 2008LCI Tutorial60 Topics Dataset metadata and array data storage layouts Types of dataset storage layouts Factors affecting I/O performance I/O with compact datasets I/O with contiguous datasets I/O with chunked datasets Variable length data and I/O

61 April 28, 2008LCI Tutorial61 HDF5 dataset metadata and array data storage layouts

62 April 28, 2008LCI Tutorial62 HDF5 Dataset Data array Ordered collection of identically typed data items distinguished by their indices Metadata Dataspace: Rank, dimensions of dataset array Datatype: Information on how to interpret data Storage Properties: How array is organized on disk Attributes: User-defined metadata (optional)

63 April 28, 2008LCI Tutorial63 HDF5 Dataset DataMetadata Dataspace 3 Rank Dim_2 = 5 Dim_1 = 4 Dimensions Time = 32.4 Pressure = 987 Temp = 56 Attributes Chunked Compressed Dim_3 = 7 Storage info IEEE 32-bit float Datatype

64 April 28, 2008LCI Tutorial64 Metadata cache and array data Dataset array data typically kept in application memory Dataset header in separate space – metadata cache Application memory Metadata cache File Dataset array data HDF5 metadata Dataset array data Dataset header …………. Datatype Dataspace …………. Attributes …

65 April 28, 2008LCI Tutorial65 Metadata and metadata cache HDF5 metadata Information about HDF5 objects used by the library Examples: object headers, B-tree nodes for group, B-Tree nodes for chunks, heaps, super-block, etc. Usually small compared to raw data sizes (KB vs. MB-GB) Metadata cache Space allocated to handle pieces of the HDF5 metadata Allocated by the HDF5 library in application’s memory space Cache behavior affects overall performance Metadata cache implementation prior to HDF5 1.6.5 could cause performance degradation for some applications

66 April 28, 2008LCI Tutorial66 Types of data storage layouts

67 April 28, 2008LCI Tutorial67 HDF5 datasets storage layouts Contiguous Chunked Compact

68 April 28, 2008LCI Tutorial68 Contiguous storage layout Metadata header separate from raw data Raw data stored in one contiguous block on disk Application memory Metadata cache Dataset header …………. Datatype Dataspace …………. Attributes … File Dataset array data

69 April 28, 2008LCI Tutorial69 Chunked storage Chunking – storage layout where a dataset is partitioned in fixed-size multi-dimensional tiles or chunks Used for extendible datasets and datasets with filters applied (checksum, compression) HDF5 library treats each chunk as atomic object Greatly affects performance and file sizes

70 April 28, 2008LCI Tutorial70 Chunked storage layout Raw data divided into equal sized blocks (chunks). Each chunk stored separately as a contiguous block on disk Application memory Metadata cache Dataset header …………. Datatype Dataspace …………. Attributes … header File Dataset array data A BC ADCB D Chunk index

71 April 28, 2008LCI Tutorial71 Compact storage layout Data array and metadata stored together in the header File* * “File” may in fact be a collection of files, memory, or other storage destination. Application memory Dataset header …………. Datatype Dataspace …………. Attributes … Data Metadata cache Array data

72 April 28, 2008LCI Tutorial72 Factors affecting I/O performance

73 April 28, 2008LCI Tutorial73 What goes on inside the magic box? Operations on data inside the magic box Copying to/from internal buffers Datatype conversion Scattering - gathering Data transformation (filters, compression) Data structures used B-trees (groups, dataset chunks) Hash tables Local and Global heaps (variable length data: link names, strings, etc.) Other concepts HDF5 metadata, metadata cache Chunking, chunk cache

74 April 28, 2008LCI Tutorial74 Operations on data inside the magic box Copying to/from internal buffers Datatype conversion, such as float  integer LE  BE 64-bit integer to 16-bit integer Scattering - gathering Data is scattered/gathered from/to application buffers into internal buffers for datatype conversion and partial I/O Data transformation (filters, compression) Checksum on raw data and metadata (in 1.8.0) Algebraic transform GZIP and SZIP compressions User-defined filters

75 April 28, 2008LCI Tutorial75 I/O performance depends on Storage layouts Dataset storage properties Chunking strategy Metadata cache performance Datatype conversion performance Other filters, such as compression Access patterns

76 April 28, 2008LCI Tutorial76 I/O with different storage layouts

77 April 28, 2008LCI Tutorial77 Writing compact dataset Application memory Dataset header …………. Datatype Dataspace …………. Attributes … Data File Metadata cache Array data One write to store header and data array

78 April 28, 2008LCI Tutorial78 Writing contiguous dataset – no conversion Application memory Metadata cache Dataset header …………. Datatype Dataspace …………. Attributes … File Dataset array data

79 April 28, 2008LCI Tutorial79 Writing a contiguous dataset with datatype conversion Dataset header …………. Datatype Dataspace …………. Attribute 1 Attribute 2 ………… Application memory Metadata cache File Conversion buffer 1MB Dataset array data

80 April 28, 2008LCI Tutorial80 Partial I/O with contiguous datasets

81 April 28, 2008LCI Tutorial81 Writing whole dataset – contiguous rows File Application data in memory Data is contiguous in a file One I/O operation M rows M N

82 April 28, 2008LCI Tutorial82 Sub-setting of contiguous dataset Series of adjacent rows File N Application data in memory Subset – contiguous in file One I/O operation M rows M Entire dataset – contiguous in file

83 April 28, 2008LCI Tutorial83 Sub-setting of contiguous dataset Adjacent, partial rows File N M … Application data in memory Data is scattered in a file in M contiguous blocks Several small I/O operation N elements

84 April 28, 2008LCI Tutorial84 Sub-setting of contiguous dataset Extreme case: writing a column N M Application data in memory Subset data is scattered in a file in M different locations Several small I/O operation … 1 element

85 April 28, 2008LCI Tutorial85 Sub-setting of contiguous dataset Data sieve buffer File N M … Application data in memory Data is scattered in a file in M contiguous blocks 1 element Data is gathered in a sieve buffer in memory 64K memcopy

86 April 28, 2008LCI Tutorial86 Performance tuning for contiguous dataset Datatype conversion Avoid for better performance Use H5Pset_buffer function to customize conversion buffer size Partial I/O Write/read in big contiguous blocks Use H5Pset_sieve_buf_size to improve performance for complex subsetting

87 April 28, 2008LCI Tutorial87 I/O with Chunking

88 April 28, 2008LCI Tutorial88 Reminder – chunked storage layout Application memory Metadata cache Dataset header …………. Datatype Dataspace …………. Attributes … header File Dataset array data A BC ADCB D Chunk index

89 April 28, 2008LCI Tutorial89 Information about chunking HDF5 library treats each chunk as atomic object Compression is applied to each chunk Datatype conversion, other filters applied per chunk Chunk size greatly affects performance Chunk overhead adds to file size Chunk processing involves many steps Chunk cache Caches chunks for better performance Created for each chunked dataset Size of chunk cache is set for file (default size 1MB) Each chunked dataset has its own chunk cache Chunk may be too big to fit into cache Memory may grow if application keeps opening datasets

90 April 28, 2008LCI Tutorial90 Chunk cache Dataset_1 header ………… Application memory Metadata cache Chunking B-tree nodes Chunk cache Default size is 1MB Dataset_N header ………… ………

91 April 28, 2008LCI Tutorial91 Writing chunked dataset CB A ………….. Compression performed when chunk evicted from the chunk cache Other filters applied as data goes through filter pipeline ABC C File Chunk cacheChunked dataset Filter pipeline

92 April 28, 2008LCI Tutorial92 Partial I/O with Chunking

93 April 28, 2008LCI Tutorial93 Partial I/O for chunked dataset Example: write the green subset from the dataset, converting the data Dataset is stored as six chunks in the file. The subset spans four chunks, numbered 1-4 in the figure. Hence four chunks must be written to the file. But first, the four chunks must be read from the file, to preserve those parts of each chunk that are not to be overwritten. 12 34

94 April 28, 2008LCI Tutorial94 Partial I/O for chunked dataset For each of the four chunks: Read chunk from file into chunk cache, unless it’s already there. Determine which part of the chunk will be replaced by the selection. Replace that part of the chunk in the cache with the corresponding elements from the application’s array. Move those elements to conversion buffer and perform conversion Move those elements back from conversion buffer to chunk cache. Apply filters (compression) when chunk is flushed from chunk cache For each element 3 memcopy performed

95 April 28, 2008LCI Tutorial95 Partial I/O for chunked dataset 3 Application memory memcopy Application buffer Chunk Elements participating in I/O are gathered into corresponding chunk Chunk cache 3

96 April 28, 2008LCI Tutorial96 Partial I/O for chunked dataset 3 Conversion buffer Memcopy Application memory Chunk cache File Chunk Compress and write to file

97 April 28, 2008LCI Tutorial97 Variable length data and I/O

98 April 28, 2008LCI Tutorial98 Examples of variable length data String A[0] “the first string we want to write” ………………………………… A[N-1] “the N-th string we want to write” Each element is a record of variable-length A[0] (1,1,0,0,0,5,6,7,8,9) [length = 10] A[1] (0,0,110,2005) [length = 4] ……………………….. A[N] (1,2,3,4,5,6,7,8,9,10,11,12,….,M) [length = M]

99 April 28, 2008LCI Tutorial99 Variable length data in HDF5 Variable length description in HDF5 application typedef struct { size_t length; void *p; }hvl_t; Base type can be any HDF5 type H5Tvlen_create(base_type) ~ 20 bytes overhead for each element Data cannot be compressed

100 April 28, 2008LCI Tutorial100 How variable length data is stored in HDF5 Global heap Actual variable length data Dataset with variable length elements Pointer into global heap File Dataset header

101 April 28, 2008LCI Tutorial101 Variable length datasets and I/O When writing variable length data, elements in application buffer point to global heaps in the metadata cache where actual data is stored. Global heap Application buffer Raw data

102 April 28, 2008LCI Tutorial102 There may be more than one global heap Global heap Application buffer Raw data Global heap

103 April 28, 2008LCI Tutorial103 Variable length datasets and I/O Raw data Global heap File

104 April 28, 2008LCI Tutorial104 VL chunked dataset in a file File Dataset header Chunk B-tree Dataset chunksHeaps with VL data

105 April 28, 2008LCI Tutorial105 Writing chunked VL datasets Dataset header ………… Application memory Metadata cache B-tree nodes Chunk cache ……… Conversion buffer Raw data Global heap Chunk cache VL chunked dataset with selected region File Filter pipeline

106 April 28, 2008LCI Tutorial106 Hints for variable length data I/O Avoid closing/opening a file while writing VL datasets Global heap information is lost Global heaps may have unused space Avoid alternately writing different VL datasets Data from different datasets will go into to the same heap If maximum length of the record is known, consider using fixed-length records and compression

107 April 28, 2008LCI Tutorial107 Questions? End of Part V


Download ppt "April 28, 2008LCI Tutorial1 Parallel HDF5 Tutorial Tutorial Part IV."

Similar presentations


Ads by Google