March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner.

March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner The HDF Group Ed Hartnett Unidata/UCAR

March 9, 200910th International LCI Conference - HDF5 Tutorial2 Outline 8:30 – 9:30 Introduction to HDF5 data, programming models and tools 9:30 – 10:00 Advanced features of the HDF5 library 10:30 – 11:30 Advanced features of the HDF5 library (continued) 11:30 – 12:00 Introduction to Parallel HDF5 1:00 – 2:30 Introduction to Parallel HDF5 (continued) and Parallel I/O Performance Study 3:00 – 4:30 NetCDF-4

March 9, 200910th International LCI Conference - HDF5 Tutorial3 Introduction to HDF5 Data, Programming Models and Tools

March 9, 200910th International LCI Conference - HDF5 Tutorial4 What is HDF?

March 9, 200910th International LCI Conference - HDF5 Tutorial5 HDF is… HDF stands for Hierarchical Data Format A file format for managing any kind of data Software system to manage data in the format Designed for high volume or complex data Designed for every size and type of system Open format and software library, tools There are two HDF’s: HDF4 and HDF5 Today we focus on HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial6 Brief History of HDF 1987At NCSA (University of Illinois), a task force formed to create an architecture-independent format and library: AEHOO (All Encompassing Hierarchical Object Oriented format) Became HDF Early NASA adopted HDF for Earth Observing System project 1990’s 1996 DOE’s ASC (Advanced Simulation and Computing) Project began collaborating with the HDF group (NCSA) to create “Big HDF” (Increase in computing power of DOE systems at LLNL, LANL and Sandia National labs, required bigger, more complex data files). “Big HDF” became HDF5. 1998 HDF5 was released with support from National Labs, NASA, NCSA 2006 The HDF Group spun off from University of Illinois as non-profit corporation

March 9, 200910th International LCI Conference - HDF5 Tutorial7 7 Why HDF5? In one sentence...

March 9, 200910th International LCI Conference - HDF5 Tutorial8 8 Matter and the universe Weather and climate August 24, 2001August 24, 2002 Total Column Ozone (Dobson) 60 385 610 Life and nature Answering big questions …

March 9, 200910th International LCI Conference - HDF5 Tutorial9 9 … involves big data …

March 9, 200910th International LCI Conference - HDF5 Tutorial10 LCI Tutorial 10 … varied data … Thanks to Mark Miller, LLNL

March 9, 200910th International LCI Conference - HDF5 Tutorial11 Contig Summaries Discrepancies Contig Qualities Coverage Depth Read qualityRead quality Aligned bases ContigContig Reads Percent match TraceTrace SNP ScoreSNP Score … and complex relationships …

March 9, 200910th International LCI Conference - HDF5 Tutorial12 … on big computers … … and small computers …

March 9, 200910th International LCI Conference - HDF5 Tutorial13 How do we… Describe our data? Read it? Store it? Find it? Share it? Mine it? Move it into, out of, and between computers and repositories? Achieve storage and I/O efficiency? Give applications and tools easy access our data?

March 9, 200910th International LCI Conference - HDF5 Tutorial14 Solution: HDF5! Can store all kinds of data in a variety of ways Runs on most systems Lots of tools to access data Emphasis on standards (HDF-EOS, CGNS) Library and format emphasis on I/O efficiency and storage

March 9, 200910th International LCI Conference - HDF5 Tutorial15 A single platform with multiple uses One general formatOne general format One library, withOne library, with Options to adapt I/O and storage to data needsOptions to adapt I/O and storage to data needs Layers on top and belowLayers on top and below Ability to interact well with other technologiesAbility to interact well with other technologies Attention to past, present, future compatibilityAttention to past, present, future compatibility HDF5 Philosophy

March 9, 200910th International LCI Conference - HDF5 Tutorial16 Who uses HDF5?

March 9, 200910th International LCI Conference - HDF5 Tutorial17 Who uses HDF5? Applications that deal with big or complex data Over 200 different types of apps 2+million product users world-wide Academia, government agencies, industry

March 9, 200910th International LCI Conference - HDF5 Tutorial18 NASA EOS remote sense data HDF format is the standard file format for storing data from NASA's Earth Observing System (EOS) mission. Petabytes of data stored in HDF and HDF5 to support the Global Climate Change Research Program.

March 9, 200910th International LCI Conference - HDF5 Tutorial19 File or other “storage” Virtual file I/O Library internals Structure of HDF5 Library Object API (C, F90, C++, Java) ApplicationsApplications

March 9, 200910th International LCI Conference - HDF5 Tutorial20 HDF Tools - HDFView and Java Products - Command-line utilities (h5dump, h5ls, h5cc, h5diff, h5repack)

March 9, 200910th International LCI Conference - HDF5 Tutorial21 HDF5 Applications & Domains Simulation, visualization, remote sensing… Examples: Thermonuclear simulations Product modeling Data mining tools Visualization tools Climate models HDF-EOS CGNS ASC Storage File on parallel file system File Split metadata and raw data files User-defined device ? HDF5 format HDF5 Data Model & API StdioCustomSplit Files MPI I/O Communities Virtual File Layer (I/O Drivers)

March 9, 200910th International LCI Conference - HDF5 Tutorial22 HDF5 The Format

March 9, 200910th International LCI Conference - HDF5 Tutorial23 An HDF5 “file” is a container… lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 palettepalette …into which you can put your data objects…into which you can put your data objects

March 9, 200910th International LCI Conference - HDF5 Tutorial24 Structures to organize objects palettepalette Raster imageRaster image 3-D array3-D array 2-D array2-D array Raster imageRaster image lat | lon | templat | lon | temp ----|-----|---------|-----|----- 12 | 23 | 3.1 12 | 23 | 3.1 15 | 24 | 4.2 15 | 24 | 4.2 17 | 21 | 3.6 17 | 21 | 3.6 TableTable “/” (root)“/” (root) “/foo”“/foo” “Groups” “Datasets”

March 9, 200910th International LCI Conference - HDF5 Tutorial25 HDF5 model Groups – provide structure among objects Datasets – where the primary data goes Data arrays Rich set of datatype options Flexible, efficient storage and I/O Attributes, for metadata Everything else is built essentially from these parts.Everything else is built essentially from these parts.

March 9, 200910th International LCI Conference - HDF5 Tutorial26 HDF5 The Software

March 9, 200910th International LCI Conference - HDF5 Tutorial27 HDF5 I/O Library HDF5 I/O Library Tools, Applications, Libraries Tools, Applications, Libraries HDF5 File HDF5 File HDF5 Software

March 9, 200910th International LCI Conference - HDF5 Tutorial28 “Virtual file layer” (VFL) HDF5 Application Programming Interface Tools & Applications Tools & Applications “HDF5 File” “HDF5 File” File system, MPI-IO, SAN, other layersFile system, MPI-IO, SAN, other layers Modules to adapt I/O to specific features of system, or do I/O in some special way. “File” could be on parallel system, in memory, collection of files, etc. Applications, tools use this API to create, read, write, query, etc. Power users (consumers) Most data consumers are here. Scientific/engineering applications. Domain-specific libraries/API, tools. Users of HDF5 Software

March 9, 200910th International LCI Conference - HDF5 Tutorial29 HDF5 Data Model

March 9, 200910th International LCI Conference - HDF5 Tutorial30 HDF5 model (recap) Groups – provide structure among objects Datasets – where the primary data goes Data arrays Rich set of datatype options Flexible, efficient storage and I/O Attributes, for metadata Other objects Links (point to data in a file or in another HDF5 file) Datatypes (can be stored for complex structures and reused by multiple datatsets)

March 9, 200910th International LCI Conference - HDF5 Tutorial31 HDF5 Dataset DataMetadata Dataspace 3 RankRank Dim_2 = 5 Dim_1 = 4 DimensionsDimensions Time = 32.4 Pressure = 987 Temp = 56 AttributesAttributes Chunked Compressed Dim_3 = 7 Storage infoStorage info IEEE 32-bit float DatatypeDatatype

March 9, 200910th International LCI Conference - HDF5 Tutorial32 HDF5 Dataspace Two roles Dataspace contains spatial info about a dataset stored in a file Rank and dimensions Permanent part of dataset definition Dataspace describes application’s data buffer and data elements participating in I/O Rank = 2Rank = 2 Dimensions = 4x6Dimensions = 4x6 Rank = 1Rank = 1 Dimensions = 12Dimensions = 12

March 9, 200910th International LCI Conference - HDF5 Tutorial33 HDF5 Datatype Datatype – how to interpret a data element Permanent part of the dataset definition Two classes: atomic and compound Can be stored in a file as an HDF5 object (HDF5 committed datatype) Can be shared among different datasets

March 9, 200910th International LCI Conference - HDF5 Tutorial34 HDF5 Datatype HDF5 atomic types include normal integer & float user-definable (e.g., 13-bit integer) variable length types (e.g., strings) references to objects/dataset regions enumeration - names mapped to integers array HDF5 compound types Comparable to C structs (“records”) Members can be atomic or compound types

March 9, 200910th International LCI Conference - HDF5 Tutorial35 RecordRecord int8int4int16 2x3x2 array of float322x3x2 array of float32 Datatype:Datatype: HDF5 dataset: array of records Dimensionality: 5 x 3Dimensionality: 5 x 3 3 5

March 9, 200910th International LCI Conference - HDF5 Tutorial36 Special storage options for dataset Better subsetting access time; compressible; extendable; chunked Improves storage efficiency, transmission speed compressed Arrays can be extended in any direction extendabl e Metadata for Fred Dataset “Fred” File A File B Data for Fred Metadata in HDF5 file, raw data in a binary file external

March 9, 200910th International LCI Conference - HDF5 Tutorial37 HDF5 Attribute Attribute – data of the form “name = value”, attached to an object by application Operations similar to dataset operations, but … Not extendible No compression or partial I/O Can be overwritten, deleted, added during the “life” of a dataset

March 9, 200910th International LCI Conference - HDF5 Tutorial38 HDF5 Group A mechanism for organizing collections of related objects Every file starts with a root group Similar to UNIX directories Can have attributes “/”

March 9, 200910th International LCI Conference - HDF5 Tutorial39 “/” X temp / (root) /X /Y /Y/temp /Y/bar/temp Path to HDF5 object in a file Y bar

March 9, 200910th International LCI Conference - HDF5 Tutorial40 Shared HDF5 objects /A/P/A/P /B/R/B/R /C/R/C/R “/” A B C P R R

March 9, 200910th International LCI Conference - HDF5 Tutorial41 HDF5 Data Model Example ENSIGHT Automotive crash simulation

March 9, 20094210th International LCI Conference - HDF5 Tutorial

Automotive crash simulation March 9, 20094310th International LCI Conference - HDF5 Tutorial

Automotive crash simulation March 9, 20094410th International LCI Conference - HDF5 Tutorial

Solid modeling March 9, 20094510th International LCI Conference - HDF5 Tutorial

Solid modeling March 9, 20094610th International LCI Conference - HDF5 Tutorial

HDF5mesh March 9, 20094710th International LCI Conference - HDF5 Tutorial

April 28, 2008LCI Tutorial48 Mesh Example, in HDFView March 9, 20094810th International LCI Conference - HDF5 Tutorial

March 9, 200910th International LCI Conference - HDF5 Tutorial49 HDF5 Software

March 9, 200910th International LCI Conference - HDF5 Tutorial50 Tools & ApplicationsTools & Applications HDF FileHDF File HDF I/O LibraryHDF I/O Library HDF5 software stack

March 9, 200910th International LCI Conference - HDF5 Tutorial51 Virtual file I/O (C only)Virtual file I/O (C only)  Perform byte-stream I/O operations (open/close, read/write, seek)  User-implementable I/O (stdio, network, memory, etc.) Virtual file I/O (C only)Virtual file I/O (C only)  Perform byte-stream I/O operations (open/close, read/write, seek)  User-implementable I/O (stdio, network, memory, etc.) Library internalsLibrary internals Performs data transformations and other prep for I/O Configurable transformations (compression, etc.) Library internalsLibrary internals Performs data transformations and other prep for I/O Configurable transformations (compression, etc.) Structure of HDF5 Library Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++)  Specify objects and transformation properties  Invoke data movement operations and data transformations Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++)  Specify objects and transformation properties  Invoke data movement operations and data transformations

March 9, 200910th International LCI Conference - HDF5 Tutorial52 Write – from memory to disk memorymemory diskdisk

March 9, 200910th International LCI Conference - HDF5 Tutorial53 Partial I/O (b) Regular series of blocks from a 2D array to a contiguous sequence at a certain offset in a 1D array memorymemory diskdisk (a) Hyperslab from a 2D array to the corner of a smaller 2D array memorymemory diskdisk Move just part of a dataset

March 9, 200910th International LCI Conference - HDF5 Tutorial54 (c) A sequence of points from a 2D array to a sequence of points in a 3D array. memorymemory diskdisk (d) Union of hyperslabs in file to union of hyperslabs in memory. Partial I/O memorymemory diskdisk Move just part of a dataset

March 9, 200910th International LCI Conference - HDF5 Tutorial55 Layers – parallel example Application Parallel computing system (Linux cluster) Comput e node I/O library (HDF5) Parallel I/O library (MPI-I/O) Parallel file system (GPFS) Switch network/I/O servers Comput e node Disk architecture & layout of data on diskDisk architecture & layout of data on disk I/O flows through many layers from application to disk.

March 9, 200910th International LCI Conference - HDF5 Tutorial56 Virtual file I/O (C only)Virtual file I/O (C only) Library internalsLibrary internals Virtual I/O layer Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++)

March 9, 200910th International LCI Conference - HDF5 Tutorial57 Virtual file I/O layer A public API for writing I/O drivers Allows HDF5 to interface to disk, memory, or a user-defined device ??? CustomFile FamilyMPI I/OCore Virtual file I/O drivers Memory Stdio File FamilyFile Family FileFile “Storage”

Applications & Domains March 9, 200910th International LCI Conference - HDF5 Tutorial58 Storage File on parallel file system File Split metadata and raw data files User-defined device ? System memory HDF5 format HDF5 data model & APIHDF5 data model & API Simulation, visualization, remote sensing… Examples: Thermonuclear simulations Product modeling Data mining tools Visualization tools Climate models Common domain-specific data models HDF5 virtual file layer (I/O drivers)HDF5 virtual file layer (I/O drivers) MPI I/OMPI I/O MultiMulti StdioStdio CustomCustom CoreCore HDF5 serial & parallel I/OHDF5 serial & parallel I/O UDMSAFH5PartHDF-EOSIDL Domain- specific APIs LANLLLNL, SNLGrids COTSNASA

March 9, 200910th International LCI Conference - HDF5 Tutorial59 Portability & Robustness Runs almost anywhere Linux and UNIX workstations Windows, Mac OS X Big ASC machines, Crays, VMS systems TeraGrid and other clusters Source and binaries available from http://www.hdfgroup.org/HDF5/release/index.html QA Daily regression tests on key platforms Meets NASA’s highest technology readiness level

March 9, 200910th International LCI Conference - HDF5 Tutorial60 Other Software The HDF Group HDFView Java tools Command-line utilities Web browser plug-in Regression and performance testing software Parallel h5diff 3 rd Party (IDL, MATLAB, Mathematica, PyTables, HDF Explorer, LabView) Communities (EOS, ASC, CGNS) Integration with other software (iRODS, OPeNDAP)

March 9, 200910th International LCI Conference - HDF5 Tutorial61 Creating an HDF5 File with HDFView

March 9, 200910th International LCI Conference - HDF5 Tutorial62 A B “/” (root) Example: Create this HDF5 File 4x6 array of integers Storm

March 9, 200910th International LCI Conference - HDF5 Tutorial63 Demo Demonstrate the use of HDFView to create the HDF5 file Use h5dump to see the contents of the HDF5 file Use h5import to add data to the HDF5 file Use h5repack to change properties of the stored objects Use h5diff to compare two files

March 9, 200910th International LCI Conference - HDF5 Tutorial64 Introduction to HDF5 Programming Model and APIs

March 9, 200910th International LCI Conference - HDF5 Tutorial65 Virtual file I/O API (C only)Virtual file I/O API (C only)  Perform byte-stream I/O operations (open/close, read/write, seek)  User-implementable I/O (stdio, mpi-io, memory, etc.) Virtual file I/O API (C only)Virtual file I/O API (C only)  Perform byte-stream I/O operations (open/close, read/write, seek)  User-implementable I/O (stdio, mpi-io, memory, etc.) Library internalsLibrary internals Performs data transformations and other prep for I/O Configurable transformations (compression, etc.) Library internalsLibrary internals Performs data transformations and other prep for I/O Configurable transformations (compression, etc.) Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++)  Specify objects and transformation properties  Invoke data movement operations and data transformations Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++)  Specify objects and transformation properties  Invoke data movement operations and data transformations Structure of HDF5 Library (recap)

March 9, 200910th International LCI Conference - HDF5 Tutorial66 Goals of HDF5 Library Provide flexible API to support a wide range of operations on data. Support high performance access in serial and parallel computing environments. Be compatible with common data models and programming languages. Because of these goals, the HDF5 API is rich and large

March 9, 200910th International LCI Conference - HDF5 Tutorial67 Operations Supported by the API Create groups, datasets, attributes, linkages Create complex data types Assign storage and I/O properties to objects Perform complex subsetting during read/write Use variety of I/O “devices” (parallel, remote, etc.) Transform data during I/O Query about file and structure and properties Query about object structure, content, properties

March 9, 200910th International LCI Conference - HDF5 Tutorial68 Characteristics of the HDF5 API For flexibility, the API is extensive 300+ functions This can be daunting… but there is hope A few functions can do a lot Start simple Build up knowledge as more features are needed Library functions are categorized by object type “H5Lite” API supports basic capabilities Victronix Swiss Army Cybertool 34

March 9, 200910th International LCI Conference - HDF5 Tutorial69 The General HDF5 API Currently C, Fortran 90, Java, and C++ bindings. C routines begin with prefix H5? ? is a character corresponding to the type of object the function acts on Example APIs: H5D :Dataset interface e.g., H5Dread H5F : File interface e.g., H5Fopen H5S : dataSpace interface e.g., H5Sclose

March 9, 200910th International LCI Conference - HDF5 Tutorial70 Compiling HDF5 Applications h5cc – HDF5 C compiler command Similar to mpicc h5fc – HDF5 F90 compiler command Similar to mpif90 h5c++ – HDF5 C++ compiler command To compile: % h5cc h5prog.c % h5fc h5prog.f90

March 9, 200910th International LCI Conference - HDF5 Tutorial71 Compile option: -show -show: displays the compiler commands and options without executing them % h5cc –show Sample_c.c gcc -I/home/packages/hdf5_1.6.6/Linux_2.6/include -UH5_DEBUG_API -DNDEBUG -I/home/packages/szip/static/encoder/Linux2.6-gcc/include -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_POSIX_SOURCE -D_BSD_SOURCE -std=c99 -Wno-long-long -O -fomit-frame-pointer -finline-functions -c Sample_c.c gcc -std=c99 -Wno-long-long -O -fomit-frame-pointer -finline-functions -L/home/packages/szip/static/encoder/Linux2.6-gcc/lib Sample_c.o -L/home/packages/hdf5_1.6.6/Linux_2.6/lib /home/packages/hdf5_1.6.6/Linux_2.6/lib/libhdf5_hl.a /home/packages/hdf5_1.6.6/Linux_2.6/lib/libhdf5.a -lsz -lz -lm -Wl,-rpath -Wl,/home/packages/hdf5_1.6.6/Linux_2.6/lib

March 9, 200910th International LCI Conference - HDF5 Tutorial72 General Programming Paradigm Properties of object are optionally defined Creation properties Access property lists Default values used if none are defined Object is opened or created Object is accessed, possibly many times Object is closed

March 9, 200910th International LCI Conference - HDF5 Tutorial73 Order of Operations An order is imposed on operations by argument dependencies For Example: A file must be opened before a dataset -because- the dataset open call requires a file handle as an argument. Objects can be closed in any order.

March 9, 200910th International LCI Conference - HDF5 Tutorial74 HDF5 Defined Types For portability, the HDF5 library has its own defined types: hid_t: object identifiers (native integer) hsize_t: size used for dimensions (unsigned long or unsigned long long) hssize_t: for specifying coordinates and sometimes for dimensions (signed long or signed long long) herr_t: function return value hvl_t: variable length datatype For C, include hdf5.h in your HDF5 application.

Example: Create this HDF5 File March 9, 200910th International LCI Conference - HDF5 Tutorial75 A B “/” (root) 4x6 array of integers

Example: Step by Step March 9, 200910th International LCI Conference - HDF5 Tutorial76 A 4x6 array of integers B “/” (root)

March 9, 200910th International LCI Conference - HDF5 Tutorial77 “/” (root) Example: Create a File

March 9, 200910th International LCI Conference - HDF5 Tutorial78 Steps to Create a File 1.Decide any special properties the file should have Creation properties, like size of user block Access properties, such as metadata cache size 2.Create property lists, if necessary 3.Create the file 4.Close the file and the property lists, as needed

March 9, 200910th International LCI Conference - HDF5 Tutorial79 Code: Create a File hid_t file_id; file_id = H5Fcreate("file.h5",H5F_ACC_TRUNC, H5P_DEFAULT,H5P_DEFAULT); H5F_ACC_TRUNC flag – removes existing file H5P_DEFAULT flags – create regular UNIX file and access it with HDF5 SEC2 I/O file driver

Example: Add a Dataset March 9, 200910th International LCI Conference - HDF5 Tutorial80 4x6 array of integers A “/” (root)

March 9, 200910th International LCI Conference - HDF5 Tutorial81 Dataset Components DataMetadata Dataspace 3 RankRank Dim_2 = 5 Dim_1 = 4 DimensionsDimensions Time = 32.4 Pressure = 987 Temp = 56 AttributesAttributes Chunked Compressed Dim_3 = 7 Storage infoStorage info IEEE 32-bit float DatatypeDatatype

March 9, 200910th International LCI Conference - HDF5 Tutorial82 Dataset Creation Property List Dataset creation property list: information on how to store data in a file ChunkedChunked Chunked & compressedChunked & compressed

March 9, 200910th International LCI Conference - HDF5 Tutorial83 Steps to Create a Dataset 1.Define dataset characteristics Dataspace – 4x6 Datatype – integer Properties (if needed) 2.Decide where to put it – “root group” Obtain location identifier 3.Decide link or path – “A” 4.Create link and dataset in file 5.(Eventually) Close everything A “/” (root)

March 9, 200910th International LCI Conference - HDF5 Tutorial84 1 hid_t file_id, dataset_id, dataspace_id; 2 hsize_t dims[2]; 3 herr_t status; 4 file_id = H5Fcreate (”file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); 5 dims[0] = 4; 6 dims[1] = 6; 7 dataspace_id = H5Screate_simple (2, dims, NULL); 8 dataset_id = H5Dcreate(file_id,”A",H5T_STD_I32BE, dataspace_id, H5P_DEFAULT); 9 status = H5Dclose (dataset_id); 10 status = H5Sclose (dataspace_id); 11 status = H5Fclose (file_id); Code: Create a Dataset Terminate access to dataset, dataspace, file Create a dataspace ran k current dims Create a dataset dataspac e datatyp e property list (default) pathnam e

March 9, 200910th International LCI Conference - HDF5 Tutorial85 A B “/” (root) Example: Create a Group 4x6 array of integers file.h5

March 9, 200910th International LCI Conference - HDF5 Tutorial86 Steps to Create a Group 1.Decide where to put it – “root group” Obtain location identifier 2.Decide link or path – “B” 3.Create link and group in file Specify number of bytes to store names of objects to be added to group (as a hint) – or use default. 4.(Eventually) Close the group.

March 9, 200910th International LCI Conference - HDF5 Tutorial87 Code: Create a Group hid_t file_id, group_id;... /* Open “file.h5” */ file_id = H5Fopen(“file.h5”, H5F_ACC_RDWR, H5P_DEFAULT); /* Create group "/B" in file. */ group_id = H5Gcreate(file_id, "/B", H5P_DEFAULT, H5P_DEFAULT); /* Close group and file. */ status = H5Gclose(group_id); status = H5Fclose(file_id);

March 9, 200910th International LCI Conference - HDF5 Tutorial88 HDF5 Information HDF Information Center http://www.hdfgroup.org HDF Help email address help@hdfgroup.org HDF users mailing lists news@hdfgroup.org hdf-forum@hdfgroup.org

March 9, 200910th International LCI Conference - HDF5 Tutorial89 Questions?

March 9, 200910th International LCI Conference - HDF5 Tutorial90 Introduction to HDF5 Command-line Tools

March 9, 200910th International LCI Conference - HDF5 Tutorial91 HDF5 Command-line Tools Readers h5dump, h5diff, h5ls h5stat, h5check (new in release 1.8) Writers h5import, h5repack, h5repart, h5jam/h5unjam h5copy, h5mkgrp (new in release 1.8) Converters h4toh5, h5toh4, gif2h5, h52gif

March 9, 200910th International LCI Conference - HDF5 Tutorial92 h5dump h5dump: exports (dumps) the contents of an HDF5 file  Multiple output types  ASCII  binary  XML  Complete or selected file content  Object header information (the structure)  Attributes (the metadata)  Datasets (the data)  All dataset values  Subsets of dataset values  Properties (filters, storage layout, fill value)  Specific objects: groups/ datasets/ attributes / named datatypes / soft links  h5dump –help  Lists all option flags

Example: h5dump HDF5 "Sample.h5" { GROUP "/" { GROUP "Floats" { DATASET "FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } DATA { (0,0): 0.01, 0.02, 0.03, (1,0): 0.1, 0.2, 0.3, (2,0): 1, 2, 3, (3,0): 10, 20, 30 } DATASET "IntArray" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 5, 6 ) / ( 5, 6 ) } DATA { (0,0): 0, 1, 2, 3, 4, 5, (1,0): 10, 11, 12, 13, 14, 15, (2,0): 20, 21, 22, 23, 24, 25, (3,0): 30, 31, 32, 33, 34, 35, (4,0): 40, 41, 42, 43, 44, 45 } No options: “All” contents to standard out March 9, 200910th International LCI Conference - HDF5 Tutorial93 % h5dump Sample.h5

h5dump - object header information HDF5 "Sample.h5" { GROUP "/" { GROUP "Floats" { DATASET "FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } } DATASET "IntArray" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 5, 6 ) / ( 5, 6 ) } } -H option: Object header information March 9, 200910th International LCI Conference - HDF5 Tutorial94 % h5dump –H Sample.h5

h5dump – specific dataset HDF5 "Sample.h5" { DATASET "/Floats/FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } DATA { (0,0): 0.01, 0.02, 0.03, (1,0): 0.1, 0.2, 0.3, (2,0): 1, 2, 3, (3,0): 10, 20, 30 } -d dataset option: Specific dataset March 9, 200910th International LCI Conference - HDF5 Tutorial95 % h5dump –d /Floats/FloatArray Sample.h5

h5dump – dataset values to file HDF5 "Sample.h5" { DATASET "/IntArray" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 5, 6 ) / ( 5, 6 ) } DATA { } -o file option: Dataset values output to file March 9, 200910th International LCI Conference - HDF5 Tutorial96 % h5dump –o Ofile –d /IntArray Sample.h5 (0,0): 0, 1, 2, 3, 4, 5, (1,0): 10, 11, 12, 13, 14, 15, (2,0): 20, 21, 22, 23, 24, 25, (3,0): 30, 31, 32, 33, 34, 35, (4,0): 40, 41, 42, 43, 44, 45 % cat Ofile -y option: Do not output array indices with data values

March 9, 200910th International LCI Conference - HDF5 Tutorial97 h5dump – binary output -b FORMAT option: Binary output, FORMAT can be: MEMORY  Data exported with datatypes matching memory on system where h5dump is run. FILE  Data exported with datatypes matching those in HDF5 file being dumped. LE  Data exported with pre-defined little endian datatype. BE  Data exported with pre-defined big endian datatype. Typically used with –d dataset -o outputFile options  Allows data values to be exported for use with other applications.  When –b and –d used together, array indices are not output.

h5dump – binary output 0000000 000 000 000 000 000 000 000 001 000 000 000 002 000 000 000 003 0000020 000 000 000 004 000 000 000 005 000 000 000 012 000 000 000 013 March 9, 200910th International LCI Conference - HDF5 Tutorial98 % h5dump –b BE –d /IntArray -o OBE Sample.h5 % od –b OBE | head -2 % h5dump –b LE –d /IntArray -o OLE Sample.h5 % od –b OLE | head -2 0000000 000 000 000 000 001 000 000 000 002 000 000 000 003 000 000 000 0000020 004 000 000 000 005 000 000 000 012 000 000 000 013 000 000 000 % h5dump –b MEMORY –d /IntArray -o OME Sample.h5 % od –b OME | head -2 0000000 000 000 000 000 001 000 000 000 002 000 000 000 003 000 000 000 0000020 004 000 000 000 005 000 000 000 012 000 000 000 013 000 000 000

h5dump – properties information HDF5 "Sample.h5" { GROUP "/" { GROUP "Floats" { DATASET "FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } STORAGE_LAYOUT { CONTIGUOUS SIZE 48 OFFSET 3696 } FILTERS { NONE } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE 0 } ALLOCATION_TIME { H5D_ALLOC_TIME_LATE } … -p option: Print dataset filters, storage layout, fill value March 9, 200910th International LCI Conference - HDF5 Tutorial99 % h5dump –p –H Sample.h5

March 9, 200910th International LCI Conference - HDF5 Tutorial100 h5import h5import: loads data into an existing or new HDF5 file Data loaded from ASCII or binary files Each file corresponds to data values for one dataset Integer (signed or unsigned) and float data can be loaded Per-dataset settable properties include: datatype (int or float; size; architecture; byte order) storage (compression, chunking, external file, maximum dimensions) Properties set via command line % h5import in in_opts [in2 in2_opts] –o out configuration file % h5import in –c conf1 [in2 –c conf2] –o out

Example: h5import PATH /Floats/FloatArray INPUT-CLASS TEXTFP RANK 2 DIMENSION-SIZES 4 3 Create Sample2.h5 based on Sample.h5 March 9, 200910th International LCI Conference - HDF5 Tutorial101 % cat config.FloatArray 0.01 0.02 0.03 0.1 0.2 0.3 1 2 3 10 20 30 % cat in.FloatArray HDF5 "Sample.h5" { DATASET “/Float/FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } DATA { 0.01, 0.02, 0.03, 0.1, 0.2, 0.3, 1, 2, 3, 10, 20, 30 } % h5dump –d Floats/FloatArray –y Sample.h5

Example: h5import PATH /IntArray INPUT-CLASS TEXTIN RANK 2 DIMENSION-SIZES 5 6 March 9, 200910th International LCI Conference - HDF5 Tutorial102 % cat config.IntArray 012345 101112131415 202122232425 303132383435 404142434445 % cat in.IntArray  Input and configuration files ready; issue command % h5import in.FloatArray -c config.FloatArray \ in.IntArray -c config.IntArray -o Sample2.h5

March 9, 200910th International LCI Conference - HDF5 Tutorial103 h5mkgrp h5mkgrp: makes groups in an HDF5 file. Usage: h5mkgrp [OPTIONS] FILE GROUP... OPTIONS -h, --help Print a usage message and exit -l, --latest Use latest version of file format to create groups -p, --parents No error if existing, make parent groups as needed -v, --verbose Print information about OBJECTS and OPTIONS -V, --version Print version number and exit Example: % h5mkgrp Sample2.h5 /EmptyGroup Introduced in HDF5 release 1.8.0.

March 9, 200910th International LCI Conference - HDF5 Tutorial104 h5diff h5diff: compares HDF5 files and reports differences compare two HDF5 files % h5diff file1 file2 compare same object in two files % h5diff file1 file2 object compare different objects in two files % h5diff file1 file2 object1 object2 Option flags: none: report number of differences found in objects and where they occurred -r: in addition, report the differences -v: in addition, print list of object(s) and warnings; typically used when comparing two files without specifying object(s)

Example: h5diff file1 file2 --------------------------------------- x x / x /EmptyGroup x x /Floats x x /Floats/FloatArray x x /IntArray group : and 0 differences found group : and 0 differences found dataset: and 0 differences found dataset: and size: [5x6] [5x6] position IntArray IntArray difference ------------------------------------------------------------------- [ 3 3 ] 33 38 5 March 9, 200910th International LCI Conference - HDF5 Tutorial105 % h5diff –v Sample.h5 Sample2.h5

March 9, 200910th International LCI Conference - HDF5 Tutorial106 h5repack h5repack: copies an HDF5 file to a new file with specified filter and storage layout Removes unused space introduced when…  Objects were deleted  Compressed datasets were updated and no longer fit in original space  Full space allocated for variable-length data not used Optionally applies filter to datasets  gzip, szip, shuffle, checksum Optionally applies storage layout to datasets  Continuous, chunking, compact

March 9, 200910th International LCI Conference - HDF5 Tutorial107 h5repack: filters Compression will not be performed if data is smaller than 1K unless –m flag is used. -f FILTER option: Apply filter, FILTER can be: GZIP to apply GZIP compression SZIPto apply SZIP compression SHUFto apply the HDF5 shuffle filter FLET to apply the HDF5 checksum filter NBITto apply NBIT compression SOFF to apply the HDF5 Scale/Offset filter NONE to remove all filters

March 9, 200910th International LCI Conference - HDF5 Tutorial108 h5repack: storage layout -f LAYOUT option: Apply layout, LAYOUT can be: CHUNK to apply chunking layout COMPAto apply compact layout CONTIto apply continuous layout

 33% reduction in file size Example: h5repack (filter) 75608 TES-Aura.he5 56808 TES-rp.he5 March 9, 200910th International LCI Conference - HDF5 Tutorial109 % h5repack –f SHUF –f GZIP=1 TES-Aura.he5 \ TES-rp.he5 % ls –sk TES-Aura.he5 TES-rp.he5 Tropspheric Emission Spectrometer on Aura, the third of NASA's Earth Observing System's spacecrafts. Makes global 3-d measurements of ozone and other chemical species involved in its formation and destruction.

Example: h5repack (layout) March 9, 200910th International LCI Conference - HDF5 Tutorial110 % h5repack –m 1 –l Floats/FloatArray:CHUNK=4x1 \ Sample.h5 Sample-rp.h5 HDF5 "Sample-rp.h5" { GROUP "/" { GROUP "Floats" { DATASET "FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } STORAGE_LAYOUT { CHUNKED ( 4, 1 ) SIZE 48 } FILTERS { NONE } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE 0 } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } … % h5dump –p –H Sample-rp.h5

March 9, 200910th International LCI Conference - HDF5 Tutorial111 Performance Tuning & Troubleshooting HDF5 tools can assist with performance tuning and troubleshooting  Discover objects and their properties in HDF5 files h5dump -p  Get file size overhead information h5stat  Find locations of objects in a file h5ls  Discover differences h5diff, h5ls  Location of raw data h5ls –var  Does file conform to HDF5 File Format Specification? h5check

March 9, 200910th International LCI Conference - HDF5 Tutorial112 h5stat h5stat: Prints statistics about HDF5 files Reports two types of statistics:  High-level information about objects:  Number of different objects (groups, datasets, datatypes)  Number of unique datatypes  Size of raw data  Information about object’s structural metadata  Size of structural metadata (total/free) Object header, local and global heaps Size of B-trees  Object header fragmentation

March 9, 200910th International LCI Conference - HDF5 Tutorial113 h5stat Helps…  troubleshoot size overhead in HDF5 files  choose appropriate properties and storage strategies Usage: % h5stat –help % h5stat file.h5 Full specification at :  http://www.hdfgroup.uiuc.edu/RFC/HDF5/h5stat/ http://www.hdfgroup.uiuc.edu/RFC/HDF5/h5stat/ Introduced in HDF5 release 1.8.0.

h5check Verifies that a file is encoded according to the HDF5 File Format Specification  http://www.hdfgroup.org/HDF5/doc/H5.format.html http://www.hdfgroup.org/HDF5/doc/H5.format.html Does not use the HDF5 library Used to confirm that the files written by the HDF5 library are compliant with the specification Tool is not part of the HDF5 source code distribution  ftp://ftp.hdfgroup.org/HDF5/special_tools/h5check/ ftp://ftp.hdfgroup.org/HDF5/special_tools/h5check/ March 9, 200910th International LCI Conference - HDF5 Tutorial114

March 9, 200910th International LCI Conference - HDF5 Tutorial116 HDF5 Advanced Topics

March 9, 200910th International LCI Conference - HDF5 Tutorial117 Outline Part I Overview of HDF5 datatypes Part II Partial I/O in HDF5 Hyperslab selection Dataset region references Chunking and compression Part III Performance issues (how to do it right) Part IV Performance benefits of HDF5 version 1.8

March 9, 200910th International LCI Conference - HDF5 Tutorial118 Part I HDF5 Datatypes Quick overview of the most difficult topics

March 9, 200910th International LCI Conference - HDF5 Tutorial119 HDF5 Datatypes HDF5 has a rich set of pre-defined datatypes and supports the creation of an unlimited variety of complex user-defined datatypes. Datatype definitions are stored in the HDF5 file with the data. Datatype definitions include information such as byte order (endianess), size, and floating point representation to fully describe how the data is stored and to insure portability across platforms. Datatype definitions can be shared among objects in an HDF file, providing a powerful and efficient mechanism for describing data.

March 9, 200910th International LCI Conference - HDF5 Tutorial120 Example Array of integers on IA32 platform Native integer is little-endian, 4 bytes H5T_SDT_I32LE H5Dwrite Array of integers on SPARC64 platform Native integer is big-endian, 8 bytes H5T_NATIVE_INT H5Dread Little-endian 4 bytes integer VAX G-floating H5Dwrite

March 9, 200910th International LCI Conference - HDF5 Tutorial121 Storing Variable Length Data in HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial122 Data Time Data Time HDF5 Fixed and Variable Length Array Storage

March 9, 200910th International LCI Conference - HDF5 Tutorial123 Storing Strings in HDF5 Array of characters (Array datatype or extra dimension in dataset) Quick access to each character Extra work to access and interpret each string Fixed length string_id = H5Tcopy(H5T_C_S1); H5Tset_size(string_id, size); Wasted space in shorter strings Can be compressed Variable length string_id = H5Tcopy(H5T_C_S1); H5Tset_size(string_id, H5T_VARIABLE); Overhead as for all VL datatypes Compression will not be applied to actual data

March 9, 200910th International LCI Conference - HDF5 Tutorial124 Storing Variable Length Data in HDF5 Each element is represented by C structure typedef struct { size_t length; void *p; } hvl_t; Base type can be any HDF5 type H5Tvlen_create(base_type)

March 9, 200910th International LCI Conference - HDF5 Tutorial125 Data Example hvl_t data[LENGTH]; for(i=0; i<LENGTH; i++) { data[i].p=malloc((i+1)*sizeof(unsigned int)); data[i].len=i+1; } tvl = H5Tvlen_create (H5T_NATIVE_UINT); data[0].p data[4].len

March 9, 200910th International LCI Conference - HDF5 Tutorial126 Reading HDF5 Variable Length Array hvl_t rdata[LENGTH]; /* Create the memory vlen type */ tvl = H5Tvlen_create (H5T_NATIVE_UINT); ret = H5Dread(dataset,tvl,H5S_ALL,H5S_ALL, H5P_DEFAULT, rdata); /* Reclaim the read VL data */ H5Dvlen_reclaim(tvl,H5S_ALL,H5P_DEFAULT,rdat a); On read HDF5 Library allocates memory to read data in, application only needs to allocate array of hvl_t elements (pointers and lengths).

March 9, 200910th International LCI Conference - HDF5 Tutorial127 Storing Tables in HDF5 file

March 9, 200910th International LCI Conference - HDF5 Tutorial128 Example a_name (integer) b_name (float) c_name (double) 00.1.0000 11.0.5000 24.0.3333 39.0.2500 416.0.2000 525.0.1667 636.0.1429 749.0.1250 864.0.1111 981.0.1000 Multiple ways to store a table Dataset for each field Dataset with compound datatype If all fields have the same type: 2-dim array 1-dim array of array datatype continued….. Choose to achieve your goal! How much overhead each type of storage will create? Do I always read all fields? Do I need to read some fields more often? Do I want to use compression? Do I want to access some records?

March 9, 200910th International LCI Conference - HDF5 Tutorial129 HDF5 Compound Datatypes Compound types Comparable to C structs Members can be atomic or compound types Members can be multidimensional Can be written/read by a field or set of fields Not all data filters can be applied (shuffling, SZIP)

March 9, 200910th International LCI Conference - HDF5 Tutorial130 HDF5 Compound Datatypes Which APIs to use? H5TB APIs Create, read, get info and merge tables Add, delete, and append records Insert and delete fields Limited control over table’s properties (i.e. only GZIP compression, level 6, default allocation time for table, extendible, etc.) PyTables http://www.pytables.orghttp://www.pytables.org Based on H5TB Python interface Indexing capabilities HDF5 APIs H5Tcreate(H5T_COMPOUND), H5Tinsert calls to create a compound datatype H5Dcreate, etc. See H5Tget_member* functions for discovering properties of the HDF5 compound datatype

March 9, 200910th International LCI Conference - HDF5 Tutorial131 Creating and Writing Compound Dataset h5_compound.c example typedef struct s1_t { int a; float b; double c; } s1_t; s1_t s1[LENGTH];

March 9, 200910th International LCI Conference - HDF5 Tutorial132 Creating and Writing Compound Dataset /* Create datatype in memory. */ s1_tid = H5Tcreate (H5T_COMPOUND, sizeof(s1_t)); H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT); H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT); Note: Use HOFFSET macro instead of calculating offset by hand. Order of H5Tinsert calls is not important if HOFFSET is used.

March 9, 200910th International LCI Conference - HDF5 Tutorial133 Creating and Writing Compound Dataset /* Create dataset and write data */ dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT, H5P_DEFAULT); status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1); Note: In this example memory and file datatypes are the same. Type is not packed. Use H5Tpack to save space in the file. status = H5Tpack(s1_tid); status = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT, H5P_DEFAULT);

March 9, 200910th International LCI Conference - HDF5 Tutorial134 File Content with h5dump HDF5 "SDScompound.h5" { GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE { H5T_STD_I32BE "a_name"; H5T_IEEE_F32BE "b_name"; H5T_IEEE_F64BE "c_name"; } DATASPACE { SIMPLE ( 10 ) / ( 10 ) } DATA { { [ 0 ], [ 1 ] }, { [ 1 ], …

March 9, 200910th International LCI Conference - HDF5 Tutorial135 Reading Compound Dataset /* Create datatype in memory and read data. */ dataset = H5Dopen(file, DATASETNAME, H5P_DEFAULT); s2_tid = H5Dget_type(dataset); mem_tid = H5Tget_native_type (s2_tid); s1 = malloc(H5Tget_size(mem_tid)*number_of_elements); status = H5Dread(dataset, mem_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1); Note: We could construct memory type as we did in writing example. For general applications we need to discover the type in the file, find out corresponding memory type, allocate space and do read.

March 9, 200910th International LCI Conference - HDF5 Tutorial136 Reading Compound Dataset by Fields typedef struct s2_t { double c; int a; } s2_t; s2_t s2[LENGTH]; … s2_tid = H5Tcreate (H5T_COMPOUND, sizeof(s2_t)); H5Tinsert(s2_tid, "c_name", HOFFSET(s2_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s2_tid, “a_name", HOFFSET(s2_t, a), H5T_NATIVE_INT); … status = H5Dread(dataset, s2_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s2);

March 9, 200910th International LCI Conference - HDF5 Tutorial137 New Way of Creating Datatypes Another way to create a compound datatype #include H5LTpublic.h ….. s2_tid = H5LTtext_to_dtype( "H5T_COMPOUND {H5T_NATIVE_DOUBLE \"c_name\"; H5T_NATIVE_INT \"a_name\"; }", H5LT_DDL);

March 9, 200910th International LCI Conference - HDF5 Tutorial138 Need Help with Datatypes? Check our support web pages http://www.hdfgroup.uiuc.edu/UserSupport/ex amples-by-api/api18-c.htmlhttp://www.hdfgroup.uiuc.edu/UserSupport/ex amples-by-api/api18-c.html http://www.hdfgroup.uiuc.edu/UserSupport/ex amples-by-api/api16-c.htmlhttp://www.hdfgroup.uiuc.edu/UserSupport/ex amples-by-api/api16-c.html

March 9, 200910th International LCI Conference - HDF5 Tutorial139 Part II Working with subsets

Collect data one way …. Array of images (3D) March 9, 200914010th International LCI Conference - HDF5 Tutorial

Stitched image (2D array) Display data another way … March 9, 200914110th International LCI Conference - HDF5 Tutorial

Data is too big to read…. March 9, 200914210th International LCI Conference - HDF5 Tutorial

Need to select and access the same elements of a dataset Refer to a region… March 9, 200914310th International LCI Conference - HDF5 Tutorial

March 9, 200910th International LCI Conference - HDF5 Tutorial144 HDF5 Library Features HDF5 Library provides capabilities to Describe subsets of data and perform write/read operations on subsets Hyperslab selections and partial I/O Store descriptions of the data subsets in a file Object references Region references Use efficient storage mechanism to achieve good performance while writing/reading subsets of data Chunking, compression

March 9, 200910th International LCI Conference - HDF5 Tutorial145 Partial I/O in HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial146 How to Describe a Subset in HDF5? Before writing and reading a subset of data one has to describe it to the HDF5 Library. HDF5 APIs and documentation refer to a subset as a “selection” or “hyperslab selection”. If specified, HDF5 Library will perform I/O on a selection only and not on all elements of a dataset.

March 9, 200910th International LCI Conference - HDF5 Tutorial147 Types of Selections in HDF5 Two types of selections Hyperslab selection Regular hyperslab Simple hyperslab Result of set operations on hyperslabs (union, difference, …) Point selection Hyperslab selection is especially important for doing parallel I/O in HDF5 (See Parallel HDF5 Tutorial)

March 9, 200910th International LCI Conference - HDF5 Tutorial148 Regular Hyperslab Collection of regularly spaced equal size blocks

March 9, 200910th International LCI Conference - HDF5 Tutorial149 Simple Hyperslab Contiguous subset or sub-array

March 9, 200910th International LCI Conference - HDF5 Tutorial150 Hyperslab Selection Result of union operation on three simple hyperslabs

March 9, 200910th International LCI Conference - HDF5 Tutorial151 Hyperslab Description Start - starting location of a hyperslab (1,1) Stride - number of elements that separate each block (3,2) Count - number of blocks (2,6) Block - block size (2,1) Everything is “measured” in number of elements

March 9, 200910th International LCI Conference - HDF5 Tutorial152 Simple Hyperslab Description Two ways to describe a simple hyperslab As several blocks Stride – (1,1) Count – (2,6) Block – (2,1) As one block Stride – (1,1) Count – (1,1) Block – (4,6) No performance penalty for one way or another

March 9, 200910th International LCI Conference - HDF5 Tutorial153 H5Sselect_hyperslab Function space_id Identifier of dataspace op Selection operator H5S_SELECT_SET or H5S_SELECT_OR start Array with starting coordinates of hyperslab stride Array specifying which positions along a dimension to select count Array specifying how many blocks to select from the dataspace, in each dimension block Array specifying size of element block (NULL indicates a block size of a single element in a dimension)

March 9, 200910th International LCI Conference - HDF5 Tutorial154 Reading/Writing Selections Programming model for reading from a dataset in a file 1.Open a dataset. 2.Get file dataspace handle of the dataset and specify subset to read from. a.H5Dget_space returns file dataspace handle a.File dataspace describes array stored in a file (number of dimensions and their sizes). b.H5Sselect_hyperslab selects elements of the array that participate in I/O operation. 3.Allocate data buffer of an appropriate shape and size

March 9, 200910th International LCI Conference - HDF5 Tutorial155 Reading/Writing Selections Programming model (continued) 4.Create a memory dataspace and specify subset to write to. 1.Memory dataspace describes data buffer (its rank and dimension sizes). 2.Use H5Screate_simple function to create memory dataspace. 3.Use H5Sselect_hyperslab to select elements of the data buffer that participate in I/O operation. 5.Issue H5Dread or H5Dwrite to move the data between file and memory buffer. 6.Close file dataspace and memory dataspace when done.

March 9, 200910th International LCI Conference - HDF5 Tutorial156 Example : Reading Two Rows 123456 789101112 131415161718 192021222324 Data in a file 4x6 matrix Buffer in memory 1-dim array of length 14

March 9, 200910th International LCI Conference - HDF5 Tutorial157 Example: Reading Two Rows 123456 789101112 131415161718 192021222324 start = {1,0} count = {2,6} block = {1,1} stride = {1,1} filespace = H5Dget_space (dataset); H5Sselect_hyperslab (filespace, H5S_SELECT_SET, start, NULL, count, NULL)

March 9, 200910th International LCI Conference - HDF5 Tutorial158 Example: Reading Two Rows start[1] = {1} count[1] = {12} dim[1] = {14} memspace = H5Screate_simple(1, dim, NULL); H5Sselect_hyperslab (memspace, H5S_SELECT_SET, start, NULL, count, NULL)

March 9, 200910th International LCI Conference - HDF5 Tutorial159 Example: Reading Two Rows 123456 789101112 131415161718 192021222324 789101112131415161718 H5Dread (…, …, memspace, filespace, …, …);

March 9, 200910th International LCI Conference - HDF5 Tutorial160 Things to Remember Number of elements selected in a file and in a memory buffer must be the same H5Sget_select_npoints returns number of selected elements in a hyperslab selection HDF5 partial I/O is tuned to move data between selections that have the same dimensionality; avoid choosing subsets that have different ranks (as in example above) Allocate a buffer of an appropriate size when reading data; use H5Tget_native_type and H5Tget_size to get the correct size of the data element in memory.

March 9, 200910th International LCI Conference - HDF5 Tutorial161 HDF5 Region References and Selections

Need to select and access the same elements of a dataset Saving Selected Region in a File March 9, 200916210th International LCI Conference - HDF5 Tutorial

March 9, 200910th International LCI Conference - HDF5 Tutorial163 Reference Datatype Reference to an HDF5 object Pointer to a group or a dataset in a file Predefined datatype H5T_STD_REG_OBJ describe object references Reference to a dataset region (or to selection) Pointer to the dataspace selection Predefined datatype H5T_STD_REF_DSETREG to describe regions

March 9, 200910th International LCI Conference - HDF5 Tutorial164 Reference to Dataset Region REF_REG.h5 Root Region ReferencesMatrix 1 1 2 3 3 4 5 5 6 1 2 2 3 4 4 5 6 6

March 9, 200910th International LCI Conference - HDF5 Tutorial165 Reference to Dataset Region Example dsetr_id = H5Dcreate(file_id, “REGION REFERENCES”, H5T_STD_REF_DSETREG, …); H5Sselect_hyperslab(space_id, H5S_SELECT_SET, start, NULL, …); H5Rcreate(&ref[0], file_id, “MATRIX”, H5R_DATASET_REGION, space_id); H5Dwrite(dsetr_id, H5T_STD_REF_DSETREG, H5S_ALL, H5S_ALL, H5P_DEFAULT,ref);

March 9, 200910th International LCI Conference - HDF5 Tutorial166 Reference to Dataset Region HDF5 "REF_REG.h5" { GROUP "/" { DATASET "MATRIX" { …… } DATASET "REGION_REFERENCES" { DATATYPE H5T_REFERENCE DATASPACE SIMPLE { ( 2 ) / ( 2 ) } DATA { (0): DATASET /MATRIX {(0,3)-(1,5)}, (1): DATASET /MATRIX {(0,0), (1,6), (0,8)} }

March 9, 200910th International LCI Conference - HDF5 Tutorial167 Chunking in HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial168 HDF5 Chunking Dataset data is divided into equally sized blocks (chunks). Each chunk is stored separately as a contiguous block in HDF5 file. Application memory Metadata cache Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Dataset data ADCB header Chunk index A B CD

March 9, 200910th International LCI Conference - HDF5 Tutorial169 HDF5 Chunking Chunking is needed for Enabling compression and other filters Extendible datasets

March 9, 200910th International LCI Conference - HDF5 Tutorial170 HDF5 Chunking If used appropriately chunking improves partial I/O for big datasets Only two chunks are involved in I/O

March 9, 200910th International LCI Conference - HDF5 Tutorial171 HDF5 Chunking Chunk has the same rank as a dataset Chunk’s dimensions do not need to be factors of dataset’s dimensions

March 9, 200910th International LCI Conference - HDF5 Tutorial172 Creating Chunked Dataset 1.Create a dataset creation property list. 2.Set property list to use chunked storage layout. 3.Create dataset with the above property list. dcpl_id = H5Pcreate(H5P_DATASET_CREATE); rank = 2; ch_dims[0] = 100; ch_dims[1] = 100; H5Pset_chunk(dcpl_id, rank, ch_dims); dset_id = H5Dcreate (…, dcpl_id); H5Pclose(dcpl_id);

March 9, 200910th International LCI Conference - HDF5 Tutorial173 Writing or Reading Chunked Dataset 1.Chunking mechanism is transparent to application. 2.Use the same set of operation as for contiguous dataset, for example, H5Dopen(…); H5Sselect_hyperslab (…); H5Dread(…); 3.Selections do not need to coincide precisely with the chunks boundaries.

March 9, 200910th International LCI Conference - HDF5 Tutorial174 HDF5 Filters HDF5 filters modify data during I/O operations Available filters: 1.Checksum (H5Pset_fletcher32) 2.Shuffling filter (H5Pset_shuffle) 3.Data transformation (in 1.8.*) 4.Compression Scale + offset (in 1.8.*) N-bit (in 1.8.*) GZIP (deflate), SZIP (H5Pset_deflate, H5Pset_szip) User-defined filters (BZIP2) Example of a user-defined compression filter can be found http://www.hdfgroup.uiuc.edu/papers/papers/bzip2/ http://www.hdfgroup.uiuc.edu/papers/papers/bzip2/

March 9, 200910th International LCI Conference - HDF5 Tutorial175 Creating Compressed Dataset 1.Create a dataset creation property list 2.Set property list to use chunked storage layout 3.Set property list to use filters 4.Create dataset with the above property list crp_id = H5Pcreate(H5P_DATASET_CREATE); rank = 2; ch_dims[0] = 100; ch_dims[1] = 100; H5Pset_chunk(crp_id, rank, ch_dims); H5Pset_deflate(crp_id, 9); dset_id = H5Dcreate (…, crp_id); H5Pclose(crp_id);

March 9, 200910th International LCI Conference - HDF5 Tutorial176 Writing Compressed Dataset CB A ………….. Default chunk cache size is 1MB. Filters including compression are applied when chunk is evicted from cache. Chunks in the file may have different sizes ABC C File Chunk cache (per dataset)Chunked dataset Filter pipeline

March 9, 200910th International LCI Conference - HDF5 Tutorial177 Chunking Basics to Remember Chunking creates storage overhead in the file. Performance is affected by Chunking and compression parameters Chunking cache size ( H5Pset_cache call) Some hints for getting better performance Use chunk size not smaller than block size (4k) on a file system. Use compression method appropriate for your data. Avoid using selections that do not coincide with the chunking boundaries.

March 9, 200910th International LCI Conference - HDF5 Tutorial178 Example Creates a compressed 1000x20 integer dataset in a file %h5dump –p –H zip.h5 HDF5 "zip.h5" { GROUP "/" { GROUP "Data" { DATASET "Compressed_Data" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 1000, 20 )……… STORAGE_LAYOUT { CHUNKED ( 20, 20 ) SIZE 5316 }

March 9, 200910th International LCI Conference - HDF5 Tutorial179 Example (continued) FILTERS { COMPRESSION DEFLATE { LEVEL 6 } } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE 0 } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR }

March 9, 200910th International LCI Conference - HDF5 Tutorial180 Example (bigger chunk) Creates a compressed integer dataset 1000x20 in a file; better compression ratio is achieved. h5dump –p –H zip.h5 HDF5 "zip.h5" { GROUP "/" { GROUP "Data" { DATASET "Compressed_Data" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 1000, 20 )……… STORAGE_LAYOUT { CHUNKED ( 200, 20 ) SIZE 2936 }

March 9, 200910th International LCI Conference - HDF5 Tutorial181 Part III Performance Issues (How to Do it Right)

March 9, 200910th International LCI Conference - HDF5 Tutorial182 Performance of Serial I/O Operations Next slides show the performance effects of using different access patterns and storage layouts. We use three test cases which consist of writing a selection to an array of characters. Data is stored in a row-major order. Tests were executed on THG Linux x86_64 box using h5perf_serial and HDF5 version 1.8.0

March 9, 200910th International LCI Conference - HDF5 Tutorial183 Serial Benchmarking Tool Benchmarking tool, h5perf_serial, publicly released with HDF5 1.8.1 Features inlcude: Support for POSIX and HDF5 I/O calls. Support for datasets and buffers with multiple dimensions. Entire dataset access using a single or several I/O operations. Selection of contiguous and chunked storage for HDF5 operations.

March 9, 200910th International LCI Conference - HDF5 Tutorial184 Contiguous Storage (Case 1) Rectangular dataset of size 48K x 48K, with write selections of 512 x 48K. HDF5 storage layout is contiguous. Good I/O pattern for POSIX and HDF5 because each selection is contiguous. POSIX: 5.19 MB/s HDF5: 5.36 MB/s 1 2 3 4 1234

March 9, 200910th International LCI Conference - HDF5 Tutorial185 Contiguous Storage (Case 2) Rectangular dataset of 48K x 48K, with write selections of 48K x 512. HDF5 storage layout is contiguous. Bad I/O pattern for POSIX and HDF5 because each selection is noncontiguous. POSIX: 1.24 MB/s HDF5: 0.05 MB/s 1234 12341234 …….

March 9, 200910th International LCI Conference - HDF5 Tutorial186 Chunked Storage Rectangular dataset of 48K x 48K, with write selections of 48K x 512. HDF5 storage layout is chunked. Chunks and selections sizes are equal. Bad I/O case for POSIX because selections are noncontiguous. Good I/O case for HDF5 since selections are contiguous due to chunking layout settings. POSIX: 1.51 MB/s HDF5: 5.58 MB/s 1234 1234 12341234 ……. POSIX HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial187 Conclusions Access patterns with small I/O operations incur high latency and overhead costs many times. Chunked storage may improve I/O performance by affecting the contiguity of the data selection.

Writing Chunked Dataset 1000x100x100 dataset 4 byte integers Random values 0-99 50x100x100 chunks (20 total) Chunk size: 2 MB Write the entire dataset using 1x100x100 slices Slices are written sequentially March 9, 200918810th International LCI Conference - HDF5 Tutorial

Test Setup 20 Chunks 1000 slices Chunk size is 2MB March 9, 200918910th International LCI Conference - HDF5 Tutorial

Test Setup (continued) Tests performed with 1 MB and 5MB chunk cache size Cache size set with H5Pset_cache function H5Pget_cache (fapl, NULL, &rdcc_nelmts, &rdcc_nbytes, &rdcc_w0); H5Pset_cache (fapl, 0, rdcc_nelmts, 5*1024*1024, rdcc_w0); Tests performed with no compression and with gzip (deflate) compression March 9, 200919010th International LCI Conference - HDF5 Tutorial

Effect of Chunk Cache Size on Write Cache sizeI/O operationsTotal data written File size 1 MB (default)100275.54 MB38.15 MB 5 MB2238.16 MB38.15 MB No compression Gzip compression Cache sizeI/O operationsTotal data written File size 1 MB (default)1982335.42 MB (322.34 MB read) 13.08 MB 5 MB2213.08 MB March 9, 200919110th International LCI Conference - HDF5 Tutorial

Effect of Chunk Cache Size on Write With the 1 MB cache size, a chunk will not fit into the cache All writes to the dataset must be immediately written to disk With compression, the entire chunk must be read and rewritten every time a part of the chunk is written to Data must also be decompressed and recompressed each time Non sequential writes could result in a larger file Without compression, the entire chunk must be written when it is first written to the file If the selection were not contiguous on disk, it could require as much as 1 I/O operation for each element March 9, 200919210th International LCI Conference - HDF5 Tutorial

Effect of Chunk Cache Size on Write With the 5 MB cache size, the chunk is written only after it is full Drastically reduces the number of I/O operations Reduces the amount of data that must be written (and read) Reduces processing time, especially with the compression filter March 9, 200919310th International LCI Conference - HDF5 Tutorial

Conclusion It is important to make sure that a chunk will fit into the raw data chunk cache If you will be writing to multiple chunks at once, you should increase the cache size even more Try to design chunk dimensions to minimize the number you will be writing to at once March 9, 200919410th International LCI Conference - HDF5 Tutorial

Reading Chunked Dataset Read the same dataset, again by slices, but the slices cross through all the chunks 2 orientations for read plane Plane includes fastest changing dimension Plane does not include fastest changing dimension Measure total read operations, and total size read Chunk sizes of 50x100x100, and 10x100x100 1 MB cache March 9, 200919510th International LCI Conference - HDF5 Tutorial

Chunks Read slices Vertical and horizontal Test Setup March 9, 200919610th International LCI Conference - HDF5 Tutorial

Results Read slice includes fastest changing dimension Chunk sizeCompressionI/O operationsTotal data read 50Yes20101307 MB 10Yes100121308 MB 50No10001038 MB 10No100123814 MB March 9, 200919710th International LCI Conference - HDF5 Tutorial

Results (continued) Read slice does not include fastest changing dimension Chunk sizeCompressionI/O operationsTotal data read 50Yes20101307 MB 10Yes100121308 MB 50No1000001038 MB 10No100123814 MB March 9, 200919810th International LCI Conference - HDF5 Tutorial

Effect of Cache Size on Read When compression is enabled, the library must always read each entire chunk once for each call to H5Dread. When compression is disabled, the library’s behavior depends on the cache size relative to the chunk size. If the chunk fits in cache, the library reads each entire chunk once for each call to H5Dread If the chunk does not fit in cache, the library reads only the data that is selected More read operations, especially if the read plane does not include the fastest changing dimension Less total data read March 9, 200919910th International LCI Conference - HDF5 Tutorial

Conclusion In this case cache size does not matter when reading if compression is enabled. Without compression, a larger cache may not be beneficial, unless the cache is large enough to hold all of the chunks. The optimum cache size depends on the exact shape of the data, as well as the hardware. March 9, 200920010th International LCI Conference - HDF5 Tutorial

Hints for Chunk Settings Chunk dimensions should align as closely as possible with hyperslab dimensions for read/write Chunk cache size ( rdcc_nbytes ) should be large enough to hold all the chunks in the selection If this is not possible, it may be best to disable chunk caching altogether (set rdcc_nbytes to 0) rdcc_nelmts should be a prime number that is at least 10 to 100 times the number of chunks that can fit into rdcc_nbytes rdcc_w0 should be set to 1 if chunks that have been fully read/written will never be read/written again March 9, 200910th International LCI Conference - HDF5 Tutorial201

March 9, 200910th International LCI Conference - HDF5 Tutorial202 Part IV Performance Benefits of HDF5 version 1.8

What Did We Do in HDF5 1.8? Extended File Format Specification Reviewed group implementations Introduced new link object Revamped metadata cache implementation Improved handling of datasets and datatypes Introduced shared object header message Extended error handling Enhanced backward/forward APIs and file format compatibility March 9, 200910th International LCI Conference - HDF5 Tutorial203

What Did We Do in HDF5 1.8? And much more good stuff to make HDF5 March 9, 200910th International LCI Conference - HDF5 Tutorial204 Better and Faster

March 9, 200910th International LCI Conference - HDF5 Tutorial205 HDF5 File Format Extension

March 9, 200910th International LCI Conference - HDF5 Tutorial206 HDF5 File Format Extension Why: Address deficiencies of the original file format Address space overhead in an HDF5 file Enable new features What: New routine that instructs the HDF5 library to create all objects using the latest version of the HDF5 file format (cmp. with the earliest version when object became available, for example, array datatype)

March 9, 200910th International LCI Conference - HDF5 Tutorial207 HDF5 File Format Extension Example /* Use the latest version of a file format for each object created in a file */ fapl_id = H5Pcreate(H5P_FILE_ACCESS); H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST); fid = H5Fcreate(…,…,…,fapl_id); or fid = H5Fopen(…,…,fapl_id);

March 9, 200910th International LCI Conference - HDF5 Tutorial208 Group Revisions

March 9, 200910th International LCI Conference - HDF5 Tutorial209 Better Large Group Storage Why: Faster, more scalable storage and access for large groups What: New format and method for storing groups with many links

March 9, 200910th International LCI Conference - HDF5 Tutorial210 Informal Benchmark Create a file and a group in a file Create up to 10^6 groups with one dataset in each group Compare files sizes and performance of HDF5 1.8.1 using the latest group format with the performance of HDF5 1.8.1 (default, old format) and 1.6.7 Note: Default 1.8.1 and 1.6.7 became very slow after 700000 groups

Time to Open and Read a Dataset March 9, 200910th International LCI Conference - HDF5 Tutorial211

File Size March 9, 200910th International LCI Conference - HDF5 Tutorial212

March 9, 200910th International LCI Conference - HDF5 Tutorial214 Data Storage and I/O in HDF5

March 9, 200910th International LCI Conference - HDF5 Tutorial215 Software stack Life cycle: What happens to data when it is transferred from application buffer to HDF5 file and from HDF5 file to application buffer? File or other “storage” Virtual file I/O Library internals Object API ApplicationApplication Data buffer H5Dwrite ? Unbuffered I/O Data in a file

March 9, 200910th International LCI Conference - HDF5 Tutorial216 Goals Understanding of what is happening to data inside the HDF5 library will help to write efficient applications Goals of this talk: Describe some basic operations and data structures, and explain how they affect performance and storage sizes Give some “recipes” for how to improve performance

March 9, 200910th International LCI Conference - HDF5 Tutorial217 Topics Dataset metadata and array data storage layouts Types of dataset storage layouts Factors affecting I/O performance I/O with compact datasets I/O with contiguous datasets I/O with chunked datasets Variable length data and I/O

March 9, 200910th International LCI Conference - HDF5 Tutorial218 HDF5 dataset metadata and array data storage layouts

March 9, 200910th International LCI Conference - HDF5 Tutorial219 HDF5 Dataset Data array Ordered collection of identically typed data items distinguished by their indices Metadata Dataspace: Rank, dimensions of dataset array Datatype: Information on how to interpret data Storage Properties: How array is organized on disk Attributes: User-defined metadata (optional)

March 9, 200910th International LCI Conference - HDF5 Tutorial220 HDF5 Dataset Dataset dataMetadata Dataspace 3 RankRank Dim_2 = 5 Dim_1 = 4 DimensionsDimensions Time = 32.4 Pressure = 987 Temp = 56 AttributesAttributes Chunked Compressed Dim_3 = 7 Storage infoStorage info IEEE 32-bit float DatatypeDatatype

March 9, 200910th International LCI Conference - HDF5 Tutorial221 Metadata cache and dataset data Dataset data typically kept in application memory Dataset header in separate space – metadata cache Application memory Metadata cache File Dataset data Dataset header Dataset data Dataset headerDataset header …………. Datatype Dataspace …………. Attributes …

March 9, 200910th International LCI Conference - HDF5 Tutorial222 Metadata and metadata cache HDF5 metadata Information about HDF5 objects used by the library Examples: object headers, B-tree nodes for group, B-Tree nodes for chunks, heaps, super-block, etc. Usually small compared to raw data sizes (KB vs. MB-GB)

March 9, 200910th International LCI Conference - HDF5 Tutorial223 Metadata and metadata cache Metadata cache Space allocated to handle pieces of the HDF5 metadata Allocated by the HDF5 library in application’s memory space Cache behavior affects overall performance Metadata cache implementation prior to HDF5 1.6.5 could cause performance degradation for some applications

March 9, 200910th International LCI Conference - HDF5 Tutorial224 Types of data storage layouts

March 9, 200910th International LCI Conference - HDF5 Tutorial225 HDF5 datasets storage layouts Contiguous Chunked Compact

March 9, 200910th International LCI Conference - HDF5 Tutorial226 Contiguous storage layout Metadata header separate from dataset data Data stored in one contiguous block in HDF5 file Application memory Metadata cache Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Dataset data

March 9, 200910th International LCI Conference - HDF5 Tutorial227 Chunked storage Chunking – storage layout where a dataset is partitioned in fixed-size multi-dimensional tiles or chunks Used for extendible datasets and datasets with filters applied (checksum, compression) HDF5 library treats each chunk as atomic object Greatly affects performance and file sizes

March 9, 200910th International LCI Conference - HDF5 Tutorial228 Chunked storage layout Dataset data divided into equal sized blocks (chunks) Each chunk stored separately as a contiguous block in HDF5 file Application memory Metadata cache Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Dataset data ADCB header Chunk index A B CD

March 9, 200910th International LCI Conference - HDF5 Tutorial229 Compact storage layout Dataset data and metadata stored together in the object header File Application memory Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … Metadata cacheDataset data

March 9, 200910th International LCI Conference - HDF5 Tutorial230 Factors affecting I/O performance

March 9, 200910th International LCI Conference - HDF5 Tutorial231 What goes on inside the library? Operations on data inside the library Copying to/from internal buffers Datatype conversion Scattering - gathering Data transformation (filters, compression) Data structures used B-trees (groups, dataset chunks) Hash tables Local and Global heaps (variable length data: link names, strings, etc.) Other concepts HDF5 metadata, metadata cache Chunking, chunk cache

March 9, 200910th International LCI Conference - HDF5 Tutorial232 Operations on data inside the library Copying to/from internal buffers Datatype conversion, such as float  integer Little-endian  big-endian 64-bit integer to 16-bit integer Scattering - gathering Data is scattered/gathered from/to application buffers into internal buffers for datatype conversion and partial I/O Data transformation (filters, compression) Checksum on raw data and metadata (in 1.8.0) Algebraic transform GZIP and SZIP compressions User-defined filters

March 9, 200910th International LCI Conference - HDF5 Tutorial233 I/O performance I/O performance depends on Storage layouts Dataset storage properties Chunking strategy Metadata cache performance Datatype conversion performance Other filters, such as compression Access patterns

March 9, 200910th International LCI Conference - HDF5 Tutorial234 I/O with different storage layouts

March 9, 200910th International LCI Conference - HDF5 Tutorial235 Writing a compact dataset Application memory Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Metadata cache Dataset data One write to store header and dataset data Dataset data

March 9, 200910th International LCI Conference - HDF5 Tutorial236 Writing contiguous dataset – no conversion Application memory Metadata cache Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Dataset data No sub-setting in memory or a file is performed

March 9, 200910th International LCI Conference - HDF5 Tutorial237 Writing a contiguous dataset with datatype conversion Dataset header …………. Datatype Dataspace …………. Attribute 1 Attribute 2 ………… Application memory Metadata cache File Conversion buffer 1MB Dataset data No sub-setting in memory or a file is performed

March 9, 200910th International LCI Conference - HDF5 Tutorial238 Partial I/O with contiguous datasets

March 9, 200910th International LCI Conference - HDF5 Tutorial239 Writing whole dataset – contiguous rows File Application data in memory Data is contiguous in a file One I/O operation M rows M N

March 9, 200910th International LCI Conference - HDF5 Tutorial240 Sub-setting of contiguous dataset Series of adjacent rows File N Application data in memory Subset – contiguous in a file One I/O operation M rows M Entire dataset – contiguous in a file

March 9, 200910th International LCI Conference - HDF5 Tutorial241 Sub-setting of contiguous dataset Adjacent, partial rows File N M … Application data in memory Data is scattered in a file in M contiguous blocks Several small I/O operation N elements

March 9, 200910th International LCI Conference - HDF5 Tutorial242 Sub-setting of contiguous dataset Extreme case: writing a column N M Application data in memory Subset data is scattered in a file in M different locations Several small I/O operation … 1 element

March 9, 200910th International LCI Conference - HDF5 Tutorial243 Sub-setting of contiguous dataset Data sieve buffer File N M … Application data in memory Data is scattered in a file 1 element Data in a sieve buffer (64K) in memory memcopy

March 9, 200910th International LCI Conference - HDF5 Tutorial244 Performance tuning for contiguous dataset Datatype conversion Avoid for better performance Use H5Pset_buffer function to customize conversion buffer size Partial I/O Write/read in big contiguous blocks Use H5Pset_sieve_buf_size to improve performance for complex subsetting

March 9, 200910th International LCI Conference - HDF5 Tutorial245 I/O with Chunking

March 9, 200910th International LCI Conference - HDF5 Tutorial246 Chunked storage layout Raw data divided into equal sized blocks (chunks) Each chunk stored separately as a contiguous block in a file Application memory Metadata cache Dataset headerDataset header …………. Datatype Dataspace …………. Attributes … File Dataset data AD CB header Chunk index A B CD

March 9, 200910th International LCI Conference - HDF5 Tutorial247 Information about chunking HDF5 library treats each chunk as atomic object Compression and other filters are applied to each chunk Datatype conversion is performed on each chunk Chunk size greatly affects performance Chunk overhead adds to file size Chunk processing involves many steps Chunk cache Caches chunks for better performance Size of chunk cache is set for file (default size 1MB) Each chunked dataset has its own chunk cache Chunk may be too big to fit into cache Memory may grow if application keeps opening datasets

March 9, 200910th International LCI Conference - HDF5 Tutorial248 Chunk cache Dataset_1 header ………… Application memory Metadata cache Chunking B-tree nodes Chunk cache Default size is 1MB Dataset_N header ………… ………

March 9, 200910th International LCI Conference - HDF5 Tutorial249 Writing chunked dataset CB A ………….. Filters including compression are applied when chunk is evicted from cache ABC C File Chunk cacheChunked dataset Filter pipeline

March 9, 200910th International LCI Conference - HDF5 Tutorial250 Partial I/O with Chunking

March 9, 200910th International LCI Conference - HDF5 Tutorial251 Partial I/O for chunked dataset Example: write the green subset from the dataset, converting the data Dataset is stored as six chunks in the file. The subset spans four chunks, numbered 1-4 in the figure. Hence four chunks must be written to the file. But first, the four chunks must be read from the file, to preserve those parts of each chunk that are not to be overwritten. 12 34

March 9, 200910th International LCI Conference - HDF5 Tutorial252 Partial I/O for chunked dataset For each of four chunks on writing: Read chunk from file into chunk cache, unless it’s already there Determine which part of the chunk will be replaced by the selection Move those elements from application buffer to conversion buffer Perform conversion Replace that part of the chunk in the cache with the corresponding elements from the conversion buffer Apply filters (compression) when chunk is flushed from chunk cache For each element 3 (or more) memcopy operations are performed 12 34

March 9, 200910th International LCI Conference - HDF5 Tutorial253 Partial I/O for chunked dataset 3 Application memory conversion buffer Application buffer Chunk Elements participating in I/O are gathered into corresponding chunk after going through conversion buffer Chunk cache 3

March 9, 200910th International LCI Conference - HDF5 Tutorial254 Partial I/O for chunked dataset 3 Conversion buffer Application memory Chunk cache File Chunk Apply filters and write to file Application buffer

March 9, 200910th International LCI Conference - HDF5 Tutorial255 Variable length data and I/O

March 9, 200910th International LCI Conference - HDF5 Tutorial256 Examples of variable length data String A[0] “the first string we want to write” ………………………………… A[N-1] “the N-th string we want to write” Each element is a record of variable-length A[0] (1,1,0,0,0,5,6,7,8,9) [length = 10] A[1] (0,0,110,2005) [length = 4] ……………………….. A[N] (1,2,3,4,5,6,7,8,9,10,11,12,….,M) [length = M]

March 9, 200910th International LCI Conference - HDF5 Tutorial257 Variable length data in HDF5 Variable length description in HDF5 application typedef struct { size_t length; void *p; }hvl_t; Base type can be any HDF5 type H5Tvlen_create(base_type) ~ 20 bytes overhead for each element Data cannot be compressed

March 9, 200910th International LCI Conference - HDF5 Tutorial258 Variable length data storage in HDF5 Glob al heapGlob al heap Actual variable length data Dataset with variable length elements Pointer into global heap FileFile Dataset header

March 9, 200910th International LCI Conference - HDF5 Tutorial259 Variable length datasets and I/O When writing variable length data, elements in application buffer always go through conversion and are copied to the global heaps in a metadata cache before ending in a file Global heap Application buffer Metadata cache conversion buffer

March 9, 200910th International LCI Conference - HDF5 Tutorial260 There may be more than one global heap Globa l heapGloba l heap Raw dataRaw data Globa l heapGloba l heap Metadata cache Application memoryApplication memory Raw VL data Application bufferApplication buffer Conversion buffer Raw VL data On a write request, VL data goes through conversion and is written to a global heap; elements of the same dataset may be written to different heaps.

March 9, 200910th International LCI Conference - HDF5 Tutorial261 Variable length datasets and I/O File Global heapGlobal heap Raw dataRaw data Global heapGlobal heap Metadata cache Application memoryApplication memory Raw VL data Application bufferApplication buffer Conversion buffer Raw VL data

March 9, 200910th International LCI Conference - HDF5 Tutorial262 VL chunked dataset in a file File Dataset header Chunk B-tree Dataset chunksHeaps with VL data

March 9, 200910th International LCI Conference - HDF5 Tutorial263 Writing chunked VL datasets Dataset header ………… Application memory Metadata cache B-tree nodes Chunk cache ……… Conversion buffer VL data Raw dat a Global heap Chunk cache Data in application buffers File Filter pipeline hvl_t pointers 1 4

March 9, 200910th International LCI Conference - HDF5 Tutorial264 Hints for variable length data I/O Avoid closing/opening a file while writing VL datasets Global heap information is lost Global heaps may have unused space Avoid alternately writing different VL datasets Data from different datasets will go into to the same heap If maximum length of the record is known, consider using fixed-length records and compression

March 9, 200910th International LCI Conference - HDF5 Tutorial266 Parallel HDF5 Tutorial Albert Cheng The HDF Group

March 9, 200910th International LCI Conference - HDF5 Tutorial267 Parallel HDF5 Introductory Tutorial

March 9, 200910th International LCI Conference - HDF5 Tutorial268 Outline Overview of Parallel HDF5 design Setting up parallel environment Programming model for Creating and accessing a File Creating and accessing a Dataset Writing and reading Hyperslabs Parallel tutorial available at http://www.hdfgroup.org/HDF5/Tutor/

March 9, 200910th International LCI Conference - HDF5 Tutorial269 Overview of Parallel HDF5 Design

March 9, 200910th International LCI Conference - HDF5 Tutorial270 PHDF5 Requirements Support MPI programming PHDF5 files compatible with serial HDF5 files Shareable between different serial or parallel platforms Single file image to all processes One file per process design is undesirable Expensive post processing Not usable by different number of processes Standard parallel I/O interface Must be portable to different platforms

March 9, 200910th International LCI Conference - HDF5 Tutorial271 PHDF5 Implementation Layers Application Parallel computing system (Linux cluster) Compute node I/O library (HDF5) Parallel I/O library (MPI-I/O) Parallel file system (GPFS) Switch network/I/O servers Compute node Disk architecture & layout of data on disk PHDF5 built on top of standard MPI-IO API

March 9, 200910th International LCI Conference - HDF5 Tutorial272 Parallel Environment Requirements MPI with MPI-IO. E.g., MPICH2 ROMIO Vendor’s MPI-IO POSIX compliant parallel file system. E.g., GPFS Lustre

March 9, 200910th International LCI Conference - HDF5 Tutorial273 MPI-IO vs. HDF5 MPI-IO is an Input/Output API. It treats the data file as a “linear byte stream” and each MPI application needs to provide its own file view and data representations to interpret those bytes. All data stored are machine dependent except the “external32” representation. External32 is defined in Big Endianness Little-endian machines have to do the data conversion in both read or write operations. 64bit sized data types may lose information.

March 9, 200910th International LCI Conference - HDF5 Tutorial274 MPI-IO vs. HDF5 Cont. HDF5 is a data management software. It stores the data and metadata according to the HDF5 data format definition. HDF5 file is self-described. Each machine can store the data in its own native representation for efficient I/O without loss of data precision. Any necessary data representation conversion is done by the HDF5 library automatically.

March 9, 200910th International LCI Conference - HDF5 Tutorial275 How to Compile PHDF5 Applications h5pcc – HDF5 C compiler command Similar to mpicc h5pfc – HDF5 F90 compiler command Similar to mpif90 To compile: % h5pcc h5prog.c % h5pfc h5prog.f90

March 9, 200910th International LCI Conference - HDF5 Tutorial276 h5pcc/h5pfc -show option -show displays the compiler commands and options without executing them, i.e., dry run % h5pcc -show Sample_mpio.c mpicc -I/home/packages/phdf5/include \ -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE \ -D_FILE_OFFSET_BITS=64 -D_POSIX_SOURCE \ -D_BSD_SOURCE -std=c99 -c Sample_mpio.c mpicc -std=c99 Sample_mpio.o \ -L/home/packages/phdf5/lib \ home/packages/phdf5/lib/libhdf5_hl.a \ /home/packages/phdf5/lib/libhdf5.a -lz -lm -Wl,-rpath \ -Wl,/home/packages/phdf5/lib

March 9, 200910th International LCI Conference - HDF5 Tutorial277 Collective vs. Independent Calls MPI definition of collective call All processes of the communicator must participate in the right order. E.g., Process1 Process2 call A(); call B(); call A(); call B(); **right** call A(); call B(); call B(); call A(); **wrong** Independent means not collective Collective is not necessarily synchronous

March 9, 200910th International LCI Conference - HDF5 Tutorial278 Programming Restrictions Most PHDF5 APIs are collective PHDF5 opens a parallel file with a communicator Returns a file-handle Future access to the file via the file-handle All processes must participate in collective PHDF5 APIs Different files can be opened via different communicators

March 9, 200910th International LCI Conference - HDF5 Tutorial279 Examples of PHDF5 API Examples of PHDF5 collective API File operations: H5Fcreate, H5Fopen, H5Fclose Objects creation: H5Dcreate, H5Dopen, H5Dclose Objects structure: H5Dextend (increase dimension sizes) Array data transfer can be collective or independent Dataset operations: H5Dwrite, H5Dread Collectiveness is indicated by function parameters, not by function names as in MPI API

March 9, 200910th International LCI Conference - HDF5 Tutorial280 What Does PHDF5 Support ? After a file is opened by the processes of a communicator All parts of file are accessible by all processes All objects in the file are accessible by all processes Multiple processes may write to the same data array Each process may write to individual data array

March 9, 200910th International LCI Conference - HDF5 Tutorial281 PHDF5 API Languages C and F90 language interfaces Platforms supported: Most platforms with MPI-IO supported. E.g., IBM SP, Linux clusters, SGI Altrix, Cray XT3, …

March 9, 200910th International LCI Conference - HDF5 Tutorial282 Programming model for creating and accessing a file HDF5 uses access template object (property list) to control the file access mechanism General model to access HDF5 file in parallel: Setup MPI-IO access template (access property list) Open File Access Data Close File

March 9, 200910th International LCI Conference - HDF5 Tutorial283 Setup MPI-IO access template Each process of the MPI communicator creates an access template and sets it up with MPI parallel access information C: herr_t H5Pset_fapl_mpio(hid_t plist_id, MPI_Comm comm, MPI_Info info); F90: h5pset_fapl_mpio_f(plist_id, comm, info) integer(hid_t) :: plist_id integer :: comm, info plist_id is a file access property list identifier

March 9, 200910th International LCI Conference - HDF5 Tutorial284 C Example Parallel File Create 23 comm = MPI_COMM_WORLD; 24 info = MPI_INFO_NULL; 26 /* 27 * Initialize MPI 28 */ 29 MPI_Init(&argc, &argv); 30 /* 34 * Set up file access property list for MPI-IO access 35 */ ->36 plist_id = H5Pcreate(H5P_FILE_ACCESS); ->37 H5Pset_fapl_mpio(plist_id, comm, info); 38 ->42 file_id = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id); 49 /* 50 * Close the file. 51 */ 52 H5Fclose(file_id); 54 MPI_Finalize();

March 9, 200910th International LCI Conference - HDF5 Tutorial285 F90 Example Parallel File Create 23 comm = MPI_COMM_WORLD 24 info = MPI_INFO_NULL 26 CALL MPI_INIT(mpierror) 29 ! 30 !Initialize FORTRAN predefined datatypes 32 CALL h5open_f(error) 34 ! 35 !Setup file access property list for MPI-IO access. ->37 CALL h5pcreate_f(H5P_FILE_ACCESS_F, plist_id, error) ->38 CALL h5pset_fapl_mpio_f(plist_id, comm, info, error) 40 ! 41 !Create the file collectively. ->43 CALL h5fcreate_f(filename, H5F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) 45 ! 46 !Close the file. 49 CALL h5fclose_f(file_id, error) 51 ! 52 !Close FORTRAN interface 54 CALL h5close_f(error) 56 CALL MPI_FINALIZE(mpierror)

March 9, 200910th International LCI Conference - HDF5 Tutorial286 Creating and Opening Dataset All processes of the communicator open/close a dataset by a collective call C: H5Dcreate or H5Dopen; H5Dclose F90: h5dcreate_f or h5dopen_f; h5dclose_f All processes of the communicator must extend an unlimited dimension dataset before writing to it C: H5Dextend F90: h5dextend_f

March 9, 200910th International LCI Conference - HDF5 Tutorial287 C Example: Create Dataset 56 file_id = H5Fcreate(…); 57 /* 58 * Create the dataspace for the dataset. 59 */ 60 dimsf[0] = NX; 61 dimsf[1] = NY; 62 filespace = H5Screate_simple(RANK, dimsf, NULL); 63 64 /* 65 * Create the dataset with default properties collective. 66 */ ->67 dset_id = H5Dcreate(file_id, “dataset1”, H5T_NATIVE_INT, 68 filespace, H5P_DEFAULT); 70 H5Dclose(dset_id); 71 /* 72 * Close the file. 73 */ 74 H5Fclose(file_id);

March 9, 200910th International LCI Conference - HDF5 Tutorial288 F90 Example: Create Dataset 43 CALL h5fcreate_f(filename, H5F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) 73 CALL h5screate_simple_f(rank, dimsf, filespace, error) 76 ! 77 ! Create the dataset with default properties. 78 ! ->79 CALL h5dcreate_f(file_id, “dataset1”, H5T_NATIVE_INTEGER, filespace, dset_id, error) 90 ! 91 ! Close the dataset. 92 CALL h5dclose_f(dset_id, error) 93 ! 94 ! Close the file. 95 CALL h5fclose_f(file_id, error)

March 9, 200910th International LCI Conference - HDF5 Tutorial289 Accessing a Dataset All processes that have opened dataset may do collective I/O Each process may do independent and arbitrary number of data I/O access calls C: H5Dwrite and H5Dread F90: h5dwrite_f and h5dread_f

March 9, 200910th International LCI Conference - HDF5 Tutorial290 Programming model for dataset access Create and set dataset transfer property C: H5Pset_dxpl_mpio H5FD_MPIO_COLLECTIVE H5FD_MPIO_INDEPENDENT (default) F90: h5pset_dxpl_mpio_f H5FD_MPIO_COLLECTIVE_F H5FD_MPIO_INDEPENDENT_F (default) Access dataset with the defined transfer property

March 9, 200910th International LCI Conference - HDF5 Tutorial291 C Example: Collective write 95 /* 96 * Create property list for collective dataset write. 97 */ 98 plist_id = H5Pcreate(H5P_DATASET_XFER); ->99 H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE); 100 101 status = H5Dwrite(dset_id, H5T_NATIVE_INT, 102 memspace, filespace, plist_id, data);

March 9, 200910th International LCI Conference - HDF5 Tutorial292 F90 Example: Collective write 88 ! Create property list for collective dataset write 89 ! 90 CALL h5pcreate_f(H5P_DATASET_XFER_F, plist_id, error) ->91 CALL h5pset_dxpl_mpio_f(plist_id, & H5FD_MPIO_COLLECTIVE_F, error) 92 93 ! 94 ! Write the dataset collectively. 95 ! 96 CALL h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, data, & error, & file_space_id = filespace, & mem_space_id = memspace, & xfer_prp = plist_id)

March 9, 200910th International LCI Conference - HDF5 Tutorial293 Writing and Reading Hyperslabs Distributed memory model: data is split among processes PHDF5 uses HDF5 hyperslab model Each process defines memory and file hyperslabs Each process executes partial write/read call Collective calls Independent calls

March 9, 200910th International LCI Conference - HDF5 Tutorial294 Set up the Hyperslab for Read/Write H5Sselect_hyperslab( filespace,H5S_SELECT_SET, offset, stride, count, block )

March 9, 200910th International LCI Conference - HDF5 Tutorial295 P0 P1 File Example 1: Writing dataset by rows P2 P3

March 9, 200910th International LCI Conference - HDF5 Tutorial296 Writing by rows: Output of h5dump HDF5 "SDS_row.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 5 ) / ( 8, 5 ) } DATA { 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13 }

March 9, 200910th International LCI Conference - HDF5 Tutorial297 Memory File Example 1: Writing dataset by rows count[0] = dimsf[0]/mpi_size count[1] = dimsf[1]; offset[0] = mpi_rank * count[0]; /* = 2 */ offset[1] = 0; count[0] count[1] offset[0] offset[1] Process 1

March 9, 200910th International LCI Conference - HDF5 Tutorial298 Example 1: Writing dataset by rows 71 /* 72 * Each process defines dataset in memory and * writes it to the hyperslab 73 * in the file. 74 */ 75 count[0] = dimsf[0]/mpi_size; 76 count[1] = dimsf[1]; 77 offset[0] = mpi_rank * count[0]; 78 offset[1] = 0; 79 memspace = H5Screate_simple(RANK,count,NULL); 80 81 /* 82 * Select hyperslab in the file. 83 */ 84 filespace = H5Dget_space(dset_id); 85 H5Sselect_hyperslab(filespace, H5S_SELECT_SET,offset,NULL,count,NULL);

March 9, 200910th International LCI Conference - HDF5 Tutorial299 P0 P1 File Example 2: Writing dataset by columns

March 9, 200910th International LCI Conference - HDF5 Tutorial300 Writing by columns: Output of h5dump HDF5 "SDS_col.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 6 ) / ( 8, 6 ) } DATA { 1, 2, 10, 20, 100, 200, 1, 2, 10, 20, 100, 200 }

March 9, 200910th International LCI Conference - HDF5 Tutorial301 Example 2: Writing dataset by column Process 1 Process 0 File Memory block[1] block[0] P0 offset[1] P1 offset[1] stride[1] dimsm[0] dimsm[1]

March 9, 200910th International LCI Conference - HDF5 Tutorial302 Example 2: Writing dataset by column 85 /* 86 * Each process defines a hyperslab in * the file 88 */ 89 count[0] = 1; 90 count[1] = dimsm[1]; 91 offset[0] = 0; 92 offset[1] = mpi_rank; 93 stride[0] = 1; 94 stride[1] = 2; 95 block[0] = dimsf[0]; 96 block[1] = 1; 97 98 /* 99 * Each process selects a hyperslab. 100 */ 101 filespace = H5Dget_space(dset_id); 102 H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, stride, count, block);

March 9, 200910th International LCI Conference - HDF5 Tutorial303 Example 3: Writing dataset by pattern Process 0 Process 2 File Process 3 Process 1 Memory

March 9, 200910th International LCI Conference - HDF5 Tutorial304 Writing by Pattern: Output of h5dump HDF5 "SDS_pat.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4, 1, 3, 1, 3, 2, 4, 2, 4 }

March 9, 200910th International LCI Conference - HDF5 Tutorial305 Process 2 File Example 3: Writing dataset by pattern offset[0] = 0; offset[1] = 1; count[0] = 4; count[1] = 2; stride[0] = 2; stride[1] = 2; Memory stride[0] stride[1] offset[1] count[1]

March 9, 200910th International LCI Conference - HDF5 Tutorial306 Example 3: Writing by pattern 90 /* Each process defines dataset in memory and 91 * writes it to the hyperslab in the file. 92 */ 93 count[0] = 4; 94 count[1] = 2; 95 stride[0] = 2; 96 stride[1] = 2; 97 if(mpi_rank == 0) { 98 offset[0] = 0; 99 offset[1] = 0; 100 } 101 if(mpi_rank == 1) { 102 offset[0] = 1; 103 offset[1] = 0; 104 } 105 if(mpi_rank == 2) { 106 offset[0] = 0; 107 offset[1] = 1; 108 } 109 if(mpi_rank == 3) { 110 offset[0] = 1; 111 offset[1] = 1; 112 }

March 9, 200910th International LCI Conference - HDF5 Tutorial307 P0P2 File Example 4: Writing dataset by chunks P1P3

March 9, 200910th International LCI Conference - HDF5 Tutorial308 Writing by Chunks: Output of h5dump HDF5 "SDS_chnk.h5" { GROUP "/" { DATASET "IntArray" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 1, 2, 2, 3, 3, 4, 4, 3, 3, 4, 4 }

March 9, 200910th International LCI Conference - HDF5 Tutorial309 Example 4: Writing dataset by chunks File Process 2: Memory block[0] = chunk_dims[0]; block[1] = chunk_dims[1]; offset[0] = chunk_dims[0]; offset[1] = 0; chunk_dims[0] chunk_dims[1] block[0] block[1] offset[0] offset[1]

March 9, 200910th International LCI Conference - HDF5 Tutorial310 Example 4: Writing by chunks 97 count[0] = 1; 98 count[1] = 1 ; 99 stride[0] = 1; 100 stride[1] = 1; 101 block[0] = chunk_dims[0]; 102 block[1] = chunk_dims[1]; 103 if(mpi_rank == 0) { 104 offset[0] = 0; 105 offset[1] = 0; 106 } 107 if(mpi_rank == 1) { 108 offset[0] = 0; 109 offset[1] = chunk_dims[1]; 110 } 111 if(mpi_rank == 2) { 112 offset[0] = chunk_dims[0]; 113 offset[1] = 0; 114 } 115 if(mpi_rank == 3) { 116 offset[0] = chunk_dims[0]; 117 offset[1] = chunk_dims[1]; 118 }

March 9, 200910th International LCI Conference - HDF5 Tutorial311 Parallel HDF5 Intermediate Tutorial

March 9, 200910th International LCI Conference - HDF5 Tutorial312 Outline Performance Parallel tools

March 9, 200910th International LCI Conference - HDF5 Tutorial313 My PHDF5 Application I/O is slow If my application I/O performance is slow, what can I do? Use larger I/O data sizes Independent vs. Collective I/O Specific I/O system hints Increase Parallel File System capacity

March 9, 200910th International LCI Conference - HDF5 Tutorial314 Write Speed vs. Block Size

March 9, 200910th International LCI Conference - HDF5 Tutorial315 Independent vs. Collective Access User reported Independent data transfer mode was much slower than the Collective data transfer mode Data array was tall and thin: 230,000 rows by 6 columns : 230,000 rows :

Debug Slow Parallel I/O Speed(1) Writing to one dataset Using 4 processes == 4 columns data type is 8 bytes doubles 4 processes, 1000 rows == 4x1000x8 = 32,000 bytes % mpirun -np 4./a.out i t 1000 Execution time: 1.783798 s. % mpirun -np 4./a.out i t 2000 Execution time: 3.838858 s. # Difference of 2 seconds for 1000 more rows = 32,000 Bytes. # A speed of 16KB/Sec!!! Way too slow. March 9, 200910th International LCI Conference - HDF5 Tutorial316

Debug Slow Parallel I/O Speed(2) Build a version of PHDF5 with./configure --enable-debug --enable-parallel … This allows the tracing of MPIO I/O calls in the HDF5 library. E.g., to trace MPI_File_read_xx and MPI_File_write_xx calls % setenv H5FD_mpio_Debug “rw” March 9, 200910th International LCI Conference - HDF5 Tutorial317

Debug Slow Parallel I/O Speed(3) % setenv H5FD_mpio_Debug ’rw’ % mpirun -np 4./a.out i t 1000# Indep.; contiguous. in H5FD_mpio_write mpi_off=0 size_i=96 in H5FD_mpio_write mpi_off=2056 size_i=8 in H5FD_mpio_write mpi_off=2048 size_i=8 in H5FD_mpio_write mpi_off=2072 size_i=8 in H5FD_mpio_write mpi_off=2064 size_i=8 in H5FD_mpio_write mpi_off=2088 size_i=8 in H5FD_mpio_write mpi_off=2080 size_i=8 … # total of 4000 of this little 8 bytes writes == 32,000 bytes. March 9, 200910th International LCI Conference - HDF5 Tutorial318

March 9, 200910th International LCI Conference - HDF5 Tutorial319 Independent calls are many and small Each process writes one element of one row, skips to next row, write one element, so on. Each process issues 230,000 writes of 8 bytes each. Not good==just like many independent cars driving to work, waste gas, time, total traffic jam. : 230,000 rows :

Debug Slow Parallel I/O Speed (4) % setenv H5FD_mpio_Debug ’rw’ % mpirun -np 4./a.out i h 1000# Indep., Chunked. in H5FD_mpio_write mpi_off=0 size_i=96 in H5FD_mpio_write mpi_off=3688 size_i=8000 in H5FD_mpio_write mpi_off=11688 size_i=8000 in H5FD_mpio_write mpi_off=27688 size_i=8000 in H5FD_mpio_write mpi_off=19688 size_i=8000 in H5FD_mpio_write mpi_off=96 size_i=40 in H5FD_mpio_write mpi_off=136 size_i=544 in H5FD_mpio_write mpi_off=680 size_i=120 in H5FD_mpio_write mpi_off=800 size_i=272 … Execution time: 0.011599 s. March 9, 200910th International LCI Conference - HDF5 Tutorial320

March 9, 200910th International LCI Conference - HDF5 Tutorial321 Use Collective Mode or Chunked Storage Collective mode will combine many small independent calls into few but bigger calls==like people going to work by trains collectively. Chunks of columns speeds up too==like people live and work in suburbs to reduce overlapping traffics. : 230,000 rows :

March 9, 200910th International LCI Conference - HDF5 Tutorial322 # of RowsData Size (MB) Independent (Sec.) Collective (Sec.) 163840.258.261.72 327680.5065.121.80 655361.00108.202.68 1229181.88276.573.11 1500002.29528.153.63 1803002.75881.394.12 Independent vs. Collective write 6 processes, IBM p-690, AIX, GPFS

March 9, 200910th International LCI Conference - HDF5 Tutorial323 Independent vs. Collective write (cont.)

March 9, 200910th International LCI Conference - HDF5 Tutorial324 Effects of I/O Hints: IBM_largeblock_io GPFS at LLNL Blue 4 nodes, 16 tasks Total data size 1024MB I/O buffer size 1MB

March 9, 200910th International LCI Conference - HDF5 Tutorial325 GPFS at LLNL ASCI Blue machine 4 nodes, 16 tasks Total data size 1024MB I/O buffer size 1MB Effects of I/O Hints: IBM_largeblock_io

March 9, 200910th International LCI Conference - HDF5 Tutorial326 Parallel Tools ph5diff Parallel version of the h5diff tool h5perf Performance measuring tools showing I/O performance for different I/O API

March 9, 200910th International LCI Conference - HDF5 Tutorial327 ph5diff An parallel version of the h5diff tool Supports all features of h5diff An MPI parallel tool Manager process (proc 0) coordinates each the remaining processes (workers) to “diff” one dataset at a time; collects any output from each worker and prints them out. Works best if there are many datasets in the two files with few differences. Available in v1.8.

March 9, 200910th International LCI Conference - HDF5 Tutorial328 h5perf An I/O performance measurement tool Test 3 File I/O API POSIX I/O (open/write/read/close…) MPIO (MPI_File_{open,write,read,close}) PHDF5 H5Pset_fapl_mpio (using MPI-IO) H5Pset_fapl_mpiposix (using POSIX I/O) An indication of I/O speed upper limits

March 9, 200910th International LCI Conference - HDF5 Tutorial329 h5perf: Some features Check (-c) verify data correctness Added 2-D chunk patterns in v1.8 -h shows the help page.

March 9, 200910th International LCI Conference - HDF5 Tutorial330 h5perf: example output 1/3 %mpirun -np 4 h5perf# Ran in a Linux system Number of processors = 4 Transfer Buffer Size: 131072 bytes, File size: 1.00 MBs # of files: 1, # of datasets: 1, dataset size: 1.00 MBs IO API = POSIX Write (1 iteration(s)): Maximum Throughput: 18.75 MB/s Average Throughput: 18.75 MB/s Minimum Throughput: 18.75 MB/s Write Open-Close (1 iteration(s)): Maximum Throughput: 10.79 MB/s Average Throughput: 10.79 MB/s Minimum Throughput: 10.79 MB/s Read (1 iteration(s)): Maximum Throughput: 2241.74 MB/s Average Throughput: 2241.74 MB/s Minimum Throughput: 2241.74 MB/s Read Open-Close (1 iteration(s)): Maximum Throughput: 756.41 MB/s Average Throughput: 756.41 MB/s Minimum Throughput: 756.41 MB/s

March 9, 200910th International LCI Conference - HDF5 Tutorial331 h5perf: example output 2/3 %mpirun -np 4 h5perf … IO API = MPIO Write (1 iteration(s)): Maximum Throughput: 611.95 MB/s Average Throughput: 611.95 MB/s Minimum Throughput: 611.95 MB/s Write Open-Close (1 iteration(s)): Maximum Throughput: 16.89 MB/s Average Throughput: 16.89 MB/s Minimum Throughput: 16.89 MB/s Read (1 iteration(s)): Maximum Throughput: 421.75 MB/s Average Throughput: 421.75 MB/s Minimum Throughput: 421.75 MB/s Read Open-Close (1 iteration(s)): Maximum Throughput: 109.22 MB/s Average Throughput: 109.22 MB/s Minimum Throughput: 109.22 MB/s

March 9, 200910th International LCI Conference - HDF5 Tutorial332 h5perf: example output 3/3 %mpirun -np 4 h5perf … IO API = PHDF5 (w/MPI-I/O driver) Write (1 iteration(s)): Maximum Throughput: 304.40 MB/s Average Throughput: 304.40 MB/s Minimum Throughput: 304.40 MB/s Write Open-Close (1 iteration(s)): Maximum Throughput: 15.14 MB/s Average Throughput: 15.14 MB/s Minimum Throughput: 15.14 MB/s Read (1 iteration(s)): Maximum Throughput: 1718.27 MB/s Average Throughput: 1718.27 MB/s Minimum Throughput: 1718.27 MB/s Read Open-Close (1 iteration(s)): Maximum Throughput: 78.06 MB/s Average Throughput: 78.06 MB/s Minimum Throughput: 78.06 MB/s Transfer Buffer Size: 262144 bytes, File size: 1.00 MBs # of files: 1, # of datasets: 1, dataset size: 1.00 MBs

March 9, 200910th International LCI Conference - HDF5 Tutorial333 Useful Parallel HDF Links Parallel HDF information site http://www.hdfgroup.org/HDF5/PHDF5/ Parallel HDF5 tutorial available at http://www.hdfgroup.org/HDF5/Tutor/ HDF Help email address help@hdfgroup.org

Parallel I/O Performance Study (preliminary results) Albert Cheng The HDF Group March 9, 200910th International LCI Conference - HDF5 Tutorial335

Introduction Parallel performance affected by the I/O access pattern, file system, and MPI communication modes. Determination of interaction of these elements provides hints for improving performance. Study presents four test cases using h5perf and h5perf_serial. h5perf has been extended to support parallel testing of 2D datasets. h5perf_serial, based on h5perf, allows serial testing of n-dimensional datasets and various file drivers. Testing includes various combinations of MPI communication modes and HDF5 storage layouts. Finally, we make recommendations that can improve the I/O performance for specific patterns. March 9, 200910th International LCI Conference - HDF5 Tutorial336

Testing Systems and Configuration Syste m ArchitectureFile SystemMPI Implementation abeLinux Cluster with Intel 64 LustreMVAPICH2 1.0.2p1 Message Passing with Intel compiler cobaltccNUMA with Itanium 2 CXFSSGI Message Passing Toolkit 1.16 mercury Linux Cluster with Itanium 2 GPFSMPICH Myrinet 1.2.5..10, GM 2.0.8, Intel 8.0 Processors4 Dataset Size64K×64K (4GB) I/O Selection64MB per processor (shape depends on test case) APIHDF5 v181 (default building options) Iterations3 MPI/IO TypeCollective / Independent Storage LayoutContiguous / Chunked (chunk size depend on test case) March 9, 200910th International LCI Conference - HDF5 Tutorial337

HDF5 Storage Layouts Contiguous HDF5 assigns a static contiguous region of storage for raw data. Dataset File storage March 9, 200910th International LCI Conference - HDF5 Tutorial338

HDF5 Storage Layouts Chunked HDF5 define separate regions of storage for raw data named chunks, which are pre-allocated in row-major order when a file is created in parallel. This layout is only valid when a file is created and the chunks are pre-allocated. Further modification of the file may cause the chunks to be arranged differently. March 9, 200910th International LCI Conference - HDF5 Tutorial339 C0C1 C2C3 C0C1C2C3

Test Cases Case A The transfer selections extend over the entire columns with a size of 64K×1K. If the storage is chunked, the size of the chunks is 1K×1K. The selections are interleaved horizontally with respect to the processors. March 9, 200910th International LCI Conference - HDF5 Tutorial340 P0P1P2P3P0P1P2P3P0P1P2P3 64K 1K …

Test Cases Case B The transfer selection only spans half the columns with a size of 32K×2K. If the storage is chunked, the size of the chunks is 2K×2K. The selections are interleaved horizontally with respect to the processors. March 9, 200910th International LCI Conference - HDF5 Tutorial341 P0P1P2P3P0P1P2P3P0P1P2P3 32K … 2K P0P1P2P3P0P1P2P3P0P1P2P3… 64K

Test Cases Case C The transfer selections only span half the rows with a size of 2K×32K. If the storage is chunked, the size of the chunks is 2K×2K. The lower dimension (column) is evenly divided among the processors. March 9, 200910th International LCI Conference - HDF5 Tutorial342 P0 … P1 … P2 … P3 … P0 … P1 … P2 … P3 … 64K 32K 2K

Test Cases Case D The transfer selection extends over the entire rows with a size of 1K×64K. If the storage is chunked, the size of the chunks is 1K×1K. The lower dimension (column) is evenly divided among the processors. March 9, 200910th International LCI Conference - HDF5 Tutorial343 P0 … P1 … P2 … P3 … 64K 1K 64K P3

Access Patterns Contiguous Each processor retrieves a separate region of contiguous storage. An example of this pattern is case D using contiguous storage. Non-contiguous Separate regions are still assigned to each processor but such regions contain gaps. Examples of this pattern include case C using contiguous storage, and collective cases C-D using chunked storage. P0P1P2P3 P0 … P1 … P2 … P3... P0 March 9, 200910th International LCI Conference - HDF5 Tutorial344

Access Patterns Interleaved (or overlapped) Each processor writes into many portions that are interleaved with respect to the other processors. For example, using contiguous storage along with cases A-B generates Another instance results from using chunked storage with collective cases A-B P0P1P2P3P0P1P2P3 … P0P1P2P3P0P1P2P3 … March 9, 200910th International LCI Conference - HDF5 Tutorial345

Performance Results and Analysis The results correspond to maximum throughput values of Write Open-Close operations during 3 iterations. Serial throughput is the performance baseline since our objective is to determine how parallel access can improve performance. Unlike GPFS and CXFS, Lustre does not stripe files by default. To enable parallel access, the directory / file must be striped using the command lfs. March 9, 200910th International LCI Conference - HDF5 Tutorial346

I/O Performance in Lustre March 9, 200910th International LCI Conference - HDF5 Tutorial347 NON-STRIPEDSTRIPED COLLECTIVECase ACase BCase CCase DCase ACase BCase CCase D Contiguous 11.6623.6846.1236.6725.3550.2642.67119.26 Chunked 179.85 117.31 124.88106.95180.33 224.28 86.8893.45 INDEPENDEN TCase ACase BCase CCase DCase ACase BCase CCase D Contiguous 5.928.1720.98304.066.710.8173.45298.09 Chunked 219.15328.0412.158.16158.9133.2712.9410.51

I/O Performance in Lustre Striping partitions the file space into stripes and assigns them to several Object Storage Targets (OSTs) in round- robin fashion. Since each OST stores portions of the file that are different from the other OSTs, they all can access the file in parallel. The default configuration on abe uses a stripe size of 4MB and a stripe count of 16. Striping improves performance when the I/O request of each processor spans several stripes (and OSTs) after MPI aggregations, if any. When the processors make small independent I/O requests that are practically contiguous as cases A-B using chunked storage, a single OST can provide better performance due to asynchronous operations. March 9, 200910th International LCI Conference - HDF5 Tutorial348

I/O Performance March 9, 200910th International LCI Conference - HDF5 Tutorial349

Performance of Serial I/O Access using contiguous storage has the steepest performance trend as the cases change from A to D. When using chunked storage, the throughput remains almost constant at the upper bound. The allocation of chunks at the time they are written causes the access pattern to be virtually contiguous regardless of the test cases. March 9, 200910th International LCI Conference - HDF5 Tutorial352

Performance of Independent I/O Processors perform their I/O requests independently from each other. For contiguous storage, performance improves as the tests move from A to D. For chunked storage, throughput is high for interleaved cases A-B since writing blocks (chunks) become larger and caching is exploited. For cases C-D, the many writing requests (one per chunk) multiply the overhead due to unnecessary locking and caching in Lustre and CXFS. Unlike these file systems, GPFS has shown better scalability [1,2]. March 9, 200910th International LCI Conference - HDF5 Tutorial353

Performance of Collective I/O The participating processors coordinate and combine their many requests into fewer I/O operations reducing latency. Since the file space is evenly divided among the processors, no need for locking which reduces overhead [3]. For contiguous storage, performance is overall high but there is still an increasing trend as the cases change from A to D. For chunked storage, the performance is even higher with minor variations among the tests cases because several chunks can be written with a single I/O operation. March 9, 200910th International LCI Conference - HDF5 Tutorial354

Conclusion Important to determine the access pattern by analyzing the I/O requirements of the application and the storage implementation. For contiguous access patterns, independent access is preferable because it omits unnecessary overhead of collective calls. For non-contiguous patterns, there is little difference between independent and collective access. However, writing many chunks in independent mode may be expensive in Lustre and CXFS if caching is not exploited. For interleaved access pattern, collective mode is usually faster. For all the access patterns, collective mode and chunk storage provide the combination that yields the highest average performance. March 9, 200910th International LCI Conference - HDF5 Tutorial355

References 1.J. Borrill, L. Oliker, J. Shalf, and H. Shan. Investigation of Leading HPC I/O Performance Using A Scientific-Application Derived Benchmark. In Proceedings of SC’07: High Performance Networking and Computing, Reno, NV, November 2007. 2.W. Liao, A. Ching, K. Coloma, A. Choudhary, and L. Ward. An Implementation and Evaluation of Client-Side File Caching for MPI-IO. In Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2007, IEEE International Volume, Issue 26-30, pages 1-10, March 2007. 3.R. Thakur, W. Gropp, and E. Lusk. Data Sieving and Collective I/O in ROMIO. In Proceedings of the 7 th Symposium of the Frontiers of Massively Parallel Computation. IEEE Computer Society Press, February 1999. March 9, 200910th International LCI Conference - HDF5 Tutorial356

March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner.

Similar presentations

Presentation on theme: "March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner.

Similar presentations

Presentation on theme: "March 9, 200910th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner."— Presentation transcript:

Similar presentations

About project

Feedback