Download presentation
Presentation is loading. Please wait.
Published bySophie Cooper Modified over 9 years ago
1
April 28, 2008LCI Tutorial1 Introduction to HDF5 Tools Tutorial Part II
2
April 28, 2008LCI Tutorial2 Outline Overview of HDF5 tools Using tools for problems troubleshooting
3
April 28, 2008LCI Tutorial3 HDF5 command-line tools Readers h5dump, h5diff, h5ls 1.8 tools: h5check, h5stat Writers h5repack, h5repart, h5import, h5jam/h5unjam 1.8 tools: h5copy, h5mkgrp Converters h4toh5, h5toh4, gif2h5, h52gif
4
April 28, 2008LCI Tutorial4 h5dump Dumps the content of an HDF5 file to standard output and optionally to the following types of files 1.ASCII text file 2.XML file 3.Binary file Flags to remember -H to print header information -p to print objects’ properties -b to export data in a binary form -o to export data to a file (text by default) -y to skip printing indices -w to specify line width
5
April 28, 2008LCI Tutorial5 h5dump -H SDS.h5 HDF5 "SDS.h5" { GROUP "/" { GROUP "Floats" { DATASET "FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } } DATASET "IntArray" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 5, 6 ) / ( 5, 6 ) } }
6
April 28, 2008LCI Tutorial6 h5dump -d /Floats/FloatArray SDS.h5 HDF5 "SDS.h5" { DATASET "/Floats/FloatArray" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 4, 3 ) / ( 4, 3 ) } DATA { (0,0): 0.01, 0.02, 0.03, (1,0): 0.1, 0.2, 0.3, (2,0): 1, 2, 3, (3,0): 10, 20, 30 }
7
April 28, 2008LCI Tutorial7 h5dump -x SDS.h5
8
April 28, 2008LCI Tutorial8 h5dump binary output -b F, --binary=F The form of the binary output (F): MEMORY-- for memory type Data in a file will have the same data type as in memory FILE -- for the disk file type Data in a file will have the same data type as corresponding dataset in an HDF5 file LE -- for pre-defined little endian type H5T_IEEE_F64LE BE -- for pre-defined big endian type H5T_STD_I32BE
9
April 28, 2008LCI Tutorial9 h5dump -d /IntArray -o out_le.bin -b LE SDS.h5 od --width=24 -t x4 out_le.bin 0000000 00000000 00000001 00000002 00000003 00000004 00000005 0000030 0000000a 0000000b 0000000c 0000000d 0000000e 0000000f 0000060 00000014 00000015 00000016 00000017 00000018 00000019 0000110 0000001e 0000001f 00000020 00000021 00000022 00000023 0000140 00000028 00000029 0000002a 0000002b 0000002c 0000002d Dumps a 32-bit integer dataset, IntArray, from SDS.h5 to a little endian binary file out_le.bin
10
April 28, 2008LCI Tutorial10 h5diff Using h5diff, you can compare two objects in the same file compare two objects between two files compare all objects between two files
11
April 28, 2008LCI Tutorial11 h5diff SDS.h5 SDS2.h5 Dataset: and 5 differences found
12
April 28, 2008LCI Tutorial12 h5diff SDS.h5 SDS2.h5 -r /IntArray Dataset: and positionIntArrayIntArraydifference ------------------------------------------------------------ [ 0 0 ]01010 [ 1 0 ]1010090 [ 2 0 ]20200180 [ 3 0 ]30300270 [ 4 0 ]40400360 5 differences found
13
April 28, 2008LCI Tutorial13 h5repack Copies an HDF5 file to a new file with/without compression/chunking Remove un-used space Apply compression filter Apply layout
14
April 28, 2008LCI Tutorial14 h5repack: Applying filters -f FILTER GZIP, to apply GZIP compression SZIP, to apply SZIP compression SHUF, to apply the HDF5 shuffle filter FLET, to apply the HDF5 checksum filter NBIT, to apply NBIT compression SOFF, to apply the HDF5 Scale/Offset filter NONE, to remove all filters For example h5repack -i SDS2.h5 -o SDS2_compressed.h5 -f /IntArray:GZIP=9 Remember that if your data is smaller than 1K, compression will not be applied, see -m flag
15
April 28, 2008LCI Tutorial15 h5repack: Data layout -l LAYOUT CHUNK, to apply chunking layout COMPA, to apply compact layout CONTI, to apply continuous layout For example h5repack -i SDS.h5 -o SDS_chunk.h5 -l /Floats/FloatArray,/IntArray:CHUNK=2x3
16
April 28, 2008LCI Tutorial16 h5repart Repartitions a file or family of files For example h5repart -m 200m int16kx16k.h5 part200m%d.h5 977 MB 200 MB part200m0.h5 200 MB part200m1.h5 200 MB part200m2.h5 200 MB part200m3.h5 177 MB part200m1.h5
17
April 28, 2008LCI Tutorial17 h5import Imports binary/ASCII data into an HDF5 file h5import infile -c config_file [infile -c config_file2...] -outfile outfile Example: h5import float5x4x2.txt -c First_set.conf -o First_set.h5 PATH work/First-set INPUT-CLASS TEXTFP RANK 3 DIMENSION-SIZES 5 2 4 OUTPUT-CLASS FP OUTPUT-SIZE 64 OUTPUT-ARCHITECTURE IEEE OUTPUT-BYTE-ORDER LE CHUNKED-DIMENSION-SIZES 2 2 2 MAXIMUM-DIMENSIONS 8 8 -1 GROUP "/" { GROUP "work" { DATASET "First-set" { DATATYPE H5T_IEEE_F64LE DATASPACE SIMPLE { ( 5, 2, 4 ) / ( 8, 8, H5S_UNLIMITED ) } DATA { (0,0,0): 1.01, 1.02, 1.03, 1.04, (0,1,0): 1.11, 1.12, 1.13, 1.14, (1,0,0): 1.21, 1.22, 1.23, 1.24, (1,1,0): 1.31, 1.32, 1.33, 1.34, (2,0,0): 1.41, 1.42, 1.43, 1.44, (2,1,0): 1.51, 1.52, 1.53, 1.54, (3,0,0): 2.01, 2.02, 2.03, 2.04, (3,1,0): 2.11, 2.12, 2.13, 2.14, (4,0,0): 2.21, 2.22, 2.23, 2.24, (4,1,0): 2.31, 2.32, 2.33, 2.34 } }}
18
April 28, 2008LCI Tutorial18 h5jam/h5unjam Adds/removes a file at the beginning of an HDF5 file Example: h5jam -- adds text to User Block h5jam -u test_ub.txt -i test_ub.h5 h5unjam -- removes text from User Block h5unjam -i test_ub.h5 -o out_ub.txt -o out_ub.h5
19
April 28, 2008LCI Tutorial19 h5ls Lists selected information about file objects in the specified format Example: h5ls -r SDS2.h5 /Floats Group /Floats/DoubleArray Dataset {10, 5} /Floats/FloatArray Dataset {4, 3} /Floats/subs Group /IntArray Dataset {5, 6}
20
April 28, 2008LCI Tutorial20 gif2h5 / h52gif gif2h5 – Converts a GIF file into HDF5 gif2h5 apollo17_earth.gif apollo17_earth.h5 h52gif – Converts an HDF5 file into GIF h52gif apollo17_earth.h5 apollo17_earth2.gif -i /apollo17_earth.gif/Image0 -p "/apollo17_earth.gif/Global Palette"
21
April 28, 2008LCI Tutorial21 h5copy Copies an object from one location to another location within a file or across files Available in 1.8.0 and later / FloatArray Floats IntArray / FloatArray
22
April 28, 2008LCI Tutorial22 h5copy usage: h5copy [OPTIONS] [OBJECTS...] -i, --input input file name -o, --output output file name -s, --source source object name -d, --destination destination object name -f, --flag shallow Copy only immediate members for groups soft Expand soft links into new objects ext Expand external links into new objects ref Copy objects that are pointed by references noattr Copy object without copying attributes
23
April 28, 2008LCI Tutorial23 h5copy Example h5copy -i SDS.h5 -o SDS_cp.h5 -s /Floats/FloatArray -d /FloatArray / FloatArray Floats IntArray / FloatArray SDS.h5 SDS_cp.h5
24
April 28, 2008LCI Tutorial24 h5copy -f shallow / i1 floats integers 64-bit i2 f32 f2f1 / floats 64-bit f32 f2f1 / floats 64-bit f32 -f shallow
25
April 28, 2008LCI Tutorial25 h5copy -f soft / -f soft dset_SL/f1 f1 / dset_SL/f1 f1 / dset_SL/f1
26
April 28, 2008LCI Tutorial26 h5copy -f ref / -f ref d1 dset_ref d2 1895 763 / d1 dset_ref d2 679 1287 / dset_ref 0 0
27
April 28, 2008LCI Tutorial27 h5stat Prints different statistics about HDF5 file Helps To troubleshoot size overhead in HDF5 files To choose specific object’s properties and storage strategies Available in 1.8.0 and later
28
April 28, 2008LCI Tutorial28 h5check Verifies if an HDF5 file is encoded according to the HDF5 File Format Specification Does not use HDF5 library Serves as a watch dog that the HDF5 library implementation is compliant with the HDF5 File Format Specification Tool is NOT a part of the HDF5 source code distribution
29
April 28, 2008LCI Tutorial29 How to use it? h5check [-vn] -vn verboseness mode n=0Terse—only prints if the file is compliant or not n=1Default—prints its progress and all errors found n=2Verbose—prints everything it knows, usually for debugging
30
April 28, 2008LCI Tutorial30 Example: a compliant file % h5check example1.h5 VALIDATING example1.h5 FOUND super block signature VALIDATING the super block at 0... VALIDATING the object header at 928... VALIDATING the btree at 384... FOUND btree signature. VALIDATING the local heap at 96... FOUND local heap signature. … Result: File is in compliance.
31
April 28, 2008LCI Tutorial31 Example: a non-compliant file h5check invalid2.h5 FOUND super block signature VALIDATING the super block at 0... VALIDATING the object header at 928... VALIDATING the btree at 384... FOUND btree signature. VALIDATING the SNOD at 1248... FOUND SNOD signature. VALIDATING the object header at 976... check_sym(at 1248): Errors from check_obj_header() decode_validate_messages(): Failure in type->decode(). H5O_sdspace_decode(): Bad version number in simple dataspace message. VALIDATING the local heap at 96... FOUND local heap signature. Main(): Errors from check_obj_header(). decode_validate_messages(): Failure in type->decode(). H5O_attr_decode(): Can't decode attribute dataspace. H5O_sdspace_decode(): Bad version number in simple dataspace message. … Result: File is not in compliance.
32
April 28, 2008LCI Tutorial32 Using HDF5 Tools for Performance Tuning and Troubleshooting
33
April 28, 2008LCI Tutorial33 Introduction HDF5 tools may be very useful for performance tuning and troubleshooting Discover objects and their properties in HDF5 files h5dump -p Get file size overhead information h5stat Get locations of the objects in a file h5ls Discover differences h5diff, h5ls Location of raw data h5ls –var
34
April 28, 2008LCI Tutorial34 h5stat Prints different statistics about HDF5 file Helps To troubleshoot size overhead in HDF5 files To choose specific object’s properties and storage strategies To use h5stat --help h5stat file.h5 Full spec can be found http://www.hdfgroup.uiuc.edu/RFC/HDF5/h5stat/ http://www.hdfgroup.uiuc.edu/RFC/HDF5/h5stat/ Let us know if you need some “special” type of statistics
35
April 28, 2008LCI Tutorial35 h5stat Reports two types of statistics: High-level information about objects (examples): Number of different objects (groups, datasets, datatypes) in a file Number of unique datatypes Size of raw data in a file Information about object’s structural metadata Sizes of structural metadata (total/free) Object headers, local and global heaps Sizes of B-trees Object headers fragmentation
36
April 28, 2008LCI Tutorial36 h5stat Examples of high-level information: File information # of unique groups: 10008 # of unique datasets: 30 # of unique named datatypes: 0 …………………… Max. # of links to object: 1 Max. depth of hierarchy: 4 Max. # of objects in group: 19 …………………… Group bins: # of groups of size 0: 10000 # of groups of size 1 - 9: 7 # of groups of size 10 - 99: 1 …………………… Max. dimension size of 1-D datasets: 1643 …………………… Dataset filters information: Number of datasets with ……………… SZIP filter: 2 ……………… NBIT filter: 10 USER-DEFINED filter: 1
37
April 28, 2008LCI Tutorial37 h5stat Conclusion: There are a lot of empty groups in the file; good candidate for compact group feature (h5repack -l ….) Some datasets use “user-defined” filters and may not be readable by HDF5 library SZIP compression is needed to read some datasets Oh… my application uses buffers of size 1024 to read data… No wonder it crashes on reading… Do I have all filters needed to read the data?
38
April 28, 2008LCI Tutorial38 h5stat Examples of structural metadata information: Object header size: (total/unused) Groups: 1808/72 Datasets: 15792/832 ……… Dataset storage information: Total raw data size: 6140688 ……… Dataset datatype #3: Count (total/named) = (2/0) Size (desc./elmt) = (10/65535) Dataset datatype #4: Count (total/named) = (1/0) Size (desc./elmt) = (10/32000)
39
April 28, 2008LCI Tutorial39 Conclusions File size: 6228197 1.5% overhead (not bad at all!) There some elements of size 65535 and 32000 Oh… Is it really what I want? Should I use other datatype and get advantage of compression? h5stat
40
April 28, 2008LCI Tutorial40 Case study: Using HDF5tools to debug a problem My application creates files on Windows with VS2005 and VS2003. I can read the VS2003 file but not the VS2005 one. H5dump reads both files OK and there are no differences. What am I doing wrong? h5diff good.h5 bad.h5 Datatype: and 1 differences found h5ls –var good.h5 /Definitions/timespec Type Location: 0:1:0:900 h5debug good.h5 900 Message Information: Type class: compound Size: 8 bytes h5debug bad.h5 900 Message Information: Type class: compound Size: 16 bytes
41
April 28, 2008LCI Tutorial41 Conclusions Compound datatype “timespec” requires different number of bytes on VS2005 (16 bytes; 2x8bytes) and on VS2003 (8bytes; 2x4bytes) Oh… How do I read my data back? I assumed that my struct would need only 8 bytes for each element but it needs 16 bytes on VS2005. I need H5Tget_native_type function to find the type of my data in memory Case study: Using HDF5tools to debug a problem
42
April 28, 2008LCI Tutorial42 Questions? End of Part II
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.