Www.hdfgroup.org The HDF Group Introduction to HDF5 Session 7 Datatypes 1 Copyright © 2010 The HDF Group. All Rights Reserved.

Slides:



Advertisements
Similar presentations
Pointers.
Advertisements

C Structures Basics of structures Typedef. Data Hierarchy Byte –8 bits (ASCII character ‘A’ = ) Field –Group of characters (character string “Fred”)
Dr. Kalpakis CMSC 661, Principles of Database Systems Representing Data Elements [12]
April 17-19HDF/HDF-EOS Workshop XV1 HDF5 Advanced Topics Elena Pourmal The HDF Group The 15 th HDF and HDF-EOS Workshop April 17, 2012.
The HDF Group November 3-5, 2009HDF/HDF-EOS Workshop XIII1 HDF5 Advanced Topics Elena Pourmal The HDF Group The 13 th HDF and HDF-EOS.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
Chapter 9 Imperative and object-oriented languages 1.
Chapter 6 Structured Data Types Arrays Records. Copyright © 2007 Addison-Wesley. All rights reserved. 1–2 Definitions data type –collection of data objects.
11/6/07HDF and HDF-EOS Workshop XI, Landover, MD1 Introduction to HDF5 HDF and HDF-EOS Workshop XI November 6-8, 2007.
Representing Data Elements Gayatri Gopalakrishnan.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
ISBN Chapter 6 Data Types: Structured types.
HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September
The HDF Group Introduction to HDF5 Barbara Jones The HDF Group The 13 th HDF & HDF-EOS Workshop November 3-5, HDF/HDF-EOS Workshop.
Chapter 6 Structured Data Types Arrays Records. Copyright © 2007 Addison-Wesley. All rights reserved. 1–2 Definitions data type –collection of data objects.
Week 7 – String. Outline Passing Array to Function Print the Array How Arrays are passed in a function call Introduction to Strings String Type Character.
OOP Languages: Java vs C++
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
DM_PPT_NP_v01 SESIP_0715_JP Indexing HDF5: A Survey Joel Plutchak The HDF Group Champaign Illinois USA This work was supported by NASA/GSFC under Raytheon.
March 9, th International LCI Conference - HDF5 Tutorial1 Tutorial II: HDF5 and NetCDF-4 10 th International LCI Conference Albert Cheng, Neil Fortner.
Euratom – ENEA Association Commonalities and differences between MDSplus and HDF5 data systems G. Manduchi Consorzio RFX, Euratom-ENEA Association, corso.
1 of 14 Substituting HDF5 tools with Python/H5py scripts Daniel Kahn Science Systems and Applications Inc. HDF HDF-EOS Workshop XIV, 28 Sep
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.
The HDF Group April 17-19, 2012HDF/HDF-EOS Workshop XV1 Introduction to HDF5 Barbara Jones The HDF Group The 15 th HDF and HDF-EOS Workshop.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
Values, variables and types © Allan C. Milne v
Sep , 2010HDF/HDF-EOS Workshop XIV1 HDF5 Advanced Topics Neil Fortner The HDF Group The 14 th HDF and HDF-EOS Workshop September 28-30, 2010.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
1 Introduction to HDF5 Data Model, Programming Model and Library APIs HDF and HDF-EOS Workshop VIII October 26, 2004.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
The HDF Group October 28, 2010NetcDF Workshop1 Introduction to HDF5 Quincey Koziol The HDF Group Unidata netCDF Workshop October 28-29,
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
HDF5-HL Packet Tables.
1 N-bit and ScaleOffset filters MuQun Yang National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Urbana, IL
1 HDF5 Life cycle of data Boeing September 19, 2006.
NetCDF Data Model Issues Russ Rew, UCAR Unidata NetCDF 2010 Workshop
1 Introduction to HDF5 Data Model, Programming Model and Library APIs HDF and HDF-EOS Workshop IX November 30, 2005.
The HDF Group November 3-5, 2009HDF/HDF-EOS Workshop XIII1 HDF5 Advanced Topics Elena Pourmal The HDF Group The 13 th HDF and HDF-EOS.
HDF5 UML Figures for Presenters Part I: Class Diagrams Part II: Relationship Diagrams Parts III & IV: The above, with text blocks.
Copyright © 2012 Pearson Education, Inc. Chapter 17: Linked Lists.
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 Introduction to HDF5 Command-line Tools.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
HDF5 Q4 Demo. Architecture Friday, May 10, 2013 Friday Seminar2.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
March 9, th International LCI Conference - HDF5 Tutorial1 HDF5 Advanced Topics.
The HDF Group 10/17/15 1 HDF5 vs. Other Binary File Formats Introduction to the HDF5’s most powerful features ICALEPCS 2015.
Intro to Parallel HDF5 10/17/151ICALEPCS /17/152 Outline Overview of Parallel HDF5 design Parallel Environment Requirements Performance Analysis.
April 28, 2008LCI Tutorial1 Parallel HDF5 Tutorial Tutorial Part IV.
1 Copyright © 2011 Tata Consultancy Services Limited Virtual Access Storage Method (VSAM) and Numeric Intrinsic Functions (NUMVAL and NUMVAL-C) LG - TMF148.
The HDF Group 10/17/151 Introduction to HDF5 ICALEPCS 2015.
1 Introduction to HDF5 Programming and Tools Boeing September 19, 2006.
C Programming Day 2. 2 Copyright © 2005, Infosys Technologies Ltd ER/CORP/CRS/LA07/003 Version No. 1.0 Union –mechanism to create user defined data types.
The HDF Group Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format 1 Copyright © 2010 The HDF Group. All Rights Reserved.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
The HDF Group Introduction to HDF5 Session ? High Performance I/O 1 Copyright © 2010 The HDF Group. All Rights Reserved.
The HDF Group Introduction to HDF5 Session ? HDF5 Mathematical Concepts 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Moving from HDF4 to HDF5/netCDF-4
Parallel HDF5 Introductory Tutorial
Introduction to HDF5 Session Five Reading & Writing Raw Data Values
Introduction to HDF5 Tutorial.
HDF and HDF-EOS Workshop XII
Introduction to HDF5 for HDF4 users
Introduction to HDF5 Mike McGreevy The HDF Group
Presentation transcript:

The HDF Group Introduction to HDF5 Session 7 Datatypes 1 Copyright © 2010 The HDF Group. All Rights Reserved

An HDF5 Datatype is… A description of dataset element type Grouped into “classes”: Atomic – integers, floating-point values Enumerated Compound – like C structs Array Opaque References Object – similar to soft link Region – similar to soft link to dataset + selection Variable-length Strings – fixed and variable-length Sequences – similar to Standard C++ vector class Copyright © 2010 The HDF Group. All Rights Reserved2

HDF5 Datatypes HDF5 has a rich set of pre-defined datatypes and supports the creation of an unlimited variety of complex user-defined datatypes. Self-describing: Datatype definitions are stored in the HDF5 file with the data. Datatype definitions include information such as byte order (endianess), size, and floating point representation to fully describe how the data is stored and to insure portability across platforms. Copyright © 2010 The HDF Group. All Rights Reserved3

Datatype Conversion Datatypes that are compatible, but not identical are converted automatically when I/O is performed Compatible datatypes: All atomic datatypes are compatible Identically structured array, variable-length and compound datatypes whose base type or fields are compatible Enumerated datatype values on a “by name” basis Make datatypes identical for best performance Copyright © 2010 The HDF Group. All Rights Reserved4

Datatype Conversion Example Copyright © 2010 The HDF Group. All Rights Reserved5 Array of integers on IA32 platform Native integer is little-endian, 4 bytes H5T_STD_I32LE H5Dwrite Array of integers on SPARC64 platform Native integer is big-endian, 8 bytes H5T_NATIVE_INT H5Dread Little-endian 4 bytes integer VAX G-floating H5Dwrite

The HDF Group Storing Records with HDF5 6 Copyright © The HDF Group. All Rights Reserved

HDF5 Compound Datatypes Compound types Comparable to C structs Members can be any datatype Can write/read by a single field or a set of fields Not all data filters can be applied (shuffling, SZIP) Copyright © The HDF Group. All Rights Reserved7

HDF5 Compound Datatypes Which APIs to use? H5TB APIs Create, read, get info and merge tables Add, delete, and append records Insert and delete fields Limited control over table’s properties (i.e. only GZIP compression, level 6, default allocation time for table, extendible, etc.) PyTables Based on H5TB Python interface Indexing capabilities HDF5 APIs H5Tcreate(H5T_COMPOUND), H5Tinsert calls to create a compound datatype H5Dcreate, etc. See H5Tget_member* functions for discovering properties of the HDF5 compound datatype Copyright © The HDF Group. All Rights Reserved8

Creating and Writing Compound Dataset Copyright © The HDF Group. All Rights Reserved9 h5_compound.c example typedef struct s1_t { int a; float b; double c; } s1_t; s1_t s1[LENGTH];

Creating and Writing Compound Dataset Copyright © The HDF Group. All Rights Reserved10 /* Create datatype in memory. */ s1_tid = H5Tcreate(H5T_COMPOUND, sizeof(s1_t)); H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT); H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT); Note: Use HOFFSET macro instead of calculating offset by hand. Order of H5Tinsert calls is not important if HOFFSET is used.

Creating and Writing Compound Dataset Copyright © The HDF Group. All Rights Reserved11 /* Create dataset and write data */ dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT, H5P_DEFAULT); status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1); Note: In this example memory and file datatypes are the same. Type is not packed. Use H5Tpack to save space in the file. status = H5Tpack(s1_tid); status = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT, H5P_DEFAULT);

Reading Compound Dataset Copyright © The HDF Group. All Rights Reserved12 /* Create datatype in memory and read data. */ dataset = H5Dopen(file, DATASETNAME, H5P_DEFAULT); s2_tid = H5Dget_type(dataset); mem_tid = H5Tget_native_type(s2_tid); buf = malloc(H5Tget_size(mem_tid)*number_of_elements); status = H5Dread(dataset, mem_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, buf); Note: We could construct memory type as we did in writing example. For general applications we need to discover the type in the file, find out corresponding memory type, allocate space and do read.

Reading Compound Dataset by Fields Copyright © The HDF Group. All Rights Reserved13 typedef struct s2_t { double c; int a; } s2_t; s2_t s2[LENGTH]; … s2_tid = H5Tcreate (H5T_COMPOUND, sizeof(s2_t)); H5Tinsert(s2_tid, "c_name", HOFFSET(s2_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s2_tid, “a_name", HOFFSET(s2_t, a), H5T_NATIVE_INT); … status = H5Dread(dataset, s2_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s2);

Table Example a_name (integer) b_name (float) c_name (double) Copyright © The HDF Group. All Rights Reserved14 Multiple ways to store a table Dataset for each field Dataset with compound datatype If all fields have the same type: ◦2-dim array ◦1-dim array of array datatype Continued… Choose to achieve your goal! Storage overhead? Do I always read all fields? Do I read some fields more often? Do I want to use compression? Do I want to access some records?

The HDF Group Storing Variable Length Data with HDF5 15 Copyright © The HDF Group. All Rights Reserved

HDF5 Fixed and Variable Length Array Storage Copyright © The HDF Group. All Rights Reserved16 Data Time Data Time

Storing Strings in HDF5 Array of characters (Array datatype or extra dimension in dataset) Quick access to each character Extra work to access and interpret each string Fixed length string_id = H5Tcopy(H5T_C_S1); H5Tset_size(string_id, size); Wasted space in shorter strings Can be compressed Variable length string_id = H5Tcopy(H5T_C_S1); H5Tset_size(string_id, H5T_VARIABLE); Overhead as for all VL datatypes Compression will not be applied to actual data Copyright © The HDF Group. All Rights Reserved17

Storing Variable Length Data in HDF5 Each element is represented by C structure typedef struct { size_t length; void *p; } hvl_t; Base type can be any HDF5 type H5Tvlen_create(base_type) Copyright © The HDF Group. All Rights Reserved18

Example Copyright © The HDF Group. All Rights Reserved19 Data hvl_t data[LENGTH]; for(i=0; i<LENGTH; i++) { data[i].p = malloc((i+1)*sizeof(unsigned int)); data[i].len = i+1; } tvl = H5Tvlen_create (H5T_NATIVE_UINT); data[0].p data[4].len

Reading HDF5 Variable Length Array HDF5 library allocates memory to read data in Application only needs to allocate array of hvl_t elements (pointers and lengths) Application must reclaim memory for data read in Copyright © The HDF Group. All Rights Reserved20 hvl_t rdata[LENGTH]; /* Create the memory vlen type */ tvl = H5Tvlen_create(H5T_NATIVE_INT); ret = H5Dread(dataset, tvl, H5S_ALL, H5S_ALL, H5P_DEFAULT, rdata); /* Reclaim the read VL data */ H5Dvlen_reclaim(tvl, H5S_ALL, H5P_DEFAULT,rdata);

The HDF Group HDF5 Reference Datatypes 21 Copyright © The HDF Group. All Rights Reserved

Reference Datatypes Object Reference Pointer to an object in a file Predefined datatype H5T_STD_REG_OBJ Dataset Region Reference Pointer to a dataset + dataspace selection Predefined datatype H5T_STD_REF_DSETREG Copyright © The HDF Group. All Rights Reserved22

Need to select and access the same elements of a dataset Saving Selected Region in a File Copyright © The HDF Group. All Rights Reserved23

Reference to Dataset Region Copyright © The HDF Group. All Rights Reserved24 REF_REG.h5 Root Region ReferencesMatrix

Reference to Dataset Region Copyright © The HDF Group. All Rights Reserved25 Example dsetr_id = H5Dcreate(file_id, “REGION REFERENCES”, H5T_STD_REF_DSETREG, …); H5Sselect_hyperslab(space_id, H5S_SELECT_SET, start, NULL, …); H5Rcreate(&ref[0], file_id, “MATRIX”, H5R_DATASET_REGION, space_id); H5Dwrite(dsetr_id, H5T_STD_REF_DSETREG, H5S_ALL, H5S_ALL, H5P_DEFAULT, ref);

Stretch Break Copyright © 2010 The HDF Group. All Rights Reserved26