The HDF Group Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format 1 Copyright © 2010 The HDF Group. All Rights Reserved
Our Purpose Today 1)Familiarize you with HDF5 and its capabilities. 2) Help you understand how HDF5 might be applied to your data management challenges. Copyright © 2010 The HDF Group. All Rights Reserved2
HDF5 Data Model Copyright © 2010 The HDF Group. All Rights Reserved3 File Dataset Link Group Attribute Dataspace Datatype HDF5 Objects
Developing a Project Data Model Copyright © 2010 The HDF Group. All Rights Reserved4 Project Domain ConceptsLogical Data Model Physical Instantiation Relational HDF5 Data Model A Relational Database HDF5 File
Logical Data Models Copyright © 2010 The HDF Group. All Rights Reserved5 X X
HDF5 / Directories and Files Copyright © 2010 The HDF Group. All Rights Reserved6 HDF5Directories (Folders) and Files filefilesystem datasetfile datatype~ file type or extension dataspace~ file size attribute~ properties (Windows) groupdirectory (Unix) or folder (Windows) linkhard links & symbolic links (Unix); ~shortcuts (Windows) Both support hierarchies for organizing information (and to some degree, directed graphs)
HDF5 / XML Both support rich metadata and allow new types to be defined HDF5 objects designed for numeric data; XML objects designed for text Copyright © 2010 The HDF Group. All Rights Reserved7 HDF5XML filedocument datasetelement datatypesimple or complex type definitions in XML Schema dataspace~ minOccurs, maxOccurs in XML Schema attribute group~ element with sub-elements link~ IDREF
HDF5 / Relational Databases Copyright © 2010 The HDF Group. All Rights Reserved8 HDF5Relational Database filedatabase datasetdata table datatypechar, varchar, number, blob, raw, date, … dataspace~ records attribute? group? link? HDF5 supports multi-dimensional arrays with common datatypes in the cells; locate by offset RDB support rows with different data types in fields; locate by primary key
HDF5 Technology Platform HDF5 data model The “building blocks” for data organization and specification HDF5 software Library, language interfaces, tools HDF5 file format Bit-level organization of HDF5 file Copyright © 2010 The HDF Group. All Rights Reserved9 Recall… Let’s look at…
HDF5 File Format Defined by the HDF5 File Format Specification Specifies the bit-level organization of an HDF5 file on storage media Maps the data model objects to a linear address space Other representations of the data model objects are also possible, but those are not the HDF5 format Self-describing All the information necessary to read and reconstruct the data model objects is specified by the format Designed to work well with other technologies Designed for speed and storage efficiency Binary format Copyright © 2010 The HDF Group. All Rights Reserved10
HDF5 File Format Specification Copyright © 2010 The HDF Group. All Rights Reserved11 Introduction You can have the power of the format without worrying about the details of the specification.
Developing a Project Data Model Copyright © 2010 The HDF Group. All Rights Reserved12 Project Domain ConceptsLogical Data Model Physical Instantiation Relational HDF5 Data Model A Relational Database HDF5 File
Physical Instantiations Copyright © 2010 The HDF Group. All Rights Reserved13 Format
HDF5 / Filesystem Both allow traversal of objects in the hierarchy Both include internal metadata for fast access to subsets of the data Both can handle variety of data HDF5 file can be easily migrated or shared Copyright © 2010 The HDF Group. All Rights Reserved14
HDF5 / “Binary Flat File” “Binary Flat File” = A sequence of bytes representing (primarily) numeric data. Often written by scientific and engineering applications to save results from simulations or experiments. A binary flat files usually represents the fastest way to write numeric data. Read performance varies depending on access patterns. Unlike HDF5, binary flat files are not self-describing or portable across architectures. Copyright © 2010 The HDF Group. All Rights Reserved15
HDF5/XML Both HDF5 and XML are self-describing and portable XML is text-based and requires contents to be accessed sequentially HDF5 is binary and supports random access and subsetting Copyright © 2010 The HDF Group. All Rights Reserved16
HDF5/PDF Both HDF5 and PDF formats are published and open Both can include heterogeneous types of information PDF focused on documents HDF5 focused on collections of different types, with strong support for multi-dimensional arrays of numeric data Both are portable across architectures Copyright © 2010 The HDF Group. All Rights Reserved17
HDF5 / Relational Databases RDB provides access control features; HDF5 does not RDB transaction based; HDF5 is not Transactions / Logging introduce overhead that may not be needed HDF5 not designed for many writers to ‘random’ locations RDB provides built-in indices to values HDF5 provides navigation to datasets / subsets within datasets HDF5 files portable across platforms Copyright © 2010 The HDF Group. All Rights Reserved18
Discussion How could daily temperature measurements made at various locations throughout a building be modeled in different formats? Filesytem, Binary Flat File, XML, PDF, Relational Database What are some pros/cons of each? Copyright © 2010 The HDF Group. All Rights Reserved19
Review HDF5 consists of file format self-describing many internal structures to support high-performance software data model file, dataset, datatype, dataspace, attribute, group, link HDF5 designed to support management of high-volume, complex data data sharing and preservation Copyright © 2010 The HDF Group. All Rights Reserved20
The HDF Group ENSIGHT Automotive Crash Simulation 21 Copyright © 2010 The HDF Group. All Rights Reserved HDF5 Data Model Example
Automotive Crash Simulation 22
Automotive Crash Simulation 23
Automotive Crash Simulation 24
Solid Modeling 25
Solid Modeling 26
Modeled in HDF5 Copyright © 2010 The HDF Group. All Rights Reserved27
Mesh Example in HDFView 28Copyright © 2010 The HDF Group. All Rights Reserved
Stretch Break Copyright © 2010 The HDF Group. All Rights Reserved29