February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level Access to Remote Files Integrating HDF5 with SRB
February 2-3, 2006SRB Workshop, San Diego2/26 Outline Introduction to HDF5 The HDF-SRB model SRB Support in HDFView
February 2-3, 2006SRB Workshop, San Diego3/26 Overview of HDF5 Answering big questions … Matter & universe Weather & climate August 24, 2001 August 24, 2002 Total Column Ozone (Dobson) Life & nature
February 2-3, 2006SRB Workshop, San Diego4/26 Overview of HDF5 Involves big data …
February 2-3, 2006SRB Workshop, San Diego5/26 Overview of HDF5 On big computers …
February 2-3, 2006SRB Workshop, San Diego6/26 Overview of HDF5 HDF solution … Software & tools open source & multiple platform Common models extensions Standard APIs conventions & easy use File format for all kinds of data Efficiency storage & IO
February 2-3, 2006SRB Workshop, San Diego7/26 Overview of HDF5 Exmaple HDF5
February 2-3, 2006SRB Workshop, San Diego8/26 Overview of HDF5 HDF Software HDF I/O Library Tools & Applications HDF File
February 2-3, 2006SRB Workshop, San Diego9/26 Overview of HDF5 Object model Primary Objects Groups Datasets Additional ways to organize data Attributes Sharable objects Storage and access properties
February 2-3, 2006SRB Workshop, San Diego10/26 Overview of HDF5 Groups “/” tom dick harry temp A mechanism for collections of related objects Every file starts with a root group Similar to UNIX directories Can have attributes
February 2-3, 2006SRB Workshop, San Diego11/26 Overview of HDF5 Datasets DataMetadata Dataspace 3 Rank Dim_2 = 5 Dim_1 = 4 Dimensions time = 32.4 pressure = 987 temp = 56 Attributes Chunked compressed Dim_3 = 7 Storage info IEEE 32-bit float Datatype
February 2-3, 2006SRB Workshop, San Diego12/26 Overview of HDF5 Data subsetting (c) A sequence of points from a 2D array to a sequence of points in a 3D array. (d) Union of hyperslabs in file to union of hyperslabs in memory. (b) Regular series of blocks from a 2D array to a contiguous sequence at a certain offset in a 1D array (a) Hyperslab from a 2D array to the corner of a smaller 2D array
February 2-3, 2006SRB Workshop, San Diego13/26 Project Description Motivation SRBHDF5 Indexing and searching Distributed data system Access control Large and diverse data High performance access Interactive and subsetting High performance distributed data system
February 2-3, 2006SRB Workshop, San Diego14/26 Project Description Goals Working prototype of client/server system for object-level access to HDF5 stored in the SRB Use SRB as middleware to transfer data between the server and client Use Object-level access for interactive and efficient access to part of the file
February 2-3, 2006SRB Workshop, San Diego15/26 Remote Data Access on SRB Methods Normal ways to access SRB: Get the whole file: large files (100TB SCEC) Use POSIX low level calls: low performance New way: Implement proxy operations to access objects or parts of objects in one request
February 2-3, 2006SRB Workshop, San Diego16/26 Normal SRB File Access Architecture SRB Server HDF5 MCAT client HDF5 File (whole file or a sequence of bytes)
February 2-3, 2006SRB Workshop, San Diego17/26 Object-level File Access Architecture SRB Server MCAT HDF5 Library HDF5-SRB Module (pack/unpack messages) HDF5 Object (File, Group, Dataset, Subset, Attribute) HDF5-SRB Module (pack/unpack messages) ClientServer HDF5 Object (File, Group, Dataset, Subset, Attribute) Client Application HDF5 file
February 2-3, 2006SRB Workshop, San Diego18/26 Examples of File Access HDF5 I need to see the eye of Hurricane Bob!
February 2-3, 2006SRB Workshop, San Diego19/26 Examples of File Access Whole file transferclient Get the file Transfer large image – slow! HDF5
February 2-3, 2006SRB Workshop, San Diego20/26 Examples of File Access SRB POSIX APIHDF5 client image found image open open imagefind imagefile’s openOpen file Many small messages – slow and complex!
February 2-3, 2006SRB Workshop, San Diego21/26 Examples of File Access Object levelclient HDF5 Get me the eye of hurricane Bob 1 request, small transfer – fast!
February 2-3, 2006SRB Workshop, San Diego22/26 HDF5-SRB Model New objects/APIs A new set data objects H5File, H5Group, H5Dataset, H5Datatype, etc Encapsulated client requests and server results Enhanced SRB APIs Pack/Unpack routines (exchange data between byte stream and structure) to handle complicated struct – string, pointers, pointers to arrays, arrays of pointers, etc New srbGenProxyFunct (general Proxy Function) handles other types of request besides HDF5
February 2-3, 2006SRB Workshop, San Diego23/26 HDF5-SRB Model Data Flow Client API srbObjRequest(void *obj, int objID) Server API srbObjProcess(void *obj, int objID) srbGenProxyFunct 1. packMsg() 2. unpackMsg() HDF5 Library HDF5 file 3. H5Obj::op() 4. Access file 5. H5Object 6. packMsg() 7. unpackMsg() SRB Server
February 2-3, 2006SRB Workshop, San Diego24/26 Running Server/Client A SRB server that supports HDF5 HDF5 library and other external libraries (SZIP, ZLIB) A SRB version 3.4 or later from Follow instruction on how to run SRB server from UG packed with SRB source release or online at Any client application that implements HDF5-SRB Objects No HDF5 library is required on the client Example client application: HDFView 2.3 or above
February 2-3, 2006SRB Workshop, San Diego25/26 Short Demo HDFView Support Windows and Linux
February 2-3, 2006SRB Workshop, San Diego26/26 Question / Comments?