HDF Update Mike Folk National Center for Supercomputing Applications

Slides:



Advertisements
Similar presentations
HDF and HDF-EOS Workshop VII, September 23-25, This work is supported in part by a Cooperative Agreement with the National Aeronautics and Space.
Advertisements

® Page 1 Intel Compiler Lab – Intel Array Visualizer HDF Workshop VI December 5, 2002 John Readey
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
HDF4 and HDF5 Performance Preliminary Results Elena Pourmal IV HDF-EOS Workshop September
University of Illinois at Urbana-ChampaignHDF Mike Folk HDF-EOS Workshop IV Sept , 2000 HDF Update HDF.
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
HDF 1 NCSA HDF XML Activities Robert E. McGrath Mike Folk National Center for Supercomputing Applications.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
OCLC Online Computer Library Center CONTENTdm Migration Training Craig Yamashita Vice President, Technology and Product Development DiMeMa, Inc. July 2005.
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Mike Folks, The HDF Group Ruth Duerr, NSIDC 1.
Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group June 30, NPOESS Data Formats Working Group.
Developing a NetCDF-4 Interface to HDF5 Data
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
1 HDF-EOS and Related Tools Status Update. 2 Overview.
Developing a NetCDF-4 Interface to HDF5 Data Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC.
1 HDF-EOS Status, Related Tools and Issues. 2 Overview.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
1 Overview of HDF5 HDF Summit Boeing Seattle The HDF Group (THG) September 19, 2006.
Big Applications: Simulations, Models, Visualization, … Scientific data management for big computers and big data HDF5 (serial.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
HDF Mike Folk National Center for Supercomputing Applications Science Data Processing Workshop February 26-28, 2002 HDF Update HDF.
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
1 HDF-EOS Status and Development Larry Klein, Abe Taaheri, and Cid Praderas L-3 Communications Government Services, Inc. November 30, 2005.
Why do I want to know about HDF and HDF- EOS? Hierarchical Data Format for the Earth Observing System (HDF-EOS) is NASA's primary format for standard data.
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
N P O E S S I N T E G R A T E D P R O G R A M O F F I C E NPP/ NPOESS Product Data Format Richard E. Ullman NOAA/NESDIS/IPO NASA/GSFC/NPP Algorithm Division.
Page 1 Status of HDF-EOS, Related Software, and Tools Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshp XIII Riverdale, MD November 4, 2009.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
Support for NPP/NPOESS by The HDF Group Mike Folk The HDF Group HDF and HDF-EOS Workshop XII October 17, 2008 Oct HDF and HDF-EOS Workshop XII1.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
1 HDF-EOS Development Current Status and Schedule Larry Klein, Shen Zhao, Abe Taaheri and Ray Milburn L-3 Communications Government Services, Inc. September.
Towards Long-Term Archiving of NASA HDF-EOS and HDF Data Data Maps and the Use of Mark-Up Language Ruth Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter.
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps Ruth Duerr, NSIDC Christopher Lynnes, GES DISC The HDF Group Oct HDF and.
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
1 HDF-EOS Status, Related Tools and Issues. 2 Overview.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
Page 1 TOOLKIT / HDF-EOS Status and Development Abe Taaheri, Raytheon IIS Aura DSWG meeting October 2007.
The ISO EXPRESS and Binary Data Project January 2005.
GES DISC DAAC February 28, 2002HDF-EOS Workshop V1 The Goddard DAAC The Goddard DAAC Presented by:
- 1 - HDF5, HDF-EOS and Geospatial Data Archives HDF and HDF-EOS Workshop VII September 24, 2003.
The HDF Group Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group November 5, 2009 November 3-5,
HDF Hierarchical Data Format Nancy Yeager Mike Folk NCSA University of Illinois at Urbana-Champaign, USA
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
November 30, 2005HDF & HDF-EOS Workshop IX Peter Cao, NCSA November 30, 2005 HDF5 Tools.
1 HDF Vendors/Software Developers Workshop HDF And HDF-EOS Tools R.Suresh NASA/GSFC/HSTX Ph: FAX:
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 Introduction to HDF5 Command-line Tools.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
1 Status of HDF-EOS, Related Software and Tools. 2 TOOLKIT / HDF-EOS Support.
HDF and HDF-EOS Workshop VII September 24, 2003 HDF5, HDF-EOS and Geospatial Data Archives Don Keefer Illinois State Geological Survey Mike Folk Univ.
SPDF Science Advisory Group - September 29-30, 2005 Page 12/24/2016 9:09:48 PM Services of the Space Physics Data Facility (SPDF) / Sun-Earth Connection.
NcBrowse: A Graphical netCDF File Browser Donald Denbo NOAA-PMEL/UW-JISAO
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
HDF and HDF-EOS Workshop XII
Hierarchical Data Formats (HDF) Update
Moving from HDF4 to HDF5/netCDF-4
Kent Yang, Mike Folk The HDF Group March 31, 2009
Access HDF5 Datasets via OPeNDAP’s Data Access Protocol (DAP)
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Hierarchical Data Format (HDF) Status Update
Presentation transcript:

HDF Update Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VI December 4-5, 2002 HDF

Topics Who is supporting HDF HDF software in 2002 Other activities of interest

Who is supporting HDF? NASA/ESDIS Earth science applications, instrument data DOE/ASCI (Accelerated Strategic Computing Init.) Simulations on massively parallel machines NCSA/NSF/State of Illinois HPC and Grid data intensive apps, Visualization, user support Atmospheric and ocean modeling environments DOE Scientific Data Analysis & Computation Program High performance I/O R & D National Archives and Records Administration Small grant to consider HDF5 as an archive format

HDF software in 2002 Library releases Java Products Tools Compression Investigations of Web technologies

HDF4 library No releases in 2002. Release 1.6 planned for May, 2003 Bug fixes New compilers Intel Portland Group New OS Mac OS X AIX 5.1 64-bit

HDF5 software milestones in 2002 Q1 ‘02 Q2 ‘02 Q3 ‘02 Q4 ‘02  1.4.3  1.4.4  1.4.5 Base library  HDF5 tables   High level APIs High level library  Java products 1.0  Java products 1.1  Java prods 1.2 Java products  H5import  H4-H5 conversion library Other tools

HDF5 library in 2002 Compilers, configuration, etc. Performance “h5cc” script to simplify compilation of HDF5 programs F90 shared library and C++ supported on Windows Intel C, F90 and C++ on Linux, IA32/64 and Windows Support for zlib 1.1.4 Performance Added library performance tests Performance improvements hyperslabs, data conversions. chunking Fewer and larger I/O requests when accessing a file Parallel I/O performance improvements

Parallel HDF5 Parallel I/O performance benchmark suite Compares raw I/O, MPI-I/O, and HDF5 I/O Distributed with HDF5 http://hdf/RFC/PIO_Perf/PHDF5_performance.html Parallel HDF5 tutorial http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/ “Flexible parallel HDF5” programming model More flexible model for parallel HDF5 Performance studies and tuning activities

Next major release -- HDF5 1.6 Release date: Spring 2003 New format and library features include Compression enhancements, including szip Generic Properties Checksum Dimension scale support (tentative) Performance improvements include Chunking & compression Parallel I/O performance benchmark suite

Next major release -- HDF5 1.6 Flexible parallel HDF5 Special platforms Large Compaq cluster (Pittsburgh SC) Crays Windows XP Mac Several new compilers (e.g. Intel, Portland Group) Documentation New User’s Guide-good draft, first version

High level APIs Make HDF5 easier to use More operations per call than the normal HDF5 API Encourage standard ways to store objects Enforce standard representation of objects in HDF5 Use this as backup

High level APIs Lite – done Image – done Table – partly done Same as HDF5, but simpler Image – done Interprets dataset as image/palette 2-D raster data like HDF4 raster images Table – partly done Interprets dataset as “tables” – collections of records Insert, delete records or fields Future: sort and search Dimension scale – in the works Unstructured grids – in the works http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/doc/

HDF5 tools activities

HDF Java Products – 2002 Goal: replace older tools with single viewer/editor HDF Java Products Java HDF Interface (JHI) – to access the HDF4 library. Java HDF5 Interface (JHI5) – to access the HDF5 library. New hdf-object package – understands HDF4 and HDF5. HDFView – tool for browsing/editing HDF4 and HDF5 See demo, brochure, CD, web page http://hdf.ncsa.uiuc.edu/hdf-java-html/

HDFView releases in 2002 Q2 Q3 Q4 Version 1.0 Browser for both HDF4 and HDF5 Version 1.1 Editor for both HDF4 and HDF5 Version 1.2 All features of old Java tools. Some new features. HDFView can do as much as JHV and H5View and also includes many new editing features http://hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/

H4toH5 Conversion Toolkit Goal: support transition from HDF4 to HDF5 Version 1.0 released in July 2002 Includes h4toh5 converter h5toh4 converter library of functions for converting HDF4 objects into HDF5 objects Download from: http://hdf.ncsa.uiuc.edu/h4toh5/libh4toh5.html Mapping specification and FAQ http://hdf.ncsa.uiuc.edu/HDF5/doc/ADGuide/H4toH5Mapping.pdf

Other tools work H5import - convert flat files to HDF5 datasets ASCII text file with numeric data (float or integer) Binary file with native floating point data Binary file with native integer data hdf4import – souped up version of the old fptohdf Available in hdf4r1.6 HDF5-to-GIF and GIF-to-HDF5 converters H5dump improvements Subsetting Support variable length datatypes including strings

Other tools work H5diff compare the structure and contents of two HDF5 files, and report differences Command line utility like Unix ‘diff’ and older ‘hdiff’ Report missing objects, inconsistent size, datatype, etc. Compare values of numeric datasets First beta available January 2003 RFC: http://hdf.ncsa.uiuc.edu/RFC/H5diff/h5diff.html

Compression Szip - fast compression method for EOS data Expect to include in next releases of HDF4 and HDF5 Shuffling – reorder bytes before compressing Can improve compression ratio Performance study – BZIP2 vs gzip compression Study: whether or not to support bzip2 compression Result: BZIP2 not significantly better than gzip So not currently supported in the release But BZIP2 can be used with HDF5

Investigations of Web technologies

HDF5 XML Great interest in XML, interoperation of XML and binary formats Results HDF5 DTD h5dump –XML H5View reads XML and writes HDF5 Studies, design notes, other info http://hdf.ncsa.uiuc.edu/HDF5/XML/ Possible future activity: XML schema Update tools HDF4 schema, tools Format translation via XSLT

XML, Java Server Pages, etc. How to use HDF5 data in Web environment Experiments with XML, Java Server Pages (JSP), etc. JSP server Access HDF5 files on Web server using Web browser, or Java applet, or Java application Several variations demonstrated Is not a product! http://hdf.ncsa.uiuc.edu/HDF5/XML/

CORBA Experiments HDF5 with CORBA on distributed systems Prototype CORBA server to wrap HDF5 library and datasets (C++) Remote access via C++, Java, Web Might be valuable as replacement for Java Native Interface Successful demonstration, but many open issues Is not a product! http://hdf.ncsa.uiuc.edu/HDF5/XML/JSPExperiments/index.html

Other Activities of Interest

NPOESS National Polar-orbiting Operational Environmental Satellite System Combine satellite systems of civil and defense programs HDF5 to be used to distribute data to users First implementation in 2006 Support the NPOESS Preparatory Program Later full implementation by 2013 Converged system provides global coverage http://www.ipo.noaa.gov

Neutron Research Community Worldwide research community England, France, Germany, Japan, Italy, Switzerland, Russia US centers at Argonne, NIST, Los Alamos Neutron and X-ray scattering experiments and simulations Common software and formats to gather, share, archive, post-process data NeXus data format Enforces standardization of metadata and data structures Based on HDF4 for many years Now switching to HDF5 http://www.neutron.anl.gov/nexus/

National Archives and Records Administration Pilot project for HDF5 Explore scientific data format requirements for long term archiving of electronic records Identify record types for which HDF5 is suited

Atmospheric and Ocean Models Modeling Environment for Atmospheric Discovery (MEAD) HDF5 for high performance I/O for atmospheric and ocean modeling Weather Research and Forecasting (WRF) model Regional Ocean Modeling System (ROMS) Coupling of WRF and ROMS UAH ESML & data mining also involved

HDF5 Mesh API prototype Support for structured and unstructured “mesh” data For applications such as computational fluid dynamics, finite element analysis, and visualization. A higher-level API Format HDF5 groups and datasets to organize the data Collaboration involving NCSA, CEI and others Documentation still pretty sketchy, but see ftp://ftp.ensight.com/pub/HDF_RW/hdf_rw.tgz Discussion list in the works

HDF5 Wins 2002 R&D Magazine Award “The 100 products and processes that are the most ‘technologically significant’ and can change people's lives for the better” http://www.ncsa.uiuc.edu/News/Access/Releases/020722.HDF5.html

Thank you! Information Sources HDF 5 HDF website http://hdf.ncsa.uiuc.edu/ HDF5 Information Center http://hdf.ncsa.uiuc.edu/HDF5/ HDF Helpdesk hdfhelp@ncsa.uiuc.edu HDF users mailing list hdfnews@ncsa.uiuc.edu

Backup slides

HDF5 funding sources

HDF5 User Community Worldwide use in government, academia, industry How many users? 450 organizations or individuals have filled in “user” form in the past year There are many times this many anonymous users And some organizations have thousands of users (e.g. the Earth Observing System) Public applications More than 25 publicly available applications Four vendors so far LabVIEW IDL EarthScan Network HDF Explorer Others in the works (e.g. Matlab)

Technical fields that use HDF5 Aerospace Agricultural research Air traffic control Aircraft emissions database Applied mathematics Astrophysics Astrophysics / supernovae Atmospheric chemistry Atmospheric physics Bioengineering CEM Simulation Climatology / hydrology Computational fluid dynamics Computational physics Computational physics / education Computational physics and computational astrophysics Computer modeling Computer science Data processing Earth observation / atmospheric science Earth science Environmental science Fast searching, sorting and retrieval Film making special effects Fluid mechanics GIS Geodetic Science Geology Gravitational physics Hydrology Information technology Magnetic mass spectrometer development Marine biology / ecology Materials science Meteorological data products Meteorology Microscopy Molecular biology Nano device simulation Neutron scattering Ocean color Ocean remote sensing Optics / optoelectronics Petroleum engineering Photonic band gap studies Photonic crystals Photonics Post-fire erosion analysis Protein crystallography, molecular modeling Protostellar accretion discs Remote sensing SAR processing Satellite / weather radar remote sensing Satellite oceanography Semiconductor process simulation Software engineering, distributed systems Space geodesy Space physics Surface water flow and sediment transport Theoretical chemistry Visualization Volcanology Water resources management X-ray physics

Users of HDF5 – 66 countries

Next major release -- HDF5 1.6

Next major release -- HDF5 1.6 Performance improvements Chunking Compression (several) Parallel I/O Metadata I/O Compact dataset storage Other parallel Parallel I/O performance benchmark suite Flexible parallel HDF5 Portland group C, Fortran 90 and C++ compilers Quite a bit of Fortran work

Next major release -- HDF5 1.6 Testing (several) Special platforms PSC cluster Cray Windows XP Mac Several new compilers (e.g. Intel, Portland Group) Documentation New User’s Guide-good draft, first version

HDF5 High Level APIs – HDF5 Image For datasets to be interpreted as images/palettes 2-D raster data like HDF4 raster images Image operations Create, write, read, query Based on “HDF5 Image & Palette Specification” backup

HDF5 High Level APIs – HDF5 Table For datasets to be interpreted as “tables” A collection of records All records have the same structure Like Vdatas in HDF4, but more operations Table operations Create, write, read, query Insert, delete records or fields Future: sort and search Includes the following new Table functions: backup

HDF5 High Level APIs – HDF5 Table For datasets to be interpreted as “tables” A collection of records All records have the same structure Like Vdatas in HDF4, but more operations Table operations Create, write, read, query Insert, delete records or fields Later: sort and search

HDF5 High Level API – Future Dimension scales Similar to HDF4 In progress More table operations sort and search Unstructured grids E.g. triangle mesh backup

Szip Compression Software Implements CCSDS lossless compression algorithm Fast compression method for EOS data Expect to include in next releases of HDF4 and HDF5 HDF4: compress SDS and image HDF5: compress datasets Intellectual property issues Owned by U of Idaho (formerly U of New Mexico) Open source No commercial of encoder use without license Decoder free for everyone Backup slides

Performance study – BZIP2 compression Goal: decide whether or not to support bzip2 compression Compared bzip2 and gzip Observations Bzip2 always better than gzip in compression ratio But the difference was just a few percentage points And bzip2 always takes more processing time, especially for decoding Result Not currently supported in the release But BZIP2 can be used with HDF5 (checked with HDF5-1.4.4) http://hdf.ncsa.uiuc.edu/HDF5/papers/bzip2/ Backup slides

New HDFView features Display palette in graph as separate RGB lines. Open file as read-only option Create new array from old array Import data from text file Save to HDF4, HDF5 or binary Create new image from subset of existing image Modify string-type dataset content Convert jpeg to HDF image Convert HDF to jpeg image More user options and well organized GUI Select vdata or compound datatype by field Select subset from preview image and using mouse Support unlimited dimension when creating new HDF4 dataset. Enable application of simple math calculations to data Support multiple palettes/image Create new image with default attributes Modify image palette or select predefined palette backup

CORBA, XML etc. permutations Java Server Platform Web browser HTML XML Java Any Java Native Interface Java C Java Native Interface HDF Library and File H5view, etc C Java Applet Java CORBA Server Other App. C++ Any The Net Backup: Animation of all the permutations Client/Remote Server/Local Distributed Product Demonstrated in Research Should work, but not demonstrated

National Polar-orbiting Operational Environmental Satellite System (NPOESS) U.S. civil and defense programs to combine weather data collection, expanding to global coverage and long-term continuity of observations at less cost! Local Equatorial Crossing Time 0530 1330 0930 METOP NPOESS Lite POES METOP NPOESS DMSP DMSP 0830 0730 1330 1330 0830 0530 0930 0530 POES POES Local Equatorial Crossing Time NPOESS is evolving the United States’ 4 spacecraft polar-orbiting satellite system into a two satellite system based on U.S. civil and national security requirements. Consistent with the PDD, the NPOESS program is implementing the converged system in a manner that encourages cooperation with foreign governments and international organizations, specifically leveraging European developed payloads and relying on EUMETSAT to provide the satellite for the third plane of the 3-satellite Joint Polar System constellation that will ensure global coverage for key environmental data. <NEXT SLIDE> Local Equatorial Crossing Time DMSP DMSP Tomorrow (2005) 2 US Military 1 US Civilian 1 EUMETSAT/METOP Future (2013) 2 US Converged 1 US “Lite” Specialized Satellites Today 4-Orbit System 2 US Military 2 US Civilian Distribute in HDF5