Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

1 Computer Science, University of Warwick Accessing Irregularly Distributed Arrays Process 0’s data arrayProcess 1’s data arrayProcess 2’s data array Process.
File Consistency in a Parallel Environment Kenin Coloma
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Phillip Dickens, Department of Computer Science, University of Maine. In collaboration with Jeremy Logan, Postdoctoral Research Associate, ORNL. Improving.
Efficient I/O on the Cray XT Jeff Larkin With Help Of: Gene Wagenbreth.
Parallel I/O Performance Study Christian Chilan The HDF Group September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1.
Spark: Cluster Computing with Working Sets
I/O Analysis and Optimization for an AMR Cosmology Simulation Jianwei LiWei-keng Liao Alok ChoudharyValerie Taylor ECE Department Northwestern University.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
I/O Optimization for ENZO Cosmology Simulation Using MPI-IO Jianwei Li12/06/2001.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Comparison of Communication and I/O of the Cray T3E and IBM SP Jonathan Carter NERSC User.
Grid IO APIs William Gropp Mathematics and Computer Science Division.
NetCDF Ed Hartnett Unidata/UCAR
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
1 The Google File System Reporter: You-Wei Zhang.
1 A Look at PVFS, a Parallel File System for Linux Will Arensman Anila Pillai.
Introduction to NetCDF4 MuQun Yang The HDF Group 11/6/2007HDF and HDF-EOS Workshop XI, Landover, MD.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
NetCDF-4 The Marriage of Two Data Formats Ed Hartnett, Unidata June, 2004.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
NetCDF for Developers and Data Providers Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 14 April 2011.
HDF5 A new file format & software for high performance scientific data management.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
RFC: Breaking Free from the Collective Requirement for HDF5 Metadata Operations.
Using HDF5 in WRF Part of MEAD - an alliance expedition.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services.
Project 4 SciDAC All Hands Meeting March 26-27, 2002 PIs:Alok Choudhary, Wei-keng Liao Grad Students:Avery Ching, Kenin Coloma, Jianwei Li ANL Collaborators:Bill.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Collective Buffering: Improving Parallel I/O Performance By Bill Nitzberg and Virginia Lo.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Parallel and Grid I/O Infrastructure W. Gropp, R. Ross, R. Thakur Argonne National Lab A. Choudhary, W. Liao Northwestern University G. Abdulla, T. Eliassi-Rad.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
Advanced Utilities Extending ncgen to support the netCDF-4 Data Model Dr. Dennis Heimbigner Unidata netCDF Workshop August 3-4, 2009.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
Slides created by: Professor Ian G. Harris Hello World #include main() { printf(“Hello, world.\n”); }  #include is a compiler directive to include (concatenate)
Using IOR to Analyze the I/O Performance
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Department of Computer Science and Software Engineering
Jay Lofstead Input/Output APIs and Data Organization for High Performance Scientific Computing November.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
Intro to Parallel HDF5 10/17/151ICALEPCS /17/152 Outline Overview of Parallel HDF5 design Parallel Environment Requirements Performance Analysis.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
SDM Center Parallel I/O Storage Efficient Access Team.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
ECE 456 Computer Architecture Lecture #9 – Input/Output Instructor: Dr. Honggang Wang Fall 2013.
Message Passing Interface Using resources from
Background Computer System Architectures Computer System Software.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
- 1 - Overview of Parallel HDF Overview of Parallel HDF5 and Performance Tuning in HDF5 Library NCSA/University of Illinois at Urbana- Champaign.
IO Best Practices For Franklin Katie Antypas User Services Group NERSC User Group Meeting September 19, 2007.
PVFS: A Parallel File System for Linux Clusters
NCL variable based on a netCDF variable model
Presentation transcript:

Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab Parallel netCDF Enabling High Performance Application I/O

Outline NetCDF overview Parallel netCDF and MPI-IO Progress on API implementation Preliminary performance evaluation using LBNL test suite

NetCDF Overview netCDF example { // CDL notation for netCDF dataset dimensions: // dimension names and lengths lat = 5, lon = 10, level = 4, time = unlimited; variables: // var types, names, shapes, attributes float temp(time,level,lat,lon); temp:long_name = "temperature"; temp:units = "celsius"; float rh(time,lat,lon); rh:long_name = "relative humidity"; rh:valid_range = 0.0, 1.0; // min and max int lat(lat), lon(lon), level(level), time(time); lat:units = "degrees_north"; lon:units = "degrees_east"; level:units = "millibars"; time:units = "hours since "; // global attributes: :source = "Fictional Model Output"; data: // optional data assignments level = 1000, 850, 700, 500; lat = 20, 30, 40, 50, 60; lon = -160,-140,-118,-96,-84,-52,-45,-35,-25,-15; time = 12; rh =.5,.2,.4,.2,.3,.2,.4,.5,.6,.7,.1,.3,.1,.1,.1,.1,.5,.7,.8,.8,.1,.2,.2,.2,.2,.5,.7,.8,.9,.9,.1,.2,.3,.3,.3,.3,.7,.8,.9,.9, 0,.1,.2,.4,.4,.4,.4,.7,.9,.9; // 1 record allocated } NetCDF (network Common Data Form) is an API for reading/writing multi- dimensional data arrays Self-describing file format –A netCDF file includes information about the data it contains Machine independent –Portable file format Popular in both the fusion and climate communities

NetCDF File Format File header –Stores metadata for fixed-size arrays: number of arrays, dimension lists, global attribute list, etc. Array data –Fixed-size arrays Stored contiguously in file –Variable-size arrays Records from all variable-sized arrays are stored interleaved

NetCDF APIs Dataset APIs –Create /open/close a dataset, set the dataset to define/data mode, and synchronize dataset changes to disk Define mode APIs –Define dataset: add dimensions, variables Attribute APIs –Add, change, and read attributes of datasets Inquiry APIs –Inquire dataset metadata: dim(id, name, len), var(name, ndims, shape, id) Data mode APIs –Read/write variable (access method: single value, whole array, subarray, strided subarray, sampled subarray)

Serial vs. Parallel netCDF Serial netCDF –Parallel read Implemented by simply having all processors read the file independently Does NOT utilize native I/O provided on parallel file system – miss parallel optimizations –Sequential write Parallel writes are carried out by shipping data to a single process – overwhelm its memory capacity Parallel netCDF –Parallel read/write to a shared netCDF file –Built on top of MPI-IO which utilizes optimal I/O facilities provided by the parallel file systems –Can pass high-level access hints down to the file systems for further optimization P0P1P2P3 netCDF Parallel File System Parallel netCDF P0P1P2P3 Parallel File System

Design Parallel netCDF APIs Goals –Retain the original format Applications using original netCDF applications can access the same files –A new set of parallel APIs Prefix name “ncmpi_” and “nfmpi_” –Similar APIs Minimum changes from the original APIs for easy migration –Portable across machines –High performance Tune the API to provide better performance in today’s computing environments

Parallel File System Parallel file system consists of multiple I/O nodes –Increase bandwidth between compute and I/O nodes Each I/O node may contain more than one disk –Increase bandwidth between disks and I/O nodes A file is striped across all disks in a round-robin fashion –Maximize the possibility of parallel access switch network I/O Server I/O Server I/O Server Compute node File...

Parallel netCDF Parallel netCDF and MPI-IO Parallel netCDF APIs are the interfaces of applications to parallel file systems Parallel netCDF is implemented on top of MPI- IO ROMIO is an implementation of MPI-IO standard ROMIO is built on top of ADIO ADIO has implementations on various file systems, using optimal native I/O calls Compute node switch network I/O Server I/O Server I/O Server ROMIO ADIO User space File system space

Parallel API Implementations Dataset APIs –Collective calls –Add MPI communicator to define I/O process scope –Add MPI_Info to pass access hint for further optimization Define mode APIs –Collective calls Attribute APIs –Collective calls Inquiry APIs –Collective calls Data mode APIs –Collective mode (default) Ensure file consistency –Independent mode ncmpi_create/open( MPI_Comm comm const char *path, int cmode, MPI_Info info, int ncidp); ncmpi_begin_indep_data(int ncid); ncmpi_end_indep_data(int ncid); Switch in/out independent data mode File open

Data Mode APIs Collective and independent calls –With suffix “_all” or not High-level APIs –Mimics the original APIs –Easy path of migration to the parallel interface –Mapping netCDF access types to MPI derived datatypes Flexible APIs –Better handling of internal data representations –More fully expose the capabilities of MPI-IO to the programmer ncmpi_put/get_vars_types_all( int ncid, const MPI_Offset start[ ], const MPI_Offset count[ ] const MPI_Offset stride[ ], const unsigned char *buf); ncmpi_put/get_vars( int ncid, const MPI_Offset start[ ], const MPI_Offset count[ ] const MPI_Offset stride[ ], void *buf, int count, MPI_Datatype datatype); Flexible APIs High-level APIs

LBNL Benchmark Test suite –Developed by Chris Ding et al. at LBNL –Written in Fortran –Simple block partition patterns Access to a 3D array which is stored in a single netCDF file Running on IBM SP2 at NERSC, LBNL –Each compute node is an SMP with 16 processors –I/O is performed using all processors processor 0processor 4 XYZ partition XY partitionXZ partitionYZ partition processor 1 X partitionY partition processor 2 processor 3 processor 5 Z partition processor 6 processor 7 Y X Z

LBNL Results – 64 MB Array size – 256 x 256 x 256, real*4 Read –In some cases, performance improvement over the single processor –8 processor parallel read is 2-3 times faster than the serial netCDF Write –Performance is not better than serial netCDF, 7-8 times slower

Our Results – 64 MB Array size: 256 x 256 x 256, real*4 Run on IBM SP2 at SDSC I/O is performed using one processor per node

LBNL Results – 1 GB Array size – 512 x 512 x 512, real*8 Read –No better performance is observed Write –4-8 processor writes results in 2-3 times higher bandwidth than using a single processor

Our Results – 1 GB Array size: 512 x 512 x 512, real*8 Run on IBM SP2 at SDSC I/O is performed using one processor per node

Summary Complete the parallel C APIs Identify friendly users –ORNL, LBNL User reference manual Preliminary performance results –Using LBNL test suite: typical access patterns –Obtained scalable results