SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.

Slides:



Advertisements
Similar presentations
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT | U MICH Emergence of the Earth System Modeling Framework NSIPP Seasonal Forecast.
Advertisements

University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
Phillip Dickens, Department of Computer Science, University of Maine. In collaboration with Jeremy Logan, Postdoctoral Research Associate, ORNL. Improving.
1 Cplant I/O Pang Chen Lee Ward Sandia National Laboratories Scalable Computing Systems Fifth NASA/DOE Joint PC Cluster Computing Conference October 6-8,
By Ali Alskaykha PARALLEL VIRTUAL FILE SYSTEM PVFS PVFS Distributed File System:
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
I/O Analysis and Optimization for an AMR Cosmology Simulation Jianwei LiWei-keng Liao Alok ChoudharyValerie Taylor ECE Department Northwestern University.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
1 I/O Management in Representative Operating Systems.
6/10/2005 FastOS PI Meeting/Workshop K42 Internals Dilma da Silva for K42 group IBM TJ Watson Research.
Grid IO APIs William Gropp Mathematics and Computer Science Division.
WRF Outline Overview and Status WRF Q&A
Developing a NetCDF-4 Interface to HDF5 Data
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
1 Outline l Performance Issues in I/O interface design l MPI Solutions to I/O performance issues l The ROMIO MPI-IO implementation.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
1 Scientific Data Management Center DOE Laboratories: ANL: Rob Ross LBNL:Doron Rotem LLNL:Chandrika Kamath ORNL: Nagiza Samatova.
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
Initial Results from the Integration of Earth and Space Frameworks Cecelia DeLuca/NCAR, Alan Sussman/University of Maryland, Gabor Toth/University of Michigan.
Using HDF5 in WRF Part of MEAD - an alliance expedition.
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
Model Coupling Environmental Library. Goals Develop a framework where geophysical models can be easily coupled together –Work across multiple platforms,
MPICH2 – A High-Performance and Widely Portable Open- Source MPI Implementation Darius Buntinas Argonne National Laboratory.
HDF Mike Folk National Center for Supercomputing Applications Science Data Processing Workshop February 26-28, 2002 HDF Update HDF.
Mid-Course Review: NetCDF in the Current Proposal Period Russ Rew
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
1 Arie Shoshani, LBNL SDM center Scientific Data Management Center(SDM-ISIC) Arie Shoshani Computing Sciences Directorate Lawrence Berkeley National Laboratory.
1 Parallel and Grid I/O Infrastructure Rob Ross, Argonne National Lab Parallel Disk Access and Grid I/O (P4) SDM All Hands Meeting March 26, 2002.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
Your name here Challenges for Scalable Scientific Knowledge Discovery Alok Choudhary EECS Department, Northwestern University Wei-keng Liao, Kui Gao, Arifa.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Parallel and Grid I/O Infrastructure W. Gropp, R. Ross, R. Thakur Argonne National Lab A. Choudhary, W. Liao Northwestern University G. Abdulla, T. Eliassi-Rad.
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
Advanced Simulation and Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Presented by Scientific Data Management Center Nagiza F. Samatova Network and Cluster Computing Computer Sciences and Mathematics Division.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.
I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.
Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson
Using IOR to Analyze the I/O Performance
MPI: Portable Parallel Programming for Scientific Computing William Gropp Rusty Lusk Debbie Swider Rajeev Thakur.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
Presented by Scientific Data Management Center Nagiza F. Samatova Oak Ridge National Laboratory Arie Shoshani (PI) Lawrence Berkeley National Laboratory.
WRF Software Development and Performance John Michalakes, NCAR NCAR: W. Skamarock, J. Dudhia, D. Gill, A. Bourgeois, W. Wang, C. Deluca, R. Loft NOAA/NCEP:
DOE Network PI Meeting 2005 Runtime Data Management for Data-Intensive Scientific Applications Xiaosong Ma NC State University Joint Faculty: Oak Ridge.
Supercomputing 2006 Scientific Data Management Center Lead Institution: LBNL; PI: Arie Shoshani Laboratories: ANL, ORNL, LBNL, LLNL, PNNL Universities:
ESMF,WRF and ROMS. Purposes Not a tutorial Not a tutorial Educational and conceptual Educational and conceptual Relation to our work Relation to our work.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
SDM Center Parallel I/O Storage Efficient Access Team.
SciDAC SDM Center All Hands Meeting, October 5-7, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Jianwei Li, Avery Ching,
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
An Introduction to GPFS
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
GMAO Seasonal Forecast
MPI: Portable Parallel Programming for Scientific Computing
Software Practices for a Performance Portable Climate System Model
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Parallel I/O for Distributed Applications (MPI-Conn-IO)
Presentation transcript:

SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group (Group Leader, Rob Ross, ANL)

SDM Center Parallel NetCDF NetCDF defines: A portable file format A set of APIs for file access Parallel netCDF New APIs for parallel access Maintaining the same file format Tasks Built on top of MPI for portability and high performance Support C and Fortran interfaces Compute node network I/O Server I/O Server I/O Server Applications Client-side File System Parallel netCDF MPI-IO Compute node

SDM Center Parallel NetCDF - status Version was released on Dec. 7, 2005 Web page receives 200 page views a day Supported platforms Linux Cluster, IBM SP, BG/L, SGI Origin, Cray X, NEC SX Two sets of parallel APIs High level APIs (mimicking the serial netCDF APIs) Flexible APIs (to utilize MPI derived datatype) Support for large files ( > 2GB files) Test suites Self test codes ported from Unidata netCDF package to validate against single-process results New data analysis APIs Basic statistical functions min, max, mean, median, variance, deviation

SDM Center Illustrative PnetCDF Users FLASH – astrophysical thermonuclear application from ASCI/Alliances center at university of Chicago ACTM – atmospheric chemical transport model, LLNL WRF – Weather Research and Forecast modeling system, NCAR WRF-ROMS – regional ocean model system I/O module from scientific data technologies group, NCSA ASPECT – data understanding infrastructure, ORNL pVTK – parallel visualization toolkit, ORNL PETSc – portable, extensible toolkit for scientific computation, ANL PRISM – PRogram for Integrated Earth System Modeling, users from C&C Research Laboratories, NEC Europe Ltd. ESMF – earth system modeling framework, national center for atmospheric research CMAQ – Community Multiscale Air Quality code I/O module, SNL More …

SDM Center PnetCDF Future Work Non-blocking I/O Built on top of non-blocking MPI-IO Improve data type conversion Type conversion while packing non-contiguous buffers Data analysis APIs Statistical functions Histogram functions Range query: regional sum, min, max, mean, … Data transformation: DFT, FFT Collaboration with application users

SDM Center MPI-IO Caching Client-side file caching Reduces client-server communication costs Enables write behind to better utilize network bandwidth Avoids file system locking overhead by aligning I/O with file block size (or stripe size) Prototype in ROMIO Collaborating caching by the group of MPI processes A complete caching subsystem in MPI library Data consistency and cache coherence control Distributed file locking Memory management for data caching, eviction, and migration Applicable for both MPI collective and independent I/O Two implementations Creating an I/O thread in each MPI process Using MPI RMA utility

SDM Center FLASH - I/O Benchmark The I/O kernel of FLASH application, a block-structured adaptive mesh hydrodynamics code Each process writes 80 cubes I/O through HDF5 Write-only operations The improvement is due to write behind 16x16x1632x32x32 np= GB9.13 GB np= GB18.26 GB np= GB36.53 GB

SDM Center BTIO Benchmark Block tri-diagonal array partitioning 40 MPI collective writes followed by 40 collective reads P 0,0 P 0,1 P 0,2 P 1,0 P 1,1 P 1,2 P 2,0 P 2,1 P 2,2 P 2,0 Local array is in 4D P 2,0 File view