SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Parallel Processing with OpenMP
File Consistency in a Parallel Environment Kenin Coloma
Phillip Dickens, Department of Computer Science, University of Maine. In collaboration with Jeremy Logan, Postdoctoral Research Associate, ORNL. Improving.
1 Cplant I/O Pang Chen Lee Ward Sandia National Laboratories Scalable Computing Systems Fifth NASA/DOE Joint PC Cluster Computing Conference October 6-8,
By Ali Alskaykha PARALLEL VIRTUAL FILE SYSTEM PVFS PVFS Distributed File System:
Parallel I/O Performance Study Christian Chilan The HDF Group September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
1 I/O Management in Representative Operating Systems.
Grid IO APIs William Gropp Mathematics and Computer Science Division.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
Distributed Shared Memory Systems and Programming
1 Outline l Performance Issues in I/O interface design l MPI Solutions to I/O performance issues l The ROMIO MPI-IO implementation.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Computer Architecture and Organization Introduction.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
RFC: Breaking Free from the Collective Requirement for HDF5 Metadata Operations.
© 2010 IBM Corporation Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems Gabor Dozsa 1, Sameer Kumar 1, Pavan Balaji 2,
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
Project 4 SciDAC All Hands Meeting March 26-27, 2002 PIs:Alok Choudhary, Wei-keng Liao Grad Students:Avery Ching, Kenin Coloma, Jianwei Li ANL Collaborators:Bill.
1 Parallel and Grid I/O Infrastructure Rob Ross, Argonne National Lab Parallel Disk Access and Grid I/O (P4) SDM All Hands Meeting March 26, 2002.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Your name here Challenges for Scalable Scientific Knowledge Discovery Alok Choudhary EECS Department, Northwestern University Wei-keng Liao, Kui Gao, Arifa.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Parallel and Grid I/O Infrastructure W. Gropp, R. Ross, R. Thakur Argonne National Lab A. Choudhary, W. Liao Northwestern University G. Abdulla, T. Eliassi-Rad.
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
High-Level, One-Sided Models on MPI: A Case Study with Global Arrays and NWChem James Dinan, Pavan Balaji, Jeff R. Hammond (ANL); Sriram Krishnamoorthy.
Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson
Using IOR to Analyze the I/O Performance
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Jay Lofstead Input/Output APIs and Data Organization for High Performance Scientific Computing November.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
WRF Software Development and Performance John Michalakes, NCAR NCAR: W. Skamarock, J. Dudhia, D. Gill, A. Bourgeois, W. Wang, C. Deluca, R. Loft NOAA/NCEP:
DOE Network PI Meeting 2005 Runtime Data Management for Data-Intensive Scientific Applications Xiaosong Ma NC State University Joint Faculty: Oak Ridge.
Threads. Readings r Silberschatz et al : Chapter 4.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
Accelerating High Performance Cluster Computing Through the Reduction of File System Latency David Fellinger Chief Scientist, DDN Storage ©2015 Dartadirect.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
SDM Center Parallel I/O Storage Efficient Access Team.
Review CS File Systems - Partitions What is a hard disk partition?
SciDAC SDM Center All Hands Meeting, October 5-7, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Jianwei Li, Avery Ching,
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
PIDX PIDX - a parallel API to capture the data models used by HPC application and write it out in an IDX format. PIDX enables simulations to write out.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
Background Computer System Architectures Computer System Software.
An Introduction to GPFS
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Data Management with Google File System Pramod Bhatotia wp. mpi-sws
Parallel I/O Optimizations
Lock Ahead: Shared File Performance Improvements
Outline Midterm results summary Distributed file systems – continued
Status for Endeavor 6: Improved Scientific Data Access Infrastructure
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Parallel I/O for Distributed Applications (MPI-Conn-IO)
Presentation transcript:

SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei Li ANL Collaborators:Bill Gropp, Rob Ross, Rajeev Thakur Rob Latham Progress in Storage Efficient Access PnetCDF and MPI-I/O

Outline Parallel netCDF –Building blocks –Status report –Users and applications –Future works MPI I/O file caching sub-system –Enhance client-side file caching for parallel applications –Scalable approach for enforcing file consistency and atomicity

Parallel netCDF Goals –Design parallel APIs –Keep the same file format Backward compatible, easy to migrate from serial netCDF Similar API names and argument lists but with parallel semantics Tasks –Built on top of MPI for portability and high performance Take advantage of existing MPI-IO optimization (collective I/O, etc.) –Additional functionality for sophisticated I/O patterns A new set of flexible APIs incorporate MPI derived data type to address the mapping between memory and file data layout –Support C and Fortran interfaces –Support external data representations across platforms Parallel netCDF Compute node switch network I/O Server I/O Server I/O Server ROMIO ADIO User space File system space

PnetCDF Current Status High level APIs (mimicking serial netCDF API) –Fully supported both in C and Fortran Flexible APIs (extended to utilize MPI derived datatype) –Allow complex memory layout for mapping between I/O buffer and file space –Support varm routines (strided memory layout) ported from serial netCDF –Support array shuffles, e.g. transposition Test suites –C and Fortran self test codes ported from Unidata netCDF package to validate against single-process results –Parallel test codes for both sets of APIs Latest release is v0.9.4 –Pre-release v1.0 Sync with netCDF v3.6.0 (newest release from UniData) Parallel API user manual

PnetCDF Users and Applications FLASH – Astrophysical Thermonuclear application from ASCI/Alliances Center at University of Chicago ACTM – Atmospheric Chemical Transport Model from LLNL ROMS – Regional Ocean Model System from NCSA HDF group ASPECT – data understanding infrastructure from ORNL pVTK – parallel visualization toolkit from ORNL PETSc – Portable, Extensible Toolkit for Scientific Computation from ANL PRISM– PRogram for Integrated Earth System Modeling

PnetCDF Future Works Data type conversion for external data representation –Reducing intermediate memory copy operations int64  int32, little-endian  big-endian, int  double –Data type caching at PnetCDF level (w/o repeated decoding) I/O hints –Currently, only MPI hints (MPI file info) are supported –Need netCDF level hints, eg. patterns of access sequence for multiple arrays Non-blocking I/O Large array support (dimensionality > ) More flexible and extendable file format –Allow adding new objects dynamically –Store arrays of structured data types, such as C structure

Client-side File Caching for MPI I/O Traditional client-side file caching –Treats each client independently, targeting for distributed environment –Inadequate for parallel environment where clients are most likely related with each other (eg. read/write shared files) Collective caching –Application processes cooperate with each other to perform data caching, coherence control (leaving I/O servers out of the task) client processors I/O servers global cache pool local cache buffers network interconnect

Design of Collective Caching Caching sub-system is implemented at user space –Built at the MPI I/O level  Portable across different file systems Distributed management –For cache metadata and lock control (vs. centralized) Two designs: –Using an I/O thread –Using the MPI remote-memory-access (RMA) facility system space user space server-side file system network client-side file system MPI I/O MPI library application process collective caching processes 1 P 2 P 3 P 0 P File logical parititioning Distributed cache meta data processes block 9 status block 5 status block 1 status block 10 status block 6 status block 2 status block 11 status block 7 status block 3 status block 8 status block 4 status block 0 status 1 P 2 P 3 P 0 P Global cache pool local memory page 3 page 2 page 1 block 4block 3block 2block 1block 0 page 3 page 2 page 1 page 3 page 2 page 1 page 3 page 2 page 1

Performance Results 1 IBM SP at SDSC using GPFS –System peak performance: 2.1 GB/s for reads, 1 GB/s for writes Sliding-window benchmark –I/O requests are overlapped –Can cause cache coherence problem

Performance Results 2 FLASH I/O - 8 x 8 x Number of nodes I/O Bandwidth in MB/s FLASH I/O - 16 x 16 x Number of nodes I/O bandwidth in MB/s BTIO Benchmark - class A Number of nodes I/O Bandwidth in MB/s OriginalCollective Caching BTIO Benchmark - class B Number of nodes I/O Bandwidth in MB/s BTIO benchmark –From NAS Ames Research Center -- Parallel Benchmarks version 2.4 –Block Tri-diagonal array partitioning pattern –Use MPI collective I/O calls –I/O requests are not overlapped FLASH I/O benchmark –From U. of Chicago, ASCI Alliances Center –Access pattern is non-contiguous both in memory and in file –Use HDF5 –I/O requests are not overlapped OriginalCollective CachingOriginalCollective CachingOriginalCollective Caching