I/O on Clusters Rajeev Thakur Argonne National Laboratory.

Slides:



Advertisements
Similar presentations
2013 Summer Institute: Discover Big Data, August 5-9, San Diego, California SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dealing.
Advertisements

Katie Antypas NERSC User Services Lawrence Berkeley National Lab NUG Meeting 1 February 2012 Best Practices for Reading and Writing Data on HPC Systems.
Parallel I/O A. Patra MAE 609/CE What is Parallel I/O ? zParallel processes need parallel input/output zIdeal: Processor consuming/producing data.
By Ali Alskaykha PARALLEL VIRTUAL FILE SYSTEM PVFS PVFS Distributed File System:
© 2003 IBM Corporation IBM Systems and Technology Group Operating System Attributes for High Performance Computing Ken Rozendal Distinguished Engineer.
Introduction to MPI-IO. 2 Common Ways of Doing I/O in Parallel Programs Sequential I/O: –All processes send data to rank 0, and 0 writes it to the file.
I/O Analysis and Optimization for an AMR Cosmology Simulation Jianwei LiWei-keng Liao Alok ChoudharyValerie Taylor ECE Department Northwestern University.
On evaluating GPFS Research work that has been done at HLRS by Alejandro Calderon.
Creating a Secured and Trusted Information Sphere in Different Markets Giuseppe Contino.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
SciDAC 2005 Achievements and Challenges for I/O in Computational Science Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
How to Cluster both Servers and Storage W. Curtis Preston President The Storage Group.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Introduction to MPI-IO Rajeev Thakur Mathematics and Computer Science Division Argonne National Laboratory.
Ch 4. The Evolution of Analytic Scalability
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
1 A Look at PVFS, a Parallel File System for Linux Will Arensman Anila Pillai.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
1 A Look at PVFS, a Parallel File System for Linux Talk originally given by Will Arensman and Anila Pillai.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
RAID COP 5611 Advanced Operating Systems Adapted from Andy Wang’s slides at FSU.
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
Active Storage and Its Applications Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory 2007 Scientific Data Management All Hands.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.
Session-8 Data Management for Decision Support
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
MPI-2 Sathish Vadhiyar Using MPI2: Advanced Features of the Message-Passing.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Parallel and Grid I/O Infrastructure W. Gropp, R. Ross, R. Thakur Argonne National Lab A. Choudhary, W. Liao Northwestern University G. Abdulla, T. Eliassi-Rad.
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up Parallel I/O on the SP David Skinner, NERSC Division, Berkeley Lab.
DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 12/6/2006 Lecture 24 – Profilers.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
© 2012 Whamcloud, Inc. Lustre Development Update Dan Ferber Whamcloud, Inc. IDC HPC User Group April 16-17, 2012.
Welcome to the PVFS BOF! Rob Ross, Rob Latham, Neill Miller Argonne National Laboratory Walt Ligon, Phil Carns Clemson University.
Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.
COMMON INTERFACE FOR EMBEDDED SOFTWARE CONFIGURATION by Yatiraj Bhumkar Advisor Dr. Chung-E Wang Department of Computer Science CALIFORNIA STATE UNIVERSITY,
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
Accelerating High Performance Cluster Computing Through the Reduction of File System Latency David Fellinger Chief Scientist, DDN Storage ©2015 Dartadirect.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
SDM Center Parallel I/O Storage Efficient Access Team.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 2.
Parallel IO for Cluster Computing Tran, Van Hoai.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
Holly Cate January 20, 2010 Main Bioinformatics Laboratory.
Five todos when moving an application to distributed HTC.
The Grid from a User´s Perspective: The Cluster Finder Use Case Art Carlson AstroGrid-D, Heidelberg, 24 July 2006.
Compute and Storage For the Farm at Jlab
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Introduction to Systems Analysis and Design
Ch 4. The Evolution of Analytic Scalability
Hadoop Technopoints.
Multiprocessor and Real-Time Scheduling
Presentation transcript:

I/O on Clusters Rajeev Thakur Argonne National Laboratory

State of Affairs I/O recognized as a problem for several years No good answers still for the question: What should I use for I/O on my cluster? Partial solutions exist for parts of the problem No mainstream solution that you can use blindfolded

What are the Requirements? Two distinct requirements Home directories globally visible across cluster For executables, parameter files, small output Reliable, need to be backed up Less of a need for concurrent writes to the same file Parallel I/O For large inputs, large outputs High-bandwidth concurrent writes and reads to the same file Performance critical Even some vendors not clear about above requirements

What are the Current Solutions? Can one file system be used for both? In theory yes, in practice no. Need physical separation between the two for performance Home directories Traditionally NFS; works at small scale Open question at large scale. Some NFS-GFS combo?

What are the Current Solutions? (contd) Parallel I/O IBM GPFS Really good paper on it at FAST ’ I haven’t used it myself. I haven’t heard how well it works on large Linux clusters PVFS (Argonne, Clemson) Fast. Measured > 2GB/s bandwidth Needs more in the areas of reliability and management tools PVFS2 (Argonne, Clemson) Under development Improved performance. Adds reliability and management tools Lustre Under development

Using Parallel I/O Given a good parallel file system and sufficient I/O hardware, what can users or libraries do to get good performance? Wherever possible, make large concurrent I/O requests If not possible, make single collective request for noncontiguous data instead of lots of small requests (using MPI-IO, for example)

Example: Distributed Array Access File containing the global array in row-major order P3P2 P1P0 2D array distributed among four processes

Don’t Do This Each process makes one independent read request for each row in the local array (as in Unix) MPI_File_open(..., file,..., &fh) for (i=0; i<n_local_rows; i++) { MPI_File_seek(fh,...); MPI_File_read(fh, &(A[i][0]),...); } MPI_File_close(&fh);

Do This! Each process defines a noncontiguous file view and calls a collective I/O function MPI_Type_create_subarray(..., &subarray,...); MPI_Type_commit(&subarray); MPI_File_open(MPI_COMM_WORLD, file,..., &fh); MPI_File_set_view(fh,..., subarray,...); MPI_File_read_all(fh, A,...); MPI_File_close(&fh);

High-Level I/O Libraries Libraries have an even greater responsibility to do it right! Need the right API in the first place Collective, noncontiguous Use MPI-IO the right way Minimize small metadata accesses and updates Example: Parallel NetCDF library being developed at Argonne and Northwestern

What is Needed… In addition to performance, we need software that is Self-recovering Self-optimizing It’s time we write software that is “smart”