N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up Parallel I/O on the SP David Skinner, NERSC Division, Berkeley Lab.

Slides:



Advertisements
Similar presentations
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER TotalView on the T3E and IBM SP Systems NERSC User Services June 12, 2000.
Advertisements

NPACI Parallel Computing Institute August 19-23, 2002 San Diego Supercomputing Center S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Mixed Language Programming on Seaborg Mark Durst NERSC User Services.
Katie Antypas NERSC User Services Lawrence Berkeley National Lab NUG Meeting 1 February 2012 Best Practices for Reading and Writing Data on HPC Systems.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
Data Locality Aware Strategy for Two-Phase Collective I/O. Rosa Filgueira, David E.Singh, Juan C. Pichel, Florin Isaila, and Jesús Carretero. Universidad.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
Hardware Basics: Inside the Box 2  2001 Prentice Hall2.2 Chapter Outline “There is no invention – only discovery.” Thomas J. Watson, Sr. What Computers.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Support for Adaptive Computations Applied to Simulation of Fluids in Biological Systems Immersed Boundary Method Simulation in Titanium Siu Man Yau, Katherine.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Comparison of Communication and I/O of the Cray T3E and IBM SP Jonathan Carter NERSC User.
Performance Engineering and Debugging HPC Applications David Skinner
National Energy Research Scientific Computing Center (NERSC) The GUPFS Project at NERSC GUPFS Team NERSC Center Division, LBNL November 2003.
Performance Comparison of Pure MPI vs Hybrid MPI-OpenMP Parallelization Models on SMP Clusters Nikolaos Drosinos and Nectarios Koziris National Technical.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.
Introduction to HPC resources for BCB 660 Nirav Merchant
Office of Science U.S. Department of Energy Evaluating Checkpoint/Restart on the IBM SP Jay Srinivasan
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.
1.First Go to
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 PDSF Host Database HEPiX Fall 2002 Cary Whitney
S AN D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE On pearls and perils of hybrid OpenMP/MPI programming.
NERSC NUG Meeting 5/29/03 Seaborg Code Scalability Project Richard Gerber NERSC User Services.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Evolution of the NERSC SP System NERSC User Services Original Plans Phase 1 Phase 2 Programming.
Common Practices for Managing Small HPC Clusters Supercomputing 12
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 A Comparison of Performance Analysis Tools on the NERSC SP Jonathan Carter NERSC User Services.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 NERSC Visualization Greenbook Workshop Report June 2002 Wes Bethel LBNL.
Computer Basics Terminology - Take Notes. What is a computer? well, what is the technical definition A computer is a machine that changes information.
1/30/2003 BARC1 Profile-Guided I/O Partitioning Yijian Wang David Kaeli Electrical and Computer Engineering Department Northeastern University {yiwang,
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 I/O Strategies for the T3E Jonathan Carter NERSC User Services.
Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005.
E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional.
Matlab Demo #1 ODE-solver with parameters. Summary Here we will – Modify a simple matlab script in order to split the tasks to be sent to the cluster.
MPI Performance in a Production Environment David E. Skinner, NERSC User Services ScicomP 10 Aug 12, 2004.
A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.
I/O on Clusters Rajeev Thakur Argonne National Laboratory.
A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group NERSC User Group Meeting September 17, 2007.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scientific Visualization Wes Bethel Visualization Group NUG Business Meeting May 29, 2003.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
Using IOR to Analyze the I/O Performance
Innovation for Our Energy Future Opportunities for WRF Model Acceleration John Michalakes Computational Sciences Center NREL Andrew Porter Computational.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up MPI and MPI-I/O on seaborg.nersc.gov David Skinner, NERSC Division, Berkeley Lab.
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.
National Energy Research Scientific Computing Center (NERSC) HPC In a Production Environment Nicholas P. Cardo NERSC Center Division, LBNL November 19,
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Issues on the operational cluster 1 Up to 4.4x times variation of the execution time on 169 cores Using -O2 optimization flag Using IBM MPI without efficient.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up User Codes on the SP David Skinner, NERSC Division, Berkeley Lab.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
Fault Tolerance in Charm++ Gengbin Zheng 10/11/2005 Parallel Programming Lab University of Illinois at Urbana- Champaign.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
15/02/2006CHEP 061 Measuring Quality of Service on Worker Node in Cluster Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division,
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
IO Best Practices For Franklin Katie Antypas User Services Group NERSC User Group Meeting September 19, 2007.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Achieving the Ultimate Efficiency for Seismic Analysis
Scalability to Hundreds of Clients in HEP Object Databases
Is System X for Me? Cal Ribbens Computer Science Department
Integration of Singularity With Makeflow
CINECA HIGH PERFORMANCE COMPUTING SYSTEM
Parallel Programming in C with MPI and OpenMP
Presentation transcript:

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up Parallel I/O on the SP David Skinner, NERSC Division, Berkeley Lab

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 2 Motivation NERSC uses GPFS for $HOME and $SCRATCH Local disk filesystems on seaborg (/tmp) are tiny Growing data sizes and concurrencies often outpace I/O methodologies

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 3 Seaborg.nersc.gov

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 4 Case Study: Data Intensive Computing at NERSC Binary black hole collisions Finite differencing on a 1024x768x768x200 grid Run on 64 NH2 nodes with 32GB RAM (2 TB total) Need to save regular snapshots of full grid The first full 3D calculation of inward spiraling black holes done at NERSC by Ed Seidel, Gabrielle Allen, Denis Pollney, and Peter Diener Scientific American April 2002

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 5 Problems The binary black hole collision uses a modified version of the Cactus code to solve Einstein’s equations. It’s choices for I/O are serial and MPI-I/O CPU utilization suffers as time is lost to I/O Variation in write times can be severe

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 6 Finding solutions Data pattern is a common one Survey strategies to determine the rate and variation in rate

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 7

8 Parallel I/O Strategies

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 9 Multiple File I/O if(private_dir) rank_dir(1,rank); fp=fopen(fname_r,"w"); fwrite(data,nbyte,1,fp); fclose(fp); if(private_dir) rank_dir(0,rank); MPI_Barrier(MPI_COMM_WORLD);

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 10 Single File I/O fd=open(fname,O_CREAT|O_RDWR, S_IRUSR); lseek(fd,(off_t)(rank*nbyte)-1,SEEK_SET); write(fd,data,1); close(fd);

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 11 MPI-I/O MPI_Info_set(mpiio_file_hints, MPIIO_FILE_HINT0); MPI_File_open(MPI_COMM_WORLD, fname, MPI_MODE_CREATE | MPI_MODE_RDWR, mpiio_file_hints, &fh); MPI_File_set_view(fh, (off_t)rank*(off_t)nbyte, MPI_DOUBLE, MPI_DOUBLE, "native", mpiio_file_hints); MPI_File_write_all(fh, data, ndata, MPI_DOUBLE, &status); MPI_File_close(&fh);

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 12 Results

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 13 Scaling of single file I/O

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 14 Scaling of multiple file and MPI I/O

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 15 Large block I/O MPI I/O on the SP includes the file hint IBM_largeblock_io IBM_largeblock_io=true used throughout, default values show large variation IBM_largeblock_io=true also turns off data shipping

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 16 Large block I/O = false MPI on the SP includes the file hint IBM_largeblock_io Except above IBM_largeblock_io=true used throughout IBM_largeblock_io=true also turns off data shipping

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 17 Bottlenecks to scaling Single file I/O has a tendency to serialize Scaling up with multiple files create filesystem problems Akin to data shipping consider the intermediate case

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 18 Parallel IO with SMP aggregation (32 tasks)

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 19 Parallel IO with SMP aggregation (512 tasks)

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 20 Summary MB 10 MB 100 MB 1 GB 10 G 100 G Serial Multiple File mod n MPI IO MPI IO collective

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 21 Future Work Testing NERSC port of NetCDF to MPI-I/O Comparison with Linux/Intel GPFS NERSC/LBL Alvarez Cluster 84 2way SMP Pentium Nodes Myrinet 2000 Fiber Optic Interconnect Testing GUPFS technologies as they become available