CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.

Slides:



Advertisements
Similar presentations
MPI Basics Introduction to Parallel Programming and Cluster Computing University of Washington/Idaho State University MPI Basics Charlie Peck Earlham College.
Advertisements

CS 140: Models of parallel programming: Distributed memory and MPI.
Chapter 3. MPI MPI = Message Passing Interface Specification of message passing libraries for developers and users –Not a library by itself, but specifies.
Reference: / MPI Program Structure.
Tutorial on MPI Experimental Environment for ECE5610/CSC
High Performance Computing
MPI Program Structure Self Test with solution. Self Test 1.How would you modify "Hello World" so that only even-numbered processors print the greeting.
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
Today’s topic: –File operations –I/O redirection –Inter-process communication through pipes.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
I/O Systems ◦ Operating Systems ◦ CS550. Note:  Based on Operating Systems Concepts by Silberschatz, Galvin, and Gagne  Strongly recommended to read.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Parallel Processing LAB NO 1.
Review C++ exception handling mechanism Try-throw-catch block How does it work What is exception specification? What if a exception is not caught?
ORNL is managed by UT-Battelle for the US Department of Energy Crash Course In Message Passing Interface Adam Simpson NCCS User Assistance.
Electronic Visualization Laboratory, University of Illinois at Chicago MPI on Argo-new Venkatram Vishwanath Electronic Visualization.
Director of Contra Costa College High Performance Computing Center
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
An Introduction to Parallel Programming and MPICH Nikolaos Hatzopoulos.
CS 240A Models of parallel programming: Distributed memory and MPI.
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Hybrid MPI and OpenMP Parallel Programming
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
PP Lab MPI programming II. Program#1 Write a program that prints hello from every created process. Like: Hello World from process 0 of 5 Hello World from.
Parallel Programming with MPI By, Santosh K Jena..
F. Douglas Swesty, DOE Office of Science Data Management Workshop, SLAC March Data Management Needs for Nuclear-Astrophysical Simulation at the Ultrascale.
Quiz 3: solutions QUESTION #2 Consider a multiprocessor system with two processors (P1 and P2) and each processor has a cache. Initially, there is no copy.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
1 Message Passing Models CEG 4131 Computer Architecture III Miodrag Bolic.
Chapter 4 Message-Passing Programming. The Message-Passing Model.
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
File Systems cs550 Operating Systems David Monismith.
1 HPCI Presentation Kulathep Charoenpornwattana. March 12, Outline Parallel programming with MPI Running MPI applications on Azul & Itanium Running.
Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
Timing in MPI Tarik Booker MPI Presentation May 7, 2003.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
CS 591x Overview of MPI-2. Major Features of MPI-2 Superset of MPI-1 Parallel IO (previously discussed) Standard Process Startup Dynamic Process Management.
Parallel IO for Cluster Computing Tran, Van Hoai.
MPI Groups, Communicators and Topologies. Groups and communicators In our case studies, we saw examples where collective communication needed to be performed.
Message Passing Interface Using resources from
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
Introduction to parallel computing concepts and technics
MPI Basics.
Introduction to MPI.
MPI Message Passing Interface
CS 584.
Introduction to Message Passing Interface (MPI)
Message Passing Models
Pattern Programming Tools
Lab Course CFD Parallelisation Dr. Miriam Mehl.
Chapter 2: Operating-System Structures
Quick Tutorial on MPICH for NIC-Cluster
Chapter 2: Operating-System Structures
Distributed Memory Programming with Message-Passing
Parallel Processing - MPI
MPI Message Passing Interface
Some codes for analysis and preparation for programming
CS 584 Lecture 8 Assignment?.
Presentation transcript:

CS 591 x I/O in MPI

MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained by the MPI Forum

I/O in MPI MPI implementations conform well to MPI standards MPI 1 standards avoid the issue of I/O This is a problem since it is rare that a useful program does no I/O How to handle I/O is left to the individual implementations

I/O in MPI To use C I/O functions – which processes have access to stdin, stdout, stderr? This is undefined in MPI. Sometimes all processes have access to stdout. In some implementations only one process has access to stdout

I/O in MPI Sometimes stdout is only available to rank 0 in MPI_COMM_WORLD Same is true of stdin Some implementations provide no access to stdin

I/O in MPI So how do you create portable programs? Make some assumptions Do some checking

I/O in MPI Recall in our MPI implementation – MPI running under PBS puts stdout in a file (*.oxxxxx) No direct access to stdin

stdin in PBS/Torque -I -- means interactive can be on qsub command line or in script job still starts under the control of scheduler When job starts PBS/MPI will provide you with an interactive shell Not terribly obvious

I/O in MPI Two ways to deal with I/O in MPI define a specific approach in your program use specialized parallel I/O system I/O in parallel systems in a hot topic in high performance computing research

I/O in MPI Learn or define a single process that can do input (stdin) and output (stdout) Usually this will be rank 0 in MPI_COMM_WORLD Write program to have IO process manage all user IO (user input/reports, prompts,etc.)

I/O in MPI Attribute caching recall that topologies are attributes associated (attached to communicators) There are other attributes attached to communicators… … and you can assign your own for example, designate a process to handle IO

Attribute Caching Duplicate the communicator MPI_Comm_dup(old_comm, &new_comm); Define a key value (index) for the new attribute MPI_Keyval_create(MPI_DUP_FN, MPI_NULL_DELETE_FN, &IO_KEY, extra_arg);

Attribute caching Define a value for the attribute – define the rank of the designated IO process *io_rank = 0; Assign the attribute to to communicator MPI_Attr_put(io_comm, IO_KEY, io_rank); To retrieve an attribute MPI_Attr_get(io_comm, IO_KEY, &io_rank_att, &flag);

Attribute Caching Attribute caching functions are local you may need to share attribute values with other processes in the comm.

I/O Process Even though no IO mechanism is defined in MPI… MPI implementations should have several predefined attributes for MPI_COMM_WORLD One of these in MPI_IO Defines which in process in the comm is suppose to be able to do IO

I/O process If no process can do IO MPI_IO = MPI_PROC_NULL If every process in the comm can do IO MPI_IO = MPI_ANY_SOURCE If some can and some cannot process that can MPI_IO = myrank process that cannot MPI_IO = rank that can

I/O Process MPI_IO really means which process can do output still may not have access to stdin

MPI-IO –stdin, stdout,stderr for stdout – create an io communicator identify an IO process in the communicator or – create an IO process in the communicator IO process gathers results from compute processes IO process outputs results

MPI-IO -stdin Recall that all processes have access to stdin -only one process may have access to stdin, or no processes have access to stdin How will we know?

Testing stdin in MPI #include #include "mpi.h" main(int argc, char** argv) { int size, rank, numb; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf("enter an integer "); scanf(" %d",&numb); printf("Hello world! I'm %d of %d - numb = %d\n", rank, size,numb); MPI_Finalize(); }

Testing stdin in MPI #include #include "mpi.h" main(int argc, char** argv) { int size, rank, numb; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (rank == 0) { printf("enter an integer "); scanf(" %d",&numb);} MPI_Bcast(&numb, 1, MPI_INT, 0, MPI_COMM_WORLD); printf("Hello world! I'm %d of %d - numb = %d\n", rank, size,numb); MPI_Finalize();}

stdin – what to do? If all processes have access to stdin – designate one process as the IO process have that process read from stdin distribute input to other processes If only one process has access to stdin- identify which process has access to stdin have IO process read from stdin distribute data to other processes

stdin – What to do? If no process has access to stdin – pass data as command line arguments read input data from files create include files with data values  nuisance

File IO in MPI File IO can be a major bottleneck in the performance of a parallel application Parallel application can have large (enormous) data sets We often think of file IO as a side-effect – least in terms of performance – not true in parallel applications “One half hour of IO for every 2 hours of computation”

MPI File IO types of Applications Large grids and meshes storing grid point results for post pressing distributing data for input Checkpointing periodically saving the state of a job how much work can you afford to lose?

MPI File IO types of applications Disk caching data to large for local memories Data mining small compute load but a lot of file IO combing through large datasets  ex. CFD

File IO in MPI Recall that the use stdin, stdout, stderr assume, generally, a single channel for each of these This is not true with respect to file IO – sort of. Gathering to an IO node may not be the most efficient strategy

File IO in MPI In parallel systems you have multiple processors running concurrently each may have the ability to do file IO – concurrently Know your architecture Network shared disk storage  diskless compute nodes  directories shared across nodes

Directories on Energy /home/user - is shared and same on all nodes (r/w) /usr/local/packages/ - is shared and same on all nodes (ro) all other directories on any node are local to each node Implications?

IO example staging data for input dividing data before input to job distribute data pieces to local compute node disk drives each compute node reads local files to get its piece of the data  as opposed to “read and scatter” uses standard file IO calls

IO Example Dump and collect In some cases large results datasets do not need to gathered to an IO node compute node writes data to file on local disk drive postprocess program “visits” compute nodes and collects locally stored data postprocessor store integrated data set.

File IO strategy IO Process/Scatter-Gather vs. Local IO/distribute-collect Depends on – use of input/output size of dataset file IO capacity of compute nodes  available disk space  disk IO performance