Parallel Computing Through MPI Technologies Author: Nyameko Lisa Supervisors: Prof. Elena Zemlyanaya, Prof Alexandr P. Sapozhnikov and Tatiana F. Sapozhnikov.

Slides:



Advertisements
Similar presentations
Parallel Computing Glib Dmytriiev
Advertisements

Parallel Processing with OpenMP
Introductions to Parallel Programming Using OpenMP
Data Marshaling for Multi-Core Architectures M. Aater Suleman Onur Mutlu Jose A. Joao Khubaib Yale N. Patt.
Class CS 775/875, Spring 2011 Amit H. Kumar, OCCS Old Dominion University.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Types of Parallel Computers
Information Technology Center Introduction to High Performance Computing at KFUPM.
MPI – An introduction by Jeroen van Hunen What is MPI and why should we use it? Simple example + some basic MPI functions Other frequently used MPI functions.
Reference: Message Passing Fundamentals.
Data Parallel Algorithms Presented By: M.Mohsin Butt
Nor Asilah Wati Abdul Hamid, Paul Coddington, Francis Vaughan School of Computer Science, University of Adelaide IPDPS - PMEO April 2006 Comparison of.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
A Brief Look At MPI’s Point To Point Communication Brian T. Smith Professor, Department of Computer Science Director, Albuquerque High Performance Computing.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Pregel: A System for Large-Scale Graph Processing
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
CC02 – Parallel Programming Using OpenMP 1 of 25 PhUSE 2011 Aniruddha Deshmukh Cytel Inc.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
L15: Putting it together: N-body (Ch. 6) October 30, 2012.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
STRATEGIC NAMING: MULTI-THREADED ALGORITHM (Ch 27, Cormen et al.) Parallelization Four types of computing: –Instruction (single, multiple) per clock cycle.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Parallelization: Area Under a Curve. AUC: An important task in science Neuroscience – Endocrine levels in the body over time Economics – Discounting:
Parallel Computer Architecture and Interconnect 1b.1.
HPCA2001HPCA Message Passing Interface (MPI) and Parallel Algorithm Design.
Planned AlltoAllv a clustered approach Stephen Booth (EPCC) Adrian Jackson (EPCC)
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
1 Parallel Programming Aaron Bloomfield CS 415 Fall 2005.
Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.
Computer Organization & Assembly Language © by DR. M. Amer.
Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.
Chapter 4 Message-Passing Programming. The Message-Passing Model.
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
Introduction to MPI Nischint Rajmohan 5 November 2007.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
Threaded Programming Lecture 2: Introduction to OpenMP.
MPI implementation – collective communication MPI_Bcast implementation.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Parallel Computing Presented by Justin Reschke
Is MPI still part of the solution ? George Bosilca Innovative Computing Laboratory Electrical Engineering and Computer Science Department University of.
Computations with MPI technology Matthew BickellThomas Rananga Carmen Jacobs John S. NkunaMalebo Tibane S UPERVISORS : Dr. Alexandr P. Sapozhnikov Dr.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Introduction to Parallel Processing
SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data - Aditi Thuse.
Parallel Programming By J. H. Wang May 2, 2017.
Computer Engg, IIT(BHU)
Running R in parallel — principles and practice
MPI Message Passing Interface
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
CSCE569 Parallel Computing
Constructing a system with multiple computers or processors
Dr. Tansel Dökeroğlu University of Turkish Aeronautical Association Computer Engineering Department Ceng 442 Introduction to Parallel.
By Brandon, Ben, and Lee Parallel Computing.
Introduction to parallelism and the Message Passing Interface
MPJ: A Java-based Parallel Computing System
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Types of Parallel Computers
Presentation transcript:

Parallel Computing Through MPI Technologies Author: Nyameko Lisa Supervisors: Prof. Elena Zemlyanaya, Prof Alexandr P. Sapozhnikov and Tatiana F. Sapozhnikov

Outline – Parallel Computing through MPI Technologies  Introduction  Overview of MPI  General Implementation  Examples  Application to Physics Problems  Concluding Remarks

Introduction – Need for Parallelism More stars in the sky than there are grains of sands on all the beaches of the world

Introduction – Need for Parallelism It requires approximately 204 billion atoms to encode the human genome sequence Vast number of problems from a wide range of fields have significant computational requirements

Introduction – Aim of Parallelism Attempt to divide a single problem into multiple parts Distribute the segments of said problem amongst various processes or nodes Provide a platform layer to manage data exchange between multiple processes that solve a common problem simultaneously

Introduction – Serial Computation Problem divided into discrete, serial sequence of instructions Each executed individually, on a single CPU

Introduction – Parallel Computation Same problem distributed amongst several processes (program and allocated data)

Introduction – Implementation Main goal is to save time and hence money – Furthermore can solve larger problems – depleted resources – Overcome intrinsic limitations of serial computation – Distributed systems provide redundancy, concurrency and access to non-local resources, e.g. SETI, Facebook, etc 3 methodologies for implementation of parallelism – Physical Architecture – Framework – Algorithm In practice will almost always be combination of above Greatest hurdle is managing distribution of information and data exchange i.e. overhead

Introduction – Top 500 Japan’s K Computer (Kei = 10 quadrillion) Currently fastest supercomputer cluster in the world petaflops (~8 x calculations per second)

Overview – What is MPI? Message Passing Interface One of many frameworks and technologies for implementing parallelization Library of subroutines (FORTRAN), classes (C/C++) and bindings for python packages that mediate communication (via messages) between single threaded processes, executing independently and in parallel

Overview – What is needed? Common user accounts with same password Administrator / root privileges for all accounts Common directory structure and paths MPICH2 installed on all machines This is combination of MPI-1 and MPI-2 standards CH – Chameleon portability layer provides backward compatibility to existing MPI frameworks

Overview – What is needed? MPICC & MPIF77 – Provide options and special libraries needed to compile and link MPI programs MPIEXEC – Initialize parallel jobs and spawn copies of the executable to all of the processes Each process executes its own copy of code By convention choose root process (rank 0) to serve as master process

General Implementation Hello World - C++

General Implementation Hello World - FORTRAN

General Implementation Hello World - Output

Example - Broadcast Routine Point-to-point (send & recv) and Collective (bcast) library routines are contained in library Source node mediates distribution of data to/from all other nodes

Example - Broadcast Routine Linear Case Apart from root and last nodes, each node receives from and sends to previous and next node respectively Use point-to-point library routines to build custom collective routine MPI_RECV(myProc - 1) MPI_SEND(myProc + 1)

Example - Broadcast Routine Binary Tree Each parent node sends message to two child nodes MPI_SEND(2 * myProc) MPI_SEND(2 * myProc+1) IF( MOD(myProc,2) == 0 ) MPI_RECV( myProc/2 ) ELSE MPI_RECV((myProc-1)/2)

Example – Broadcast Routine Output

Applications to Physics Problems Quadrature – Discretize interval [a,b] into N steps and divide amongst processes: – FOR LOOP (1+myProc to N;incr of numProcs) – E.g. with N = 10 and numProcs = 3 Process: Iteration1, Iteration2,… 0: 1,4,7,10 1: 2,5,8 2: 3,6,9 Finite Difference problems – Similarly divide mesh/grid amongst processes Many applications, limited only by our ingenuity

Closing Remarks In 1970’s, Intel co-founder Gordon Moore, correctly predicted that, ”number of transistors that can be inexpensively placed on an integrated circuit doubles approximately every 2 years” 10-Core Xeon E7 processor family chips are currently commercially available MPI easy to implement and well suited to many independent operations that can be executed simultaneously Only limitations are overhead incurred by inter- process communications, out ingenuity ands strictly sequential segments of program

Acknowledgements and Thanks NRF and South African Department of Science and Technology JINR, University Center Dr. Jacobs and Prof. Lekala Prof. Elena Zemlyanaya, Prof Alexandr P. Sapozhnikov and Tatiana F. Sapozhnikov Last but not least my fellow colleagues