HPC across Heterogeneous Resources Sathish Vadhiyar.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

MPI Message Passing Interface
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Winter, 2004CSS490 MPI1 CSS490 Group Communication and MPI Textbook Ch3 Instructor: Munehiro Fukuda These slides were compiled from the course textbook,
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
Beowulf Supercomputer System Lee, Jung won CS843.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Types of Parallel Computers
Dr. Zahid Anwar. Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster.
Reference: Message Passing Fundamentals.
High Availability (HA) May 03, Motivation  New Technology  The opportunity to create a cluster  Exploring with Linux Operating system.
An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.
CS335 Networking & Network Administration Tuesday April 27, 2010.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Mapping Techniques for Load Balancing
Communication Part II Message-Oriented Communication Chapter 4.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Chapter 2 Computer Clusters Lecture 2.1 Overview.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Managing Heterogeneous MPI Application Interoperation and Execution. From PVMPI to SNIPE based MPI_Connect() Graham E. Fagg*, Kevin S. London, Jack J.
Fine Grain MPI Earl J. Dodd Humaira Kamal, Alan University of British Columbia 1.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
1 Parallel of Hyderabad CS-726 Parallel Computing By Rajeev Wankar
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
SOME EXPERIMENTS on GRID COMPUTING in COMPUTATIONAL FLUID DYNAMICS Thierry Coupez(**), Alain Dervieux(*), Hugues Digonnet(**), Hervé Guillard(*), Jacques.
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
Parallel and Distributed Simulation Hardware Platforms Simulation Fundamentals.
Programming Environment & Training (PET) Sep 99 1 Parallel I/O for Distributed Applications Dr Graham E Fagg Innovative Computing Laboratory University.
Outline Course Administration Parallel Archtectures –Overview –Details Applications Special Approaches Our Class Computer Four Bad Parallel Algorithms.
Edgar Gabriel Short Course: Advanced programming with MPI Edgar Gabriel Spring 2007.
The Cluster Computing Project Robert L. Tureman Paul D. Camp Community College.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
MIMD Distributed Memory Architectures message-passing multicomputers.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
The Grid the united computing power Jian He Amit Karnik.
1 MPI_Connect and Parallel I/O for Distributed Applications Dr Graham E Fagg Innovative Computing Laboratory University of Tennessee Knoxville, TN
Distributed Programming CA107 Topics in Computing Series Martin Crane Karl Podesta.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 8 October 23, 2002 Nayda G. Santiago.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
2/22/2001Greenbook 2001/OASCR1 Greenbook/OASCR Activities Focus on technology to enable SCIENCE to be conducted, i.e. Software tools Software libraries.
Fourth EGEE Conference Pise, October 23-28, 2005 DEISA Perspectives Towards cooperative extreme computing in Europe Victor Alessandrini IDRIS - CNRS
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Million Entity Distributed Simulations Improving network load by eliminating broadcast interest groups Presented by: David Prody.
Fault Tolerance and Checkpointing - Sathish Vadhiyar.
Albert-Einstein-Institut Exploring Distributed Computing Techniques with Ccactus and Globus Solving Einstein’s Equations, Black.
MPI Adriano Cruz ©2003 NCE/UFRJ e Adriano Cruz NCE e IM - UFRJ Summary n References n Introduction n Point-to-point communication n Collective.
Background Computer System Architectures Computer System Software.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 6, 2006 Session 22.
Page : 1 SC2004 Pittsburgh, November 12, 2004 DEISA : integrating HPC infrastructures in Europe DEISA : integrating HPC infrastructures in Europe Victor.
Wireless M2M System Architecture for Data Acquisition and Control Elmira Dichkova Bachvarova Central Laboratory of Mechatronics and Instrumentation Bulgarian.
PVM and MPI.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Clouds , Grids and Clusters
Parallel and Multiprocessor Architectures – Shared Memory
Different Architectures
Hybrid Programming with OpenMP and MPI
Presentation transcript:

HPC across Heterogeneous Resources Sathish Vadhiyar

Motivation  MPI assumes global communicator with the help of which all processes can communicate with each other.  Hence all nodes on which MPI application is started must be accessible by all nodes.  This is not always possible  Due to the shortage of IP address space.  Security concerns – Large MPP sites and beowulf clusters have only 1 master node in public IP address space.  Grand challenging applications require the use of many MPPs.

PACX-MPI  Parallel Computer eXtension  MPI on a cluster of MPPs  Initially developed to connect Cray-YMP with Intel Paragon

PACX-MPI  CFDs or crash simulation of automobiles – 1 MPP not enough  Initial application – flow solver application across Pittsburgh supercomputing centre, Sandia national lab and High Performance Computing Center, Stuttgart  PACX sits between applications and MPI;

PACX-MPI  On each MPP, 2 extra nodes with running daemons to take care of communication between MPPs, compression and decompression of data, communication with local nodes  Daemon nodes implemented as additional local MPI processes  Communication among processes internal to MPP through vendor MPI on a local MPP network.  Communication between MPPs through daemons via Internet or specialized network.

PACX MPI Architecture

PACX MPI – Point-point comm. (Node 6 -> Node 2)

PACX MPI – Broadcast comm. (Root: Node 6)

Data Conversion  If the sender and receiver are in two separate heterogeneous MPPs, the sender converts its data to XDR (external data representation) format  Receiver converts data from XDR format to its own data representation

PACX-MPI: Results  Between T3Es at PSC and SDSC T3E  URANUS application  Navier-Strokes application – an iterative application based on convergence  Simulation of rentry vehicle  Between PSC and Univ. of Stuttgart  Closely coupled application due to frequent communication due to convergence.  Had to be modified for metacomputing setting – application made more asynchronous compromising on convergence  P3TDSMC – Monte Carlo for particle tracking  More amenable to metacomputing because of high computation- communication ratio  The latency effects are hidden as larger number of particles are considered.

Other Related Projects  PLUS  MPICH-G  PVMPI  MPI-Connect

References  Edgar Gabriel, Michael Resch, Thomas Beisel, Rainer Keller: 'Distributed computing in a heterogenous computing environment', (gzipped postscript) to appear at EuroPVMMPI'98 Liverpool/UK,  Thomas Beisel, Edgar Gabriel, Michael Resch: 'An Extension to MPI for Distributed Computing on MPPs' (gzipped postscript) in Marian Bubak, Jack Dongarra, Jerzy Wasniewski (Eds.) 'Recent Advances in Parallel Virtual Machine and Message Passing Interface', Lecture Notes in Computer Science, Springer, 1997,  Message-passing environments for metacomputing, Pages , Matthias A. Brune, Graham E. Fagg and Michael M. Resch, FGCS, Volume 15, 1999  PVMPI: An Integration of PVM and MPI systems.  A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems. I. Foster, N. Karonis. Proc SC Conference, November, 1998