Download presentation
Presentation is loading. Please wait.
Published byAlan Rich Modified over 8 years ago
1
Open MPI - A High Performance Fault Tolerant MPI Library Richard L. Graham Advanced Computing Laboratory, Group Leader (acting)
2
Overview Open MPI Collaboration MPI Run-time Future directions
3
Collaborators Los Alamos National Laboratory (LA-MPI) Sandia National Laboratory Indiana University (LAM/MPI) The University of Tennessee (FT-MPI) High Performance Computing Center, Stuttgart (PACX-MPI) University of Houston Cisco Systems Mellanox Voltaire Sun Myricom IBM QLogic URL: www.open-mpi.org
4
A Convergence of Ideas Robustness (CSU) PACX-MPI (HLRS) LAM/MPI (IU) LA-MPI (LANL) FT-MPI (U of TN) Open MPI Fault Detection (LANL, Industry) Grid (many) Autonomous Computing (many) FDDP (Semi. Mfg. Industry) ResilientComputingSystems OpenRTE
5
Components Formalized interfaces Specifies “black box” implementation Different implementations available at run-time Can compose different systems on the fly Interface 1Interface 2Interface 3 Caller
6
Performance Impact
7
MPI
8
Two Sided Communications
9
P2P Component Frameworks
10
Shared Memory - Bandwidth
11
Shared Memory - Latency
12
IB Performance Latency Message SizeLatency - Open MPILatency - MVAPICH 03.099.6 (anomaly?) 13.483.09 323.603.30 1284.484.16 20487.938.67 819215.7222.86 1638427.1429.37
13
IB Performance Bandwidth
14
GM Performance Data Ping-Pong Latency (usec) Data SizeOpen MPIMPICH-GM 0 Byte8.138.07 8 Byte8.328.22 64 Byte8.688.65 256 Byte12.5212.11
15
GM Performance Data Ping-Pong Latency (usec) - Data FT Data SizeOpen MPI - OB1 Open MPI - FT LA-MPI - FT 0 Byte5.248.659.2 8 Byte5.508.679.26 64 Byte6.009.079.45 256 Byte8.5213.0113.54
16
GM Performance Data Ping-Pong Bandwidth
17
MX Ping-Pong Latency (usec) Message SizeOpen MPI - MTL MPICH - MX 03.142.87 83.222.89 643.913.6 2565.765.25
18
MX Performance Data Ping-Pong Bandwidth (MB/sec)
19
XT3 Performance Latency Implementation1 Byte Latency Native Portals5.30us MPICH-27.14us Open MPI8.50us
20
XT3 Performance Bandwidth
21
Collective Operations
22
MPI Reduce - Performance
23
MPI Broadcast - Performance
24
MPI Reduction - II
25
Open RTE
26
Seamless, transparent environment for high- performance applications Inter-process communications within and across cells Distributed publish/subscribe registry Supports event-driven logic across applications, cells Persistent, fault tolerant Dynamic “spawn” of processes, applications both within and across cells Grid Single Computer Cluster Open RTE - Design Overview
27
Grid Single Computer Cluster Open RTE - Components
28
General Purpose Registry Cached, distributed storage/retrieval system All common data types plus user-defined Heterogeneity between storing process and recipient automatically resolved Publish/subscribe Support event-driven coordination and notification Subscribe to individual data elements, groups of elements, wildcard collections Specify actions that trigger notifications
29
Subscription Services Subscribe to container and/or keyval entry Can be entered before data arrives Specifies data elements to be monitored Container tokens and/or data keys Wildcards supported Specifies action that generates event Data entered, modified, deleted Number of matching elements equals, exceeds, is less than specified level Number of matching elements transitions (increases/decreases) through specified level Events generate message to subscriber Includes specified data elements Asynchronously delivered to specified callback function on subscribing process
30
Future Directions
31
Revise MPI Standard Clarify standard Standardized the interface Simplify standard Make the standard more “H/W Friendly”
32
Beyond Simple Performance Measures Performance and scalability are important, but What about future HPC systems Heterogeneity Multi-core Mix of processors Mix of networks Fault-tolerance
33
Focus on Programmability Performance and Scalability are important, but what about Programmability
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.