Download presentation
Presentation is loading. Please wait.
Published byPeter Lindsey Modified over 9 years ago
1
Presented by Open MPI on the Cray XT Richard L. Graham Tech Integration National Center for Computational Sciences
2
2 Graham_OpenMPI_SC07 Why does Open MPI exist? Maximize all MPI expertise: research/academia, industry, …elsewhere. Capitalize on (literally) years of MPI research and implementation experience. The sum is greater than the parts. Research/ academia Industry
3
3 Graham_OpenMPI_SC07 Current membership 14 members, 6 contributors 4 US DOE labs7 vendors 8 universities1 individual
4
4 Graham_OpenMPI_SC07 Key design feature: Components Formalized interfaces Specifies “black box” implementation Different implementations available at run-time Can compose different systems on the fly Interface 3Interface 2Interface 1 Caller
5
5 Graham_OpenMPI_SC07 Point-to-point architecture BTL-GM MPool-GM Rcache BTL-OpenIB MPool-OpenIB Rcache BML-R2 PML-OB1/DR MPI MTL-MX (Myrinet) PML-CM MTL- Portals MTL-PSM (QLogic)
6
6 Graham_OpenMPI_SC07 Portals port: OB1 vs. CM OB1 Matching in main-memory Short message: eager, buffer on receive Long message: rendezvous Rendezvous packet: 0 byte payload Get message after match CM Matching maybe on NIC Short message: eager, buffer on receive Long message: eager Send all data If Match: deliver directly to user buffer No Match: discard payload, and get() user data after match
7
7 Graham_OpenMPI_SC07 Collective communications component structure PML OB1 CM DR CRCPW Allocator Basic Bucket BTL TCP Shared Mem. Infiband MTL Myrinet MX Portals PSM Topology Basic Utility Collective Basic Tuned Hierarchical Intercomm. Shared Mem. Non-blocking I/O Portals MPI Component Architecture (MCA) MPI API User application
8
8 Graham_OpenMPI_SC07 Benchmark Results
9
9 Graham_OpenMPI_SC07 NetPipe bandwidth data (MB/sec) 0.00010.0010.010.1110100100010000 Data Size (KBytes) Open MPI—CM Open MPI—OB1 Cray MPI 2000 1800 1600 1400 1200 1000 800 600 400 200 0 Bandwidth (MBytes/sec)
10
10 Graham_OpenMPI_SC07 Zero byte ping-pong latency Open MPI—CM 4.91 µsec Open—OB1 6.16 µsec Cray MPI 4.78 µsec
11
11 Graham_OpenMPI_SC07 VH1—Total runtime 3.54.04.55.05.56.57.07.58.5 Log 2 Processor Count 250 240 230 220 210 200 VH-1 Wall Clock Time (sec) 8.06.0 Open MPI—CM Open MPI—OB1 Cray MPI
12
12 Graham_OpenMPI_SC07 GTC—Total runtime 1234578911 Log 2 Processor Count Open MPI—CM Open MPI—OB1 Cray MPI 1150 1100 1050 1000 900 800 106 950 850 GTC Wall Clock Time (sec)
13
13 Graham_OpenMPI_SC07 POP—Step runtime 345678911 Log 2 Processor Count 2048 1024 512 256 128 POP Time Step Wall Clock Time (sec) 10 Open MPI—CM Open MPI—OB1 Cray MPI
14
14 Graham_OpenMPI_SC07 Summary and future directions Support for XT (Catamount and Compute Node Linux) within standard distribution Performance (application and micro-benchmarks) comparable to that of Cray MPI Support for recovery from process failure is being added
15
15 Graham_OpenMPI_SC07 Contact Richard L. Graham Tech Integration National Center for Computational Sciences (865) 356-3469 rlgraham@ornl.gov www.open-mpi.org 15 Graham_OpenMPI_SC07
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.