Open MPI OpenFabrics Update April 2008 Jeff Squyres.

Slides:



Advertisements
Similar presentations
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
Advertisements

1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
OFED TCP Port Mapper Proposal June 15, Overview Current NE020 Linux OFED driver uses host TCP/IP stack MAC and IP address for RDMA connections Hardware.
MPI Requirements of the Network Layer Presented to the OpenFabrics libfabric Working Group January 21, 2014 Community feedback assembled by Jeff Squyres,
Uncovering Performance and Interoperability Issues in the OFED Stack March 2008 Dennis Tolstenko Sonoma Workshop Presentation.
HPC USER FORUM I/O PANEL April 2009 Roanoke, VA Panel questions: 1 response per question Limit length to 1 slide.
IWARP Update #OFADevWorkshop.
MCDST : Supporting Users and Troubleshooting a Microsoft Windows XP Operating System Chapter 9: Troubleshooting Power Management and I/O Devices.
OFED (iWarp) Enhancements Felix Marti, Open Fabrics Alliance Workshop Sonoma, April 2008 Chelsio Communications.
IB ACM InfiniBand Communication Management Assistant (for Scaling) Sean Hefty.
Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.
 Introduction Introduction  Definition of Operating System Definition of Operating System  Abstract View of OperatingSystem Abstract View of OperatingSystem.
Lesson 4 Computer Software
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Virtualizing Modern High-Speed Interconnection Networks with Performance and Scalability Institute of Computing Technology, Chinese Academy of Sciences,
OFA-IWG - March 2010 OFA Interoperability Working Group Update Authors: Mikkel Hagen, Rupert Dance Date: 3/15/2010.
SRP Update Bart Van Assche,.
Wave Relay System and General Project Details. Wave Relay System Provides seamless multi-hop connectivity Operates at layer 2 of networking stack Seamless.
OFED 1.x Roadmap & Release Process November 06 Jeff Squyres, Woodruff, Robert J, Betsy Zeller, Tziporet Koren,
OFA Interoperability Logo Program Sujal Das, April 30, 2007 Sonoma Workshop Presentation.
OFA-IWG Interop Event March 2008 Rupert Dance, Arkady Kanevsky, Tuan Phamdo, Mikkel Hagen Sonoma Workshop Presentation.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco PublicOFA Open MPI 1 Open MPI Progress Jeff Squyres.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
The Open Fabrics Verbs Working Group Pavel Shamis and Liran Liss.
Open Fabrics BOF Supercomputing 2008 Tziporet Koren, Gilad Shainer, Yiftah Shahar, Bob Woodruff, Betsy Zeller.
OFED for Linux: Status and Next Steps 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010.
OpenFabrics 2.0 or libibverbs 1.0 Sean Hefty Intel Corporation.
Versus JEDEC STAPL Comparison Toolkit Frank Toth February 20, 2000.
Management Scalability Author: Todd Rimmer Date: April 2014.
OFED Interoperability NetEffect April 30, 2007 Sonoma Workshop Presentation.
OFED - Status and Process November 2007 Tziporet Koren.
Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.
Scalable RDMA Software Solution Sean Hefty Intel Corporation.
OFED Usage in VMware Virtual Infrastructure Anne Marie Merritt, VMware Tziporet Koren, Mellanox May 1, 2007 Sonoma Workshop Presentation.
IWARP Status Tom Tucker. 2 iWARP Branch Status  OpenFabrics SVN  iWARP in separate branch in SVN  Current with trunk as of SVN 7626  Support for two.
Integrating New Capabilities into NetPIPE Dave Turner, Adam Oline, Xuehua Chen, and Troy Benjegerdes Scalable Computing Laboratory of Ames Laboratory This.
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group NERSC User Group Meeting September 17, 2007.
Resource Utilization in Large Scale InfiniBand Jobs Galen M. Shipman Los Alamos National Labs LAUR
OpenFabrics Enterprise Distribution (OFED) Update
Operating System Principles And Multitasking
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Basic Memory Management 1. Readings r Silbershatz et al: chapters
Intel Open Source Technology Center Lu Baolu 2015/09
iSER update 2014 OFA Developer Workshop Eyal Salomon
OpenFabrics Interface WG A brief introduction Paul Grun – co chair OFI WG Cray, Inc.
OFA-IWG Interop Event April 2007 Rupert Dance Lamprey Networks Sonoma Workshop Presentation.
OpenFabrics 2.0 rsockets+ requirements Sean Hefty - Intel Corporation Bob Russell, Patrick MacArthur - UNH.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring Windows Server 2008 Printing.
Other Tools HPC Code Development Tools July 29, 2010 Sue Kelly Sandia is a multiprogram laboratory operated by Sandia Corporation, a.
Open Fabrics Interfaces Software Sean Hefty - Intel Corporation.
Is MPI still part of the solution ? George Bosilca Innovative Computing Laboratory Electrical Engineering and Computer Science Department University of.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
This has been created by QA InfoTech. Choose QA InfoTech as your Automated testing partner. Visit for more information.
SC’13 BoF Discussion Sean Hefty Intel Corporation.
Sun Tech Talk 3 Solaris 10 and OpenSolaris Pierre de Filippis Sun Campus Evangelist
MASS Java Documentation, Verification, and Testing
NOX: Towards an Operating System for Networks
The University of Adelaide, School of Computer Science
Maintaining software solutions
Introduction to Computers
Resource Utilization in Large Scale InfiniBand Jobs
Enabling the NVMe™ CMB and PMR Ecosystem
OFED 1.2 Status and Contents
Chapter 2: Operating-System Structures
Presented by Bogdan Stanca-Kaposta (Spirent)
Chapter 2: Operating-System Structures
Presentation transcript:

Open MPI OpenFabrics Update April 2008 Jeff Squyres

Sidenote: MPI Forum MPI Forum re-convening  2.1: bug fixes, consolidate to one document  2.2: “bigger” bug fixes  3.0: addition of entirely new stuff Strongly encourage all to participate  Hardware / MPI vendors  ISVs who use MPI  MPI end users Next meeting: April 28-30, Chicago

OMPI Current Membership 15 members, 9 contributors, 1 partner  6 research labs  8 universities  10 vendors  1 individual

Current Status Stable release series: 1.2  Current community release: v1.2.6 Released yesterday Bug fix release  OFED v1.3 includes: v1.2.5 Will include v1.2.6 in OFED v1.3.1 Working towards next major series: 1.3  Exact release date difficult to predict  “Herding the cats”

v1.3: OpenFabrics Features Connection Manager support  IB CM (many thanks Sean Hefty), RDMA CM XRC support APM support BSRQ (including XRC integration)  Multiple receive queues with different size buffers (i.e., send on Q with closest size)  More efficient use of registered memory No use of UD [yet?]

New iWARP Support Open Grid Computing / Chelsio  Adding RDMA CM, auditing verbs usage More difficult than initially expected  Adding CM support to OMPI was “hard”  Many firmware, driver, and OMPI bugs Work mostly complete; still to-do:  Init parameter file extensions  Multiple ports / devices striping setup

iWARP Challenges Chelsio T3 does not support SRQ ibv_post_recv() race condition  OMPI uses multiple QPs per peer pair (BSRQ)  But all OMPI flow control goes on one QP Registered memory utilization will be poor  Both issues fixed in “T4” Connection “initiator-must-send-first” issue  Solved by hiding 0 byte RDMA read in Chelsio firmware / driver (NetEffect has similar)  More general / NIC-independent solution coming in OFED 1.4

iWARP Lessons Learned No “huge” surprises  Verbs worked as expected Open MPI and MVAPICH use the verbs stack very differently  Brought out many, many latent vendor bugs  Strongly encourage other iWARP vendors to start testing / participating ASAP MPI Testing Tool (MTT) can help!

Other v1.3 Features Dropping VAPI support Major job launch scalability improvements  LANL RoadRunner (LANL, IBM)  TACC Ranger (Sun)  Jaguar (ORNL) Tighter integration with parallel tools  DDT parallel debugger “understands” opaque MPI handles  VampirTrace integration (tracefile / post- mortem analysis)

Other v1.3 Features “Manycore” issues  Use newest Portable Linux Processor Affinity (PLPA) release (see  Allow binding to specific socket/core  “Better” integration to resource managers to allow them to handle affinity (post 1.3?) First cut of “Carto”[graphy] framework  Discover and use topology of host, fabric  Port selection, collective algorithms

Roadmap 1.3 release taking too long  Group decided 1.3 feature-driven, not time  About 1.5 years since initial 1.2 release May move to a shorter plan release cycle  At least once a year?  Still under debate Have a variety of features planned for “post 1.3” releases

Come Join Us!