MPI Progress-Independent Communicators. Pavan Balaji, Argonne National Laboratory Shared Object Semantics in MPI P0 (Thread 0)P0 (Thread 1)P1 MPI_Irecv(…,

Slides:



Advertisements
Similar presentations
Proposal (More) Flexible RMA Synchronization for MPI-3 Hubert Ritzdorf NEC–IT Research Division
Advertisements

Generalized Requests. The current definition They are defined in MPI 2 under the hood of the chapter 8 (External Interfaces) Page 166 line 16 The objective.
RMA Considerations for MPI-3.1 (or MPI-3 Errata)
Construction process lasts until coding and testing is completed consists of design and implementation reasons for this phase –analysis model is not sufficiently.
Using the Self Service BMC Helpdesk
1 Computer Science, University of Warwick Accessing Irregularly Distributed Arrays Process 0’s data arrayProcess 1’s data arrayProcess 2’s data array Process.
Non-Blocking Collective MPI I/O Routines Ticket #273.
Enabling MPI Interoperability Through Flexible Communication Endpoints
Practice Session 7 Synchronization Liveness Deadlock Starvation Livelock Guarded Methods Model Thread Timing Busy Wait Sleep and Check Wait and Notify.
Keyval and callbacks Rules and Behaviors. Background Keyvals are used for caching attributes on {COMM,TYPE,WIN} objects Discussion will focus on COMM.
1 Implementing Master/Slave Algorithms l Many algorithms have one or more master processes that send tasks and receive results from slave processes l Because.
Implementing A Simple Storage Case Consider a simple case for distributed storage – I want to back up files from machine A on machine B Avoids many tricky.
A Pipeline for Lockless Processing of Sound Data David Thall Insomniac Games.
OPERATING SYSTEMS Threads
COS 461 Fall 1997 Where We Are u so far: networking u rest of semester: distributed computing –how to use networks to do interesting things –connection.
Toward Efficient Support for Multithreaded MPI Communication Pavan Balaji 1, Darius Buntinas 1, David Goodell 1, William Gropp 2, and Rajeev Thakur 1 1.
1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
Exercise lecture : Exercise 2 and 3 Rune / Yun. Overview Intro to exercise 3 Aspects from exercise 2.
Ordering and Consistent Cuts Presented By Biswanath Panda.
LYU9901-Travel Net LYU9901-Travel Net Supervisor: Prof. Michael R. Lyu Students: Ho Chi Ho Malcolm Lau Chi Ho Arthur (Presentation on )
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Alessandro Agnello ActiveMath - A Learning Platform With Semantic Web Features 1IWS2 Bits - Alessandro Agnello.
Runtime Support for Irregular Computations in MPI-Based Applications - CCGrid 2015 Doctoral Symposium - Xin Zhao *, Pavan Balaji † (Co-advisor), William.
Lessons Learned Implementing User-Level Failure Mitigation in MPICH Wesley Bland, Huiwei Lu, Sangmin Seo, Pavan Balaji Argonne National Laboratory User-level.
1 I/O Management in Representative Operating Systems.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
Task Farming on HPCx David Henty HPCx Applications Support
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
Synthesising Identify supporting ideas and contradictory ideas. Check the grouping of ideas? Synthesis is how you integrate and combine materials gathered.
Nachos Phase 1 Code -Hints and Comments
1 State Records Center Entering New Inventory  Versatile web address:  Look for any new ‘Special Updates’ each.
Unix Pipes Pipe sets up communication channel between two (related) processes. 37 Two processes connected by a pipe.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Argonne National Laboratory is a U.S. Department of Energy laboratory managed by U Chicago Argonne, LLC. Xin Zhao *, Pavan Balaji † (Co-advisor) and William.
CPSC 217 T03 Week I Part #1: Unix and HELLO WORLD Hubert (Sathaporn) Hu.
Writing a Research Proposal. Key questions  Do’s and don’ts of writing a research proposal  Pros and cons of doing research, including writing  Communication.
Request username and password if you don’t already have one.
Free Powerpoint Templates Page 1 Free Powerpoint Templates Ray Payroll Management.
Abdelhalim Amer *, Huiwei Lu *, Pavan Balaji *, Satoshi Matsuoka + *Argonne National Laboratory, IL, USA +Tokyo Institute of Technology, Tokyo, Japan Characterizing.
Setting up and getting going with…. MIT App Inventor.
INFO 424 Team Project Practicum Week 2 - Launch report, Project tracking, Review report Glenn Booker Notes largely from Prof. Hislop.
MPI-3 Hybrid Working Group Status. MPI interoperability with Shared Memory Motivation: sharing data between processes on a node without using threads.
Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.
PMI: A Scalable Process- Management Interface for Extreme-Scale Systems Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing.
Petra III Status Teresa Núñez Hasylab-DESY Tango Meeting DESY,
Wesley Bland, Huiwei Lu, Sangmin Seo, Pavan Balaji Argonne National Laboratory {wbland, huiweilu, sseo, May 5, 2015 Lessons Learned Implementing.
Java Thread and Memory Model
Efficient Multithreaded Context ID Allocation in MPI James Dinan, David Goodell, William Gropp, Rajeev Thakur, and Pavan Balaji.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
CS 179: GPU Programming Lab 7 Recitation: The MPI/CUDA Wave Equation Solver.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
Project Management Kirk L Nylund February 18, 2015.
An Architecture for Mobile Databases By Vishal Desai.
Endpoints Plenary James Dinan Hybrid Working Group December 10, 2013.
MPI Communicator Assertions Jim Dinan Point-to-Point WG March 2015 MPI Forum Meeting.
Issues #1 and #3 Fault Tolerance Working Group December 2015 MPI Forum.
Part 1. Managing replicated server groups These questions pertain to managing server groups with replication, as in e.g., Chubby, Dynamo, and the classical.
The Chinese Room Argument Part II Joe Lau Philosophy HKU.
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
CSC CSC 143 Threads. CSC Introducing Threads  A thread is a flow of control within a program  A piece of code that runs on its own. The.
Design and Manufacture and Graphic Communication.
Error Handler Rework Fault Tolerance Working Group.
Synergy.cs.vt.edu VOCL: An Optimized Environment for Transparent Virtualization of Graphics Processing Units Shucai Xiao 1, Pavan Balaji 2, Qian Zhu 3,
Skimming in Zztop Ricardo – SLT meeting.
Thread Synchronization
IS-ENES Cases Seven use cases are listed as data lifecycle steps A B C
Compilers, Languages, and Memory Models
Presentation transcript:

MPI Progress-Independent Communicators

Pavan Balaji, Argonne National Laboratory Shared Object Semantics in MPI P0 (Thread 0)P0 (Thread 1)P1 MPI_Irecv(…, comm1, &req1);MPI_Irecv(…, comm2, &req2);MPI_Ssend(…, comm1); pthread_barrier();pthread_barrier();MPI_Ssend(…, comm2); MPI_Wait(&req2, …);pthread_barrier(); MPI_Wait(&req1, …);  Current semantics allow any thread to access any MPI object –Request can be created by one thread and used by another thread –MPI implementation requires appropriate locking/memory consistency to make sure this is allowed  Some applications might not require such semantics –Each thread only uses objects generated by it

Pavan Balaji, Argonne National Laboratory Communicator Hints  Predefined info arguments for MPI_Comm_dup_with_info  independent_comm –“This info argument allows a high-quality MPI implementation to assign independent communication resources to this communicator. A thread waiting on an operation issued on a communicator with this info argument set, will not be required to make progress on pending operations on any other communicator, window or file.” –The MPI implementation can create independent communication resources for this communicator (out-of-band communication, lesser lock contention)  The following program, which is valid for MPI-3, will not be valid with this info hint: P0 (Thread 0)P0 (Thread 1)P1 MPI_Irecv(…, comm1, &req1);MPI_Irecv(…, comm2, &req2);MPI_Ssend(…, comm1); pthread_barrier();pthread_barrier();MPI_Ssend(…, comm2); MPI_Wait(&req2, …);pthread_barrier(); MPI_Wait(&req1, …);

Pavan Balaji, Argonne National Laboratory Other notes (1/2)  How the semantics are inherited to other objects needs to be discussed –What happens when I dup an “independent” communicator The standard defines that dup will inherit the info, so the new communicator will be independent as well –What happens when I split an “independent” communicator The standard defines that split, etc., will not inherit the info, so the new communicator will not be independent –What happens to files, windows, etc. No additional propagation of info. We can define the info key for those objects if needed

Pavan Balaji, Argonne National Laboratory Other notes (2/2)  Additional hints can improve performance further –E.g., this communicator will be used by only one thread  Implementation details –Would be useful if the MPI implementation can create communicator- specific objects –Already done as research papers, but not in production implementations today (?)

Pavan Balaji, Argonne National Laboratory What are we losing?  If applications expect progress on one communicator will make progress on everything, that might not happen any more –Is that breaking backward compatibility? Might be OK since this is a new info key –E.g., asynchronous progress threads