Illustration of ISP for each bug-class MPI Happens-before: how MPI supports Out-of-order execution The POE algorithm of ISP explained using MPI happens-before.

Slides:

Advertisements

Similar presentations

1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (

Advertisements

Demo of ISP Eclipse GUI Command-line Options Set-up Audience with LiveDVD About 30 minutes – by Ganesh 1.

1 Non-Blocking Communications. 2 #include int main(int argc, char **argv) { int my_rank, ncpus; int left_neighbor, right_neighbor; int data_received=-1;

CS 542: Topics in Distributed Systems Diganta Goswami.

Message Passing: Formalization, Dynamic Verification Ganesh Gopalakrishnan School of Computing, University of Utah, Salt Lake City, UT 84112, USA based.

PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,

Module 7: Advanced Development  GEM only slides here  Started on page 38 in SC09 version Module 77-0.

Concurrent Programming James Adkison 02/28/2008. What is concurrency? “happens-before relation – A happens before B if A and B belong to the same process.

Mutual Exclusion.

Poor Mans Cluster (PMC) Johny Balian. Outline What is PMC How it works Concept Positive aspects Negative aspects Good and Bad Application ideas Monte.

Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.

CMSC 202 Exceptions 2 nd Lecture. Aug 7, Methods may fail for multiple reasons public class BankAccount { private int balance = 0, minDeposit =

An Introduction to Java Programming and Object- Oriented Application Development Chapter 8 Exceptions and Assertions.

 Both System.out and System.err are streams—a sequence of bytes.  System.out (the standard output stream) displays output  System.err (the standard.

CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XVIII: Concluding Remarks.

Getting Started with MPI Self Test with solution.

1 Complexity of Network Synchronization Raeda Naamnieh.

Mrs. Chapman. Tabs (Block Categories) Commands Available to use Script Area where you type your code Sprite Stage All sprites in this project.

11-Jun-15 Exceptions. 2 Errors and Exceptions An error is a bug in your program dividing by zero going outside the bounds of an array trying to use a.

Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.

Point-to-Point Communication Self Test with solution.

Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.

Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.

S A B D C T = 0 S gets message from above and sends messages to A, C and D S.

1 Lecture 15: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.

MPI Point-to-Point Communication CS 524 – High-Performance Computing.

The shift from sequential to parallel and distributed computing is of fundamental importance for the advancement of computing practices. Unfortunately,

/t/ / I d/ /d/ Try Again Go on /t/ / I d/ /d/

1 TRAPEZOIDAL RULE IN MPI Copyright © 2010, Elsevier Inc. All rights Reserved.

Mr. Wortzman. Tabs (Block Categories) Available Blocks Script Area Sprite Stage All sprites in this project.

CSE 486/586 CSE 486/586 Distributed Systems PA Best Practices Steve Ko Computer Sciences and Engineering University at Buffalo.

Project Planning with IT Y/601/7321

1 Choosing MPI Alternatives l MPI offers may ways to accomplish the same task l Which is best? »Just like everything else, it depends on the vendor, system.

1 VeriSoft A Tool for the Automatic Analysis of Concurrent Reactive Software Represents By Miller Ofer.

Object Oriented Programming Elhanan Borenstein Lecture #4.

Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.

IBM TSpaces Lab 1 Introduction. Summary TSpaces Overview Basic Definitions Basic primitive operations Reading/writing tuples in tuplespace HelloWorld.

Games Development 2 Concurrent Programming CO3301 Week 9.

Lecture 6-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) September 12, 2013 Lecture 6 Global Snapshots Reading:

Concurrent Programming. 2 The Cunning Plan We’ll look into: –What concurrent programming is –Why you care –How it’s done We’re going to skim over *all*

Debugging and Profiling With some help from Software Carpentry resources.

Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.

Distributed Systems Fall 2010 Logical time, global states, and debugging.

The shift from sequential to parallel and distributed computing is of fundamental importance for the advancement of computing practices. Unfortunately,

Time Management If you cannot manage yourself for effectiveness, you cannot expect to manage others” (Peter Drucker).

MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Concurrency & Dynamic Programming.

Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,

Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,

Endpoints Plenary James Dinan Hybrid Working Group December 10, 2013.

1 Chapter 11 Global Properties (Distributed Termination)

CSC CSC 143 Threads. CSC Introducing Threads  A thread is a flow of control within a program  A piece of code that runs on its own. The.

Agenda  Quick Review  Finish Introduction  Java Threads.

CSS430 Deadlocks Textbook Ch7

Verification of Data-Dependent Properties of MPI-Based Parallel Scientific Software Anastasia Mironova.

Deadlock Management.

Multiple Writers and Races

Methods The real power of an object-oriented programming language takes place when you start to manipulate objects. A method defines an action that allows.

Threading And Parallel Programming Constructs

Algorithms Take a look at the worksheet. What do we already know, and what will we have to learn in this term?

CMSC 202 Exceptions 2nd Lecture.

CMSC 202 Exceptions 2nd Lecture.

Lecture 10: Consistency Models

Exceptions 25-Apr-19.

Exceptions 22-Apr-19.

Exceptions 10-May-19.

Exceptions 5-Jul-19.

CMSC 202 Exceptions 2nd Lecture.

Lecture 11: Consistency Models

Presentation transcript:

Illustration of ISP for each bug-class MPI Happens-before: how MPI supports Out-of-order execution The POE algorithm of ISP explained using MPI happens-before About 30 minutes – by Ganesh 1

Deadlocks – Show how a collective deadlock is caught Resource Leaks – Demonstrate MPI_hb_different_comm that shows Coloration in Java GUI depends on communicators The relaxed HB edges – None between different sends from the same process » They target the same destination » but use different communicators Buffer Sensitive Deadlocks – Run MPI_BufferingSensitiveDeadlock With Windows->Preferences->ISP->Use Blocking Sends = true With the above = false Assertion violation – Run MPI_AssertTest (red_blue_assert.c) with 4 procs. Bug Classes Caught by ISP

We have already seen the Internal Issue Order Witness the experimental Time Order stepping feature Iprobes are interesting! – We can probe one send / receive – But, we can actually communicate with a different send / receive – The ISP GUI shows this really vividly! – Try out MPI_Iprobeillustration, file i-probe-illustration.c ISP’s execution is directly guided by MPI-hb – MPI_HappensBeforeIllustration ISP is very mindful of wildcard dependencies that may arise due to continued execution – Try MPI_CrossCoupledWildcardDependency – Discuss our “lazy algorithm” for handling this (not in ISP now) Intuitive GUI Display Capabilities

4 Formalization of MPI Happens-before It really is matches-before + completes-before

MPI guarantees point-to-point non-overtaking P0 --- S(to:1, msg1, h1); … S(to:1, msg2, h2); … W(h1); … W(h2); P1 --- R(from:0, buf1, h3); … R(from:0, buf2, h4); … W(h3); … W(h4);

MPI guarantees point-to-point non-overtaking P0 --- S(to:1, msg1, h1); … S(to:1, msg2, h2); … W(h1); … W(h2); P1 --- R(from:0, buf1, h3); … R(from:0, buf2, h4); … W(h3); … W(h4); Require S(..msg1..) to match a potential receive from P1 BEFORE S(..msg2..) is allowed to match

MPI guarantees point-to-point non-overtaking P0 --- S(to:1, msg1, h1); … S(to:1, msg2, h2); … W(h1); … W(h2); P1 --- R(from:0, buf1, h3); … R(from:0, buf2, h4); … W(h3); … W(h4); Require S(..msg1..) to match a potential receive from P1 BEFORE S(..msg2..) is allowed to match Require R(..buf1..) to match a potential send from P0 BEFORE R(..buf2..) is allowed to match

MPI guarantees point-to-point non-overtaking P0 --- S(to:1, msg1, h1); … S(to:1, msg2, h2); … W(h1); … W(h2); P1 --- R(from:0, buf1, h3); … R(from:0, buf2, h4); … W(h3); … W(h4); Require S(..msg1..) to match a potential receive from P1 BEFORE S(..msg2..) is allowed to match Require R(..buf1..) to match a potential send from P0 BEFORE R(..buf2..) is allowed to match Achieved by ensuring the above matches-before relations during scheduling

But, do not enforce matches-before needlessly Must we force sends (S) to finish in order in all cases ? Certainly not! (unless you love poor performance) P0 --- S(to:1, big-message, h1); … S(to:2, small-message, h2); … W(h2); … W(h1); P1 --- R(from:1, buf1, h3); … W(h3); P1 --- R(from:2, buf2, h4); … W(h4);

Must we force sends (S) to finish in order in all cases ? Certainly not! (unless you love poor performance) P0 --- S(to:1, big-message, h1); … S(to:2, small-message, h2); … W(h2); … W(h1); P1 --- R(from:1, buf1, h3); … W(h3); P1 --- R(from:2, buf2, h4); … W(h4); Non-blocking sends (S) allow later actions of the same process (including “unrelated” non-blocking sends) to be concurrent. But, do not enforce matches-before needlessly

Must we force sends (S) to finish in order in all cases ? Certainly not! (unless you love poor performance) P0 --- S(to:1, big-message, h1); … S(to:2, small-message, h2); … W(h2); … W(h1); P1 --- R(from:1, buf1, h3); … W(h3); P1 --- R(from:2, buf2, h4); … W(h4); Non-blocking sends (S) allow later actions of the same process (including “unrelated” non-blocking sends) to be concurrent. Achieved by NOT having matches-before between the S within P0 But, do not enforce matches-before needlessly

MPI is tricky… till you see how it really works! Will this single-process example called “Auto-send” deadlock ? P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2);

MPI is tricky… till you see how it really works! Will this single-process example called “Auto-send” deadlock ? P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Again no! R (non-blocking receive) only initiates receipt – likewise non-blocking send (S) only initiates sending. These activities go on concurrently within the same process.

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); The HB How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Issue R(from:0, h1), because prior to issuing R, P0 is not at a fence

How Example Auto-send works P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Issue B, because after issuing R, P0 is not at a fence

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Form match set; Match-enabled set is {B} How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Fire Match-enabled set {B} How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Issue S(to:0, h2) because since B is gone, P0 is no longer at a fence How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Issue W(h1) because after S(to:0, h2), P0 is not at a fence How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Can’t form a { W(h1) } match set because it has an unmatched ancestor (namely R(from:0, h1) ). How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Form and issue the { R(from:0, h1), S(to:0, h2) } match set, and issue How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Now form and issue the match set { W(h1) } How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Now issue W(h2) How Example Auto-send works

P0 : R(from:0, h1); B; S(to:0, h2); W(h1); W(h2); Form match set { W(h2) } and fire it. Done. How Example Auto-send works

End of D 27