Correcting Error in Message Passing Systems

Slides:



Advertisements
Similar presentations
Applied Algorithmics - week7
Advertisements

BASIC BUILDING BLOCKS -Harit Desai. Byzantine Generals Problem If a computer fails, –it behaves in a well defined manner A component always shows a zero.
Honolulu, 23 rd of May 2011PESOS Evaluating the Compatibility of Conversational Service Interactions Sam Guinea and Paola Spoletini.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 8 Network models.
1 Maximum Flow w s v u t z 3/33/3 1/91/9 1/11/1 3/33/3 4/74/7 4/64/6 3/53/5 1/11/1 3/53/5 2/22/2 
Lectures on Network Flows
LOCALITY IN DISTRIBUTED GRAPH ALGORITHMS Nathan Linial Presented by: Ron Ryvchin.
Introduction in algorithms and applications Introduction in algorithms and applications Parallel machines and architectures Parallel machines and architectures.
1 8-ShortestPaths Shortest Paths in a Graph Fundamental Algorithms.
The Role of Specialization in LDPC Codes Jeremy Thorpe Pizza Meeting Talk 2/12/03.
Data Structures and Programming.  John Edgar2.
GraphLab: how I understood it with sample code Aapo Kyrola, Carnegie Mellon Univ. Oct 1, 2009.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 08, 2005 Session 16.
Practical Session 10 Error Detecting and Correcting Codes.
Billy Timlen Mentor: Imran Saleemi.  Goal: Have an optimal matching  Given: List of key-points in each image/frame, Matrix of weights between nodes.
Timing Model Reduction for Hierarchical Timing Analysis Shuo Zhou Synopsys November 7, 2006.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Practical Session 10 Computer Architecture and Assembly Language.
Hongbin Li 11/13/2014 A Debugger of Parallel Mutli- Agent Spatial Simulation.
TU/e Algorithms (2IL15) – Lecture 8 1 MAXIMUM FLOW (part II)
Process Management Deadlocks.
The Distributed Application Debugger (DAD)
Jan Bækgaard Pedersen Alan Wagner Department of Computer Science
Network Flow.
Computer Architecture and Assembly Language
Software Testing.
Dept of Computer Science University of Maryland College Park
Maximum Flow Chapter 26.
Multi Level Interactive Parallel Debugging of Message Passing Programs
1) By using hamming code (even – parity), Show the correct binary number that transmitted by the sender if the receiver received binary number.
Different Types of Testing
The Echo Algorithm The echo algorithm can be used to collect and disperse information in a distributed system It was originally designed for learning network.
Packet Forwarding.
Parallel Programming By J. H. Wang May 2, 2017.
EKT 221 : Digital 2 Serial Transfers & Microoperations
Advanced Computer Networks
Lecture 16 Bipartite Matching
Factor Graphs and the Sum-Product Algorithm
Lectures on Network Flows
Network Flow.
On the Complexity of Buffer Allocation in Message Passing Systems
Parallel Programming with MPI and OpenMP
Packetizing Error Detection
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
PVMbuilder A Tool for Parallel Programming by Jan Bækgård Pedersen &
Graph Paper Programming
Today’s lesson – Python next steps
Software Testing (Lecture 11-a)
Packetizing Error Detection
Visualizing Prim’s MST Algorithm Used to Trace the Algorithm in Class
TexPoint fonts used in EMF.
Protocol Verification in Millipede
Lecture 7 review Consider a link running the Go-Back-N protocol. Suppose the transmission delay and propagation delay are both 1ms, the window size is.
CSCE569 Parallel Computing
Hierarchical Routing Scheme
Packetizing Error Detection
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
BugHint: A Visual Debugger Based on Graph Mining
Flow Networks General Characteristics Applications
Honors Track: Competitive Programming & Problem Solving Avoiding negative edges Steven Ge.
M4 and Parallel Programming
Algorithms (2IL15) – Lecture 7
EE5900 Advanced Embedded System For Smart Infrastructure
Network Flow.
Computer Architecture and Assembly Language
Contents Communications Theory Parallel vs. serial transmission
Network Flow.
Parallel Exact Stochastic Simulation in Biochemical Systems
SeeSoft A Visualization Tool..
Presentation transcript:

Correcting Error in Message Passing Systems By Jan Bækgaard Pedersen A Tool in Millipede Interactive Parallel Debugger

Overview Multi Level Interactive Parallel Debugging Millipede The problem – Deadlocks Deadlock detection module The idea The Algorithm Theoretical justification

Multi Level Interactive Parallel Debugging Use a tool that is tailored to the specific debugging task Debugging straight line code Message debugging Protocol debugging Visualization Sequential tool to debug sequential code. Other tools to debug Message passing errors Message contents Protocol errors Protocol verification Deadlock correction

Millipede Communication Visualization Module Graphical view of the message passing / protocol. Deadlock Detection & Correction Module Detect and analyze deadlocks And report the cause and fix Comm. Protocol Verification Module Online verification of the comm. protocol while running Message Debugging Module Inspect, control and change Contents of messages Sequential Debugging Module Debugging of the sequential code of the parallel program

The Problem Message passing programs can deadlock:

The Idea S = {s0, s1,…,sn-1} list of deadlocked* senders. R = {r0, r1,…,rn-1} list of deadlocked* receivers. si = (a,b), a and b are process identifiers, with a fixed by the sender. ri = (a,b), a and b are process identifiers, with b fixed by the receiver. si = (ai,bi) matches rj = (aj,bj) if (ai = aj) and (bi = bj)

The Idea Find permutations S of S and R of R: Number of mismatches is minimal, i.e. Minimal number of fields must be changed for the deadlock to disappear. Report needed changes to user. Compute Hamming distances between all possible permutations of S and R, and pick the ones with the minimal distance.

The Algorithm Let G = (V,E) be a directed graph V=VsVr (Vs senders, Vr receivers) E is constructed in the following way For all messages m left in message queues: If m=(s,r) is an outstanding send (sVs, rVr) add edge (s,r) to E with capacity 2. If m=(r,s) is an outstanding receive (sVs, rVr) add edge (r,s) to E with capacity 2. Iterate backwards through all delivered messages and add edge (u,v) and (v,u) to E with capacity 2 if no other node exist in E with u or v as source or destination. Add edges with capacity 1 to E to make G complete. Run maximum bipartite graph matching to get a matching.

Theoretical Justification How accurate is this algorithm? How is accuracy defined? Given a working system without deadlocks. Introduce a number of errors. Run the algorithm. If the algorithm often suggests the original working program as a fix, the accuracy is good.

4 moves 3 moves 7 moves

Example

Example

Examples 3 moves