Download presentation
Presentation is loading. Please wait.
Published by粮 廉 Modified over 6 years ago
1
On the Complexity of Buffer Allocation in Message Passing Systems
Joint work with Jan B. Pedersen & Alan Wagner Alex Brodsky University of British Columbia
2
Outline Motivation Definitions Buffer Allocation Problem
Buffer Sufficiency Problem Nonblocking Buffer Allocation Problem Other Models and Related Problems Conclusion
3
Motivation
4
Motivation
5
Motivation
6
What is the Problem? send(p2,...) send(p1,...) recv(p2,...)
Unless there is somewhere to put the message the senders will deadlock. So....
7
What is the Problem? send(p2,...) send(p1,...) recv(p2,...)
Unless there is somewhere to put the message the senders will deadlock. So, buffers are used.
8
What is the Problem? send(p2,...) send(p1,...) send(p2,...)
recv(p2,...) recv(p1,...)
9
Problem Statement Not all systems have unrestricted amounts of buffers. e.g., clusters that offload message passing functionality to the network interface card (NIC). Hence, we must determine the number of buffers needed for a safe program execution. This is the Buffer Allocation Problem (BAP). Question: What is the complexity of BAP?
10
Assumptions
11
Assumptions Processes are asynchronous.
12
Assumptions Processes are asynchronous.
The communication pattern is static. i.e., doesn't change from execution to execution.
13
Assumptions Processes are asynchronous.
The communication pattern is static. i.e., doesn't change from execution to execution. Send/recv calls are explicitly matched. send(p2,...) recv(p1,...)
14
Assumptions Processes are asynchronous.
The communication pattern is static. i.e., doesn't change from execution to execution. Send/recv calls are explicitly matched. Buffers are allocated on the receiver.
15
Assumptions Processes are asynchronous.
The communication pattern is static. i.e., doesn't change from execution to execution. Send/recv calls are explicitly matched. Buffers are allocated on the receiver. Sends block if no buffers are available & receiver is not ready.
16
Problem Input
17
Problem Input What is the invariant across executions of a program?
18
Problem Input What is the invariant across executions of a program?
The static communication pattern
19
Problem Input What is the invariant across executions of a program?
The static communication pattern. Use communication graphs to represent communication patterns. The communication graph becomes the problem input.
20
Communication Graph P0 P1 P2 P3 P4 P5 Process component Time
Processes are denoted by vertical process arcs (up to down).
21
Communication Graph P0 P1 start Event send recv end
Events (start, end, send, receive) are denoted by vertices.
22
Communication Graph P0 P1 send
Communication arcs denote sends from one process to another.
23
Communication Graph P0 P1 P2 P3 P4 P5
Examples of communication graphs.
24
Arrival != Receipt P0 P1 P2 P3 Arrival interval
Receive event occurs when message is received.
25
Arrival != Receipt P0 P1 P2 P3 Arrival interval
Messages can arrive before receive events.
26
Dependencies P0 P1 All events depend on start events.
27
Dependencies P0 P1 Receive events depend on send events.
28
Dependencies P0 P1 If there are NO buffers, send events depend on receive events.
29
Dependencies P2 P3 A send event depends on the preceding event.
30
Dependencies P2 P3 The arrival interval is defined by a receive event and its dependency within the same process.
31
Circular Dependency & Deadlock
With no buffers, the send/receive events depend on each other.
32
Circular Dependency & Deadlock
A circular dependency (with no buffers) represents deadlock.
33
The t-ring P0 P1 P2 P3 P4 P5 We call such a circular dependency a t-ring, e.g., t=6.
34
Solving Deadlock P0 P1 P2 P3 P4 P5 1 2
1 2 To solve deadlock, we use buffers.
35
Buffer Assignment P0 P1 P2 P3 P4 P5 1 2
1 2 Each process is assigned 0 or more buffers.
36
Solving Deadlock P0 P1 1 Initially, neither process can complete a send.
37
Solving Deadlock P0 P1 1 Message from process 0 is buffered by process 1.
38
Solving Deadlock P0 P1 1 Process 0 can proceed to receive from process 1.
39
Solving Deadlock P0 P1 1 Finally, process 1 receives from process 0.
40
Solving Deadlock P2 P3 P4 P5 2 Initially, none of the sends can complete.
41
Solving Deadlock P2 P3 P4 P5 2 Since message arrival is nondeterministic, 2 buffers are needed.
42
Solving Deadlock P2 P3 P4 P5 2 Process P4 can complete its receives.
43
Solving Deadlock P2 P3 P4 P5 2 Process P3 can complete its receives.
44
Safety A program is k-safe if k buffers are sufficient to guarantee deadlock free execution.
45
Buffer Allocation Problem (BAP)
Informal Question: How many buffers does a program need to avoid deadlock? Formal Question: Given a communication graph, how many buffers are needed to avoid deadlock in the corresponding program? Decision Question (BAP): Given a communication graph and integer k, is the corresponding program k-safe?
46
Thm: BAP is NP-hard Proof by reduction from 3SAT
3SAT Decision Problem: Does the formula of the form ∧i (ai ∨ bi ∨ ci) have a satisfying assignment where each ai, bi, ci, is a either a variable xj or its negation, (n variables). Idea: For any 3SAT formula we show how to construct a corresponding communication graph to test n-safety (requires n buffers). 2 widgets: fix assignment and check clauses
47
The Construction x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3
For a formula over n variables create a graph with 2n processes.
48
Fixing the assignment x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3
The 2-rings are used to fix a variable assignment.
49
Fixing the assignment x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1
1 1 1 1 A buffer assignment fixes the variables, e.g., ~x0, x1, x2, ~x3. No more than n buffers may be selected, (testing for n-safety).
50
Use a 3-ring for each Clause
x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1 (x0+x1+x3) (x0+~x2+x3) Each clause is represented by a 3-ring. Which will not deadlock only if one of the processes has a buffer.
51
Unsatisfied Clauses and 3-rings
x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1 (x0+x1+x3) (x0+~x2+x3) This first 3-ring does not deadlock. The 3-ring corresponds to a satisfied clause.
52
Unsatisfied Clauses and 3-rings
x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1 (x0+x1+x3) (x0+~x2+x3) For this buffer assignment the second 3-ring will deadlock. The program is n-safe if none of the 3-rings deadlock.
53
A Better Buffer Assignment
x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1 (x0+x1+x3) (x0+~x2+x3) Buffer assignments of size n that prevent deadlock correspond to satisfying assignments for the formula.
54
Thus, BAP is NP-Hard! x0 ~x0 x1 ~x1 x2 ~x2 x3 ~x3 1 1 1 1 (x0+x1+x3)
1 1 1 1 (x0+x1+x3) (x0+~x2+x3)
55
Buffer Sufficiency Problem (BSP)
How about something easier? To solve BAP we need to verify that a buffer assignment is safe, this is also HARD! Decision Problem: Given a communication graph and a buffer assignment, does the buffer assignment yield a safe execution?
56
Thm: BSP is coNP-complete
Proof by reduction from Tautology Given a DNF formula, is it true for all assignments? Construct a communication graph that corresponds to a given formula. Play a colouring game on the graph which simulates the execution of a program. There exists a simulation that deadlocks iff the formula is not a tautology.
57
Buffer Stealing P2 P3 P4 1 P2 is blocked on a receive
P2 is blocked on a receive Which process blocks, P3 or P4 ? Either execution is possible.
58
Buffer Stealing P2 P3 P4 1 P2 is blocked on a receive
P2 is blocked on a receive The process whose message arrives first steals the buffer.
59
Fixing an Assignment P2 xi ~xi 1
A buffer stealing widget used to fix an assignment.
60
Fixing an Assignment P2 xi ~xi 1
Fixing an assignment corresponds to an execution.
61
Terms of a DNF T0 T1 T2 T3 T4 T5 1 1 1 1 1 1 When does this system deadlock?
62
Terms of a DNF T0 T1 T2 T3 T4 T5 1 1 1 1 1 1 Message arrival corresponds to a term being false.
63
Terms of a DNF T0 T1 T2 T3 T4 T5 1 1 1 1 1 1 If message arrival steals all buffers, the t-ring will deadlock.
64
Sketch of proof A buffer stealing widget forces an execution which corresponds to a variable assignment. The simulation deadlocks on the sum widget only if the formula is false on the assignment. Also uses the buffer stealing mechanism. Hence, a simulation can deadlock iff the corresponding formula is not a tautology.
65
How does this help us? While these results are interesting, they don't help us solve our problem! How to determine the number of buffers our program needs.
66
How does this help us? While these results are interesting, they don't help us solve our problem! How to determine the number of buffers our program needs. Suppose, we added an additional restriction: the program should not block (or deadlock) due to lack of buffers!
67
Amazingly, this makes our problem tractable!
How does this help us? While these results are interesting, they don't help us solve our problem! How to determine the number of buffers our program needs. Suppose, we added an additional restriction: the program should not block (or deadlock) due to lack of buffers! Amazingly, this makes our problem tractable!
68
The Nonblocking Buffer Allocation Problem (NBAP)
Informal Question: How many buffers does a program need to execute without blocking on a send? Decision Question: Given a communication graph and an integer k, are all executions of the corresponding program send block free? Upper bound: # of receives in each process.
69
When is a buffer needed? P0 P1 P2 P3 P4 P5 Arrival interval
A buffer is needed only during the arrival interval.
70
How Long is the Interval?
P0 P1 P2 P3 P4 P5 Arrival interval The interval extends to preceding dependency in the same process.
71
Each interval requires a buffer
P0 P1 P2 P3 P4 P5 2 buffers, no overlap
72
Each interval requires a buffer
P0 P1 P2 P3 P4 P5 3 buffers, 1 overlap
73
Each interval requires a buffer
P0 P1 P2 P3 P4 P5 5 buffers, 3 overlaps of size 2, 1 overlap of size 3
74
The Algorithm P0 P1 P2 P3 P4 P5 3 2 Compute the maximum per process overlap.
75
The Algorithm P0 P1 P2 P3 P4 P5 5 = + + 3 + 2 + +
5 = + + 3 + 2 + + Sum the per process buffers.
76
Implementation First: detect dependencies to minimize arrival intervals. Use depth first search and dynamic programming. Arrival interval
77
Implementation First: detect dependencies to minimize arrival intervals. Use depth first search and dynamic programming. Second: Compute max overlap for each process. Sort arrival intervals of each process. Find maximum overlap. 1 Arrival interval
78
Implementation First: detect dependencies to minimize arrival intervals. Use depth first search and dynamic programming. Second: Compute max overlap for each process. Sort arrival intervals of each process. Find maximum overlap. Total time: O(|V|n + |V|log|V|) 1 Arrival interval
79
Other Models We considered a model where the buffer is allocated on the receive side.
80
Other Models We considered a model where the buffer is allocated on the receive side. Other models include: Send side buffers
81
Other Models We considered a model where the buffer is allocated on the receive side. Other models include: Send side buffers Send / recv side buffers
82
Results We considered a model where the buffer is allocated on the receive side. Other models include: Send side buffers Send / recv side buffers For these we have the following results: Problem Recv Side Send Side Send/Recv BAP NP-hard NP-hard NP-hard BSP CoNP-C CoNP-C NBAP P P
83
Results We considered a model where the buffer is allocated on the receive side. Other models include: Send side buffers Send / recv side buffers For these we have the following results: Problem Recv Side Send Side Send/Recv BAP NP-hard NP-hard NP-hard BSP CoNP-C P CoNP-C NBAP P P NP-hard
84
Conclusions Solving the buffer allocation problem for programs with static communication patterns and simple communication primitives is NP- hard. Even verifying a solution to the buffer allocation problem is hard (coNP-complete). Fortunately, if programs are required to be block free, as well as deadlock free, then the problem becomes tractable!
85
Thank you
86
Solving Deadlock P0 P1 1 Message from process 0 is buffered by process 1.
87
Solving Deadlock P0 P1 P2 P3 P4 P5 1 2
1 2 Finally, process 1 receives buffered message from process 0.
88
Solving Deadlock P0 P1 P2 P3 P4 P5 1 2
1 2 Process 0 can then receive from process 1.
89
The Algorithm P0 P1 P2 P3 P4 P5 3 2 Compute the maximum per process overlap.
90
The Algorithm P0 P1 P2 P3 P4 P5 5 = + + 3 + 2 + +
5 = + + 3 + 2 + + Sum the per process buffers.
91
Implementation First: detect dependencies
To minimize arrival intervals Arrival interval
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.