Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University.

Similar presentations


Presentation on theme: "Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University."— Presentation transcript:

1 Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University

2 Graph Algorithms are Ubiquitous 2 Computational biology Social Networks Computer Graphics

3 Agenda Operator formulation of graph algorithms Implementation considerations for sequential graph programs Optimistic parallelization of graph algorithms Introduction to the Galois system 3

4 Operator formulation of graph algorithms 4

5 Main Idea Define high-level abstraction of graph algorithms in terms of – Operator – Schedule – Delta Given a new algorithm describe it in terms of composition of these elements – Enables many implementations – Find one suitable for typical input and architecture 5

6 Problem Formulation – Compute shortest distance from source node S to every other node Many algorithms – Bellman-Ford (1957) – Dijkstra (1959) – Chaotic relaxation (Miranker 1969) – Delta-stepping (Meyer et al. 1998) Common structure – Each node has label dist with known shortest distance from S Key operation – relax-edge(u,v) Example: Single-Source Shortest-Path 6 2 5 1 7 A A B B C C D D E E F F G G S S 3 4 2 2 1 9 12 2 A A C C 3 if dist(A) + W AC < dist(C) dist(C) = dist(A) + W AC

7 Scheduling of relaxations: Use priority queue of nodes, ordered by label dist Iterate over nodes u in priority order On each step: relax all neighbors v of u – Apply relax-edge to all (u,v) Dijkstra’s Algorithm 7 2 5 1 7 A A B B C C D D E E F F G G S S 3 4 2 2 1 9 7 5 3 6

8 Chaotic Relaxation Scheduling of relaxations: Use unordered set of edges Iterate over edges (u,v) in any order On each step: – Apply relax-edge to edge (u,v) 8 2 5 1 7 A A B B C C D D E E F F G G S S 3 4 2 2 1 9 5 12 (S,A) (B,C) (C,D) (C,E)

9 Q = PQueue[Node] Q.enqueue(S) while Q ≠ ∅ { u = Q.pop foreach (u,v,w) { if d(u) + w < d(v) d(v) := d(u) + w Q.enqueue(v) } W = Set[Edge] W ∪ = (S,y) : y ∈ Nbrs(S) while W ≠ ∅ { (u,v) = W.get if d(u) + w < d(v) d(v) := d(u) + w foreach y ∈ Nbrs(v) W.add(v,y) } Algorithms as Scheduled Operators 9 Dijkstra-styleChaotic-Relaxation Graph Algorithm = Operator(s) + Schedule

10 Deconstructing Schedules 10 What should be done How it should be done Unordered/Ordered algorithms Operator Delta Graph Algorithm Operators Schedule Order activity processing Identify new activities Identify new activities Static Schedule Static Schedule Dynamic Schedule Code structure (loops) : activity “TAO of parallelism” PLDI’11 Priority in work queue

11 Static Identify new activities Operators Dynamic Example 11 Graph Algorithm =+Schedule Order activity processing Dijkstra-style Chaotic-Relaxation Q = PQueue[Node] Q.enqueue(S) while Q ≠ ∅ { u = Q.pop foreach (u,v,w) { if d(u) + w < d(v) d(v) := d(u) + w Q.enqueue(v) } W = Set[Edge] W ∪ = (S,y) : y ∈ Nbrs(S) while W ≠ ∅ { (u,v) = W.get if d(u) + w < d(v) d(v) := d(u) + w foreach y ∈ Nbrs(v) W.add(v,y) }

12 SSSP in Elixir Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Graph type Operator Fixpoint Statement 12

13 Operators Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Redex pattern Guard Update b b a a if bd > ad + w ad w bd b b a a ad w ad+w 13

14 Fixpoint Statement Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Apply operator until fixpoint Scheduling expression 14

15 Scheduling Examples Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Locality enhanced Label-correcting group b ≫ unroll 2 ≫ approx metric ad Locality enhanced Label-correcting group b ≫ unroll 2 ≫ approx metric ad 15 Dijkstra-style metric ad ≫ group b Dijkstra-style metric ad ≫ group b q = new PrQueue q.enqueue(SRC) while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) }

16 Implementation considerations for sequential graph programs 16

17 17 Parallel Graph Algorithm Operators Schedule Order activity processing Identify new activities Static Schedule Static Schedule Dynamic Schedule Operator Delta Inference

18 Finding the Operator delta 18

19 Problem Statement Many graph programs have the form until no change do { apply operator } Naïve implementation: keep looking for places where operator can be applied to make a change – Problem: too slow Incremental implementation: after applying an operator, find smallest set of future active elements and schedule them (add to worklist) 19

20 Identifying the Delta of an Operator 20 b b a a relax 1 ? ?

21 Delta Inference Example 21 b b a a SMT Solver assume (da + w 1 < db) assume ¬(dc + w 2 < db) db_post = da + w 1 assert ¬(dc + w 2 < db_post) Query Program relax 1 c c w2w2 w1w1 relax 2 (c,b) does not become active

22 assume (da + w 1 < db) assume ¬(db + w 2 < dc) db_post = da + w 1 assert ¬(db_post + w 2 < dc) Query Program Delta Inference Example – Active 22 SMT Solver b b a a relax 1 c c w1w1 relax 2 w2w2 Apply relax on all outgoing edges (b,c) such that: dc > db +w 2 and c ≄ a Apply relax on all outgoing edges (b,c) such that: dc > db +w 2 and c ≄ a

23 Influence Patterns 23 b=c a a d d b b a=c d d a=d c c b b b=d a=c b=c a=d b=d a a c c

24 Implementing the operator 24

25 Example: Triangle Counting How many triangles exist in a graph – Or for each node Useful for estimating the community structure of a network 25

26 Triangles Pseudo-code 26 for a : nodes do for b : nodes do for c : nodes do if edges(a,b) if edges(b,c) if edges(c,a) if a < b if b < c if a < c triangles++ fi … … …

27 Example: Triangles 27 for a : nodes do for b : nodes do for c : nodes do if edges(a,b) if edges(b,c) if edges(c,a) if a < b if b < c if a < c triangles++ fi … ≺≺ Iterators Graph Conditions Scalar Conditions

28 28 for a : nodes do for b : nodes do for c : nodes do if edges(a,b) if edges(b,c) if edges(c,a) if a < b if b < c if a < c triangles++ fi … ≺≺ Triangles: Reordering Iterators Graph Conditions Scalar Conditions

29 29 for a : nodes do for b : nodes do for c : nodes do if edges(a,b) if edges(b,c) if edges(c,a) if a < b if b < c if a < c triangles++ fi … ≺≺ for a : nodes do for b : Succ( a ) do for c : Succ( b ) do if edges(c,a) if a < b if b < c if a < c triangles++ fi … Triangles: Implementation Selection for x : nodes do if edges(x,y) ⇩ for x : Succ(y) do Reordering + Implementation Selection Tile: Iterators Graph Conditions Scalar Conditions

30 Optimistic parallelization of graph programs 30

31 Parallelism is Everywhere Texas Advanced Computing Center Cell-phones Laptops

32 Example: Boruvka’s algorithms for MST 32

33 Minimum Spanning Tree Problem 33 cd ab ef g 24 6 5 3 7 4 1

34 Minimum Spanning Tree Problem 34 cd ab ef g 24 6 5 3 7 4 1

35 Boruvka’s Minimum Spanning Tree Algorithm 35 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node cd ab ef g 24 6 5 3 7 4 1 d a,c b ef g 4 6 3 4 1 7 lt

36 Parallelism in Boruvka 36 cd ab ef g 24 6 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node

37 Non-conflicting Iterations 37 cd ab 2 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node ef g 4 6

38 Non-conflicting Iterations 38 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node d a,c b 3 4 1 7 e f,g 6

39 Conflicting Iterations 39 cd ab ef g 24 6 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node

40 Optimistic parallelization of graph algorithms 40

41 How to parallelize graph algorithms The TAO of Parallelism in Graph Algorithms / PLDI 2011 The TAO of Parallelism in Graph Algorithms Optimistic parallelization Implemented by the Galois system 41

42 Operator Formulation of Algorithms Active element – Site where computation is needed Operator – Computation at active element – Activity: application of operator to active element Neighborhood – Set of nodes/edges read/written by activity – Distinct usually from neighbors in graph Ordering : scheduling constraints on execution order of activities – Unordered algorithms: no semantic constraints but performance may depend on schedule – Ordered algorithms: problem-dependent order Amorphous data-parallelism – Multiple active elements can be processed in parallel subject to neighborhood and ordering constraints : active node : neighborhood Parallel program = Operator + Schedule + Parallel data structure What is that? Who implements it?

43 Optimistic Parallelization in Galois Programming model – Client code has sequential semantics – Library of concurrent data structures Parallel execution model – Activities executed speculatively Runtime conflict detection – Each node/edge has associated exclusive lock – Graph operations acquire locks on read/written nodes/edges – Lock owned by another thread  conflict  iteration rolled back – All locks released at the end Runtime book-keeping (source of overhead) – Locking – Undo actions 43 i1i1 i2i2 i3i3

44 Avoiding rollbacks 44

45 Cautious Operators When an iteration aborts before completing its work we need to undo all of its changes – Log each change to the graph and upon abort apply reverse actions in reverse order – Expensive to maintain – Not supported by Galois systems for C++ How can we avoid maintaining rollback data? An operator is cautious if it never performs changes before acquiring all locks – In this case upon abort there are no changes to be undone – Can ensure operator is cautious by adding code to acquire locks before making any changes 45

46 Failsafe Points 46 Lockset Grows Lockset Stable Failsafe … foreach (Node a : wl) { … … } foreach (Node a : wl) { Set aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.removeEdge(a, lt); Set ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } g.removeNode(lt); mst.add(minW); wl.add(a); } Program point P is failsafe if: For every future program point Q – the locks set in Q is already contained in the locks set of P:  Q : Reaches(P,Q)  Locks(Q)  ACQ(P)

47 Is this Code Cautious? 47 Lockset Grows Lockset Stable Failsafe … foreach (Node a : wl) { Set aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.removeEdge(a, lt); Set ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } g.removeNode(lt); mst.add(minW); wl.add(a); } No lt a

48 Rewrite as Cautious Operator 48 Lockset Grows Lockset Stable Failsafe … foreach (Node a : wl) { Set aNghbrs = g.neighbors(a); Node lt = null; for (Node n : aNghbrs) { minW,lt = minWeightEdge((a,lt), (a,n)); } g.neighbors(lt); g.removeEdge(a, lt); Set ltNghbrs = g.neighbors(lt); for (Node n : ltNghbrs) { Edge e = g.getEdge(lt, n); Weight w = g.getEdgeData(e); Edge an = g.getEdge(a, n); if (an != null) { Weight wan = g.getEdgeData(an); if (wan.compareTo(w) < 0) w = wan; g.setEdgeData(an, w); } else { g.addEdge(a, n, w); } g.removeNode(lt); mst.add(minW); wl.add(a); } lt a

49 So far 49 Operator formulation of graph algorithms Implementation considerations for sequential graph programs Optimistic parallelization of graph algorithms Introduction to the Galois system

50 Next steps Divide into groups Algorithm proposal – Due date: 15/4 – Phrase algorithm in terms of operator formulation – Define delta if necessary – Submit proposal with description of algorithm + pseudo-code – LaTeX template will be on web-site soon Lecture on 15/4 on implementing your algorithm via Galois 50


Download ppt "Implementing Parallel Graph Algorithms Spring 2015 Implementing Parallel Graph Algorithms Lecture 2: Introduction Roman Manevich Ben-Gurion University."

Similar presentations


Ads by Google