Presentation is loading. Please wait.

Presentation is loading. Please wait.

Roman Manevich Rashid Kaleem Keshav Pingali University of Texas at Austin Synthesizing Concurrent Graph Data Structures: a Case Study.

Similar presentations


Presentation on theme: "Roman Manevich Rashid Kaleem Keshav Pingali University of Texas at Austin Synthesizing Concurrent Graph Data Structures: a Case Study."— Presentation transcript:

1 Roman Manevich Rashid Kaleem Keshav Pingali University of Texas at Austin Synthesizing Concurrent Graph Data Structures: a Case Study

2 vision 2 Problem How to utilize parallel hardware Programming model for parallel applications High-level language for parallelism Program in terms of sequential semantics Choose tuning parameters for better performance Decouple semantics from implementation Compiler synthesizes parallel code Correctness guarantees Avoids usual pitfalls: deadlocks, data races, etc. For any value of tuning parameters

3 this talk 3 Problem How to utilize parallel hardware Programming model for parallel applications High-level language for parallelism Program in terms of sequential semantics Choose tuning parameters for better performance Decouple semantics from implementation Compiler synthesizes parallel code Correctness guarantees Avoids usual pitfalls: deadlocks, data races, etc. For any value of tuning parameters Parallelizing graph algorithms Implementing concurrent graph data structures Relational algebra Relation decomposition and tiling Autograph generates Java code Linearizability Speculation support: abstract locks + undos

4 context 4 Graph algorithms are ubiquitous Computational biology Social NetworksComputer Graphics

5 organization 5 Speculative parallelism background Speculative parallelization via Galois Data structures for speculative parallelism Autograph Specifying relational data structures Optimizations Empirical evaluation Outperform library data structures up to 2x

6 minimum spanning tree problem 6 cd ab ef g 24 6 5 3 7 4 1

7 7 cd ab ef g 24 6 5 3 7 4 1

8 Boruvka’s algorithm 8 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node cd ab ef g 24 6 5 3 7 4 1 d a,c b ef g 4 6 3 4 1 7 lt

9 parallelism in Boruvka 9 cd ab ef g 24 6 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node

10 non-conflicting iterations 10 cd ab 2 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node ef g 4 6

11 non-conflicting iterations 11 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node d a,c b 3 4 1 7 e f,g 6

12 conflicting iterations 12 cd ab ef g 24 6 5 3 7 4 1 Build MST bottom-up repeat { pick arbitrary node ‘a’ merge with lightest neighbor ‘lt’ add edge ‘a-lt’ to MST } until graph is a single node

13 Amorphous data-parallelism 13 Algorithm = repeated application of operator to graph Active node: Node where computation is needed Activity: Application of operator to active node Neighborhood: Sub-graph read/written to perform activity Unordered algorithms: Active nodes can be processed in any order Parallel execution of activities, subject to neighborhood constraints Neighborhoods unknown at compile time Use speculation i1i1 i2i2 i3i3

14 optimistic parallelization in Galois 14 Programming model Client code has sequential semantics Library of concurrent data structures Parallel execution model Thread-level speculation (TLS) Activities executed speculatively Conflict detection Each node has associated exclusive lock Graph operations acquire locks on accessed nodes Lock owned by another thread  conflict  iteration rollback i1i1 i2i2 i3i3

15 concurrent data structure contract 15 Linearizability [Herlihy & Wing TOPLAS’90] Method calls should appear to execute atomically Synchronization w.r.t concrete data structure Support speculation [Pingali et al. PLDI’07] [Herlihy & Koskinen PPoPP’08] Methods acquire abstract locks Synchronization w.r.t abstract data type Methods should register undo actions for rollback (Data-race freedom) (Deadlock freedom) (Non-blocking methods)

16 library graph data structure 16 thread id a b next dummy next f c dummy next d dummy e next dummy 0123 next Boruvka only removes nodes in_flag=1 set of nodes:

17 customized graph data structure 17 a b f c d e next in_flag=1 remove(d) set of nodes:

18 customized graph data structure 18 a b f c d e next in_flag=1 in_flag=0 in_flag=1 remove(d) set of nodes:

19 organization 19 Speculative parallelism background Speculative parallelization via Galois Data structures for speculative parallelism Autograph Specifying relational data structures Optimizations Empirical evaluation Outperform library data structures up to 2x

20 high-level spec at a glance 20 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node Decomposition Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods... nodes:Set node edges:List edgesOut:Map src succ:Map dstwt edgesIn:Map dstpred:Set src Tiling edgestile ListTile nodestile AttArrLinkedSet edgesOuttile AttMap edgesIn tile AttMap succ tile DualArrayMap pred tile ArraySet semanticsimplementation

21 specifying a graph for Boruvka 21 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node

22 relational representation of graph 22 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node ab5 ac2 bd4 cd7 de1 ef6 ba5 ca2 db4 dc7 ed1 fe6 srcdstwtnode a b c d e f nodesedges cd ab 2 5 7 4 1 ef 6

23 specifying methods 23 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods... ab5 ac2 bd4 cd7 de1 ef6 ba5 ca2 db4 dc7 ed1 fe6 srcdstwtnode a b c d e f nodesedges can we implement efficiently?

24 decomposing relations 24 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node Decomposition Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods... nodes:Set node edges:List edgesOut:Map src succ:Map dstwt edgesIn:Map dstpred:Set src

25 decomposed representation 25 Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods... a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges

26 findMin(a) 26 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

27 findMin(a) 27 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

28 findMin(a) 28 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

29 findMin(a) 29 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

30 findMin(a) 30 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

31 findMin(a): abstract locks 31 a b c d e f srcsuccnode a b c d e f b5 c2 d4 a5 d7 a2 e1 b4 c7 f6 d1 dstwt a b c d e f dstpred b c src a d a d b c e d f e edgesOutedgesIn e6 nodesedges Decomposition nodes : Set(node) edges : List( edgesOut : Map(src, succ : Map(dst, wt)) edgesIn : Map(dst, pred : Set(src)) ) Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods...

32 “tiles”: concretizing sub-relations 32 Structure nodes : rel(node) edges : rel(src, dst, wt) FD {src, dst} → {wt} FK src → node FK dst → node Decomposition Methods edgeExists : contains(src, dst) removeNode : remove(node) findMin : map(src, out dst, out wt) { if (wt < minWeight) { lt = dst; minWeight = wt; }... other methods... nodes:Set node edges:List edgesOut:Map src succ:Map dstwt edgesIn:Map dstpred:Set src Tiling edgestile ListTile nodestile AttArrLinkedSet edgesOuttile AttMap edgesIn tile AttMap succ tile DualArrayMap pred tile ArraySet

33 nodes tile AttArrLinkedSet 33 a b c d e f nodesedges thread id a b next dummy next f c dummy next d dummy e next dummy 0123 next in_flag=1

34 nodes tile AttLinkedSet 34 a b c d e f nodesedges a b f c d e next in_flag=1

35 optimizations 35 Customizing tiles Customize nodes set for concurrent deletions Customize successor/predecessor maps for primitive types Customize map operations Inlining Selecting relevant attributes Handling auxiliary state Loop fusion for read-only operations

36 organization 36 Speculative parallelism background Speculative parallelization of graph algorithms Data structures for speculative parallelism Autograph Specifying relational data structures Optimizations Empirical evaluation Related work + conclusion

37 experiments 37 Specified graph data structures Used Autograph to generates Java code Compared Generated data structures Library data structures (from Galois) Hand-written parallel benchmarks Show relative effect of different optimizations

38 Boruvka: running times comparison 38

39 Boruvka: running times comparison 39

40 Boruvka: effect of optimizations 40

41 Delaunay mesh refinement: times 41

42 Single-source shortest path: times 42

43 writing graph applications yesterday 43 Galois Runtime Graph Application Concurrent Data Structure Library Morph Graph LC Graph Set … Map Expert programmer Concurrency Expert Joe programmer + Correct ? Efficient (non-customizable)

44 writing graph applications today 44 Galois Runtime Data structure specification Autograph Graph Application Joe programmer Joe++ programmer + Correct + Customizable + Speedup over library data structures Data structure implementation

45 Grazie! Download Galois from http://iss.ices.utexas.edu/ http://iss.ices.utexas.edu/ Expect Autograph in next Galois release


Download ppt "Roman Manevich Rashid Kaleem Keshav Pingali University of Texas at Austin Synthesizing Concurrent Graph Data Structures: a Case Study."

Similar presentations


Ads by Google