Download presentation
Published byRandolf George Modified over 9 years ago
1
Ligra: A Lightweight Graph Processing Framework for Shared Memory
Julian Shun Miller Institute, UC Berkeley (work done while at Carnegie Mellon University) Joint work with Laxman Dhulipala and Guy Blelloch (Principles and Practice of Parallel Programming, 2013 and Data Compression Conference, 2015)
2
Large Graphs Need to efficiently analyze these graphs
HPGM 2015 Need to efficiently analyze these graphs Computational efficiency Space efficiency Programming efficiency
3
Breadth-first Search (BFS)
HPGM 2015 Compute a BFS tree rooted at source r containing all vertices reachable from r Frontier r r Can process each frontier in parallel Race conditions, load balancing
4
BFS Abstractly All graph traversal algorithms do this!
HPGM 2015 All graph traversal algorithms do this! Operate on a subset of vertices Map computation over subset of edges in parallel Return new subset of vertices (Map computation over subset of vertices in parallel) Breadth-first search Betweenness centrality Connected components Bellman-Ford shortest paths Graph eccentricity estimation PageRank Can we build an abstraction for these types of algorithms?
5
Graph Processing Systems
HPGM 2015 Existing: Pregel/Giraph, GraphLab, Pegasus, Knowledge Discovery Toolbox, GraphChi, Parallel BGL, and many others… Our system: Ligra - Lightweight graph processing system for shared memory Efficient for “frontier-based” algorithms
6
} Ligra Operate on a subset of vertices
HPGM 2015 Graph Operate on a subset of vertices Map computation over subset of edges in parallel Return new subset of vertices (Map computation over subset of vertices in parallel) VertexSubset } EdgeMap VertexMap Shared memory Cilk Plus/OpenMP
7
Ligra Framework VertexMap VertexSubset VertexSubset bool f(v){
HPGM 2015 2 VertexSubset 8 4 6 5 4 4 8 8 bool f(v){ data[v] = data[v] + 1; return (data[v] == 1); } VertexMap 7 1 6 6 3 VertexSubset 8 6
8
Ligra Framework EdgeMap T T VertexSubset T T T F VertexSubset
HPGM 2015 T 2 2 T VertexSubset 8 4 6 5 5 T 4 4 4 8 8 EdgeMap T bool update(u,v){…} bool cond(v){…} 7 7 1 1 T 6 6 3 F VertexSubset 1 4 5 2 7
9
Breadth-first Search in Ligra
HPGM 2015 parents = {-1, …, -1}; //-1 indicates “unvisited” procedure UPDATE(s, d): return compare_and_swap(parents[d], -1, s); procedure COND(i): return parents[i] == -1; //checks if “unvisited” procedure BFS(G, r): parents[r] = r; frontier = {r}; //VertexSubset while (size(frontier) > 0): frontier = EDGEMAP(G, frontier, UPDATE, COND); frontier
10
Actual BFS code in Ligra
HPGM 2015
11
Sparse or Dense EdgeMap?
HPGM 2015 Frontier Dense method better when frontier is large and many vertices have been visited Sparse (traditional) method better for small frontiers Switch between the two methods based on frontier size [Beamer et al. SC ’12] 1 2 9 9 3 13 10 10 4 14 11 11 5 15 12 12 6 Limited to BFS? 7 8
12
EdgeMap Not limited to BFS!
HPGM 2015 procedure EDGEMAP(G, frontier, Update, Cond): if (|frontier| + sum of out-degrees > threshold) then: return EDGEMAP_DENSE(G, frontier, Update, Cond); else: return EDGEMAP_SPARSE(G, frontier, Update, Cond); Loop through incoming edges of “unexplored” vertices (in parallel), breaking early if possible Loop through outgoing edges of frontier vertices in parallel Not limited to BFS! Generalization to other problems: Bellman-Ford shortest paths, betweenness centrality, graph eccentricity estimation, connected components, PageRank Users need not worry about this
13
PageRank HPGM 2015
14
PageRank in Ligra HPGM 2015 p_curr = {1/|V|, …, 1/|V|}; p_next = {0, …, 0}; diff = {}; procedure UPDATE(s, d): return atomic_increment(p_next[d], p_curr[s] / degree(s)); procedure COMPUTE(i): p_next[i] = α ∙ p_next[i] + (1- α) ∙ (1/|V|); diff[i] = abs(p_next[i] – p_curr[i]); p_curr[i] = 0; return 1; procedure PageRank(G, α, ε): frontier = {0, …, |V|-1}; error = ∞ while (error > ε): frontier = EDGEMAP(G, frontier, UPDATE, CONDtrue); frontier = VERTEXMAP(frontier, COMPUTE); error = sum of diff entries; swap(p_curr, p_next) return p_curr;
15
PageRank Sparse version?
HPGM 2015 Sparse version? PageRank-Delta: Only update vertices whose PageRank value has changed by more than some Δ-fraction (discussed in GraphLab and McSherry WWW ‘05)
16
PageRank-Delta in Ligra
HPGM 2015 PR[i] = {1/|V|, …, 1/|V|}; nghSum = {0, …, 0}; Change = {}; //store changes in PageRank values procedure UPDATE(s, d): //passed to EdgeMap return atomic_increment(nghSum[d], Change[s] / degree(s)); procedure COMPUTE(i): //passed to VertexMap Change[i] = α ∙ nghSum[i]; PR[i] = PR[i] + Change[i]; return (abs(Change[i]) > Δ); //check if absolute value of change is big enough
17
Ligra Performance Ligra performance close to hand-written code
HPGM 2015 244 seconds (64 x 32-cores) (16 x 8-cores) (64 x 8-cores) Ligra performance close to hand-written code Faster than distributed-memory on per-core basis Several shared-memory graph processing systems subsequently developed: Galois [SOSP ‘13], X-stream [SOSP ‘13], PRISM [SPAA ‘14], Polymer [PPoPP ‘15], Ringo [SIGMOD ‘15]
18
Large Graphs HPGM 2015 Available RAM 6.6B edges ~20B edges (latest version) ~150B edges All fit in a Terabyte of memory; can fit on commodity shared memory machine What if you don’t have that much RAM?
19
Ligra+: Adding Graph Compression to Ligra
HPGM 2015 Ligra+: Adding Graph Compression to Ligra Difference encoding (using variable-length codes) for sorted edges per vertex Modify EdgeMap: parallel edge decoding on-the-fly All hidden from the user!
20
Ligra+: Adding Graph Compression to Ligra
HPGM 2015 Ligra+: Adding Graph Compression to Ligra Cost of decoding on-the-fly? Memory bottleneck a bigger issue as graph algorithms are memory-bound
21
HPGM 2015 Conclusion Ligra: lightweight graph processing framework for shared-memory “frontier-based” algorithms Switches computation based on frontier size Ligra+: extension which incorporates graph compression Reduces space usage and improves parallel performance Code: An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs [KDD 2015] RT06 Tue 13:00-15:00 Social and Graphs 2 Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.