1 QSX: Querying Social Graphs Parallel models for querying graphs beyond MapReduce Vertex-centric models –Pregel (BSP) –GraphLab GRAPE.

Slides:



Advertisements
Similar presentations
Pregel: A System for Large-Scale Graph Processing
Advertisements

Chapter 5: Tree Constructions
Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Graph Algorithms Carl Tropper Department of Computer Science McGill University.
1 TDD: Topics in Distributed Databases Distributed Query Processing MapReduce Vertex-centric models for querying graphs Distributed query evaluation by.
Topological Sort Topological sort is the list of vertices in the reverse order of their finishing times (post-order) of the depth-first search. Topological.
Data Structures Using C++
Distributed Graph Processing Abhishek Verma CS425.
Spark: Cluster Computing with Working Sets
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Tirgul 12 Algorithm for Single-Source-Shortest-Paths (s-s-s-p) Problem Application of s-s-s-p for Solving a System of Difference Constraints.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
Pregel: A System for Large-Scale Graph Processing
Big Data Infrastructure Jimmy Lin University of Maryland Monday, April 13, 2015 Session 10: Beyond MapReduce — Graph Processing This work is licensed under.
Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.
Shortest Paths Introduction to Algorithms Shortest Paths CSE 680 Prof. Roger Crawfis.
Social Media Mining Graph Essentials.
Pregel: A System for Large-Scale Graph Processing
Graph Algorithms Ch. 5 Lin and Dyer. Graphs Are everywhere Manifest in the flow of s Connections on social network Bus or flight routes Social graphs:
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Graphs – Shortest Path (Weighted Graph) ORD DFW SFO LAX
Performance Guarantees for Distributed Reachability Queries Wenfei Fan 1,2 Xin Wang 1 Yinghui Wu 1,3 1 University of Edinburgh 2 Harbin Institute of Technology.
L21: “Irregular” Graph Algorithms November 11, 2010.
Jim Anderson Comp 122, Fall 2003 Single-source SPs - 1 Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Dijkstras Algorithm Named after its discoverer, Dutch computer scientist Edsger Dijkstra, is an algorithm that solves the single-source shortest path problem.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph. Want to compute a shortest path for each possible.
Dijkstra’s Algorithm Supervisor: Dr.Franek Ritu Kamboj
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Distributed Graph Simulation: Impossibility and Possibility 1 Yinghui Wu Washington State University Wenfei Fan University of Edinburgh Southwest Jiaotong.
Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees Da Yan (CUHK), James Cheng (CUHK), Kai Xing (HKUST), Yi Lu (CUHK), Wilfred.
Distributed Systems CS
CPT-S Topics in Computer Science Big Data 1 Yinghui Wu EME 49.
Data Structures and Algorithms in Parallel Computing Lecture 4.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
1 Directed Graphs Chapter 8. 2 Objectives You will be able to: Say what a directed graph is. Describe two ways to represent a directed graph: Adjacency.
Data Structures and Algorithms in Parallel Computing
© 2006 Pearson Addison-Wesley. All rights reserved 14 A-1 Chapter 14 Graphs.
– Graphs 1 Graph Categories Strong Components Example of Digraph
Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
REX: RECURSIVE, DELTA-BASED DATA-CENTRIC COMPUTATION Yavuz MESTER Svilen R. Mihaylov, Zachary G. Ives, Sudipto Guha University of Pennsylvania.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Big Data Infrastructure Week 11: Analyzing Graphs, Redux (1/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
TensorFlow– A system for large-scale machine learning
Parallel Graph Algorithms
CPT-S 415 Topics in Computer Science Big Data
PREGEL Data Management in the Cloud
CPT-S 415 Big Data Yinghui Wu EME B45.
Data Structures and Algorithms in Parallel Computing
湖南大学-信息科学与工程学院-计算机与科学系
Distributed Systems CS
Distributed Systems CS
Replication-based Fault-tolerance for Large-scale Graph Processing
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
Da Yan, James Cheng, Yi Lu, Wilfred Ng Presented By: Nafisa Anzum
Brad Karp UCL Computer Science
Chapter 14 Graphs © 2011 Pearson Addison-Wesley. All rights reserved.
Presentation transcript:

1 QSX: Querying Social Graphs Parallel models for querying graphs beyond MapReduce Vertex-centric models –Pregel (BSP) –GraphLab GRAPE

2 Inefficiency of MapReduce mapper reducer Blocking: Reduce does not start until all Map tasks are completed Other reasons? Intermediate results shipping: all to all Write to disk and read from disk in each step, although the data does not change in loops

3 The need for parallel models beyond MapReduce Can we do better for graph algorithms? MapReduce: Inefficiency: blocking, intermediate result shipping (all to all); write to disk and read from disk in each step, even for invariant data in a loop Does not support iterative graph computations: External driver No mechanism to support global data structures that can be accessed and updated by all mappers and reducers Support for incremental computation? Have to re-cast algorithms in MapReduce, hard to reuse existing (incremental) algorithms General model, not limited to graphs

Vertex-centric models 4 4

5 Bulk Synchronous Parallel Model (BSP) Processing: a series of supersteps Vertex: computation is defined to run on each vertex Superstep S: all vertices compute in parallel; each vertex v may –receive messages sent to v from superstep S – 1; –perform some computation: modify its states and the states of its outgoing edges –Send messages to other vertices ( to be received in the next superstep) Message passing Vertex-centric, message passing Leslie G. Valiant: A Bridging Model for Parallel Computation. Commun. ACM 33 (8): (1990) analogous to MapReduce rounds

6 Pregel: think like a vertex Vertex: modify its state/edge state/edge sets (topology) Supersteps: within each, all vertices compute in parallel Termination: –Each vertex votes to halt –When all vertices are inactive and no messages in transit Synchronization: supersteps Asynchronous: all vertices within each superstep Input: a directed graph G –Each vertex v: a node id, and a value –Edges: contain values (associated with vertices)

Example: maximum value Superstep Superstep 1 Superstep 2 Superstep 3 Shaded vertices: voted to halt message passing 7

8 Vertex API Template (VertexValue, EdgeValue, MessageValue) Class Vertex { void Compute (MessageIterator: msgs) const vertex_id; const superstep(); const VertexValue& GetValue(); VertexValue* MutableValue(); OutEdgeIterator GetOutEdgeIterator(); void SendMessageTo (dest_vertex, MessageValue& message); void VoteToHalt(); } User defined Iteration control Vertex value: mutable Outgoing edges Message passing: messages can be sent to any vertex whose id is known All messages received Think like a vertex: local computation 8

9 PageRank The likelihood that page v is visited by a random walk:  (1/|V|) + (1 -  )  _(u  L(v)) P(u)/C(u) Recursive computation: for each page v in G, compute P(v) by using P(u) for all u  L(v) until converge: no changes to any P(v) after a fixed number of iterations random jump following a link from other pages A BSP algorithm? 9

10 PageRank in Pregel PageRankVertex { Compute (MessageIterator: msgs) { if (superstep() >= 1) then sum := 0; for all messages in msgs do *MutableValue() :=  /NumVertices() + (1-  ) sum; } if (superstep() < 30) then n := GetOutEdgeIterator().size(); sendMessageToAllNeighbors(GetValue() / n); else VoteToHalt(); }  (1/|V|) + (1 -  )  _(u  L(v)) P(u) Assume 30 iterations Pass revised rank to its neighbors iterations  (1/|V|) + (1 -  )  _(u  L(v)) P(u)/C(u) VertexValue: the current rank 10

11 Dijkstra’s algorithm for distance queries Distance: single-source shortest-path problem Input: A directed weighted graph G, and a node s in G Output: The lengths of shortest paths from s to all nodes in G Dijkstra (G, s, w): 1. for all nodes v in V do a. d[v]   ; 2. d[s]  0; Que  V; 3. while Que is nonempty do a. u  ExtractMin(Que); b. for all nodes v in adj(u ) do a) if d[v] > d[u] + w(u, v) then d[v]  d[u] + w(u, v); Use a priority queue Que; w(u, v): weight of edge (u, v); d(u): the distance from s to u Extract one with the minimum d(u) An algorithm in Pregel? 11

12 Distance queries in Pregel ShortesPathVertex { Compute (MessageIterator: msgs) { if isSource(vertex_id( )) then minDist := 0 else minDist :=  ; for all messages m in msgs do minDist := min(minDist, m.Value()); if midDist < GetValue() then *MutableValue() := minDist; for all nodes v linked to from the current node u do SendMessageTo(v, minDist + w(u, v)); VoteToHalt(); } Think like a vertex MutableValue: the current distance Pass revised distance to its neighbors Messages: distances to u Refer to the current node as u aggregation 12

13 Combiner and Aggregation Each vertex can provide a value to an aggregator in any superstep S. System aggregates these values (“reduce”) The aggregated values are made available to all vertices in superstep S + 1. optimization Combine several messages intended for a vertex –Provided that the messages can be aggregated (“reduced”) by using some associative and commutative function –Reduce the number of messages Global data structures

14 Topology mutation Handling conflicts: –Partial order on operations: edge removal < vertex removal < vertex addition < edge addition System: random action or user specified Extra power, yet increased complication Function compute can add or remove vertices Possible conflicts: –Vertex 1 adds an edge to vertex 100 –Vertex 2 deletes vertex 100 –Vertex 1 creates a vertex 10 with value 10 –Vertex 2 also creates a vertex 10 with value 12

15 Pregel implementation Cross edges: minimize edges across partitions –Sparsest Cut Problem Master, worker Vertices are assigned to machines: hash(vertex.id) mod N Partitions can be user-specified, to co-locate all Web pages form the same site, for instance Master: coordinate a set of workers (partitions, assignments) Worker: processes one or more partitions, local computation –Know the partition function and partitions assigned to it –All vertices in a partition are initially active –Worker notifies master of the number of active vertices at the end of a superstep Giraph,

16 Fault tolerance recovery Checkpoints: master instructs workers to save state to HDFS –Vertex values –Edge values –Incoming messages Master saves aggregated values to disk Worker failure: –detected by regular “ping” messages issued by the master (mark it failed after specified interval) –Recovered by creating a new worker, with the state stored from previous checkpoint

17 The vertex centric model of GraphLab Vertex: computation is defined to run on each vertex All vertices compute in parallel –Each vertex reads and writes to data on adjacent nodes or edges Consistency: serialization –Full consistency: no overlap for concurrent updates –Edge consistency: exclusive read-write to its vertex and adjacent edges; read only to adjacent vertices –Vertex consistency: all updates in parallel (sync operations) Asynchronous: all vertices No supersteps asynchronous Machine learning, data mining

18 Vertex-centric models vs. MapReduce Vertex centric: think like a vertex; MapReduce: think like a graph Can we do better? Vertex centric: maximize parallelism – asynchronous, minimize data shipment via message passing; support iterations MapReduce: inefficiency caused by blocking; distributing intermediate results (all to all), unnecessary write/read; does not provide a mechanism to support iteration Vertex centric: limited to graphs; MapReduce: general Lack of global control: ordering for processing vertices in recursive computation, incremental computation, etc New programming models, have to re-cast algorithms in MapReduce, hard to reuse existing (incremental) algorithms

GRAPE: A parallel model based on partial evaluation 19

Querying distributed graphs Given a big graph G, and n processors S1, …, Sn G is partitioned into fragments (G1, …, Gn) G is distributed to n processors: Gi is stored at Si Dividing a big G into small fragments of manageable size Each processor Si processes its local fragment Gi in parallel Parallel query answering Input: G = (G1, …, Gn), distributed to (S1, …, Sn), and a query Q Output: Q(G), the answer to Q in G Q( ) G G G1G1 G1G1 GnGn GnGn G2G2 G2G2 … How does it work? 20

GRAPE (GRAPh Engine) 21 Divide and conquer partition G into fragments (G1, …, Gn), distributed to various sites manageable sizes upon receiving a query Q, evaluate Q( Gi ) in parallel collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G evaluate Q on smaller Gi data-partitioned parallelism Each machine (site) Si processes the same query Q, uses only data stored in its local fragment Gi 21

Partial evaluation 22 The connection between partial evaluation and parallel processing compute f( x )  f( s, d ) conduct the part of computation that depends only on s generate a partial answer the part of known input Partial evaluation in distributed query processing evaluate Q( Gi ) in parallel collect partial matches at a coordinator site, and assemble them to find the answer Q( G ) in the entire G yet unavailable input a residual function Gj as the yet unavailable input as residual functions at each site, Gi as the known input 22

Coordinator 23 Coordinator: receive/post queries, control termination, and assemble answers Upon receiving a query Q post Q to all workers Initialize a status flag for each worker, mutable by the worker Terminate the computation when all flags are true Assemble partial answers from workers, and produce the final answer Q(G) Termination, partial answer assembling Each machine (site) Si is either a coordinator a worker: conduct local computation and produce partial answers 23

Workers 24 Worker: conduct local computation and produce partial answers upon receiving a query Q, evaluate Q( Gi ) in parallel send messages to request data for “border nodes” use local data Gi only Local computation, partial evaluation, recursion, partial answers 24 Incremental computation: upon receiving new messages M evaluate Q( Gi + M) in parallel set its flag true if no more changes to partial results, and send the partial answer to the coordinator This step repeats until the partial answer at site Si is ready With edges to other fragments Incremental computation

25 Costly when G is big Regular path Input: A node-labelled directed graph G, a pair of nodes s and t in G, and a regular expression R Question: Does there exist a path p from s to t such that the labels of adjacent nodes on p form a string in R? Reachability and regular path queries Reachability Input: A directed graph G, and a pair of nodes s and t in G Question: Does there exist a path from s to t in G? O(|V| + |E|) time O(|G| |R|) time Parallel algorithms? 25

Reachability queries 26 Worker: conduct local computation and produce partial answers upon receiving a query Q, evaluate Q( Gi ) in parallel send messages to request data for “border nodes” Local computation: computing the value of X v in Gi With edges to other fragments Boolean formulas as partial answers For each node v in Gi, a Boolean variable X v, indicating whether v reaches destination t The truth value of X v can be expressed as a Boolean formula over X vb Border nodes in Gi 26

Boolean variables Partial evaluation by introducing Boolean variables Locally evaluate each qr(v,t) in Gi in parallel: for each in-node v’ in Fi, decides whether v’ reaches t; introduce a Boolean variable to each v’ Partial answer to qr(v,t): a set of Boolean formula, disjunction of variables of v’ to which v can reach qr(v,t) v t v’ t qr(v,v’) X v ’ = qr(v’,t) = X v1 ’ or … or X vn ’ 27

Distributed reachability: assembling Assembling partial answers Collect the Boolean equations at coordinator solve a system of linear Boolean equation by using a dependency graph qr(s,t) is true if and only if X s = true in the equation system X v = X v’’ or X v’ X v’’ = false Xt = 1 X v ’ = Xt Xs = Xv O(|V f |) Coordinator: Assemble partial answers from workers, and produce the final answer Q(G) Only V f, the set of border nodes in all fragments 28

QQ Q Q Q 1. Dispatch Q to fragments (at Sc) 2. Partial evaluation: generating Boolean equations (at Gi) 3. Assembling: solving equation system (at Sc) Example Sc Jack,"MK" Emmy,"HR" Mat,"HR " G2G2 Fred, "HR" Walt, "HR" Bill,"DB " G1G1 Pat,"SE" Tom,"AI" Ross,"HR " G3G3 Ann Mark No messages between different fragments 29

Reachability queries in GRAPE 30 upon receiving a query Q, evaluate Q( Gi ) in parallel collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G Think like a graph Complexity analysis Parallel computation: in O(|V f ||G m |) time One round: no incremental computation is needed Data shipment: O(|V f | 2 ), to send partial answers to the coordinator; no message passing between different fragments 30 G m : the largest fragment speedup? | G m | = |G|/n Complication: minimizing V f ? An NP-complete problem. Approximation algorithm. Rahimian, A. H. Payberah, S. Girdzijauskas, M. Jelasity, and S. Haridi. Ja-be-ja: A distributed algorithm for balanced graph partitioning. Technical report, Swedish Institute of Computer Science, 2013

Regular path queries in GRAPE 31 Incorporating the state of NFA for R 31 Boolean formulas as partial answers Treat R as an NFA (with states) For each node v in Gi, a Boolean variable X(v, w), indicating whether v matches state w of R and reach destination t X(v, f): the final state f of NFA, for destination node t Regular path queries Input: A node-labelled directed graph G, a pair of nodes s and t in G, and a regular expression R Question: Does there exist a path p from s to t such that the labels of adjacent nodes on p form a string in R? Adding a regular expression R

Boolean variables Partial answers as Boolean formulas For each node v in Gi, assign v. rvec: a vector of O(|Vq|) Boolean formulas, each entry v.rvec[w] denotes if v matches state w introduce a Boolean variable X(v’,w) for each border node v’ of Gi and a state w in Vq, denoting if v’ matches w Partial answer to qrr(s,t): a set of Boolean formula from each in-nodes of Fi v1v1 t v’ t vqvq wqwq … v2v2 f 11 f 12 … f 1k f 1v ’ f 2v ’ … f kv ’ R X(v’,w) |Vq|: the number of states in R 32

Ann HRDB Mark pattern fragment graph Fragment F "virtual nodes" of F 1 cross edges Regular path queries 33

Regular reachability queries 34 Ann HRDB Mark F 1 : Y(Ann,Mark) = X(Pat, DB) X(Mat, HR) X(Fred, HR) = X(Emmy, HR) Fred, HR Walt, HR Bill,DB Ann Mat, HR Pat, SE Emmy, HR X(Pat, DB) X(Emmy, HR) F 2 : X(Emmy, HR) = X(Ross, HR) X(Mat, HR) = X(Fred, HR) X(Ross, HR) F 3 : X(Pat, DB) = false, X(Ross, HR) = true Ross, HR Mark, Mark X(Mat, HR) true false Y(Ann,Mark) = true Boolean variables for “virtual nodes” reachable from Ann F1F1 F2F2 F3F3 Boolean equations at each site, in parallel The same query is partially evaluated at each site in parallel Assemble partial answers: solve the system of Boolean equations Only the query and the Boolean equations need to be shipped Each site is visited once 34

Regular path queries in GRAPE 35 upon receiving a query Q, evaluate Q( Gi ) in parallel collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G Think like a graph: process an entire fragment Complexity analysis Parallel computation: in O((|V f | 2 +|G m |)|R| 2 ) time One round: no incremental computation is needed Data shipment: O(|R| 2 |V f | 2 ), to send partial answers to the coordinator; no message passing between different fragments G m : the largest fragment Speedup: |G m | = |G|/n, and R is small in practice 35

36 Graph pattern matching by graph simulation Input: A directed graph G, and a graph pattern Q Output: the maximum simulation relation R A parallel algorithm in GRAPE? Maximum simulation relation: always exists and is unique If a match relation exists, then there exists a maximum one Otherwise, it is the empty set – still maximum Complexity: O((| V | + | V Q |) (| E | + | E Q | ) The output is a unique relation, possibly of size |Q||V|

37 Algorithm for computing graph simulation Similarity(P) for all nodes u in Q do sim(u)  the set of candidate matches w in G; while there exist (u, v) in Q and w in sim(u) (in G) that violate the simulation condition sim(u)  sim(u)  {w}; output sim(u) for all u in Q Input: pattern Q and graph G Output: for each u in Q, sim(u): the matches w in G successor(w)  sim(v) =  Plus optimization techniques successor(w)  sim(v) =  There exists an edge from u to v in Q, but the candidate w of u has no corresponding edge to a node w’ that matches v refinement with the same label; moreover, if u has an outgoing edge, so does w

38 Complication A cycle with two nodes matches a cycle of unbounded length Fixpoint computation: revise the match relation until no further changes Graph simulation does not have data locality pattern graph In a parallel setting, data shipment is a must

Coordinator Coordinator: Upon receiving a query Q post Q to all workers Initialize a status flag for each worker, mutable by the worker Again, Boolean formulas 39 Given a big graph G, and n processors S1, …, Sn G is partitioned into fragments (G1, …, Gn) G is distributed to n processors: Gi is stored at Si Boolean formulas as partial answers For each node v in Gi and each pattern node u in Q, a Boolean variable X(u, v), indicating whether v matches u The truth value of X(u, v) can be expressed as a Boolean formula over X(u’, v’), for border nodes v’ in V f

Worker: initial evaluation 40 Worker: conduct local computation and produce partial answers upon receiving a query Q, evaluate Q( Gi ) in parallel send messages to request data for “border nodes” use local data Gi only Partial evaluation: using an existing algorithm Local evaluation Invoke an existing algorithm to compute Q(Gi) Minor revision: incorporating Boolean variables Messages: For each node to which there is an edge from another fragment Gj, send the truth value of its Boolean variable to Gj With edges from other fragments

Worker: incremental evaluation 41 Incremental computation, recursion, termination set its flag true and send partial answer Q(Gi) to the coordinator Repeat until the truth values of all Boolean variables in Gi are determined evaluate Q( Gi + M) in parallel Messages from other fragments Recursive computation Termination: Coordinator: Terminate the computation when all flags are true The union of partial answers from all the workers is the final answer Q(G) Use an existing incremental algorithm Partial answer assembling 41

Graph simulation in GRAPE Input: G = (G1, …, Gn), a pattern query Q Output: the unique maximum match of Q in G 42 parallel query processing with performance guarantees Performance guarantees Response time: O((|V Q | + |V m |) (|E Q | + |E m |) |V Q | |V f |) the total amount of data shipped is in O( |V f | |V Q | ) Speed up where Q = (V Q, E Q ) G m = (V m, E m ): the largest fragment in G V f : the set of nodes with edges across different fragments in contrast graph simulation: O((| V | + | V Q |) (| E | + | E Q | ) small |G|/n with 20 machines, 55 times faster than first collecting data and then using a centralized algorithm 42

43 GRAPE vs. other parallel models Implement a GRAPE platform? Reduce unnecessary computation and data shipment Message passing only between fragments, vs all-to-all (MapReduce) and messages between vertices Incremental computation: on the entire fragment; Flexibility: MapReduce and vertex-centric models as special cases MapReduce: a single Map (partitioning), multiple Reduce steps by capitalizing on incremental computation Vertex-centric: local computation can be implemented this way Think like a graph, via minor revisions of existing algorithms; no need to to re-cast algorithms in MapReduce or BSP Iterative computations: inherited from existing ones

Summing up 44

45 Summary and review What is the MapReduce framework? Pros? Pitfalls? Develop algorithms in MapReduce What are vertex-centric models for querying graphs? Why do we need them? What is GRAPE? Why does it need incremental computation? How to terminate computation in GRAPE? Develop algorithms in vertex-centric models and in GRAPE. Compare the four parallel models: MapReduce, PBS, vertex- centric, and GRAPE

46 Project (1) Recall PageRank (Lecture 2) 46 Implement two algorithms for PageRank, in BSP GRAPE Develop optimization strategies Experimentally evaluate your algorithms, especially its scalability with the size of G Write a survey on parallel algorithms for PageRank, as part of the related work. A development project

47 Project (2) Recall strongly connected components (Lecture 2) 47 Implement two algorithms for strongly computing connected components, in BSP GRAPE Develop optimization strategies Experimentally evaluate your algorithms, especially its scalability with the size of G Write a survey on parallel algorithms for computing strongly connected components, as part of the related work. A development project

48 Project (3) Recall bounded simulation (Lecture 3) 48 Implement a parallel algorithm for graph pattern matching via bounded simulation, in GRAPE Develop optimization strategies Experimentally evaluate your algorithm, especially its scalability with the size of G Write a survey on parallel algorithms for graph pattern matching, as part of the related work. A research and development project

49 Project (4) Recall graph partitioning: given a directed graph G and a natural number n, we want to partition G into n fragments of roughly even size such that the total number of border nodes in V f is minimized Read existing work on graph partitioning Develop an approximation algorithm for graph partitioning Implement your algorithm in any parallel programming model of your choice Develop optimization strategies Experimentally evaluate your algorithm, especially its scalability with the size of G and the size of |V f | Write a survey on graph partitioning algorithms, as part of the related work. A research and development project 49

50 Pregel: a system for large-scale graph processing Da Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, Y. Bu: Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees Distributed GraphLab: A Framework for Machine Learning in the Cloud, PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs, gonzalez-low-gu-bickson-guestrin.pdf W. Fan, X. Wang, and Y. WU. Distributed Graph Simulation: Impossibility and Possibility. VLDB (parallel scalability) W. Fan, X. Wang, and Y. Wu. Performance Guarantees for Distributed Reachability Queries, VLDB, (parallel model) Papers for you to review