Parallelizing Sequential Graph Computations

Parallelizing Sequential Graph Computations
Wenfei Fan University of Edinburgh Beihang University 1

Social graphs, knowledge bases, transportation networks, …
Real-life graphs B A1 Am W Edge: relationship Node: person Social graphs, knowledge bases, transportation networks, …

A wide range of applications
Graph computations traversal: DFS, BFS, single-source shortest path (SSSP) connectivity: SCC, MST graph pattern matching: graph simulation, subgraph isomorphism keyword search Machine learning: collaborative filtering … Applications pattern recognition knowledge discovery transportation network analysis Web site classification, social position detection social media marketing … A wide range of applications 3 QSX Spring (LN5) 3 3

Graph pattern matching
Find all matches of a pattern in a graph Identify suspects in a drug ring B B A1 Am 1 W A S W W 3 3 W W W W Is this feasible? Facebook : 1.38 billion nodes, and trillions of links Graph pattern W W 4 “Understanding the structure of drug trafficking organizations”

The good, the bad and the ugly
Traditional computational complexity theory of 50 years: The good: polynomial time computable (PTIME) The bad: NP-hard (intractable) The ugly: PSPACE-hard, EXPTIME-hard, undecidable… What happens when it comes to big data? Using SSD of 12G/s, a linear scan of D of 15TB takes 20 minutes, and a single join of D takes 1.4 days O(n2) time is already beyond reach on big data in practice! Polynomial time queries become intractable on big data! 5

Tractability revisited for big data
NP and beyond W. Fan, F. Geerts, F. Neven. Making Queries Tractable on Big Data with Preprocessing, VLDB 2013. Parallel polylog time for online processing, after PTIME offline one-time preprocessing P not BD-tractable BD-tractable Open, like P = NP BD-tractable queries: properly contained in P unless P = NC 6

Tractable graph computations?
NP and beyond Subgraph isomorphism P DFS: linear time not BD-tractable BD-tractable Is it feasible to query real-life graphs? 7

Parallel graph computations
Add more computing resources DB M interconnection network P A number of parallel graph engines have been developed Pregel (Google), Graphlab (CUM), Giraph (Facebook), GraphX (MapREduce), Blogel, Giraph++ Are we done yet?

Recognized problems MapReduce: inefficiency and I/O cost (blocking, disk access in each step); lack of support for iterative graph computations Vertex-centric (Pregel, GraphLab): excessive message passing; think like a vertex; lack of global optimization Block-centric (Blogel, Giraph++): vertex-programming We can bear with these, but … Have to recast existing graph algorithms in a new model It is nontrivial for one to learn how to program in the new parallel models, e.g., “think like a vertex” Graph computations have been studied for decades, and a number of sequential graph algorithms are already in place Think of DOS 35 years ago Parallel graph computations are a privilege of experienced users

Make parallel graph computations accessible to more people
The objective of GRAPE Plug and play! Develop a parallel graph query engine that offers Ease of programming: users can plug in existing sequential algorithms and get them automatically parallelized Assurance: the system guarantees termination and correctness Graph-level optimization: the system inherits all optimization strategies developed sequential algorithms, and Scale-up: automated parallelization does not imply degradation in performance and functionality Provided that the sequential algorithms are correct W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, C. Tian. Parallelizing sequential graph computations, SIGMOD The Best Paper Award Make parallel graph computations accessible to more people

Overview GRAPE: a parallel graph query engine
SIGMOD 2017 GRAPE: a parallel graph query engine Incrementalizing batch graph algorithms Applications: social media marketing (association rules for graphs) SIGMOD 2017 SIGMOD 2016, VLDB 2015 W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, C. Tian. Parallelizing sequential graph computations, SIGMOD 2017. W. Fan, C. Hu, C. Tian. Bounded incremental graph computations: Undoable and doable, SIGMOD 2017. W. Fan, X. Wang, Y. Wu, and J. Xu. Association rules with graph patterns, VLDB 2015 W. Fan, X. Wang, Y. Wu, and J. Xu. Adding counting quantifiers to graph patterns. SIGMOD 2016 Joint work with Jingbo Xu, Wenyuan Yu, Yinghui Wu, Jiaxin Jiang, Zeyu Zheng, Bohan Zhang, Chunming Hu, Yang Cao, Chao Tian 11

GRAPE: a parallel graph query engine
12

Ease of use: comparable to MapReduce
GRAPE API For a class Q of graph queries Input: A graph G and a query Q in Q Output: Q(G), the answers to Q in G Three functions for Q No need for recasting PEval: a sequential algorithm for Q, for partial evaluation IncEval: a sequential incremental algorithm for Q Assemble: a sequential algorithm (taking a union of partial results) Ease of use: comparable to MapReduce 13 13

The parallel model of GRAPE
Data partitioned parallelism (shared-nothing architecture) Fragmented graph G = (G1, …, Gn), distributed to workers G BSP: upon receiving a query Q, in three phases: Q( ) evaluate Q on smaller Gi … Partial evaluation PEval: evaluate Q( Gi ) in parallel Repeat incremental IncEval: compute Q(Gi  Mi) in parallel, by treating messages Mi between different workers as “updates” Assemble partial results when it reaches a “fixed point” Gn G1 G2 Messages Mi: update parameters pertaining to border nodes; can be “automatically” generated from variable declarations Parallel processing = partial + incremental evaluation 14 14

Peval, IncEval and Assemble are existing sequential algorithms
GRAPE: workflow Q PEval coordinator worker worker Q(G1) Q(Gn) coordinator worker worker Q(G1  M1) Q(Gn  Mn) coordinator IncEval Assemble Q(G) Plug and play! Peval, IncEval and Assemble are existing sequential algorithms 15 15

Plug in any sequential algorithm for partial evaluation
compute f( x )  f( s, d ) conduct the part of computation that depends only on s generate a partial answer the part of known input yet unavailable input a residual function Partial evaluation in distributed query processing evaluate Q( Gi ) in parallel Gj’s residing at other workers are the yet unavailable input at each site, local Gi is the known input G Q( ) … Q( ) Q( ) Gn G1 Q( ) G2 Plug in any sequential algorithm for partial evaluation 16 16

Incremental query answering
Real-life graphs constantly change (∆G) Graph computations are typically iterative. Re-compute Q(G⊕∆G) starting from scratch? Changes ∆G are typically small Compute Q(G) once, and then incrementally maintain it Changes to the input Old output Incremental query processing: Input: Q, G, Q(G), ∆G Output: ∆M such that Q(G⊕∆G) = Q(G) ⊕ ∆M Computing graph matches in batch style is expensive. We find new matches by making maximal use of previous computation, without paying the price of the high complexity of graph pattern matching. When changes ∆G to the data G are small, typically so are the changes ∆M to the output Q(G⊕∆G) New output Changes to the output Minimizing unnecessary recomputation 17 17

Complexity of incremental problems
Incremental query answering Input: Q, G, Q(G), ∆G Output: ∆M such that Q(G⊕∆G) = Q(G) ⊕ ∆M The cost of query processing: a function of |G| and |Q| incremental algorithms: |CHANGED|, the size of changes in the input: ∆G, and the output: ∆M The updating cost that is inherent to the incremental problem itself W. Fan. Y. Wu, X. Wang. Bounded graph pattern matching, TODS 2013 W. Fan, C. Hu, C. Tian. Bounded incremental graph computations: Undoable and doable, SIGMOD 2017. Bounded: the cost is expressible as f(|CHANGED|, |Q|)? G. Ramalingam, Thomas W. Reps: On the Computational Complexity of Dynamic Graph Problems. TCS 158(1&2), 1996 Bounded: Reduce computational and communication cost 18 18

Example: graph simulation
Input: A directed graph G, and a graph pattern Q Output: the maximum simulation relation R Maximum simulation relation: always exists and is unique If a match relation exists, then there exists a maximum one Otherwise, it is the empty set – still maximum a binary relation S on nodes for each (u,v)∈ S, each edge (u,u’) in Q is mapped to an edge (v, v’ ) in G, such that (u’,v’ )∈ S Complexity: O((| EQ | + | VQ |) (| E | + | V | ) A “hard” one for existing models: fixed point computation 19

Graph simulation in GRAPE
PEval: an existing sequential algorithm for graph simulation – well optimized IncEval: an existing sequential incremental algorithm – bounded Assemble: the union of partial matches at each processors Message passing and “updates” Border nodes v in Gi: with an edge to another fragment For each border node v in Gi and each pattern node u in Q, a Boolean variable X(u, v), indicating whether v matches u Initially, X(u, v) is set true Messages Mi from another fragment: flaps X(u, v) false monotonic parallelize sequential algorithms with correctness guaranteed

Parallelizing sequential graph algorithms
A simultaneous fixpoint computation (R1, …, Rn) Rir partial results at worker i in round r Mi message to worker i Ri0 = PEval(Q, Gi) Rir+1 = IncEval(Q, Rir, Gi, Mi) PEval: initialization IncEval: the immediate consequence operator Assurance Theorem: termination and correctness guaranteed if the sequential algorithms are correct, and Mi is “monotonic”! Plug in correct sequential algorithms, and play! GRAPE parallelizes the sequential algorithms automatically guarantees termination and correctness A principled approach: a simultaneous fixed point computation of partial evaluation + incremental computation

As powerful as existing graph systems
The power of GRAPE What about existing parallel graph algorithms? Simulation Theorem: MapReduce, BSP (bulk synchronous parallel) and PRAM models are optimally simulated by GRAPE Existing parallel algorithms readily migrate to GRAPE Without extra cost Graph-level optimization: Inherit all optimization techniques developed for sequential algorithms and graphs, e.g., indexing Hard for vertex-centric or block-centric systems As powerful as existing graph systems

The impact of (bounded) incremental computation
Performance of GRAPE Experimental setting: from 4 to 24 processors Graph computations: graph simulation, subgraph isomorphism, SSSP, CC, keyword search, and collaborative filtering Real-life graphs: social networks, knowledge bases, transportation networks Competitors: Giraph, GraphLab, Blogel Experimental results Running time: on average at least 2 times faster, up to 980 times Communication cost: on average 10%, as small as 1.3%, substantially less! Without much optimization The impact of (bounded) incremental computation 23 23

Reasons for using GRAPE
Unique: minimize unnecessary recomputation A principled approach Partial evaluation + Incremental computation Assurance on termination and correctness no need to recast existing algorithms in a new model Ease of use: Users need to provide only 3 functions: PEval, IncEval, Assemble, all existing sequential algorithms any one who knows textbook stuffs for graph computations At least 2 times faster than the state-of-the-art systems, up to 980 times, orders of magnitudes less data shipment Graph-level optimization: all sequential optimization techniques Performance: by simply plugging existing sequential algorithms, GRAPE outperforms the state-of-the-art systems An open source system is under development 24 24

Relative bounded incremental computations
25

Challenges of incremental algorithms
Incremental query processing: Input: Q, G, Q(G), ∆G Output: ∆M such that Q(G⊕∆G) = Q(G) ⊕ ∆M Making big data small: speed up GRAPE Bounded: the cost is expressible as f(|CHANGED|, |Q|)? Positive: shortest distance (single source, all pairs) Negative: reachability (single source or all pairs), subgraph isomorphism However, Less incremental algorithms are in place than batch algorithms Fewer incremental algorithms are known bounded or not It is hard and ad hoc to prove whether an incremental problem is bounded or not Systematic proof methods? “incrementalizing” popular batch algorithms? Is bounded computation within reach in practice? 26 26

-reductions from Q1 to Q2
Incremental problems: Q1 and Q2, instances (Qi, Gi) A triple of functions (f, fi, fo) such that for any I1 = (Q1, G1) of Q1 f(I1) is an instance of (Q2, G2) of Q2 for all updates ∆G1 to G1 fi(∆G1 ) computes updates ∆G2 to G2 (updates to input) f0(∆O2 ) computes updates ∆O1 (updates to output) in PTIME in | ∆G1 | + | ∆O1 | and | Q1 |. fi G1 G2 f (Q1, ) G1 (Q2, ) G2 fo Q1(G1) Q2(G2) 27 A systematic method prove the boundedness 27

New unboundedness results
Negative results fi G1 G2 f (Q1, ) G1 (Q2, ) G2 fo Q1(G1) Q2(G2) Theorem: if there exists a -reduction from Q1 to Q2 and the incremental problem for Q2 is bounded, them the incremental problem for Q1 is bounded The incremental problem is unbounded for regular path queries, strongly connected components, and keyword search even under a unit edge insertion and a unit edge deletion -reduction W. Fan, C. Hu, C. Tian. Bounded incremental graph computations: Undoable and doable, SIGMOD 2017. New unboundedness results 28 28

Limitations of boundedness
There are efficient incremental algorithms for regular path queries, strongly connected components, and keyword search although none of the incremental problems is bounded Boundedness is too strong a criteria It does not capture auxiliary structures necessary for algorithms It does not reflect the effectiveness of real-life incremental algorithms, which often substantially outperform batch algorithms even when the incremental problem is unbounded More practical criteria for characterizing the effectiveness? 29 29

Localizable incremental computations
An incremental algorithm T∆ for Q is localizable if for each query Q in Q, its cost can be expressed by a function of |Q|, and the size of dQ-neighbors of nodes in ∆G dQ: decided by the size of Q (eg, diameter) dQ-neighbor of node v: within dQ hopes of v -reduction Doable: the incremental problem is localizable for subgraph isomorphism, keyword search although these problems are unbounded! Effective incremental algorithms are within reach for common queries 30 30

Relatively bounded incrementalization
Consider a popular batch algorithm T for Q An incremental algorithm T∆ for Q is bounded relative to T if for each query Q in Q, its cost is a polynomial of |Q|, |∆G|, and |AFF|, the difference between the data inspected by T for computing Q(G) and Q(G  ∆G) Incrementalizing popular batch algorithm Doable: the incremental problem is relatively bounded for regular path queries, and strongly connected components Provide GRAPE with effective incremental graph algorithms 31

Application: Graph-pattern association rules
32

Association rules revised for graphs
Conventional association rules: X  Y X and Y: itemsets (attributes of a relation) milk, diaper  beer Social media marketing To trump conventional marketing 90% of customers trust their peer (friends, colleagues) recommendations vs. 14% who trust advertising 60% of users said that Twitter plays an important roles in their shopping The need for association rules for graphs 33

Graph pattern association rules
if x and x’ are friends who live in the same city c, they both like at least 3 French restaurants in c, and if x’ likes a newly opened French restaurant y, then x may also like y We can advertise restaurant y to person x x x’ friend like live-in c y Association rules defined with graph patterns

Adding logic quantifiers
if all friends of x use Huawei phones, then x may buy Huawei P10 x1 x friend 100% use recommend y Existential quantifiers: by default Universal quantification: 100%

Counting quantifiers and FO operators
if more than 80% of friends of x use Huawei phone, and none of the friends of x gave Huawei phone a bad rating, then the chances are that x may like Huawei P10 x2 x friend > 80% friend = 0 recommend use bad-rating x1 y Counting: 80%; negation: 0

Extending graph patterns with first-order logic
Q(x, y): edges labelled with counting quantifiers existential quantifiers universal quantifiers negation Revise the semantics of subgraph isomorphism with counting Balance the expressive power and complexity Input: A pattern Q, and a graph G Question: Does there exist a match of Q in G? Undecidable for FOL Bound on quantifier alternation The quantified matching problem is DP-complete in general NP-complete in the absence of negation W. Fan, Y. Wu, and J. Xu. Adding counting quantifiers to graph patterns. SIGMOD 2016 Practical to discover and apply GPARs Parallel scalable algorithms for quantified matching 37

Graph Pattern Association Rules (GPAR)
R(x, y): Q(x, y)  p(x, y) Q: a graph pattern x, y: two variables for entities p: a predicate Conventional notions no longer work Semantics: graph pattern matching via subgraph isomorphism support, confidence GPAR discovery: top-k diversified GPARs pertaining to p(x, y) with support above , from a social graph G W. Fan, X. Wang, Y. Wu, and J. Xu. Association rules with graph patterns, VLDB 2015 W. Fan, X. Wang, Y. Wu, and J. Xu. Adding counting quantifiers to graph patterns. SIGMOD 2016 Identifying potential customers: the set of entities identified by GPARs in social graph G with confidence above  APP on top of GRAPE Parallel scalable algorithms in large social networks 38

Summing up 39

Invitation: Join forces with us?
GRAPE A principled approach: partial evaluation + incremental computation Ease of programming: sequential algorithms, plug and play Assurance: convergence and correctness Graph-level optimization, in addition to bounded incremental step Systems: performance comparable to the state-of-the-art systems Applications: social media marketing: Association rules with graph patterns Inconsistency checking: graph functional dependencies . . . W. Fan, X. Wang, Y. Wu. Making Distributed graph simulation: impossibility and possibility, VLDB 2014. The more processors, the faster? Ongoing: Parallel scalability most parallel algorithms published are NOT parallel scalable there exist computations that are NOT parallel scalable 40 Invitation: Join forces with us?

References W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, C. Tian. Parallelizing sequential graph computations. SIGMOD 2017. W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang. GRAPE: Parallelizing sequential graph computations. VLDB 2017 (demo). W. Fan, C. Hu, C. Tian. Bounded incremental computations: Undoable and doable. SIGMOD 2017. W. Fan, X. Wang, and Y. Wu. Answering Graph Pattern Queries using Views, TKDE 2016 (invited). W. Fan, Y. Wu, and J. Xu. Adding counting quantifiers to graph patterns. SIGMOD 2016 W. Fan, Y. Wu, and J. Xu. Functional dependencies for graphs. SIGMOD 2016 W. Fan, X. Wang, Y. Wu, J. Xu. Association rules for graph patterns. VLDB 2015

References Y. Cao, W. Fan and R. Huang. Making pattern queries bounded in big graphs. ICDE 2015. W. Fan, X. Wang, and Y. Wu. Querying big graphs with bounded resources, SIGMOD 2014. W. Fan, X. Wang, and Y. Wu. Distributed Graph Simulation: Impossibility and Possibility, VLDB 2014. W. Fan, X. Wang, and Y. Wu. Diversified Top-k Graph Pattern Matching, VLDB 2014. W. Fan, X. Wang, and Y. Wu. Answering Graph Pattern Queries using Views, ICDE 2014. W. Fan, X. Wang, and Y. Wu. Incremental Graph Pattern Matching, TODS 38(3), 2013 W. Fan, F. Geerts, F. Neven. Making Queries Tractable on Big Data with Preprocessing, VLDB 2013.

Parallelizing Sequential Graph Computations

Similar presentations

Presentation on theme: "Parallelizing Sequential Graph Computations"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Parallelizing Sequential Graph Computations

Similar presentations

Presentation on theme: "Parallelizing Sequential Graph Computations"— Presentation transcript:

Similar presentations

About project

Feedback