There are Trillions of Little Forks in the Road. Choose Wisely

There are Trillions of Little Forks in the Road. Choose Wisely
There are Trillions of Little Forks in the Road. Choose Wisely! – Estimating the Cost and Likelihood of Success of Constrained Walks to Optimize a Graph Pruning Pipeline – Nicolas Tripoul Hassan Halawa Tahsin Reza Geoffrey Sanders Roger Pearce Matei Ripeanu I’ll present our work on distributed pattern matching on property graphs. This work was done in collaboration with LLNL. Pattern matching is a powerful tool to answer complex graph queries. While significant research effort has been invested in large-scale graph processing, problems such as pattern matching, that are computationally complex yet serve a rich set of applications, have been little explored in the context of high-volume, large-scale processing. Unfortunately, existing pattern matching solutions have limited capabilities: they do not scale for modern large graph datasets and there are limitations to the intricacy of the patterns that can be used. Practical solutions for robust pattern matching in large-scale graphs remains an open problem. netsyslab.ece.ubc.ca computation.llnl.gov/casc

- This is called an evidence wall or a crime board.
- A detective uses this for the investigation. - It has links between potential suspects and criminal activities.

- As the detective gets closer to solving the case, he gradually narrows down the list of suspects by eliminating less likely individuals. - Our solution approach is essentially based on the same idea: we eliminate the vertices and edges that do not match the template.

PruneJuice: Exact Pattern Matching based on Graph Pruning [SC’18]
Arbitrary Patterns Large Graphs 109 – 1012 edges Fast Time-to-Solution Horizontal Scalability, 104 Cores High-level design objectives for this work are the following: Supports … … So it can be integrated to a human driven analytics pipeline. To be able to achieve that we need horizontal scalability - We want to leverage an existing graph processing framework like HavoqGT, GraphLab or Giraph - These frameworks demonstrated good scalability for various graph algorithms like BFS and Page Rank - Since these frameworks expose a vertex centric API for algorithm implementation, we also seek vertex centric descriptions for our algorithms -We want to utilize fast interconnects on HPC platforms through asynchronous communication - Finally we want our design to be both memory efficient and have low distributed communication overhead - The solution should scale to large graphs with hundreds of billions and trillions of edges - Scale to a large number of distributed nodes - Support a rich set of pattern matching scenarios Guarantees 100% Precision 100% Recall HavoqGT Vertex-Centric PruneJuice: Pruning Trillion-edge Graphs to a Precise Pattern-Matching Solution, Tahsin Reza, Matei Ripeanu, Nicolas Tripoul, Geoffrey Sanders, Roger Pearce, SC’18

An Application of Pattern Matching in a Large Social Network Graph
U P E Friend Going to Likes Background graph - Let’s have a look at a practical example of pattern matching on a very large graph. There exists several examples of very large real world graphs. - A recent work reports a Facebook user graph consists of over 1 trillion edges. - Here is an example of such a graph which, in addition to the users, contain events users go to and pages they like. U E P User Event Page [Ching 2015]

Link Recommendation U P E Friend Going to Likes Social Network U - It is possible to model graph queries as pattern matching problems. - Lets consider the following link recommendation query on this graph. - Identify the users, who has a friend and who has another friend. They … Recommend the page and the event to first user. U P E U E P User Event Page U Template [Ching 2015]

U P E Friend Going to Likes U U P E U Template - An important thing to notice here is that the match needs to consider both topology and label. In fact, most real world applications of pattern matching typically involve labeled graphs. (Twitter papers?) - In this work we also consider label based matching. - Label based pattern matching has applications in bioinformatics, data mining and social network analysis. U E P User Event Page [Ching 2015]

U P E Friend Going to Likes U U P E U Template - As you can see the graph contains a single example of the template. Define background graph, template and match U E P User Event Page An exact match [Ching 2015]

Set of Matching Vertices and Edges
𝐺 Background graph 𝐺0 Template Do not scale (combinatorial explosion) The Big Picture Existing Techniques (explicit search / backtracking) 𝐺, 𝐺0 Enumeration Match Exists? Set of Matching Vertices and Edges Match Counting So, how does our contribution fit in the larger context of pattern matching?

Our Approach Graph pruning Set of Matching Vertices and Edges
𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph Do not scale The Big Picture Existing Techniques 𝐺, 𝐺0 Enumeration Match Counting - Our approach is graph pruning. - After pruning, the background graph is reduced to a subgraph G*, which is the union of all matches. Our approach is graph pruning: we iteratively eliminate vertices and edges that do not meet the requirements of the template. 𝐺 ∗ is the union of all matching subgraphs in 𝐺 Our Approach Graph pruning Match Exists? 𝐺 ∗ Set of Matching Vertices and Edges 𝐺 ∗ ≪𝐺

𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph Do not scale The Big Picture Existing Techniques 𝐺, 𝐺0 Enumeration Match Counting Furthermore, pruning leads to the solution for some problems. 𝐺 ∗ is the union of all matching subgraphs in 𝐺 Our Approach Graph pruning Match Exists? 𝐺 ∗ Set of Matching Vertices and Edges

The Big Picture 𝐺 ∗ could be 107 times smaller than 𝐺 Existing Techniques 𝐺, 𝐺0 Enumeration Match Counting - One could operate on G* for full match enumeration or counting. - Now enumeration is less expensive, as it operates on a much smaller graph. - Later we show G* can be 7 orders of magnitude smaller. Our Approach Graph pruning Match Exists? Operating on 𝐺 ∗ 𝐺 ∗ Set of Matching Vertices and Edges

The Big Picture 𝐺 ∗ could be 107 times smaller than 𝐺 𝐺, 𝐺0
Existing Techniques 𝐺, 𝐺0 Enumeration Match Counting Graph Pruning for Pattern Matching [SC’18] Optimization Opportunities [IA3 - 18] Our Solution [IA3 - 18] Experiment Results [IA3 - 18] The rest of presentation is organized as follows. Our Approach Graph pruning Match Exists? Operating on 𝐺 ∗ 𝐺 ∗ Set of Matching Vertices and Edges

Overview of the Graph Pruning Pipeline
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Overview of the Graph Pruning Pipeline Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Non-local Constraint Checking Local Constraint Checking 𝐺 ∗ - Here I show you a high level overview of the complete pruning pipeline. - It is consist of a number of steps – next, we will look at them one at a time. - In our graph pruning model, a vertex must meet a set of constraints to be part of a match. So, we first identify these constraints. There are two types of constraints local and non-local constraints. 𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph, union of all matching subgraphs

Constraint Generation
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Constraint Generation Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Non-local Constraint Checking Local Constraint Checking 𝐺 ∗ - In our graph pruning model, a vertex must meet a set of constraints to be part of a match. So, we first identify these constraints. There are two types of constraints local and non-local constraints. - Two separate routines verify these constraints. - Optimizations for … 𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph, union of all matching subgraphs

Local constraints of 𝐺0 Template U P E 𝐺, 𝐺0 𝐺 ∗
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Local constraints of 𝐺0 U E P Template In summary this is how the complete pipeline looks like. Selection and ordering influence performance. 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Non-local constraints of 𝐺0
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local constraints of 𝐺0 U E P Template In summary this is how the complete pipeline looks like. Selection and ordering influence performance. 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Local Constraint Checking – Eliminates vertices and edges
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Local Constraint Checking – Eliminates vertices and edges Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Non-local Constraint Checking Local Constraint Checking 𝐺 ∗ LCC verifies template constraints in the immediate neighborhood of a vertex and eliminate both vertices and edges. 𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph, union of all matching subgraphs

Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Local Constraint Checking – Eliminates vertices and edges U P E U E P Template Local constraints of the orange vertex is, it has two blue neighbors. 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Local Constraint Checking – Eliminates vertices and edges U P E U E P Template The ones that do not meet this requirement are deleted (vertices with the red outline). 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Non-local Constraint Checking – Eliminates vertices
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local Constraint Checking – Eliminates vertices Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Non-local Constraint Checking Local Constraint Checking 𝐺 ∗ NLCC verifies template requirements beyond the one-hop neighborhood. As you delete edges, it could trigger a cascading effect and delete additional vertices. 𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph, union of all matching subgraphs

Non-local constraints of 𝐺0
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local constraints of 𝐺0 U P U E U E P Template In the constraint generation phase we breakdown the template into smaller substructures that that can be verified independently. We deconstruct the template to identify non-local constraints that can be verified independently without introducing false negatives. This is only possible if you are doing pruning. U E P U E P 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local Constraint Checking – Eliminates vertices U P U P E T U E P Template T U E T U E P T U E P 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local Constraint Checking – Eliminates vertices U P U P E T U E P Template T U E We use a token passing approach … T U E P T U E P 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Non-local Constraint Checking – Eliminates vertices U P U P E U E P Template U E U E P U E P 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Solution Graph 𝐺 ∗ , union of all matching subgraphs
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Solution Graph 𝐺 ∗ , union of all matching subgraphs Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Non-local Constraint Checking Local Constraint Checking 𝐺 ∗ Once we have verified all the NLCs, we are remained with the solution graph, which is the union of all matching subgraphs. 𝐺 Background graph 𝐺0 Template 𝐺 ∗ Solution graph, union of all matching subgraphs

Solution Graph 𝐺 ∗ , union of all matching subgraphs
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Solution Graph 𝐺 ∗ , union of all matching subgraphs U P E U E P Template Here is the final pruned graph. 𝐺, 𝐺0 Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺 ∗ Non-local Constraint Checking

Full Match Enumeration on the Solution Graph 𝐺 ∗
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Full Match Enumeration on the Solution Graph 𝐺 ∗ Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Full Match Enumeration 𝐺 ∗ Non-local Constraint Checking Local Constraint Checking Furthermore, you can do full match enumeration, by operating on the pruned graph. This is to note that … - Full match enumeration operates on the pruned graph. - Selection and ordering of non-local constraints influence performance. - This problem is comparable to join order optimization in relational databases.

Critical influence on performance: non-local constraints
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Critical influence on performance: non-local constraints Identify Local and Non-local Constraints for 𝐺0 Local Constraint Checking For each non-local constraint 𝐺, 𝐺0 Full Match Enumeration 𝐺 ∗ Non-local Constraint Checking Local Constraint Checking Furthermore, you can do full match enumeration, by operating on the pruned graph. This is to note that … - Full match enumeration operates on the pruned graph. - Selection and ordering of non-local constraints influence performance. - This problem is comparable to join order optimization in relational databases. (1) Which constraints to generate? (2) In what order to check them? (4) When to switch to full enumeration? (3) How to generate them? [SC’18] solution uses heuristics

One metric to make all these decisions: constraint effectiveness
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results One metric to make all these decisions: constraint effectiveness 𝐸𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒𝑛𝑒𝑠𝑠 = Δ 𝐷𝑖𝑠𝑡 𝑡𝑜 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝐶𝑜𝑠𝑡 = Number of vertices pruned Time to run a constraint Chance of success of a constrained walk – starting from a specific vertex in a background graph. Number of vertices pruned: chance of success of a constrained walk – starting from a specific label. Time: time to check is such a walk exists Time to check if such a walk exists (approximate for shared-memory as: #mem accesses)

Seed model with actual data Greedy: pick best at each iteration
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Chance of success of a constrained walk – starting from a specific vertex in a background graph. Number of vertices pruned Time to run a constraint Time to check if such a walk exists (approximate for shared-memory as: #mem accesses  #links traversed) Number of vertices pruned: chance of success of a constrained walk – starting from a specific label. Time: time to check is such a walk exists Graph model Analytical estimates Seed model with actual data Greedy: pick best at each iteration Random graph model with two hypotheses : Vertex degree determined by its label The probability of an edge between vertices with two labels only depends on the labels.

Evaluation Dataset Number of vertex Number of edges Maximum degree
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results Evaluation Dataset Number of vertex Number of edges Maximum degree Reddit 3.9B 14B 19B Internet Movie Database 5.0M 29M 552K Patent 2.7M 28M 789 Youtube 4.6M 88M 2.5K System System 1 Intel(R) Xeon(R) CPU E GHz, 40 Cores, 512 Gb of RAM

One experiment: Optimized constraint ordering and early enumeration
Graph Pruning for Pattern Matching Optimized Decision Making Experiment Results One experiment: Optimized constraint ordering and early enumeration Decisions informed by model are effective! [Many more experiments in the paper]

No false positives or negatives
Takeaways What makes a pruning-based approach promising? U E P Template U P U E U E P U E P No false positives or negatives Concluding remarks This the novelty of our technique is that You can verify substructures in isolation Throw away vertices early without introducing false negatives Your computation is polynomial a lot of the times, edge elimination is 100% polytime Unlike enumeration, pruning explore the minimal number of paths Paving the way to practical microscopic graph queries through pattern matching - In this work, we demonstrated graph pruning is an enabler for pattern matching on very large graphs, a problem that is computationally intractable today - You do not need to solely rely on high complexity algorithms for pattern matching - It is possible to develop, low complexity highly parallel and scalable pruning algorithms, we showed these algorithms can prune the large graph by 7 orders of magnitude in a matter of minutes - For some problems, pruning leads to the solution Smaller algorithm state – can prevent combinatorial explosion Search space reduction – enumeration is now less expensive Optimized constraint selection / ordering / generation netsyslab.ece.ubc.ca computation.llnl.gov/casc

Experiment 1: Topology aware Walk generation
Constraint : Which walk is the “best” starting from ? 1 1 2 3 4 1 2 3 4 1 2 2 1 2 3 4 1 2 3 4 1 1 2 2

Topology aware Walk generation
Use topological information to minimize the cost of a walk Optimal walk Compute all the possible walk and find the one with the best effectiveness Exponential growth cost with the size of the template Greedy approach Iteratively construct the walk When a branch appears in the constraint, select the best branch 1 2 1 2 3 … 1 2 1 2 3

Evaluation : Topology aware Walk generation
Up to 17% performance gain in pruning Up to 50% performance gain in enumeration

Pattern matching objectives
How to generate constraint walks ? How to order constraints and select the best one ? How to choose the enumeration walk ? When can we start enumerating early ? / When is pruning interesting ? What are the limitation of this model ?

We can order the constraint by effectiveness to select the best constraint to run
Previous algorithm: Select the smallest path or cycle constraint If there are no path or cycle constraint remaining, select the smallest TDS one Run this constraint Greedy algorithm: Gather graph topological information Compute remaining constraint effectiveness Rank the constraint to find the one with the best effectiveness

Evaluation : Constraint ordering
Up to 2.5-time faster pruning via constraint ordering against the intuitive selection ordering

Evaluation : Mis-ordering probability
Dataset Correctly ordered best constraint Performance impact of misordering compared to the optimal order IMDB-1 22/24 0.14% IMDB-2 51/56 4.64% IMDB-3 89/102 18.33% Youtube-8 6/6 0.0% Patent-8 Constraint are well ordered compared to the optimal order Mis-ordering does not happen often and does not have a strong impact on performance

Enumeration can be started early if we only want the matches
Early enumeration starts when the cost of running enumeration is lower than the cost of running a new constraint. This avoids running constraint that would not be effective

Can we predict when will constraint ordering fail ?
Computing variance of the metrics gives an estimation of the accuracy of the random graph model. We can limit the impact of worst-case mis-ordering through the use of constraint selection rules : Hard rules : TDS should only be run after several round of CC/PC. Soft rules : The cost is biased to favor faster running constraints We can run a posteriori estimation of the accuracy of our model and fallback to another constraint ordering method.

Limitations Removing one vertex can lead to a whole part of the graph pruned The objective function can be biased to give more emphasis to pruning vertex with a low label frequency and a high vertex degree : 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒′=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑒𝑟𝑡𝑒𝑥 𝑝𝑟𝑢𝑛𝑒𝑑/(𝐿𝑎𝑏𝑒𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 ∗𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑑𝑒𝑔𝑟𝑒𝑒) Pruning introduces a bias in the graph label distribution We can use a similarity metric to favor selection of constraint that are different from the previously run ones.

Conclusion The shared memory implementation is faster than the distributed one thanks to additional optimization. Constraint generation and selection is a difficult problem. We came up with a solution giving better results than the previous intuitive approach. We are now able to avoid pruning when it is not necessary. Can this approach be adapted for distributed systems ?

Example

The probability that a vertex will be valid can be estimated
1 2 3 4 Constraint : Transition probability P(1 -> 2) P(2 -> 3) P(3 -> 4) Number of vertex 1 1 x N(1->2) 1 x N(1->2) x N(2->3) 1 x N(1->2) x N(2->3) x N(3->4)

The probability that a vertex will be valid can be estimated
1 2 3 4 Constraint : P(1-> 2 -> 3 -> 4) Transition probability P(1 -> 2) P(2 -> 3) P(3 -> 4) Number of vertex 1 1 x N(1->2) 1 x N(1->2) x N(2->3) 1 x N(1->2) x N(2->3) x N(3->4)

Optimization availables
PruneJuice-Distributed Shared memory Type of walk BFS DFS LCC optimization Possible Implemented Work aggregation Multiple constraint validation Early check termination Impossible

Optimization 1 : Work aggregation – Distributed/Shared memory
Constraint :

Optimization 2 : Multiple constraint validation – Distributed/Shared memory

Optimization 3 : Early termination – Shared memory specific
Constraint :

The Cost of running a constraint is proportional to the number of edges traversed
1 2 3 4 Constraint : Cost of running a constraint : Size of the traversed tree Work aggregation Multiple validation Early search termination Transition probability P(1 -> 2) P(2 -> 3) P(3 -> 4) Number of vertex 1 1 x N(1->2) 1 x N(1->2) x N(2->3) 1 x N(1->2) x N(2->3) x N(3->4)

Rest of Tahsin’s talk

Distributed System Implementation on top of HavoqGT
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Distributed System Implementation on top of HavoqGT Metadata Store LCC NLCC Enumeration Control Logic HavoqGT Vertex-Centric API Core pruning and enumeration routines are implemented using the vertex-centric API Our solution operates on the native graph data structure (CSR). Control logic … actually more than that, enforces consistency, pruning … HavoqGT Asynchronous Visitor Queue MPI Runtime HavoqGT Delegate Partitioned Graph Checkpointing and Load Balancing [Pearce 2014]

Strong and weak scaling exp. for precise pruning Performance metrics
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Evaluation Strong and weak scaling exp. for precise pruning Performance metrics Search time for a single template Pruning factor Full match enumeration on the pruned graph Comparison with related work Insights into performance The evaluation will focus on performance and scalability studies using benchmarks that include pattern matching scenarios that are relevant to real-world use cases and stress the system along multiple axes. Processing time for a single template query is the primary measure of performance. The scalability evaluation will includes strong and weak scaling experiments.

Testbed – Quartz at Quartz System Details CPU Arch.
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Testbed – Quartz at Quartz System Details CPU Arch. Intel Xeon E (2.1GHz) Cores/Node 36 (2x CPU Sockets) Memory/Node 128GB Total Nodes 2,634 Peak Perf. 2.6PFlop Interconnect Intel Omni-Path ~20GB/s inter-node B/W. This is almost the same bandwidth between two sockets on the same machine. 63rd in TOP500 List – June 2018 TOSS3 kernel version 3.10 | OpenMPI 2.0 | GCC 4.9

Workloads Graphs Type |V| 2|E| dmax davg dstdev Size Web Data Commons
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Workloads Graphs Type |V| 2|E| dmax davg dstdev Size Web Data Commons Real 3.5B 257B 95M 72.25 3.6K 2.7TB Reddit 3.9B 14B 19M 3.74 483.25 460GB IMDb 5M 29M 552K 5.83 342.64 < 2GB Patent 2.7M 28M 789 10.17 10.80 Youtube 4.6M 88M 2.5K 19.16 21.67 R-MAT up to Scale 37 Synthetic 137B 4.4T 612M 32 4.9K 45TB Storage / memory requirements for these graphs. Why do we need a supercomputer? Number of labels in respective datasets? At least WDC. Double check Reddit, IMDB d_avg, d_stdev for the undirected graph. Patent, Youtube verified.

Strong Scaling – Web Data Commons (WDC) Hyperlink Graph
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Strong Scaling – Web Data Commons (WDC) Hyperlink Graph 3.5 billion vertices and 128 billion directed edges (2.7TB) Vertex labels – top-level domain names, e.g., gov, ca, and edu, 2903 labels These are the among the most frequent domains, covering ∼22% of the vertices in the WDC graph. org covers 220M vertices, the 2nd most frequent after com.

Strong Scaling Experiments
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Strong Scaling Experiments Runtime broken down to LCC and NLCC iterations. # Compute nodes Template

Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Strong Scaling Experiments # Compute nodes Template

Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Strong Scaling Experiments Good strong scaling for cyclic and acyclic templates, up to 90% efficient LCC shows near perfect strong scaling NLCC is the bottleneck – topology, match distribution, load imbalance # Compute nodes Template

Weak Scaling – Recursive Matrix (R-MAT), Graph500 Synthetic Graphs
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Weak Scaling – Recursive Matrix (R-MAT), Graph500 Synthetic Graphs 𝑉 = 2 𝑆𝐶𝐴𝐿𝐸 and 𝐸 = 16×2 𝑆𝐶𝐴𝐿𝐸 Scale 28 (4.3B directed edges) to Scale 37 (2.2T directed edges, 45TB) Vertex labels – degree based binning, log 2 (𝑑 𝑣 +1) , up to 30 labels These labels cover ∼30% of the vertices, with 2 being the most frequent label (14B instances in the Scale 37 graph)

Weak Scaling Experiments
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Weak Scaling Experiments Steady weak scaling Prunes trillion edge graphs by 107 in < 1 min. Number of iterations depends on the topology, diameter of the template Runtime is broken down to individual iterations.

Match Enumeration on the Pruned Graph
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Match Enumeration on the Pruned Graph Count 668M 2,444 1.49B Time 4min 1.84s 40h It gives us confidence that our approach is valuable.

Match Enumeration on the Pruned Graph
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Match Enumeration on the Pruned Graph ‘To Enumerate, or Not to Enumerate’ < 1 min. to precisely prune the 2.7TB graph 40+ hours to enumerate the pruned graph 1.49+ billion matches - This begs the question … - You do not need to enumerate to draw this. - More approachable than just listing the matches.

‘To Enumerate, or Not to Enumerate’
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results ‘To Enumerate, or Not to Enumerate’ This shows the actual matches and how they are formed in the background graph. - This begs the question … - You do not need to enumerate to draw this. - More approachable by a human analyst than just listing the matches. 2,444 Output produced from the pruned subgraph using matplotlib

Comparison with Arabesque/QFrag [SOSP’15, SoCC’17]
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Comparison with Arabesque/QFrag [SOSP’15, SoCC’17] Patent 9x 6.4x 10x Youtube 4.4x 3.9x 6.6x 4.3x a d c b e f It gives us confidence that our approach is valuable. Speedup over QFrag on 60 cores, shared memory Runtime for precise pruning + enumeration

Why we do not see linear scaling?
Design Objectives Graph Pruning for Pattern Matching Evaluation Methodology Experiment Results Why we do not see linear scaling? Graph mutation Nonuniform distribution of matches in the bkg. graph Load imbalance Loss of parallelism Performance is also influenced by load imbalance. 668M

No false positives or negatives
Takeaways What makes a pruning-based approach promising? U E P Template U P U E U E P U E P No false positives or negatives Concluding remarks This the novelty of our technique is that You can verify substructures in isolation Throw away vertices early without introducing false negatives Your computation is polynomial a lot of the times, edge elimination is 100% polytime Unlike enumeration, pruning explore the minimal number of paths Paving the way to practical microscopic graph queries through pattern matching - In this work, we demonstrated graph pruning is an enabler for pattern matching on very large graphs, a problem that is computationally intractable today - You do not need to solely rely on high complexity algorithms for pattern matching - It is possible to develop, low complexity highly parallel and scalable pruning algorithms, we showed these algorithms can prune the large graph by 7 orders of magnitude in a matter of minutes - For some problems, pruning leads to the solution Smaller algorithm state – can prevent combinatorial explosion Search space reduction – enumeration is now less expensive Tahsin Reza netsyslab.ece.ubc.ca computation.llnl.gov/casc

There are Trillions of Little Forks in the Road. Choose Wisely

Similar presentations

Presentation on theme: "There are Trillions of Little Forks in the Road. Choose Wisely"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

There are Trillions of Little Forks in the Road. Choose Wisely

Similar presentations

Presentation on theme: "There are Trillions of Little Forks in the Road. Choose Wisely"— Presentation transcript:

Similar presentations

About project

Feedback