Download presentation
Presentation is loading. Please wait.
Published byJoan Watts Modified over 9 years ago
1
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard
2
Contents Introduction What is Graph Analytics? Problems in Graph Analytics Path Problems Pattern Problems Pattern Matching Models Applications Computational Models & Frameworks Conclusion
3
What,Why and Where ? Large Data Graphs, query processing and Iterative algorithms. Big Data Graph Analytics Wide Ranging Domain Page Rank Algorithm, Relationship Analysis, Maps and Bio-chemistry. Traditional graph computation algorithms. Scalable graph analytics is still a challenge designing algorithms, performance, indexing, geographic distribution
4
Graph Analytics data items :labeled vertices, & relationships : labeled edges Labelled Multi-Digraphs - Vertex labeled Digraphs - directed graph with edge labels - Fully labelled Multi-Digraph - adds edge labels - allows multiple edges between 2 vertices if labels are distinct - G(V, E, L, l) : vertices, labeled edges, label set, vertex labeling
5
Problems in Graph Analytics Problems can broadly classified in three categories Path Problems – Reachability – Shortest Path Pattern Problems – Graph Simulation – Graph Morphisms Partition Problems
6
Path Problems Reachability – Given a graph G and two vertices, u, w ∈ G.V, – find a path (sequence of edges) connecting them path(u, w) = uv 1 l i1, v 1 v 2 l i2,..., v n w l in+1 ∈ G.E – Reachability is simply reach(u, w) = path(u, w) Shortest Path – Given k vertices, find a minimum distance path that – includes all k vertices. – Algorithms: Dijkstra, Bellman-Ford
7
Pattern Problems Pattern Matching Model – Given a query graph Q, match its labeled vertices to corresponding labeled vertices in a data graph G Φ : Q.V → 2 G.V – All the pattern matching models start with matching vertex labels – For each vertex u in Q.V, require ∀ u' ∈ Φ (u), l(u') = l(u) Graph Simulation – For each label matching pair (u, u') in Q.V × G.V, – require matching children ∀ v ∈ child Q (u), v' ∈ Φ (v) ∃ such that u’ v' ∈ G.E
8
Pattern Problems Dual Simulation – For each vertex pair (u, u') in Q.V × G.V, also require matching parents ∀ w ∈ parent Q (u), w’ ∈ Φ (w) ∃ such that w’ u' ∈ G.E Strong Simulation – Matching patterns in G must be contained in some ball B radius(B) = diameter(Q)
9
Pattern Problems Strict Simulation – only balls containing vertices in an initial dual match Tight Simulation – only balls with centers corresponding to a central vertex in Q radius(B) = diameter(Q) CAR-tight Simulation – child and parent match must satisfy
10
Pattern Problems Graph Homomorphism – Mapping function f: Q.V → G.V such that ∀ u ∈ Q, u' = f(u), l(u') = l(u) u v ∈ Q.E implies u'v’ ∈ G.E Sub-graph Isomorphism - If we further require the mapping functions to be bijections between Q.V and G’.V, where G’ is a sub-graph of G (G’ ⊆ G) Difference : correspondence between vertices, one-to-one correspondence
11
Applications Graph Databases – Neo4j, Orient DB and Titan Semantic Web – SPARQL Queries – (subject, predicate, object) subject -edge-> object e.g. SELECT ?x, ?y, ?z WHERE { x a Lawyer. y a Doctor. z a Lawyer. x friend y. x competes_with z. y friend z } Social Media - Facebook, Twitter, LinkedIn, Amazon
12
Computational Models & Frameworks MapReduce/Hadoop Apache Storm Apache Spark Bulk Synchronous Parallel (BSP) - BSP can be quite useful for implementing graph algorithms - vertex-centric, has become popular for big data graph analytics. - significant overhead in the synchronization
13
Conclusion Graphs are now becoming an important form of representation for data. Implementation of improved sequential algorithms Adding of Extensions.
14
Questions ??
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.