Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard.

Similar presentations


Presentation on theme: "Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard."— Presentation transcript:

1 Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard

2 Contents  Introduction  What is Graph Analytics?  Problems in Graph Analytics  Path Problems  Pattern Problems  Pattern Matching Models  Applications  Computational Models & Frameworks  Conclusion

3 What,Why and Where ?  Large Data Graphs, query processing and Iterative algorithms.  Big Data Graph Analytics  Wide Ranging Domain Page Rank Algorithm, Relationship Analysis, Maps and Bio-chemistry.  Traditional graph computation algorithms.  Scalable graph analytics is still a challenge designing algorithms, performance, indexing, geographic distribution

4 Graph Analytics  data items :labeled vertices, & relationships : labeled edges  Labelled Multi-Digraphs - Vertex labeled Digraphs - directed graph with edge labels - Fully labelled Multi-Digraph - adds edge labels - allows multiple edges between 2 vertices if labels are distinct - G(V, E, L, l) : vertices, labeled edges, label set, vertex labeling

5 Problems in Graph Analytics  Problems can broadly classified in three categories  Path Problems – Reachability – Shortest Path  Pattern Problems – Graph Simulation – Graph Morphisms  Partition Problems

6 Path Problems  Reachability – Given a graph G and two vertices, u, w ∈ G.V, – find a path (sequence of edges) connecting them path(u, w) = uv 1 l i1, v 1 v 2 l i2,..., v n w l in+1 ∈ G.E – Reachability is simply reach(u, w) = path(u, w)  Shortest Path – Given k vertices, find a minimum distance path that – includes all k vertices. – Algorithms: Dijkstra, Bellman-Ford

7 Pattern Problems  Pattern Matching Model – Given a query graph Q, match its labeled vertices to corresponding labeled vertices in a data graph G Φ : Q.V → 2 G.V – All the pattern matching models start with matching vertex labels – For each vertex u in Q.V, require ∀ u' ∈ Φ (u), l(u') = l(u)  Graph Simulation – For each label matching pair (u, u') in Q.V × G.V, – require matching children ∀ v ∈ child Q (u), v' ∈ Φ (v) ∃ such that u’ v' ∈ G.E

8 Pattern Problems  Dual Simulation – For each vertex pair (u, u') in Q.V × G.V, also require matching parents ∀ w ∈ parent Q (u), w’ ∈ Φ (w) ∃ such that w’ u' ∈ G.E  Strong Simulation – Matching patterns in G must be contained in some ball B radius(B) = diameter(Q)

9 Pattern Problems  Strict Simulation – only balls containing vertices in an initial dual match  Tight Simulation – only balls with centers corresponding to a central vertex in Q radius(B) = diameter(Q)  CAR-tight Simulation – child and parent match must satisfy

10 Pattern Problems  Graph Homomorphism – Mapping function f: Q.V → G.V such that ∀ u ∈ Q, u' = f(u), l(u') = l(u) u v ∈ Q.E implies u'v’ ∈ G.E  Sub-graph Isomorphism - If we further require the mapping functions to be bijections between Q.V and G’.V, where G’ is a sub-graph of G (G’ ⊆ G) Difference : correspondence between vertices, one-to-one correspondence

11 Applications  Graph Databases – Neo4j, Orient DB and Titan  Semantic Web – SPARQL Queries – (subject, predicate, object) subject -edge-> object e.g. SELECT ?x, ?y, ?z WHERE { x a Lawyer. y a Doctor. z a Lawyer. x friend y. x competes_with z. y friend z }  Social Media - Facebook, Twitter, LinkedIn, Amazon

12 Computational Models & Frameworks  MapReduce/Hadoop  Apache Storm  Apache Spark  Bulk Synchronous Parallel (BSP) - BSP can be quite useful for implementing graph algorithms - vertex-centric, has become popular for big data graph analytics. - significant overhead in the synchronization

13 Conclusion  Graphs are now becoming an important form of representation for data.  Implementation of improved sequential algorithms  Adding of Extensions.

14 Questions ??


Download ppt "Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard."

Similar presentations


Ads by Google