Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wellington Cabrera, Carlos Ordonez (presenter)

Similar presentations


Presentation on theme: "Wellington Cabrera, Carlos Ordonez (presenter)"— Presentation transcript:

1 Unified Algorithm to Solve Several Graph Problems with Relational Queries
Wellington Cabrera, Carlos Ordonez (presenter) University of Houston, USA

2 Motivation Graph problems are among the most challenging problems in big data analytics (social networks, WWW, transportation networks). Are specialized graph systems (e.g. Giraph) required to analyze big graphs? Lot of data stored in relational databases. Query processing: studied for a long time.

3 Definitions Let G=(V,E) , E is stored in a relational table E(i,j,v)
Table E corresponds to the adjacency matrix E, omitting zeroes weights/distances are represented by v If |E|= O(n), we say that E is sparse Let S be a vector of n graph vertices, stored on table S(j,v): v: distance, reachability, order, probability We omit v values with no information (like inifinity for distances, 0 for probabilities)

4 Example: Directed Graph
2 3 6 2 2 1 3 2 1 2 4 3 2 3 1 7 5 3

5 Bellman-Ford Reachability Topological Sort Page Rank
Graph Algorithms Bellman-Ford Reachability Topological Sort Page Rank Main idea: These algorithms can be expressed as a sequence of vector-matrix multiplications How can they work in a relational database?

6 Graph algorithms over a semi-ring:

7 Algorithm Pattern:

8 Example: Vector-Matrix Multiplication with SQL queries
Vector-Matrix Multiplication (+ ,* ) semiring SELECT S.j, sum(S.v * E.v) FROM Sd-1 as S join E on S.j=E.i GROUP BY j Vector-Matrix Multiplication (min, +) semiring SELECT S.j, min(S.v + E.v) FROM Sd-1 as S JOIN E on S.j=E.i In general SELECT S.j, g()(S.v ⊕ E.v)

9 Bellman Ford Input: Table E
Output: Table Sd ( Vector with shortest distances from a source)

10 Reachability Input: Table E Output: Table Rd

11 PageRank Input: Table E Output: Table Sd ( Vector with shortest distances from a source)

12 Topological Sort Input: Table E
Output: Table Sd ( Vector with shortest distances from a source)

13 Comparison of 4 algorithms:

14 Unified Algorithm Input: E, S0, R0, f(), g(), ⨂, ε, unionFlag
Optional Input: s Output: Rd

15 Conclusions Graph algorithms are expressed as an iteration of SPJA queries External algorithms Not limited by RAM Strengths Sparse storage Early termination, when possible Lightweight relational queries Unified Algorithm Solves 4 important and diverse graph problems Future work: more graph algorithms

16 References C. Ordonez, W. Cabrera, and A. Gurram. Comparing columnar, row and array DBMSs to process recursive queries on graphs, Information Systems journal, 2016 (accepted).


Download ppt "Wellington Cabrera, Carlos Ordonez (presenter)"

Similar presentations


Ads by Google