Wellington Cabrera, Carlos Ordonez (presenter)

Slides:



Advertisements
Similar presentations
1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 19 Prof. Erik Demaine.
Advertisements

Graph Theory, DFS & BFS Kelly Choi What is a graph? A set of vertices and edges –Directed/Undirected –Weighted/Unweighted –Cyclic/Acyclic.
Algorithms and Data Structures
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Lecture 21: Matrix Operations and All-pair Shortest Paths Shang-Hua Teng.
Graph & BFS.
Graphs Intro G.Kamberova, Algorithms Graphs Introduction Gerda Kamberova Department of Computer Science Hofstra University.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
CSC 2300 Data Structures & Algorithms March 30, 2007 Chapter 9. Graph Algorithms.
Improved Randomized Algorithms for Path Problems in Graphs PhD Thesis Surender Baswana Department of Computer Science & Engineering, I.I.T. Delhi Research.
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard Department of Computer Science University.
MapReduce and Graph Data Chapter 5 Based on slides from Jimmy Lin’s lecture slides ( (licensed.
A Comparison of Column, Row and Array DBMSs to Process Recursive Queries Carlos Ordonez ATT Labs.
CSCE350 Algorithms and Data Structure Lecture 17 Jianjun Hu Department of Computer Science and Engineering University of South Carolina
Big Data Analytics Carlos Ordonez. Big Data Analytics research Input? BIG DATA (large data sets, large files, many documents, many tables, fast growing)
All-Pairs Shortest Paths
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
BY: Mark Gruszecki.  What is a Recursive Query?  Definition(s) and Algorithm(s)  Optimization Techniques  Practical Issues  Impact of each Optimization.
Main Index Contents 11 Main Index Contents Graph Categories Graph Categories Example of Digraph Example of Digraph Connectedness of Digraph Connectedness.
IFS180 Intro. to Data Management Chapter 10 - Unions.
Directed Graphs 12/7/2017 7:15 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Randomized Min-Cut Algorithm
CSE 373 Topological Sort Graph Traversals
Lecture 11 Graph Algorithms
Graph Algorithms Minimum Spanning Tree (Chap 23)
CSE373: Data Structures & Algorithms Lecture 13: Topological Sort / Graph Traversals Kevin Quinn Fall 2015.
CSE 2331/5331 Topic 9: Basic Graph Alg.
Big Data Analytics in Parallel Systems
Parallel Database Systems
Directed Graphs 9/20/2018 1:45 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Definition In simple terms, an algorithm is a series of instructions to solve a problem (complete a task) We focus on Deterministic Algorithms Under the.
Routing in Packet Networks Shortest Path Routing
Dissertation for the degree of Philosophiae Doctor (PhD)
Linchuan Chen, Peng Jiang and Gagan Agrawal
See if you can write an equation from this table.
Graph & BFS.
Optimizing Recursive Queries in SQL
Randomized Algorithms CS648
Unit-5 Dynamic Programming
CS212: Object Oriented Analysis and Design
Paul Beame in lieu of Richard Anderson
CS 3700 Networks and Distributed Systems
Sorting “Example” with Insertion Sort
CS200: Algorithm Analysis
Carlos Ordonez, Predrag T. Tosic
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
Topological Sort CSE 373 Data Structures Lecture 19.
Parallel Analytic Systems
CS 3700 Networks and Distributed Systems
G-CORE: A Core for Future Graph Query Languages
Shortest Path Algorithms
Peng Jiang, Linchuan Chen, and Gagan Agrawal
See if you can write an equation from this table.
Discrete Mathematics Lecture 13_14: Graph Theory and Tree
Big Data Analytics: Exploring Graphs with Optimized SQL Queries
Chapter 24: Single-Source Shortest Paths
Optimized Algorithms for Data Analysis in Parallel Database Systems
Chapter 16 1 – Graphs Graph Categories Strong Components
Lecture 21: Matrix Operations and All-pair Shortest Paths
Wellington Cabrera Advisor: Carlos Ordonez
Lecture 10 Graph Algorithms
Carlos Ordonez, Javier Garcia-Garcia,
The Gamma Operator for Big Data Summarization on an Array DBMS
Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Carlos Ordonez, Yiqun Zhang University of Houston, USA 1.
Graph Algorithms Ch. 5 Lin and Dyer.
Review for Final Neil Tang 05/01/2008
COSC 3101A - Design and Analysis of Algorithms 12
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Exponential Functions and their Graphs
Presentation transcript:

Unified Algorithm to Solve Several Graph Problems with Relational Queries Wellington Cabrera, Carlos Ordonez (presenter) University of Houston, USA

Motivation Graph problems are among the most challenging problems in big data analytics (social networks, WWW, transportation networks). Are specialized graph systems (e.g. Giraph) required to analyze big graphs? Lot of data stored in relational databases. Query processing: studied for a long time.

Definitions Let G=(V,E) , E is stored in a relational table E(i,j,v) Table E corresponds to the adjacency matrix E, omitting zeroes weights/distances are represented by v If |E|= O(n), we say that E is sparse Let S be a vector of n graph vertices, stored on table S(j,v): v: distance, reachability, order, probability We omit v values with no information (like inifinity for distances, 0 for probabilities)

Example: Directed Graph 2 3 6 2 2 1 3 2 1 2 4 3 2 3 1 7 5 3

Bellman-Ford Reachability Topological Sort Page Rank Graph Algorithms Bellman-Ford Reachability Topological Sort Page Rank Main idea: These algorithms can be expressed as a sequence of vector-matrix multiplications How can they work in a relational database?

Graph algorithms over a semi-ring:  

Algorithm Pattern:  

Example: Vector-Matrix Multiplication with SQL queries Vector-Matrix Multiplication (+ ,* ) semiring SELECT S.j, sum(S.v * E.v) FROM Sd-1 as S join E on S.j=E.i GROUP BY j Vector-Matrix Multiplication (min, +) semiring SELECT S.j, min(S.v + E.v) FROM Sd-1 as S JOIN E on S.j=E.i In general SELECT S.j, g()(S.v ⊕ E.v)

Bellman Ford Input: Table E Output: Table Sd ( Vector with shortest distances from a source)

Reachability Input: Table E Output: Table Rd

PageRank Input: Table E Output: Table Sd ( Vector with shortest distances from a source)

Topological Sort Input: Table E Output: Table Sd ( Vector with shortest distances from a source)

Comparison of 4 algorithms:

Unified Algorithm Input: E, S0, R0, f(), g(), ⨂, ε, unionFlag Optional Input: s Output: Rd

Conclusions Graph algorithms are expressed as an iteration of SPJA queries External algorithms Not limited by RAM Strengths Sparse storage Early termination, when possible Lightweight relational queries Unified Algorithm Solves 4 important and diverse graph problems Future work: more graph algorithms

References C. Ordonez, W. Cabrera, and A. Gurram. Comparing columnar, row and array DBMSs to process recursive queries on graphs, Information Systems journal, 2016 (accepted).