Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

I/O and Space-Efficient Path Traversal in Planar Graphs Craig Dillabaugh, Carleton University Meng He, University of Waterloo Anil Maheshwari, Carleton.
Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
Graphs - II CS 2110, Spring Where did I leave that book?
Graph A graph, G = (V, E), is a data structure where: V is a set of vertices (aka nodes) E is a set of edges We use graphs to represent relationships among.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
#1© K.Goczyła GRAPHS Definitions and data structuresDefinitions and data structures Traversing graphsTraversing graphs Searching for paths in graphsSearching.
Dynamic Pickup and Delivery with Transfers* P. Bouros 1, D. Sacharidis 2, T. Dalamagas 2, T. Sellis 1,2 1 NTUA, 2 IMIS – RC “Athena” * To appear in SSTD’11.
Evaluating Path Queries over Route Collections Panagiotis Bouros NTUA, Greece (supervised by Y. Vassiliou)
Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical.
Evaluating Reachability Queries over Path Collections P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical.
CS Lecture 9 Storeing and Querying Large Web Graphs.
Lecture 14: Graph Algorithms Shang-Hua Teng. Undirected Graphs A graph G = (V, E) –V: vertices –E : edges, unordered pairs of vertices from V  V –(u,v)
CS728 Lecture 16 Web indexes II. Last Time Indexes for answering text queries –given term produce all URLs containing –Compact representations for postings.
©Brooks/Cole, 2003 Chapter 12 Abstract Data Type.
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
Using Search in Problem Solving
Transforming Infix to Postfix
Graphs & Graph Algorithms Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Trip Planning Queries F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, S.-H. Teng Boston University.
Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
A Level Computer Science Topic 9: Data Structures T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
VLDB '2006 Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong University of Science and Technology, Hong Kong) Victor.
Αποτίμηση Ερωτημάτων σε Συλλογές Διαδρομών (Evaluating Queries on Route Collections) Παναγιώτης Μπούρος Ενδιάμεση Κρίση Επιβλέπων: Ι. Βασιλείου.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Representing and Using Graphs
Graphs. Made up of vertices and arcs Digraph (directed graph) –All arcs have arrows that give direction –You can only traverse the graph in the direction.
Easiest-to-Reach Neighbor Search Fatimah Aldubaisi.
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
Rooted Tree a b d ef i j g h c k root parent node (self) child descendent leaf (no children) e, i, k, g, h are leaves internal node (not a leaf) sibling.
COSC 2007 Data Structures II
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
Graphs Upon completion you will be able to:
An Improved DFA for Fast Regular Expression Matching Author : Domenico Ficara 、 Stefano Giordano 、 Gregorio Procissi Fabio Vitucci 、 Gianni Antichi 、 Andrea.
NETWORK FLOWS Shruti Aggrawal Preeti Palkar. Requirements 1.Implement the Ford-Fulkerson algorithm for computing network flow in bipartite graphs. 2.For.
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
CIS 825 Lecture 9. Minimum Spanning tree construction Each node is a subtree/fragment by itself. Select the minimum outgoing edge of the fragment Send.
1 GRAPHS – Definitions A graph G = (V, E) consists of –a set of vertices, V, and –a set of edges, E, where each edge is a pair (v,w) s.t. v,w  V Vertices.
Graphs - II CS 2110, Spring Where did David leave that book? 2.
Graphs A New Data Structure
Dynamic Pickup and Delivery with Transfers
Thinking about Algorithms Abstractly
DATA STRUCTURES AND OBJECT ORIENTED PROGRAMMING IN C++
CSE 2331/5331 Topic 9: Basic Graph Alg.
Csc 2720 Instructor: Zhuojun Duan
Probabilistic Data Management
Query-Friendly Compression of Graph Streams
Spatio-temporal Pattern Queries
CS120 Graphs.
CMSC 341 Lecture 21 Graphs (Introduction)
More Graph Algorithms.
Graphs & Graph Algorithms 2
Problem Solving and Searching
Week nine-ten: Trees Trees.
Problem Solving and Searching
COMP171 Depth-First Search.
EMIS 8374 Search Algorithms Updated 9 February 2004
Graph Traversal Lecture 18 CS 2110 — Spring 2019.
Important Problem Types and Fundamental Data Structures
3.2 Graph Traversal.
Chapter 14 Graphs © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture 10 Graph Algorithms
EMIS 8374 Search Algorithms Updated 12 February 2008
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Presentation transcript:

Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management of Information Systems – R.C. Athena 3 University of Peloponnese

Outline Introduction Related work Introduce path representation of a graph Present an index for path representations Extend depth-first search for answering “find a path” reachability queries Experimental study Conclusion and Future work

Introduction Graphs, modelling complex problems –Spatial & road networks –Social networks –Semantic Web Important query type, reachability –“find a path” reachability query Find a path between two given graph nodes

Answering “find a path” reachability queries Two extreme approaches –No precomputation Exploit graph edges Search algorithm –Full precompution Store path information in TC of the graph Single lookups Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases

Answering “find a path” reachability queries No precomputation –Tarjan, single source path expression problem Precomputation –Agrawal et al., encode each path between any graph nodes –Barton and Zezula, graph segmentation - ρ-Index Labeling schemes –Determine whether exists a path, but cannot identify it Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Tarjan single source path

Answering “find a path” reachability queries No precomputation –Tarjan, single source path expression problem Precomputation –Agrawal et al., encode each path between any graph nodes –Barton and Zezula, graph segmentation - ρ-Index Labeling schemes –Determine whether exists a path, but cannot identify it Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Agrawal et al. encoding Tarjan single source path

Answering “find a path” reachability queries No precomputation –Tarjan, single source path expression problem Precomputation –Agrawal et al., encode each path between any graph nodes –Barton and Zezula, graph segmentation - ρ-Index Labeling schemes –Determine whether exists a path, but cannot identify it Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Agrawal et al. encoding Barton and Zezula Tarjan single source path

Answering “find a path” reachability queries No precomputation –Tarjan, single source path expression problem Precomputation –Agrawal et al., encode each path between any graph nodes –Barton and Zezula, graph segmentation - ρ-Index Labeling schemes –Determine whether exists a path, but cannot identify it Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Agrawal et al. encoding Barton and Zezula Tarjan single source path

Answering “find a path” reachability queries Our idea –Represent the graph as a set of paths –Each path contains precomputed answers –Precompute and store part of path information in TC of the graph In the middle –No need to compute TC Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Agrawal et al. encoding Barton and Zezula Tarjan single source path

Answering “find a path” reachability queries Our idea –Represent the graph as a set of paths –Each path contains precomputed answers –Precompute and store part of path information in TC of the graph In the middle –No need to compute TC Full precomputation No precomputation single lookup search algorithm answering time increases space requirement increases Agrawal et al. encoding Barton and Zezula Tarjan single source path our approach

In brief Propose a novel representation of a graph as a set of paths (path representation) Present an index for providing efficient access in representation (P-Index) Extend depth-first search to work with paths in answering “find a path” reachability queries (pdfs) Preliminary experimental evaluation

Graph – Representations - Indices G(V,E) edges E(G) paths P(G) Adjacency list P-Index graph representation index

Path representation Set of paths –Stores part of path information in TC of a graph –Combines graph edges to efficiently answer “find a path” reachability queries –Preserves reachability information Each graph edge is contained in at least one path Construct graph by merging paths Not unique

Path representation – Example A B C DF K EH G

A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) G P(G) = {p1,p2,p3,p4}

Path representation – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) G P(G) = {p1,p2,p3,p4}

P-Index Consider graph G(V,E) and its path representation P(G) –For each node v in V retain paths[v] list of paths in P(G) containing v –P-Index(G) = {paths[v i ]}, for each v i in V

P-Index – Example p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) A B C DF K EH G P(G)

P-Index – Example p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) A B C DF K EH Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) G P(G)

P-Index – Example p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) A B C DF K EH Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) G P(G)

Algorithm pdfs Answers “find a path” reachability queries Extends depth-first search to work with paths –For each node, visit Not only its children Also, its successors in paths of P(G) Input: graph G(V,E), P(G), P-Index(G) –Current path stack curPath Method: –If exists path in P(G) where source before target –While curPath not empty Read top node u of curPath Read a path p containing top u – If no path left, pop u Else for each node v in p after u –Case 1: if exists path in P(G) where v before target then FOUND path –Case 2: if visited[v]=FALSE then push it in curPath, visited[v]=TRUE –Case 3: if visited[v]=TRUE then ignore rest of nodes in p

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) Query: FindAPath(B,K) G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) p1 contains B Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) visit C,E no path in P(G) contains either C or E before target K Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) E contained in p1 at the end Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) E contained in p1 at the end pop E Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) p1 contains C But E already visited, next path Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) p1 contains C But E already visited, next path p2 contains C Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) p1 contains C But E already visited, next path p2 contains C consider D, exists path in P(G) containing D before target K: p4 Current search node Visited node G

pdfs – Example A B C DF K EH p1(A,B,C,E) p2(C,D,B,F) p3(C,H) p4(D,K) Ap1 Bp1, p2 Cp1, p2, p3 Dp2, p4 Ep1 Fp2 Hp3 Kp4 P-Index(G) P(G) FOUND path from B to K (B,C,D,K) G

Experimental study Sets of random graphs Construct path representations –Traverse graph in depth-first manner starting from several nodes –Terminate when all graph edges included –Promote construction of long paths Reusing graph edges Experimental parameters –Graph nodes |V|: 10 4, 5*10 4, 10 5, 5*10 5, 10 6 –Avg degree d = |E|/|V|: 2, 3, 4, 5, 10 –Max length of paths in P(G) L max : 10, 20, 30, 40, 50

Varying max path length Graph G: |V|=100,000 & d=4, 5 different path representations 1,000 “find a path” queries As L max increases –Larger part of path information included –Fewer but longer paths –pdfs visits more node in each iteration –More possibly exists path where node u before target –Storage requirements increase

Varying max path length Graph G: |V|=100,000 & d=4, 5 different path representations 1,000 “find a path” queries As L max increases –Larger part of path information included –Fewer but longer paths –pdfs visits more nodes in each iteration –More possibly exists path where node u before target –Storage requirements increase

Varying max path length Graph G: |V|=100,000 & d=4, 5 different path representations 1,000 “find a path” queries As L max increases –Larger part of path information included –Fewer but longer paths –pdfs visits more nodes in each iteration –More possibly exists path where node u before target –Storage requirements increase

Varying max path length Graph G: |V|=100,000 & d=4, 5 different path representations 1,000 “find a path” queries As L max increases –Larger part of path information included –Fewer but longer paths –pdfs visits more nodes in each iteration –More possibly exists path where node u before target –Storage requirements increase ~2.5 more edges

Varying max path length Graph G: |V|=100,000 & d=4, 5 different path representations 1,000 “find a path” queries As L max increases –Larger part of path information included –Fewer but longer paths –pdfs visits more nodes in each iteration –More possibly exists path where node u before target –Storage requirements increase ~2.5 more edges ~3.5 more edges

Varying avg degree Initial graph G: |V|=100,000 & d=2 & L max =30 –Progressively add edges 1,000 “find a path” queries More dense graph –Larger number of long paths –Fewer short paths

Varying number of graph nodes 5 graphs: d=4 & L max =30 1,000 “find a path” queries |V| increases –Paths have fewer common nodes –Less possibly exists a path in P(G) where node u before target

Conclusions and Future work Conclusions –Propose a novel representation of a graph as a set of paths –Present P-Index –Extend depth-first search to work with paths in answering “find a path” reachability queries –Preliminary experimental evaluation Future work –Answer “find a path” with length constraint reachability queries –Updates –Introduce cost model for path representation Construction of the set of paths Answering queries cost Updating representation cost

Questions ?

Evaluating “find a path” reachability queries Additional slides

Find a path from S to T Answering queries – Basic idea

S contained in p 1,p 2,p 3 Answering queries – Basic idea

Consider p 2 part – X last node Answering queries – Basic idea

S contained in p 4,p 5 Answering queries – Basic idea

Consider p 5 part – Z last node Answering queries – Basic idea

Z only contained in p 5 – backtrack to Y Answering queries – Basic idea

Y contained in p 6 Answering queries – Basic idea

Consider p 6 part – FOUND target T Answering queries – Basic idea

Varying number of graph nodes