Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical.

Slides:



Advertisements
Similar presentations
1 Directed Depth First Search Adjacency Lists A: F G B: A H C: A D D: C F E: C D G F: E: G: : H: B: I: H: F A B C G D E H I.
Advertisements

Reachability Querying: An Independent Permutation Labeling Approach (published in VLDB 2014) Presenter: WEI, Hao.
Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
Graphs COP Graphs  Train Lines Gainesville OcalaDeltona Daytona Melbourne Lakeland Tampa Orlando.
Efficient Processing Regular Queries In Shared-Nothing Parallel Database Systems Using Tree- And Structural Indexes (ADBIS 2007, Bulgaria) Vu Le Anh, Attilla.
Graph Theory, DFS & BFS Kelly Choi What is a graph? A set of vertices and edges –Directed/Undirected –Weighted/Unweighted –Cyclic/Acyclic.
Graph A graph, G = (V, E), is a data structure where: V is a set of vertices (aka nodes) E is a set of edges We use graphs to represent relationships among.
Graph Searching CSE 373 Data Structures Lecture 20.
Graph Search Methods A vertex u is reachable from vertex v iff there is a path from v to u
CSE 373: Data Structures and Algorithms Lecture 19: Graphs III 1.
Graph Search Methods A vertex u is reachable from vertex v iff there is a path from v to u
Graph Search Methods Spring 2007 CSE, POSTECH. Graph Search Methods A vertex u is reachable from vertex v iff there is a path from v to u. A search method.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
CS171 Introduction to Computer Science II Graphs Strike Back.
Dynamic Pickup and Delivery with Transfers* P. Bouros 1, D. Sacharidis 2, T. Dalamagas 2, T. Sellis 1,2 1 NTUA, 2 IMIS – RC “Athena” * To appear in SSTD’11.
January 11, Csci 2111: Data and File Structures Week1, Lecture 1 Introduction to the Design and Specification of File Structures.
Evaluating Path Queries over Route Collections Panagiotis Bouros NTUA, Greece (supervised by Y. Vassiliou)
July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management.
Evaluating Reachability Queries over Path Collections P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical.
CS Lecture 9 Storeing and Querying Large Web Graphs.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
CS728 Lecture 16 Web indexes II. Last Time Indexes for answering text queries –given term produce all URLs containing –Compact representations for postings.
Presented by Ozgur D. Sahin. Outline Introduction Neighborhood Functions ANF Algorithm Modifications Experimental Results Data Mining using ANF Conclusions.
Blind Search-Part 2 Ref: Chapter 2. Search Trees The search for a solution can be described by a tree - each node represents one state. The path from.
Using Search in Problem Solving
Graphs & Graph Algorithms Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Web Search – Summer Term 2006 II. Information Retrieval (Basics) (c) Wolfgang Hürst, Albert-Ludwigs-University.
 Last lesson  Graphs  Today  Graphs (Implementation, Traversal)
More Graph Algorithms Weiss ch Exercise: MST idea from yesterday Alternative minimum spanning tree algorithm idea Idea: Look at smallest edge not.
Graphs Chapter 28 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Solving problems by searching
Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense.
Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.
Authors: Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan Presented By: Aruna Keyword Search on External Memory Data Graphs.
Cost-based Optimization of Graph Queries Silke Trißl Humboldt-Universität zu Berlin Knowledge Management in Bioinformatics IDAR 2007.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
MA/CSSE 473 Day 12 Insertion Sort quick review DFS, BFS Topological Sort.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
WAES 3308 Numerical Methods for AI
Αποτίμηση Ερωτημάτων σε Συλλογές Διαδρομών (Evaluating Queries on Route Collections) Παναγιώτης Μπούρος Ενδιάμεση Κρίση Επιβλέπων: Ι. Βασιλείου.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Dimitrios Skoutas Alkis Simitsis
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Path-Hop: efficiently indexing large graphs for reachability queries Tylor Cai and C.K. Poon CityU of Hong Kong.
Rectlinear Block Packing Using the O-tree Representation Yingxin Pang Koen Lampaert Mindspeed Technologies Chung-Kuan Cheng University of California, San.
Efficient Processing of Top-k Spatial Preference Queries
A Combination of Trie-trees and Inverted files for the Indexing of Set-valued Attributes Manolis Terrovitis (NTUA) Spyros Passas (NTUA) Panos Vassiliadis.
Speeding Up Warehouse Physical Design Using A Randomized Algorithm Minsoo Lee Joachim Hammer Dept. of Computer & Information Science & Engineering University.
L3-Network Algorithms L3 – Network Algorithms NGEN06(TEK230) – Algorithms in Geographical Information Systems by: Irene Rangel, updated Nov by Abdulghani.
SocialVoD: a Social Feature-based P2P System Wei Chang, and Jie Wu Presenter: En Wang Temple University, PA, USA IEEE ICPP, September, Beijing, China1.
1 Directed Graphs Chapter 8. 2 Objectives You will be able to: Say what a directed graph is. Describe two ways to represent a directed graph: Adjacency.
COSC 2007 Data Structures II
Graphs. Introduction Graphs are a collection of vertices and edges Graphs are a collection of vertices and edges The solid circles are the vertices A,
1/16/20161 Introduction to Graphs Advanced Programming Concepts/Data Structures Ananda Gunawardena.
Graph Indexing: A Frequent Structure-­based Approach 指導老師:曾新穆 教授 組員:李彥寬、洪世敏、丁鏘巽、 黃冠霖、詹博丞 日期: 2013/11/ /11/141.
Graph Representations And Traversals. Graphs Graph : – Set of Vertices (Nodes) – Set of Edges connecting vertices (u, v) : edge connecting Origin: u Destination:
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
Graph Searching CSIT 402 Data Structures II. 2 Graph Searching Methodology Depth-First Search (DFS) Depth-First Search (DFS) ›Searches down one path as.
Biointelligence Lab School of Computer Sci. & Eng. Seoul National University Artificial Intelligence Chapter 8 Uninformed Search.
CSC317 1 At the same time: Breadth-first search tree: If node v is discovered after u then edge uv is added to the tree. We say that u is a predecessor.
Graphs - II CS 2110, Spring Where did David leave that book? 2.
Review Graph Directed Graph Undirected Graph Sub-Graph Spanning Sub-Graph Degree of a Vertex Weighted Graph Elementary and Simple Path Link List Representation.
Dynamic Pickup and Delivery with Transfers
Discrete Maths 9. Graphs Objective
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Evaluating Reachability Queries over Path Collections* P. Bouros 1, S. Skiadopoulos 2, T. Dalamagas 3, D. Sacharidis 3, T. Sellis 1,3 1 National Technical University of Athens 2 University of Peleponnese 3 Institute for Management of Information Systems – R.C. Athena HDMS'09 * Long version of SSDBM’09 paper

Introduction (I) Several applications store and query large collections of data sequences – Recent advances in GIS and geoservices resulted in large volumes of routes (e.g., Points of Interest (POIs) sequences) Route collections – Points => nodes – Sequences => routes HDMS'09

Introduction (II) Web sites retain huge collections of routes – ShareMyRoutes.com – TravelByGPS.com People visiting Athens – Track their sightseeing – Create routes of interesting places Frequent updates – Users upload new routes HDMS'09

Problem Route collections 1.Too large to fit in main memory 2.Frequently updated, adding new routes Reachability queries – Q: path from Academy to Zappeion – A: Academy -> University of Athens (change to route p 2 ) -> Parliament-> Zappeion HDMS'09

Problem Route collections 1.Too large to fit in main memory 2.Frequently updated, adding new routes Reachability queries – Q: path from Academy to Zappeion – A: Academy -> University of Athens (change to route p 2 ) -> Parliament-> Zappeion HDMS'09

Why not a graph-based solution? Transform route collection P into graph G P 1)Searching: depth or breadth-first search Low storage and maintance cost Slow query evaluation 2)Enconding transitive closure: Fast query evaluation Expensive precomputation, not for frequently updated graphs – 2-hop [CH+02], HOPI [STW05] – DAGs: Geometric-based & partitioning 2-hop [CY+06,08], interval LB [AB+89] – GRIPP [TL07] HDMS'09

Outline The pfs algorithm – Indexing route collections – Indexing route transitions Index maintenance Experimental evaluation Conclusions and Further work HDMS'09

The pfs algorithm (I) Path-first search, basic idea: – Examine part of routes at once, not single nodes Extend depth-first search – Work with routes instead of graph edges For each route p containing current node v – Visit each node after v (successor) in p – Push to dfs stack set of successors at once HDMS'09

The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfs algorithm (II) Find a path from node F to C HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, N, B, C)

P -Index Inverted index on route collections – For each node store routes containing it Access paths containing current node Better termination condition => pfsP – Identify a path containing current node before target HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) noderoutes list A,, B C D,, ……

P -Index Inverted index on route collections – For each node store routes containing it Access paths containing current node Better termination condition => pfsP – Identify a path containing current node before target HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) noderoutes list A,, B C D,, ……

P -Index Inverted index on route collections – For each node store routes containing it Access routes containing current node Better termination condition => pfsP – Identify a route containing current node before target HDMS'09 noderoutes list A,, B C D,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsP algorithm Find a path from F to T HDMS'09 noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsP algorithm Find a path from F to T HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsP algorithm Find a path from F to T HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsP algorithm Find a path from F to T Answer: (F, D, N, B, T) HDMS'09 JOIN noderoutes list ……. F,, …… T p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -graph (I) HDMS'09 Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

H -graph (I) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K)

H -graph (I) HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

H -graph (I) HDMS'09 p1p1 (A, B, C, D, J) p4p4 (D, N, B, F, K) Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H- graph

H -graph (I) HDMS'09 Graph representation of collection – Nodes Routes of the collection – Edges (p i, p j, v) All possible transitions among routes Edge label v => share node, link Better termination condition => pfsH – Identify an edge on H - graph

H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -graph (II) Find a path from node F to J HDMS'09 p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, J)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

H -Index In practice, H -Index, adj. lists of H -graph HDMS'09 routeedges list p1p1,,, p2p2,,,, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) p1p1 p2p2 B,D

The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K)

The pfsH algorithm Find a path from F to J, routes[F] = {,, } routes[J] = { } HDMS'09 routeedges list p2p2,,,,,,, p4p4,,,,, p5p5, …… JOIN p1p1 (A, B, C, D, J) p2p2 (A, F, D, N, B, T) p3p3 (N, L, M) p4p4 (D, N, B, F, K) p5p5 (A, F, K) Answer: (F, D, J)

Index maintenance P -Index, H -Index as inverted files on disk – Updates -> adding new routes – Not consider each new route separately – Batch updates, consider set of new routes Basic idea: – Build memory resident P -Index, H -Index for new routes – Merge disk-based indices with memory resident ones HDMS'09

Outline The pfs algorithm – Indexing route collections – Indexing route transitions Index maintenance Experimental evaluation Conclusions and Further work HDMS'09

Setup Synthetic route collections – |P|, l avg, |V|, zipf, U Compare – Convert collection to graph, dfs & adjacency lists – pfsP & P -Index – pfsH & P -Index, H -Index Construction cost, query evaluation, vary one of |P|, l avg, |V|, zipf Maintenance cost, vary U HDMS'09

Index construction HDMS'09 |P| (x 10 3 ) l avg = 10, |V| = , zipf = 0.8 |V| (x 10 3 ) |P| = , l avg = 10, zipf = 0.8

Query evaluation (I) HDMS'09 |P| (x 10 3 ) l avg = 10, |V| = , zipf = 0.8 l avg |P| = , |V| = , zipf = 0.8

Query evaluation (II) HDMS'09 |V| (x 10 3 ) |P| = , l avg = 10, zipf = 0.8 zipf |P| = , l avg = 10, |V| =

Index maintenance HDMS'09 |P| = , l avg = 10, |V| = , zipf = 0.8 U (%)

Conclusions Reachability queries over frequently updated route collections The path-first search (pfs) algorithm – Indexing route collections: P -Index & pfsP – Indexing route transitions: H -Index & pfsH Handling frequent updates, adding new routes Experimental evaluation – P -Index & pfsP, low construction & maintance cost – H -Index, P -Index & pfsH, fast query evaluation HDMS'09

Further work Ongoing – New index that combines P -Index & H -Index advantages Low constructing and maintenance cost Fast query evaluation Future work – Other types of queries Considering constraints HDMS'09

Thank you! HDMS'09