By: Sibo Wang, Xiaokui Xiao, Yin Yang, Wenqing Lin

Slides:



Advertisements
Similar presentations
Chapter 4 Partition I. Covering and Dominating.
Advertisements

Lecture 24 MAS 714 Hartmut Klauck
Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Graphs Chapter 20 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
GOLOMB RULERS AND GRACEFUL GRAPHS
Combinatorial Algorithms
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Approximation Algorithms: Combinatorial Approaches Lecture 13: March 2.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Branch and Bound Similar to backtracking in generating a search tree and looking for one or more solutions Different in that the “objective” is constrained.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
Graphs Chapter 20 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
1 Shortest Path Calculations in Graphs Prof. S. M. Lee Department of Computer Science.
V. V. Vazirani. Approximation Algorithms Chapters 3 & 22
Copyright © Cengage Learning. All rights reserved.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Design Techniques for Approximation Algorithms and Approximation Classes.
MCS 312: NP Completeness and Approximation algorithms Instructor Neelima Gupta
COSC 2007 Data Structures II Chapter 14 Graphs III.
May 1, 2002Applied Discrete Mathematics Week 13: Graphs and Trees 1News CSEMS Scholarships for CS and Math students (US citizens only) $3,125 per year.
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Suppose G = (V, E) is a directed network. Each edge (i,j) in E has an associated ‘length’ c ij (cost, time, distance, …). Determine a path of shortest.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
Graphs. Representations of graphs : undirected graph An undirected graph G have five vertices and seven edges An adjacency-list representation of G The.
The Theory of NP-Completeness
More NP-Complete and NP-hard Problems
CS 201 Compiler Construction
Applied Discrete Mathematics Week 13: Graphs
Greedy Technique.
FORA: Simple and Effective Approximate Single­-Source Personalized PageRank Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, Yin Yang School of Information.
Probabilistic Data Management
CS 201 Compiler Construction
First-Cut Techniques Lecturer: Jing Liu
Peer-to-Peer and Social Networks
CS223 Advanced Data Structures and Algorithms
1.3 Modeling with exponentially many constr.
ICS 353: Design and Analysis of Algorithms
Graph.
Randomized Algorithms CS648
Graphs Chapter 13.
CSE 373: Data Structures and Algorithms
Graphs.
Graph Operations And Representation
Discrete Math II Howon Kim
Applied Combinatorics, 4th Ed. Alan Tucker
András Sebő and Anke van Zuylen
Diversified Top-k Subgraph Querying in a Large Graph
Algorithms for Budget-Constrained Survivable Topology Design
1.3 Modeling with exponentially many constr.
Algorithms (2IL15) – Lecture 7
Chapter 6 Network Flow Models.
CSE 373: Data Structures and Algorithms
The Full Steiner tree problem Part Two
The Theory of NP-Completeness
CS154, Lecture 16: More NP-Complete Problems; PCPs
On the Graph Decomposition
Warm Up – Tuesday Find the critical times for each vertex.
The Complexity of Approximation
Agenda Review Lecture Content: Shortest Path Algorithm
Chapter 9 Graph algorithms
Constructing a m-connected k-Dominating Set in Unit Disc Graphs
Minimum Spanning Trees
Presentation transcript:

By: Sibo Wang, Xiaokui Xiao, Yin Yang, Wenqing Lin Effective Indexing for Approximate Constrained Shortest Path Queries on Large Road Networks By: Sibo Wang, Xiaokui Xiao, Yin Yang, Wenqing Lin

Motivation Constrained Shortest Path(CSP) query: Constraint?-+ finds the shortest path from an origin to destination based on the given constraint Constraint?-+ Time Distance Toll-Payment Safety What about+ the Large Networks? Solutions available. But: Mostly approximate or Prohibitively Expensive Expensive ,Why? Fail to utilize the special properties of road networks Most of them process queries without INDEXES

Formal Definitions α-CSP QUERY: Given an origin s, a destination t, a cost constraint θ, and an approximation ratio α, an α-CSP QUERY returns a path P, such that c(P) ≤ θ, and , and ℓ(P) ≤ α.ℓ(Popt), where Popt is the optimal answer to the exact CSP query with origin s, destination t, and cost constraint θ. Path v1,v3,v5 = CSP Path v1,v2,v5= α-CSP

(α-DOMINANCE): Let P1 and P2 be two paths connecting the same origin and destination vertices. P1 α-dominates P2 iff c(P1) ≤ c(P2) and ℓ(P1) ≤ α · ℓ(P2). For instance, consider paths P1 = {(v1, v2), (v2, v5)} and P2 = {(v1, v3), (v3,v5)} in the above example. When α = 1.2, P1 α-dominates P2, since the cost of P1 is c(P1) = 5 ≤ c(P2) = 6,and the length of P1 is ℓ(P1) = 6 ≤ α · ℓ(P2) = 1.2 × 5 = 6. P1 P2

Skyline set: a set S of paths is called a skyline set, iff no path in S is α-dominated by another in the same set S. We say that a path P is a skyline path if P is in a skyline set. Note that if two paths P1 and P2 have the same cost, it is possible that they α- dominate each other, in which case we put the path with smaller length in a skyline set. For exact CSP, we define dominance relationship and skyline in the same way, by simply fixing α = 1.

Problem Statement Available solutions for CSP on larger networks. Exact CSP without Index α -CSP without Index Exact CSP with Index α -CSP with index But all of the available solutions are either EXPENSIVE or IMPRACTICAL.

Proposed Solution COLA (COnstrained LAbeling) What is COLA? index-based approximate CSP processing for large road networks Mainly exploits two important properties of road network: real road networks are often(roughly) planar, thus, can be effectively split into partitions relatively small number of landmark vertices in the road network that commonly appear in CSP results. How does it work? COLA PARTITIONS the road network and constructs an OVERLAY GRAPH on top of the partitions. The INDEX STRUCTURE is then built on the overlay graph, whose size is much SMALLER than the original graph Then, α-CSP query processing is done with a pair of origin and destination vertices s and t ϵ G and a cost constraint θ, using overlay graph and the COLA index to get the Shortest Path

OVERLAY GRAPH : Given an input graph G, a partitioning T of G, and the query approximation ratio α, a graph Go = ( Vo, Eo) is an overlay graph of G with respect to T, if it satisfies the following three conditions : Vo consists of all boundary vertices with respect to T; For each edge eo ε Eo that starts at a vertex v and ends at vertex v’, there exists a path P in G that goes from v to v’ such that c(P) = c( eo ) and l(P)= l( eo ); For any pair of origin and destination vertices s, t ε Go, and any path P in G from s to t, there exists a path Po in G​o that goes from s to t that α-dominates P, i.e. , c(Po) ) ≤ c(P) and l(Po) ) ≤ α.l(P). How does it work? It compresses the input graph by: Including only the boundary vertices of the partitions and removing all other vertices Using edges to represent paths in G Reducing the number of edges by removing paths that are α-dominated by others

COLA-Index What is a COLA-index? Definition: Given an overlay graph Go, a COLA index contains label sets Bin (vo) and Bout(vo) for each vertex vo E G satisfying the following conditions: Each entry in Bin (vo) corresponds to a path from another vertex in Go to vo. Each entry in Bout (vo) corresponds to a path from vo to another vertex in Go. For any path P between any two vertices so, to, ε Vo ( P may contain vertices in V/Vo) , the COLA index contains both an out-label in Bout(so) with cost co and length lo and an in-label in Bin(to) with cost ci and length li such that co+ci ≤ c(P) and lo+li ≤ α.l(P) Example:

Let a =1.1 and P1o =(v2,v3), P2o=(v3,v5) and P3o=(v2,v5). Let P1’ , P2’ and P3’ be three trivial paths from v2 to v2, v3 to v3 and v5 to v5, with zero cost/length, respectively. Bout(v2) ={ P1o,P1’} , Bin(v2) = {P1’} Bout(v3) ={ P2’} , Bin(v3) = {P2’} Bout(v5) ={ P3’} , Bin(v5) = {P2o,P3o,P3’} To satisfy condition 3: Five paths concerning nodes in Go: P1=((v2,v4),(v4,v5)), P2=(v2,v3), P3=(v3,v5), P4=(v2,v5), P5=((v2,v3),(v3,v5)) P1 has cost(c1) = 6 and length(l1) = 5 But P1o in Bout(v2) and P2o in Bin(v5) has combined cost less than path P1 i.e. c(P1o) + c(P2o) ≤ c(P1) and l(P1o)+l(P2o) ≤ α .l(p1) P3o P1o P2o

Query Processing The main idea of COLA query processing is to build Bout(s) and Bin(t) during query time, using the COLA index as well as subgraphs Gs,Gt ∈ T containing s and t, respectively. Bout(s) and Bin(t) must satisfy that the α-CSP result can be obtained by concatenating a path from Bout(s) and another from Bin(t). The main challenge thus lies in the computation of Bout(s) and Bin(t).

Given an α-CSP query from v1 to v5 with cost threshold θ = 7 and α=1.1 Check v1 and v5 are boundary vertices? Since v5 is a boundary vertex, Bin(v5) = {P◦2 , P◦3 , P′3 } from COLA-index. But v1 is not a boundary vertex, its out-label set Bout(v1) needs to be computed on the fly COLA initiates a Dijkstra-like search from v1 to other vertices in v1’s subgraph it retrieves the skyline set from v1 to v2, which contains only P1 = (v1, v2) Then, it joins the skyline set with Bout(v2), and adds the joined paths into Bout(v1) if they are not α-dominated by any path in Bout(v1) After that, we have Bout(v1) = {P1, P1 · P◦1 }, where P1 · P◦1 denotes the concatenation of P1 and P◦1 . COLA retrieves the skyline set from v1 to v3, and joins paths in the skyline set with Bout(v3). These paths are P2 = (v1, v3) and P3 = [(v1, v2), (v2, v3)]. P3(identical to P1·P◦1) is not added to Bout(v1), which ends up with Bout(v1) = {P1, P1・ P◦1 , P2}. By joining Bout(v1) and Bin(v5), COLA retrieves a skyline set for all paths from v1 to v5 i.e. P=[(v1,v2),(v2,v3),(v3,v5)]

Optimizations Observation 1 First optimization of the join between Bout(s) and Bin(t), : If c(Po)+c(Pi) > θ, then joining Po (resp. Pi) with any path in Bin(t) (resp. Bout(s)) with cost higher than Pi (resp. Po) cannot lead to an α-CSP result; If c(Po)+c(Pi) ≤ θ , then joining Po (resp. Pi) with any path in Bin(t) (resp. Bout(s)) with length longer than Pi (resp. Po) can be discarded Cost constraint(θ)=13 Bout(s) Path End Vertex Cost Length Po1 V1 4 7 Po2 V2 Po3 V5 10 Po4 V4 12 6 Bin(t) Path Start Vertex Cost Length Pi1 V5 7 5 Pi2 V4 6 Pi3 V2 12 Pi4 V6 4 13 LabelJoin(P*)=Po2.Pi3

OBSERVATION 2. Let P be a path from s to w◦ from the join result of skyline set L(v◦) and Bout(v◦). P cannot possibly lead to an α-CSP result, if any of the following is true. w◦ cannot possibly reach t on the input graph G The minimum cost for any path from w◦ to t exceeds θ−c(P); There exists a path P′ from s to t satisfying the cost constrain such that the minimum length of any path from w◦ to t exceeds ℓ(P′)/α − ℓ(P). According to the above observation, before performing any join operation, we first select a set of end vertices W for the join results that can possibly lead to an α-CSP result, which is incrementally pruned using Observation 2. Then, we filter the COLA Index, and use only the labels that reach a vertex in W.

α-DIJK Index free solution used for: Computing skyline paths Intra-partition search during query processing in COLA For building the COLA index As a standalone index free solution for α-CSP CP-Dijk α-Dijk Distributes the “budget” for pruning equally to each vertex on the path In case of many vertex on the path, each receives only little pruning power Distributes more pruning power to “Landmark vertexes” and lesser to other Results in reduction of path to be examined and acceleration of query