Distance Indexing on Road Networks A summary Andrew Chiang CS 4440.

Slides:



Advertisements
Similar presentations
Week 1: Introduction to GIS
Advertisements

Problem solving with graph search
Indexing DNA Sequences Using q-Grams
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie BaoChi-Yin ChowMohamed F. Mokbel Department of Computer Science and Engineering.
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Visibility Graph Team 10 NakWon Lee, Dongwoo Kim.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
Speaker: Ping-Lin Chang 2009/04/12.  Introduction  ROAD Framework  Operation Designed  Empirical Results  Conclusions 2Fast Object Search on Road.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
CS171 Introduction to Computer Science II Graphs Strike Back.
CS 128/ES Lecture 12b1 Spatial Analysis (3D)
CS 128/ES Lecture 12b1 Spatial Analysis (3D)
Graph & BFS.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Graph COMP171 Fall Graph / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D E A C F B Vertex Edge.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Shortest path algorithm. Introduction 4 The graphs we have seen so far have edges that are unweighted. 4 Many graph situations involve weighted edges.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Scalable Network Distance Browsing in Spatial Database Samet, H., Sankaranarayanan, J., and Alborzi H. Proceedings of the 2008 ACM SIGMOD international.
CS 206 Introduction to Computer Science II 03 / 30 / 2009 Instructor: Michael Eckmann.
Route Planning Vehicle navigation systems, Dijkstra’s algorithm, bidirectional search, transit-node routing.
Map Analysis with Networks Francisco Olivera, Ph.D., P.E. Department of Civil Engineering Texas A&M University Some of the figures included in this presentation.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Trip Planning Queries F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, S.-H. Teng Boston University.
Backtracking.
Network and Dynamic Segmentation Chapter 16. Introduction A network consists of connected linear features. Dynamic segmentation is a data model that is.
CSCI-455/552 Introduction to High Performance Computing Lecture 18.
Geography and CS Philip Chan. How do I get there? Navigation Which web sites can give you turn-by-turn directions?
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
VLDB '2006 Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong University of Science and Technology, Hong Kong) Victor.
Message-Optimal Connected Dominating Sets in Mobile Ad Hoc Networks Paper By: Khaled M. Alzoubi, Peng-Jun Wan, Ophir Frieder Presenter: Ke Gao Instructor:
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Distance. Euclidean Distance Minimum distance from a source (Value NoData) Input grid must have at least one source cell with the rest of the grid.
Representing and Using Graphs
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
Shortest Path Navigation Application on GIS Supervisor: Dr. Damitha Karunaratne Thilani Imalka 2007/MCS/023.
Online Algorithms By: Sean Keith. An online algorithm is an algorithm that receives its input over time, where knowledge of the entire input is not available.
Sequence Comparison Algorithms Ellen Walker Bioinformatics Hiram College.
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.
Group 8: Denial Hess, Yun Zhang Project presentation.
10 Copyright © William C. Cheng Data Structures - CSCI 102 Graph Terminology A graph consists of a set of Vertices and a set of Edges C A B D a c b d e.
Similarity Searching in High Dimensions via Hashing Paper by: Aristides Gionis, Poitr Indyk, Rajeev Motwani.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
L3-Network Algorithms L3 – Network Algorithms NGEN06(TEK230) – Algorithms in Geographical Information Systems by: Irene Rangel, updated Nov by Abdulghani.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
DYNAMICALLY COMPUTING FASTEST PATHS FOR INTELLIGENT TRANSPORTATION SYSTEMS MEERA KRISHNAN R.
CS Machine Learning Instance Based Learning (Adapted from various sources)
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Spanning Trees Dijkstra (Unit 10) SOL: DM.2 Classwork worksheet Homework (day 70) Worksheet Quiz next block.
Presented by: Siddhant Kulkarni Spring Authors: Publication:  ICDE 2015 Type:  Research Paper 2.
Graphs – Part III CS 367 – Introduction to Data Structures.
CS330 Discussion 6.
Comp 245 Data Structures Graphs.
K Nearest Neighbor Classification
CS223 Advanced Data Structures and Algorithms
i206: Lecture 14: Heaps, Graphs intro.
Graphs & Graph Algorithms 2
Fast Nearest Neighbor Search on Road Networks
Graphs Chapter 11 Objectives Upon completion you will be able to:
Yan Shi CS/SE 2630 Lecture Notes
ITEC 2620M Introduction to Data Structures
Indexing, Access and Database System Architecture
COMPUTER NETWORKS CS610 Lecture-16 Hammad Khalid Khan.
Chapter 9 Graph algorithms
Presentation transcript:

Distance Indexing on Road Networks A summary Andrew Chiang CS 4440

Introduction Geodatabases store geographic data that can be represented on a map Roads can be stored in a geodatabase or spatial database as polylines At the very base of MapQuest and Google Maps/Earth is a road network

Road Networks A network of roads represented by polylines At each intersection of two roads, a point/vertex is placed Between any two vertices on the road network, that segment has properties used in calculations (length of segment, time for traveling the segment, etc)

Road Networks VS Normal Space Normal Euclidean space doesn’t have paths between points, just empty space With road networks, we connect certain points using edges (roads) Roads can be given weights (distance, time) that factor into optimization algorithms

Location-Based Services Using Road Networks Users in a location-based service utilize continuous NN and kNN queries to provide users with information Shortest path algorithms are commonly used (Dijkstra’s Algorithm) to find the distances between two points on the network Can find shortest paths on the fly, or pre- compute and store distances and paths in a table

Drawbacks of Current Practices Dijkstra’s Algorithm is all fine and dandy for short distances, but… For longer distances, Dijkstra’s Algorithm is very inefficient We don’t want to have to calculate long distances continuously (terribly inefficient!) So what do we do? What DO we do?

Distance Signature To help efficiency in queries, one can use a proposed “distance signature” Instead of storing a specific distances to objects, we store an approximate distance (distance range) For each node in the network, we create a signature

What’s in a Distance Signature? The approximate distance between that node and each other object of interest in the network The index of the node to go to when traversing the shortest path from this node to the destination node

Some Notation In a road network N, each node n has a distance signature S(n) S(n) is composed of components S(n)[0…i], which contains the approximate distance range between the node n and node i In addition to S(n)[0…i], we store a backtracking link S(n)[0…i].link, which gives us the corresponding index in the adjacency matrix of n of the node to hop to when following the shortest path from n to i

Example of a Distance Signature p1p2p3p4p5p6p p1p2p3p4p5p6p Units in miles Distance Categories 0: < 1 mi 1: 1 mi <= D < 2 mi 2: 2 mi <= D < 3 mi 3: >= 3 mi S(p6) S(p6).link Adjacency Matrix for P6 P40.9 P51.6 P70.5

Operations on S(n) Find approximate and exact distance between two nodes in the network Exact distance computation uses backtrack link values to follow shortest path from A to B Approximate distance comparision, about how far away are points A and B from N?

More Operations on S(n) Distance sorting (ordering of features from closest to farthest or vice versa, kNN queries)

Using S(n) for Range Queries For range queries, we use distance categories to include or exclude features quickly If a category is entirely within the query range, we automatically include all features in the category If a category is entirely outside the query range, we automatically exclude all features in the category If a category includes the query range distance, we must do distance calculations

Using S(n) for kNN Queries Find number of feature in each distance category. Keep only the categories that will cover the closest k features Do distance sort on features categories kept. Keep only top k features

Notice anything? Operations that return approximate distances VS exact distance? By using distance signature, we are able to trim down a set of features into a smaller set This way, we can perform more specific operations on fewer features, rather than on every feature in the network

Other Cool Features of S(n) S(n) can be compressed, mainly in the backtracking link –Nodes that share the same link from n –Commutative property of S(n) (adding two signatures together) Easy updates to S(n) when a road on the network is changed

Optimization For best performance, we want to make just the right number of distance categories for a signature Things to think about –Density of distance data points –Query load: how many operations will we need to perform a query? –Storage space: bits used for storing the signature for each node in the network

Optimization (ctd.) Since most range and kNN queries are local to the user’s location, we determine our distance categories exponentially Distance ranges represented as… T, cT, c 2 T, …, where c, T are constants

Optimization (ctd.) After some really ugly math, we determine that the optimal values are… C = eT = √(SP / e) … where SP is the distance of a typical range query that will be performed on this system. This is usually defined by the creator of the system For a full derivation, refer to the paper

A Look at Performance For purposes of performance comparison, we compare using the distance signature versus using… –Full indexing: storing the hard distances –NVD (Network Voronoi Diagram): a commonly- used kNN query algorithm

A Look at Performance (ctd.) Consistently smaller index size than full indexing Disk size for signature nearly 10% that of full indexing

A Look at Performance (ctd.) For range queries, distance affects performance of signature, but still outperforms NVD When threshold for query is low, signature is as good as full indexing

A Look at Performance (ctd.) For kNN queries with a higher k value, signature outperforms NVD Signature’s performance doesn’t increase linearly as k increases

Performance Summary Although full indexing still provides faster query processing time, the disk space used by distance signature is far less Distance signature performs kNN queries faster than a proven indexing method for kNN queries Overall performance on all aspects still reasonable for use on both range and kNN queries

Summary Distance signature is a new indexing method optimized for road networks that can efficiently perform both range and kNN queries Distances are categorized into exponential ranges, and operations use a general-to- specific approach Signature itself is smaller in size and is compressible