Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.

Slides:



Advertisements
Similar presentations
The Optimal-Location Query
Advertisements

Nearest Neighbor Search
Spatio-temporal Databases
On Spatial-Range Closest Pair Query Jing Shan, Donghui Zhang and Betty Salzberg College of Computer and Information Science Northeastern University.
Nearest Neighbor Queries using R-trees
Indexing and Range Queries in Spatio-Temporal Databases
1 Finding Shortest Paths on Terrains by Killing Two Birds with One Stone Manohar Kaul (Aarhus University) Raymond Chi-Wing Wong (Hong Kong University of.
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Nearest Neighbor Queries using R-trees Based on notes from G. Kollios.
Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue The Min-dist Location Selection Query University of Melbourne 14/05/2015.
Continuous Intersection Joins Over Moving Objects Rui Zhang University of Melbourne Dan Lin Purdue University Kotagiri Ramamohanarao University of Melbourne.
Distance-based Indexing for metric space & almost-metric space Donghui Zhang Northeastern University.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.
2-dimensional indexing structure
Spatio-temporal Databases Time Parameterized Queries.
Spatial Indexing for NN retrieval
1 Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer.
Spatial Queries Nearest Neighbor and Join Queries.
Spatial Queries Nearest Neighbor Queries.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Evaluation of Top-k OLAP Queries Using Aggregate R-trees Nikos Mamoulis (HKU) Spiridon Bakiras (HKUST) Panos Kalnis (NUS)
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
1 Introduction to Spatial Databases Donghui Zhang CCIS Northeastern University.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Efficient Processing of Top-k Spatial Preference Queries
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
On Computing Top-t Influential Spatial Sites Authors: T. Xia, D. Zhang, E. Kanoulas, Y.Du Northeastern University, USA Appeared in: VLDB 2005 Presenter:
9/2/2005VLDB 2005, Trondheim, Norway1 On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern.
Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept. of Electronic.
Bin Yao, Feifei Li, Piyush Kumar Presenter: Lian Liu.
Information Technology Selecting Representative Objects Considering Coverage and Diversity Shenlu Wang 1, Muhammad Aamir Cheema 2, Ying Zhang 3, Xuemin.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Branch and Bound Algorithms Present by Tina Yang Qianmei Feng.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2015.
CSE554Contouring IISlide 1 CSE 554 Lecture 3: Contouring II Fall 2011.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2013.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Spatial Queries Nearest Neighbor and Join Queries Most slides are based on slides provided By Prof. Christos Faloutsos (CMU) and Prof. Dimitris Papadias.
Da Yan, Raymond Chi-Wing Wong, and Wilfred Ng The Hong Kong University of Science and Technology.
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
1 Introduction to Spatial Databases Donghui Zhang CCIS Northeastern University.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Tian Xia and Donghui Zhang Northeastern University
CSE 554 Lecture 5: Contouring (faster)
Spatial Queries Nearest Neighbor and Join Queries.
Progressive Computation of The Min-Dist Optimal-Location Query
Nearest Neighbor Queries using R-trees
Spatio-temporal Pattern Queries
Spatial Online Sampling and Aggregation
Spatio-temporal Databases
Introduction to Spatial Databases
Efficient Evaluation of k-NN Queries Using Spatial Mashups
Finding Fastest Paths on A Road Network with Speed Patterns
Similarity Search: A Matching Based Approach
Spatio-temporal Databases
Continuous Density Queries for Moving Objects
The Skyline Query in Databases Which Objects are the Most Important?
Efficient Processing of Top-k Spatial Preference Queries
Donghui Zhang, Tian Xia Northeastern University
Efficient Aggregation over Objects with Extent
Presentation transcript:

Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University of Hong Kong VLDB ’ 06, Seoul, Korea

Donghui Zhang et al. Optimal Location Query 2 Motivation “ What is the optimal location in Boston area to build a new McDonald ’ s store? ” Suppose a customer drives to the closest McDonald ’ s. Optimality: Minimize AVG driving distance.

Donghui Zhang et al. Optimal Location Query 3 Who will be interested? Corporations –Chained restaurants (e.g. McDonald ’ s, Burger King, Starbucks) –Supermarkets (e.g. Wal-Mart, Costco, Stop & Shop) –Location-based service providers (e.g. Verizon, AT&T) Computer Scientists especially in –Databases –Computational Geometry –Algorithms

Donghui Zhang et al. Optimal Location Query 4 min-dist OL Without any new site: AD = ( )/4 =

Donghui Zhang et al. Optimal Location Query 5 min-dist OL Without any new site: AD = ( )/4 = 400. With new site l 1 : AD(l 1 ) = ( )/4 = l1l1

Donghui Zhang et al. Optimal Location Query 6 min-dist OL Without any new site: AD = ( )/4 = 400. With new site l 1 : AD(l 1 ) = ( )/4 = 315. With new site l 2 : AD(l 2 ) = ( )/4 = l2l2 200

Donghui Zhang et al. Optimal Location Query 7 Formal Definition Given a set S of sites, a set O of objects, and a query range Q, min-dist OL is a location l  Q which minimizes distance between o and its nearest site “ Solution ” : compute all AD(l). But …

Donghui Zhang et al. Optimal Location Query 8 Challenging 1.There are infinite number of locations in Q! How to produce a finite set of candidates (yet keeping optimality)? 2.How to avoid computing AD(l) for all candidates?

Donghui Zhang et al. Optimal Location Query 9 Solution Highlights 1.Algorithm to compute AD(l). 2.Theorems to limit #candidates. 3.Lower-bound of AD(l) for all locations l in a cell C. 4.Progressive algorithm.

Donghui Zhang et al. Optimal Location Query 10 L1 Distance d(o, s) = |o.x – s.x|+|o.y – s.y|

Donghui Zhang et al. Optimal Location Query Compute AD(l) Remember Define Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= l RNN(l)=  AD=AD(l)

Donghui Zhang et al. Optimal Location Query Compute AD(l) Remember Define Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= l RNN(l)={o 7, o 8 } AD(l) < AD

Donghui Zhang et al. Optimal Location Query Compute AD(l) Remember Define AD(l)=AD - ? Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= Average savings for customers in RNN(l)

Donghui Zhang et al. Optimal Location Query Compute AD(l) Theorem S and O are “ static ” versus l. –AD can be pre-computed. –So is dNN(o, S) To compute AD(l): –Find RNN(l) –oRNN(l), compute d(o, l)

Donghui Zhang et al. Optimal Location Query 15 How to compute RNN(l)? This is an implementation detail, dealing with computational geometry and spatial databases. Na ï ve solution: o O, compare with all sites and l. More efficient: 1.Compute Voronoi cell of l. 2.Retrieve objects inside the Voronoi cell using a range search on R-tree.

Donghui Zhang et al. Optimal Location Query 16 How to compute RNN(l)? (1) Compute Voronoi cell Remember: RNN(l) is the set of objects close to l than to any existing site in S. Consider all sites. Draw a spatial region close to l than to any site. l

Donghui Zhang et al. Optimal Location Query 17 How to compute RNN(l)? (2) Retrieve objects Standard range search. Any spatial access methods, e.g. R- tree.

Donghui Zhang et al. Optimal Location Query x axis y axis b c a d e f g h i j k l m Range query: find the objects in a given range. E.g. find all hotels in Boston. No index: scan through all objects. NOT EFFICIENT!

Donghui Zhang et al. Optimal Location Query 19

Donghui Zhang et al. Optimal Location Query 20

Donghui Zhang et al. Optimal Location Query 21

Donghui Zhang et al. Optimal Location Query x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7

Donghui Zhang et al. Optimal Location Query x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7

Donghui Zhang et al. Optimal Location Query x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7

Donghui Zhang et al. Optimal Location Query Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! Q

Donghui Zhang et al. Optimal Location Query Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! 5x6=30 candidates Q

Donghui Zhang et al. Optimal Location Query Limit #candidates Proof idea: suppose the OL is not, move it will produce a better (or equal) result. l Consider RNN(l). δ Move to the right  saves total dist.

Donghui Zhang et al. Optimal Location Query VCU(Q) A spatial region, enclosing the objects closer to Q than to sites in S. It ’ s the Voronoi cell of Q versus sites in S. Q

Donghui Zhang et al. Optimal Location Query Further Limit #candidates Only consider objects in VCU(Q). 5x6=30 candidates

Donghui Zhang et al. Optimal Location Query Further Limit #candidates 5x6=30 candidates Only consider objects in VCU(Q).

Donghui Zhang et al. Optimal Location Query Further Limit #candidates 4x4=16 candidates Only consider objects in VCU(Q).

Donghui Zhang et al. Optimal Location Query 32 Na ï ve Algorithm Derive candidates. Compute AD(l) for each. Pick smallest. Not efficient! Too many candidates! To compute AD(l) for each one, need: compute RNN(l) retrieve all these objects …

Donghui Zhang et al. Optimal Location Query 33 Progressive Idea Treat Q as a cell and consider its corners.

Donghui Zhang et al. Optimal Location Query 34 Progressive Idea Divide the cell.

Donghui Zhang et al. Optimal Location Query 35 Progressive Idea Divide the cell.

Donghui Zhang et al. Optimal Location Query 36 Progressive Idea Recursively divide a sub-cell.

Donghui Zhang et al. Optimal Location Query 37 Progressive Idea Recursively divide a sub-cell. Able to check all candidates.

Donghui Zhang et al. Optimal Location Query 38 Progressive Idea Q: What do you save? A: Cell pruning, if its lower bound  AD(l 0 ) of some candidate l 0. AD(l o ) =50 Suppose 60 is a lower bound for AD(l), l C

Donghui Zhang et al. Optimal Location Query LB(C): lower bound for AD(l), lC AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 c

Donghui Zhang et al. Optimal Location Query LB(C): lower bound for AD(l), lC Theorem: AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 c

Donghui Zhang et al. Optimal Location Query LB(C): lower bound for AD(l), lC A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation.

Donghui Zhang et al. Optimal Location Query The Progressive Algorithm 1.Maintain a heap of cells ordered by LB(). Initially one cell: Q. 2.Maintain the best candidate l opt 3.Pick the cell with minimum LB() and partition it. 4.Compute AD() for the corners of sub- cells. 5.Compute LB() for the sub-cells. 6.Insert sub-cell c i to heap if LB(c i )<AD(l opt ) 7.Goto 3.

Donghui Zhang et al. Optimal Location Query 43 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best corner of Q) LB(Q) AD( real OL ) is inside the interval

Donghui Zhang et al. Optimal Location Query 44 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) LB(Q) AD( real OL ) is inside the interval

Donghui Zhang et al. Optimal Location Query 45 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) Min{ LB(C) | C in heap } AD( real OL ) is inside the interval User may choose to terminate any time.

Donghui Zhang et al. Optimal Location Query 46 Batch Partitioning To partition a cell, should partition into multiple sub-cells. Reason: to compute AD(l), need to access the R*-tree of objects. When access the R*-tree, want to compute multiple AD(l). Tradeoff: if partition too much: wasteful! Since some candidates could be pruned.

Donghui Zhang et al. Optimal Location Query 47 Performance Setup O: 123,593 postal addresses in Northeastern part of US. Stored using an R*-tree. S: randomly select 100 sites from O. Buffer: 128 pages. Dell Pentium IV 3.2GHz. Query size: 1% in each dimension.

Donghui Zhang et al. Optimal Location Query 48 4x4=16 candidates Only consider objects in VCU(Q). 2. Further Limit #candidates

Donghui Zhang et al. Optimal Location Query 49 Effect of VCU Computation

Donghui Zhang et al. Optimal Location Query LB(C): lower bound for AD(l), lC Theorem: AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 c

Donghui Zhang et al. Optimal Location Query LB(C): lower bound for AD(l), lC A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation.

Donghui Zhang et al. Optimal Location Query 52 Comparison of Lower Bounds

Donghui Zhang et al. Optimal Location Query 53 Effect of Batch Partitioning

Donghui Zhang et al. Optimal Location Query 54 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) Min{ LB(C) | C in heap } AD( real OL ) is inside the interval User may choose to terminate any time.

Donghui Zhang et al. Optimal Location Query 55 Progressiveness Each step: partition a cell to 40 sub-cells. After 200 steps, accurate answer. After 20 steps, answer is 1% away from optimal.

Donghui Zhang et al. Optimal Location Query 56 Conclusions Introduced the min-dist optimal- location query. Proved theorems to limit the number of candidates. Presented lower-bound estimators. Proposed a progressive algorithm.