Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progressive Computation of The Min-Dist Optimal-Location Query

Similar presentations


Presentation on theme: "Progressive Computation of The Min-Dist Optimal-Location Query"— Presentation transcript:

1 Progressive Computation of The Min-Dist Optimal-Location Query
Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University of Hong Kong VLDB’06, Seoul, Korea

2 Optimal Location Query
Motivation “What is the optimal location in Boston area to build a new McDonald’s store?” Suppose a customer drives to the closest McDonald’s. Optimality: Minimize AVG driving distance. Donghui Zhang et al. Optimal Location Query

3 Optimal Location Query
min-dist OL 600 200 200 600 Without any new site: AD = ( )/4 = 400. Donghui Zhang et al. Optimal Location Query

4 Optimal Location Query
min-dist OL 600 30 l1 30 600 Without any new site: AD = ( )/4 = 400. With new site l1: AD(l1) = ( )/4 = 315. Donghui Zhang et al. Optimal Location Query

5 Optimal Location Query
min-dist OL 200 30 l2 30 200 Without any new site: AD = ( )/4 = 400. With new site l1: AD(l1) = ( )/4 = 315. With new site l2 : AD(l2) = ( )/4 = 115. Donghui Zhang et al. Optimal Location Query

6 Optimal Location Query
Formal Definition Given a set S of sites, a set O of objects, and a query range Q , min-dist OL is a location l  Q which minimizes distance between o and its nearest site Donghui Zhang et al. Optimal Location Query

7 Optimal Location Query
L1 Distance d(o, s) = |o.x – s.x|+|o.y – s.y| Donghui Zhang et al. Optimal Location Query

8 Optimal Location Query
Challenging There are infinite number of locations in Q. How to produce a finite set of candidates (yet keeping optimality)? How to avoid computing AD(l) for all candidates? Donghui Zhang et al. Optimal Location Query

9 Optimal Location Query
Solution Highlights Algorithm to compute AD(l). Theorems to limit #candidates. Lower-bound of AD(l) for all locations l in a cell C. Progressive algorithm. Donghui Zhang et al. Optimal Location Query

10 Optimal Location Query
1. Compute AD(l) Remember Define Let RNN(l) be the objects “attracted” by l. AD(l)=AD if RNN(l)= l RNN(l)= AD=AD(l) Donghui Zhang et al. Optimal Location Query

11 Optimal Location Query
1. Compute AD(l) Remember Define Let RNN(l) be the objects “attracted” by l. AD(l)=AD if RNN(l)= l RNN(l)={o7, o8} AD(l) < AD Donghui Zhang et al. Optimal Location Query

12 Optimal Location Query
1. Compute AD(l) Remember Define Let RNN(l) be the objects “attracted” by l. AD(l)=AD if RNN(l)= AD(l)=AD - ? Average savings for customers in RNN(l) Donghui Zhang et al. Optimal Location Query

13 Optimal Location Query
1. Compute AD(l) Theorem S and O are “static” versus l. AD can be pre-computed. So is dNN(o, S) To compute AD(l): Find RNN(l) oRNN(l), compute d(o, l) Donghui Zhang et al. Optimal Location Query

14 Optimal Location Query
2. Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! Q Donghui Zhang et al. Optimal Location Query

15 Optimal Location Query
2. Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! Q Donghui Zhang et al. Optimal Location Query 5x6=30 candidates

16 Optimal Location Query
2. Limit #candidates Proof idea: suppose the OL is not, move it will produce a better (or equal) result. δ l Consider RNN(l). Move to the right  saves total dist. Donghui Zhang et al. Optimal Location Query

17 Optimal Location Query
2. VCU(Q) A spatial region, enclosing the objects closer to Q than to sites in S. It’s the Voronoi cell of Q versus sites in S. Donghui Zhang et al. Optimal Location Query

18 2. Further Limit #candidates
Only consider objects in VCU(Q). 5x6=30 candidates Donghui Zhang et al. Optimal Location Query

19 2. Further Limit #candidates
Only consider objects in VCU(Q). 5x6=30 candidates Donghui Zhang et al. Optimal Location Query

20 2. Further Limit #candidates
Only consider objects in VCU(Q). 4x4=16 candidates Donghui Zhang et al. Optimal Location Query

21 Optimal Location Query
Naïve Algorithm Derive candidates. Compute AD(l) for each. Pick smallest. Not efficient! Too many candidates! To compute AD(l) for each one, need: compute RNN(l) retrieve all these objects… Donghui Zhang et al. Optimal Location Query

22 Optimal Location Query
Progressive Idea Treat Q as a cell and consider its corners. Donghui Zhang et al. Optimal Location Query

23 Optimal Location Query
Progressive Idea Divide the cell. Donghui Zhang et al. Optimal Location Query

24 Optimal Location Query
Progressive Idea Divide the cell. Donghui Zhang et al. Optimal Location Query

25 Optimal Location Query
Progressive Idea Recursively divide a sub-cell. Donghui Zhang et al. Optimal Location Query

26 Optimal Location Query
Progressive Idea Recursively divide a sub-cell. Able to check all candidates. Donghui Zhang et al. Optimal Location Query

27 Optimal Location Query
Progressive Idea Q: What do you save? A: Cell pruning, if its lower bound  AD(l0) of some candidate l0. AD(lo ) =50 C Suppose 60 is a lower bound for AD(l), l Donghui Zhang et al. Optimal Location Query

28 3. LB(C): lower bound for AD(l), lC
AD(c1)=1000 AD(c2)=3000 c AD(c3)=4000 AD(c4)=2500 Donghui Zhang et al. Optimal Location Query

29 3. LB(C): lower bound for AD(l), lC
AD(c1)=1000 AD(c2)=3000 c AD(c3)=4000 AD(c4)=2500 Theorem: is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 Donghui Zhang et al. Optimal Location Query

30 3. LB(C): lower bound for AD(l), lC
A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation. Donghui Zhang et al. Optimal Location Query

31 4. The Progressive Algorithm
Maintain a heap of cells ordered by LB(). Initially one cell: Q. Maintain the best candidate lopt Pick the cell with minimum LB() and partition it. Compute AD() for the corners of sub-cells. Compute LB() for the sub-cells. Insert sub-cell ci to heap if LB(ci)<AD(lopt) Goto 3. Donghui Zhang et al. Optimal Location Query

32 Optimal Location Query
Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best corner of Q) LB(Q) AD( real OL ) is inside the interval Donghui Zhang et al. Optimal Location Query

33 Optimal Location Query
Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. AD(best candidate) AD( real OL ) is inside the interval LB(Q) Time Donghui Zhang et al. Optimal Location Query

34 Optimal Location Query
Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. AD(best candidate) AD( real OL ) is inside the interval Min{ LB(C) | C in heap } Time User may choose to terminate any time. Donghui Zhang et al. Optimal Location Query

35 Optimal Location Query
Batch Partitioning To partition a cell, should partition into multiple sub-cells. Reason: to compute AD(l), need to access the R*-tree of objects. When access the R*-tree, want to compute multiple AD(l). Tradeoff: if partition too much: wasteful! Since some candidates could be pruned. Donghui Zhang et al. Optimal Location Query

36 Optimal Location Query
Performance Setup O: 123,593 postal addresses in Northeastern part of US. Stored using an R*-tree. S: randomly select 100 sites from O. Buffer: 128 pages. Dell Pentium IV 3.2GHz. Query size: 1% in each dimension. Donghui Zhang et al. Optimal Location Query

37 2. Further Limit #candidates
review slide 2. Further Limit #candidates Only consider objects in VCU(Q). 4x4=16 candidates Donghui Zhang et al. Optimal Location Query

38 Effect of VCU Computation
Donghui Zhang et al. Optimal Location Query

39 3. LB(C): lower bound for AD(l), lC
review slide 3. LB(C): lower bound for AD(l), lC AD(c1)=1000 AD(c2)=3000 c AD(c3)=4000 AD(c4)=2500 Theorem: is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 Donghui Zhang et al. Optimal Location Query

40 3. LB(C): lower bound for AD(l), lC
review slide 3. LB(C): lower bound for AD(l), lC A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation. Donghui Zhang et al. Optimal Location Query

41 Comparison of Lower Bounds
Donghui Zhang et al. Optimal Location Query

42 Effect of Batch Partitioning
Donghui Zhang et al. Optimal Location Query

43 Optimal Location Query
review slide Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) Min{ LB(C) | C in heap } AD( real OL ) is inside the interval User may choose to terminate any time. Donghui Zhang et al. Optimal Location Query

44 Optimal Location Query
Progressiveness Each step: partition a cell to 40 sub-cells. After 200 steps, accurate answer. After 20 steps, answer is 1% away from optimal. Donghui Zhang et al. Optimal Location Query

45 Optimal Location Query
Conclusions Introduced the min-dist optimal-location query. Proved theorems to limit the number of candidates. Presented lower-bound estimators. Proposed a progressive algorithm. Q & A... Donghui Zhang et al. Optimal Location Query


Download ppt "Progressive Computation of The Min-Dist Optimal-Location Query"

Similar presentations


Ads by Google