Bin Yao, Feifei Li, Piyush Kumar Presenter: Lian Liu
Introduction Related Work Algorithms - PFC (Progressive Furthest Cell) - CHFC (Convex Hull Furthest Cell) Experiment Discussion
Assume you live at p1 (p2, p3), where would you prefer to build a chemical factory among q1~q3?
Let P={p1, p2, p3} Q={q1, q2, q3} fn(p1, Q)=q3 fn(p2, Q)=q1 fn(p3, Q)=q1 BRFN(q1,Q,P)={p2, p3} BRFN(q2,Q,P)={} BRFN(q3,Q,P)={p1} Build the chemical factory here
Problem: Given query point q, data set P (and Q), Compute MRFN(q, P) and BRFN(q, Q, P).
MBR MBR (Minimum Bounding Rectangles) has 3 important distances to a point: Min Distance Max Distance Minmax Distance
R-tree R-tree is an index data structure. In R-trees, points are grouped into MBRs, which are recursively grouped into MBRs in higher levels of the tree.
Range query Range query: retrieves all points that locates within the query window. R-tree based algorithms proves to be efficient to deal with range queries.
How to compute the MRFN of a given query point? BFS (Brute-Force Search) PFC (Progressive Furthest Cell) Main Idea: 1. Find the cell (region) in which all reverse furthest neighbors of the query point located 2. Perform a range query with the cell How to compute?
FVC (Furthest Voronoi Cell)
FVC Example: query point = q1 fvc(q1, P)
PFC (Progressive Furthest Cell) Algorithm Points and MBRs are stored in a priority queue L with their minmaxdist sorted in decreasing order. Two vectors Vc and Vp are also maintained: Vc: Furthest neighbor candidates Vp: Disqualifying points
PFC – mechanism e is a point e is an MBR fvc(q)={} e ∈ fvc(q) e ∩ fvc(q)={} e ∩ fvc(q)≠{} c ∩ fvc(q)≠{} c ∩ fvc(q)={} At last, we update fvc(q) using Vp and then filter points in Vc using fvc(q)
Example: L={p1, R1} Vc={} Vp={} L={R1} Vc={p1} Vp={} fvc(q)
Example: L={p3} Vc={p1} Vp={p2} L={} Vc={p1, p3} Vp={p2} fvc(q)
Example: MRFN(q)={p3} fvc(q) Finally, we use all points in Vp (i.e. p2) to update fvc(q). Then, we perform a range query using the updated fvc(q). The result is {p3} 。
Efficiency of PFC PFC makes fvc(q) quickly shrink. If the query point does not have any reverse furthest neighbors, Φ will quickly be reported. However, it is still not efficient enough. Improvement: CHFC algorithm.
Convex Hull The Convex Hull of a set of points P is the smallest convex polygon that fully contains P. Denoted as C P.
Lemma: Given a point set P and its convex hull Cp, for a point q, let p*=fn(q, P), then p* ∈ C P. fvc(p, P)=fvc(p, C P )
CHFC (Convex Hull Furthest Cell) Given a set of points P and a query point p: Compute CP ∪ {p} Compute fvc(p, P) using CP ∪ {p} Perform a range query with fvc(p, P)
BRFN BRFN (Bichromatic Reverse Furthest Neighbor) can be found in the same way as MRFN. The only one difference is, we compute fvc(q, Q, P) will Q, can perform range query in P.
Efficiency of CHFC: For most (but not all) cases, |C P | << |P|. That is, the number of points considered are likely to be greatly reduced. Difficulty: How to compute and update C P when |P| is very large and even |C P | cannot fit into memory.
Computing Convex Hull Convex hulls can be found in either a distance-first or a depth-first manner. Distance-first approach is optimal in the number of page accesses, and the complexity is O(nlogn). Depth-first algorithms can run in O(n) time for worst case, but not optimal in disk accessing.
Updating Convex Hull Inserting new points: Lemma: P is a point set. If point q is contained by C P,C P ∪ {q} =C P Otherwise, C P ∪ {q} =C Cp ∪ {q}
Updating Convex Hull Deleting points: Points or MBRs with the largest perpendicular distance to p l p r are added into C P first, until there is no points outside the current convex hull.
External Convex Hull Computing Existing algorithms can found 2-Dimensional convex hulls with I/Os. However, when convex hulls are still too large to fit into memory, we use Dudley’s approximate convex hull.
CPU time & number of IOs
Thank You! Questions?