Download presentation
Presentation is loading. Please wait.
1
Presented by: Duong, Huu Kinh Luan March 14 th, 2011
2
Introduction Background information Problem Definition Handling Technique Algorithm Experimental Results Authors of paper What is the problem? Why there is the problem ? Rank Based K-NN Related Papers on the same topic Top-k Properties Problem Definition Notations used Exact Algorithm Randomized Algorithm
3
Authors of the paper Xuemin Lin - Professor The University of New South Wales The University of New South Wales PhD – C.S from the U. Queensland (Australia) in 1992U. Queensland Ying Zhang Research Fellow PhD – 01. 2008 Wenjie Zhang Post-doc research fellow PhD – 2010 Gaoping Zhu PhD Candidate Qianlu Lin PhD Candidate
4
What is the problem? GPS TRACKING DEVICE SENSOR NETWORK
5
What is the problem?
7
G. Cormode, F. Li, and K. Yi “Semantic of ranking queries for probabilistic data and expected ranks” R. Chen, L. Chen, J. Chen and X. Xie “Evaluating probability threshold k-nn queires over uncertain data” V. Ljosa and A. K. Singh “Apla: Indexing arbitrary probability distributions”
8
Introduction Background information Problem Definition Handling Technique Algorithm Experimental Results
9
U U1U1 U3U3 U2U2 U4U4 q Set of objects: U = {U 1, U 2, …, U n } U = {U 1, U 2, U 3, U 4 } Possible World: W = {u 1, u 2, u 3, …, u n } W 1 = {U1, U2} Definition 1: Rank (Rank of an obj U in one possible world W)
10
Definition 2: Expected Rank Definition 3: Median Rank
11
Example: Show on board Possible Worlds? Rank for A? i.e. r(a 1 ), r(a 2 ), r(a 3 ) Expected rank for A? i.e. er(A) Median rank for A? i.e. mr(A)
12
Top–K Query: Find k nearest neighbors for a given query q based on the expected (median) ranks of n objects.
13
Top–K Properties: Exact-k: K-NN query answer should return exactly k objects Containment: (K+1)-NN should contain all objects in KNN Unique Ranking: The same object should not be listed multiple times in KNN Value invariance: The distance only determines the relative behavior of the object Stability: Making an item in the top-k list more likely or more important should not remove it from the list
14
Top–K Properties: Proof that expected rank satisfies all 5 top-k properties not this paper major concern. It is done in the paper “Semantic of ranking queries for probabilistic data and expected ranks”, by G. Cormode, F. Li, and K. Yi
15
Overcome previous paper’s difficulties: Pre-computed expected scores of objects Reduce the number of objects accessed Expected score might change upon different queries Approximation of KNN querie answer
16
Introduction Background information Problem Definition Handling Technique Algorithm Experimental Results
17
! Lemma 2: Let u i and u j be the instances which determine the median rank and median distance of U respectively, we have r(u i ) = r(u j )
18
Finding Minimal Set for Selection Problem (Using Bound Based Approach) Motivation for the Algorithm
19
Introduction Background information Problem Definition Handling Technique Algorithm Experimental Results
20
NotationMeaning UUncertain Object qQuery d - (U), d + (U)Minimal/Maximal possible distance between instances of U and q IInterval within which each instance has a possible distance from q between [d - (I), d + (I)] r - (I), r + (I)Minimal/maximal rank of an interval I er - (U), er + (U)Minimal/Maximal expected rank of U mr - (U),mr + (U)Minimal/Maximal median rank of U Urminmr - (U) or er - (U) Urmaxmr + (U) or er + (U)
21
Uncertain objects R-Tree query q also represented in R-Tree e(I) from d - (I) to d + (I)
22
Example of calculating r - (I) and r + (I) smaller than Sum up for r - (I)
23
Example of calculating r - (I) and r + (I) smaller than Sum up for r + (I)
24
Exact Algorithm: acc rmin : accumulation of the probability values of the invervals {I of I} with d + (I)<=d U armin (d): accumulation of the probability values of the invervals {I of I U } with d + (I)<=d
25
Exact Algorithm: Cost: Initial Procedure: O(nlogn + n p0 x c io ) One round: O(n x m log(n x m)) n: number of objects m: number of interval in 1 object h: max height of local R-Tree n po : number of IO n pi : number of IO in i th round c io : cost of each IO Total time cost: T = O( h x n x m log( n x m )) + n pi x c io (i:0:h)
26
Randomized Algorithm: Sample the possible world such that the expected rank and median rank can be approximately computed in an efficient way.
27
Randomized Algorithm: Estimate the expected rank of an object U where r i (U) is the rank of U in sample S i Recall:
28
Randomized Algorithm: Find candidate objects C for the KNN query based on the global R-Tree Minimal/Maximal Expected rank for each object using Sweepline algorithm l and r --> value to prune or validate objects for the KNN query
29
Randomized Algorithm – Cost: O(nlogn) O(logn) O(n’logn + n 1 x c io ) T = O(nlogn + n’logn + n 1 x c io )
30
What is n’?
31
Introduction Background information Problem Definition Handling Technique Algorithm Experimental Results
36
Comparision with the other paper: This paperThe other paper
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.