Presentation is loading. Please wait.

Presentation is loading. Please wait.

Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*

Similar presentations


Presentation on theme: "Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*"— Presentation transcript:

1 Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*
Using High Dimensional Indexes to Support Relevance Feedback Based Interactive Images Retrieval Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei* +Fudan University, China *Simon Fraser University, Canada Motivation K-means cluster approach had been widely used to improve the performance of high dimensional index. But, there are still some problems need to be discussed further, such as how to preset the query radius and the number K of the K-means cluster, etc. In this demo system, we present a new cluster splitting based B+-tree index to deal with the above problems, and the index has been applied to support relevance feedback for content-based images retrieval. Index Structure Background The central idea of iDistance is to cluster objects and find a reference point for each cluster. Then, the distance between an object and the reference point in the cluster to which the object belong can be indexed in a B+-tree. It has been well observed that in the high dimensional real data space, a majority of clusters are intersected each other. Therefore,it is often the case, that a query region covers many clusters and causes lower query efficiency. In order to improve the query performance, the iDistance search algorithm starts with a preset small search radius and enlarges the search radius gradually if necessary. Experiment results Challenges However, in the known work, the initial query radius and the enlarging step need to be preset by experiment or user’s experiences. It is lack of theory to guide the estimation of these parameters. Demo system 1. Based on the query cost model of metric space, we developed the formulas to compute the “optimal” cluster splitting number M. 2. In the interactive relevance feedback processing, the query distance is updated using users’ feedback and the index distance is guaranteed to be a lower bound of the query distance. Thus, the index structure does not need to be changed. , Nc: the number K of K-means cluster N: size of dataset H: height of internal node u: fanout of node Approach We present a new cluster splitting based B+-tree index to deal with the above problems, 1. The optimal KNN search algorithm is adopted to avoid the selection of initial query radius 2. Through cluster splitting, the data space is partitioned more finely to reduce the intersection between query region and data clusters


Download ppt "Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*"

Similar presentations


Ads by Google