A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Slides:



Advertisements
Similar presentations
Object Recognition using Local Descriptors Javier Ruiz-del-Solar, and Patricio Loncomilla Center for Web Research Universidad de Chile.
Advertisements

3D Model Matching with Viewpoint-Invariant Patches(VIP) Reporter :鄒嘉恆 Date : 10/06/2009.
Aggregating local image descriptors into compact codes
Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.
Presented by Xinyu Chang
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
A NOVEL LOCAL FEATURE DESCRIPTOR FOR IMAGE MATCHING Heng Yang, Qing Wang ICME 2008.
Query Specific Fusion for Image Retrieval
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Kiyoshi Irie, Tomoaki Yoshida, and Masahiro Tomono 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Special Topic on Image Retrieval Local Feature Matching Verification.
Image alignment Image from
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.
CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic
Packing bag-of-features ICCV 2009 Herv´e J´egou Matthijs Douze Cordelia Schmid INRIA.
Bag of Features Approach: recent work, using geometric information.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Robust and large-scale alignment Image from
The Capacity of Color Histogram Indexing Dong-Woei Lin NTUT CSIE.
Object retrieval with large vocabularies and fast spatial matching
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Detecting Image Region Duplication Using SIFT Features March 16, ICASSP 2010 Dallas, TX Xunyu Pan and Siwei Lyu Computer Science Department University.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Fitting a Model to Data Reading: 15.1,
Scale Invariant Feature Transform (SIFT)
Linear Solution to Scale and Rotation Invariant Object Matching Professor: 王聖智 教授 Student : 周 節.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
FLANN Fast Library for Approximate Nearest Neighbors
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
CSE 185 Introduction to Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Professor: S. J. Wang Student : Y. S. Wang
Copyright Protection of Images Based on Large-Scale Image Recognition Koichi Kise, Satoshi Yokota, Akira Shiozaki Osaka Prefecture University.
1 Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval Ondrej Chum, James Philbin, Josef Sivic, Michael Isard and.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.
Evaluation of interest points and descriptors. Introduction Quantitative evaluation of interest point detectors –points / regions at the same relative.
Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
776 Computer Vision Jan-Michael Frahm Spring 2012.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.
Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.
Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.
776 Computer Vision Jan-Michael Frahm Spring 2012.
A. M. R. R. Bandara & L. Ranathunga
SIFT Scale-Invariant Feature Transform David Lowe
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Video Google: Text Retrieval Approach to Object Matching in Videos
بازیابی تصاویر بر اساس محتوا
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
Large Scale Image Deduplication
Video Google: Text Retrieval Approach to Object Matching in Videos
Multi-Information Based GCPs Selection Method
Presentation transcript:

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang

Outline Flow chart of the overall work The idea of using statistical approach to do re- ranking – By feature locations relationship O(n 2 ) time complexity – By orientation relationship O(n) time complexity The re-rank accuracy is as good as RANSAC Experimental result evaluation

Flow Chart 1 – ranking components construction Dataset: Ukbench [1] [1] D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages , June [2] Code Book Hierarchical k-means [1][2] Bag of Word histograms of the database images Query image Bag of Word histogram of the query image Respond top-N result

Flow Chart 2 – re-ranking components construction Respond top-N result Re-rank by RANSAC [3] [3] Peter Kovesi, Centre for Exploration Targeting School of Earth and Environment The University of Western Australia Re-rank by proposed statistical approach Result evaluation

1. Feature Locations Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For an ideal similar image pair: – Only translation, rotation and scaling – The ratio of corresponding distance pairs should be constant. [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B dist1 dist2

1. Feature Locations Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For a similar image pair with view angle difference: – Translation, rotation and scaling – Local geometric distortion, and wrong feature points matching – The ratio of corresponding distance pairs is near constant. [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B dist1 dist2

Example ukbench00000ukbench00001 Mean = 0.85 Variance = Total amount of match points: 554 Mean: scaling Variance: matching error, the smaller the better

1. Feature Locations Relationship Assumption after observation: – A similar image pair: a distribution with small distribution variance – A dissimilar image pair: a distribution with large distribution variance

Analysis of feature locations relationship Relationship of match pair numbers and average variances between similar image pairs and dissimilar image pairs Red: dissimilar image pairs Blue: similar image pairs

2. Feature orientation Relationship SIFT features [4] are: – Invariant to translation, rotation and scaling – Partially invariant to local geometric distortion For similar image pairs: – The rotation degree of P 1a -> P1b should be EQUAL to the rotation degree of P 2a -> P 2b [4] David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004) P 1a P 1b P 2a Image AImage B

Example ukbench00000 ukbench00001 Shift about pi/4 The rotation degree is about 50, Distance measured by histogram intersection

2. Feature orientation Relationship Assumption after observation: – A similar image pair: small orientation histogram distance – A dissimilar image pair: large orientation histogram distance

Analysis of Feature orientation Relationship Relationship of match pair numbers and average orientation intersection difference between similar image pairs and dissimilar image pairs Red: dissimilar image pairs Blue: similar image pairs

Why I zoom in the small-match-number portion of the diagrams?

Dataset and features discussion Ukbench dataset analysis: – 2550 classes, 4 images/class – Similar image pairs combination: C(4, 2) * 2550 = pairs High percentage of similar image pairs having small amount of match points. (with default ratio value = 0.6) The re-ranking criteria should have outstanding performance especially only having small match points amount. Match points #Accumulated #/%Match points #Accumulated #/% % % % % % % % % % % % %

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall High variance of the variance of Scaling Distribution, even though the mean of it is quite distinctive.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall The variance of orientation histogram difference are very small (with respect to its mean value) and stable.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall Overall, the orientation histogram difference can clearly separate similar/dissimilar image pairs, because of its large distance of mean value and quite small variance.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall When match points are more than 5, the orientation histogram difference can roughly separate similar and dissimilar image pairs.

Comparison of two re-ranking approach Match point # Similar image pairsDissimilar image pairs Variance of Scaling Distribution Orientation histogram difference Variance of Scaling Distribution Orientation histogram difference meanvarmeanvarmeanvarmeanvar overall When match points are more than 10, the orientation histogram difference can clearly separate similar and dissimilar image pairs.

Experimental results discussion 1. the impact of k values (cluster centers) K=1000K=4096K=10000K=50625K= Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) K=1000 K=4096 K=10000 K=50625 K=100000

Experimental results discussion 2. the impact of looking up code book by different approach: – A. by tracing the vocabulary tree [1]: efficient, but the result is not optimal – B. by scanning the whole code book: very slow, but guarantees a optimal BoW result with respect to the K centers K=1000(by tree)K=1000K=10000(by tree)K=10000 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) K=1000: decoded by tree K=1000: decoded directly K=10000: decoded by tree K=10000: decoded directly

K=1000 Ground truth Rotation Scale var + rotation RANSAC Scale var Original Ground truthRotationScale var + rotation RANSACScale varOriginal Re-rank depth =20

K=50625 Ground truth Rotation Scale var + rotation RANSAC Scale var Original Ground truthRotationScale var + rotation RANSACScale varOriginal Re-rank depth =20

Experimental result -- all Re-rank depth = 20 distributionK=1000K=4096K=10000K=50625 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) rotationK=1000K=4096K=10000K=50625 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) K=1000K=4096K=10000K=50625K= Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%)

RANSACK=1000K=4096K=10000K=50625 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) Ground TruthK=1000K=4096K=10000K=50625 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%) Dist+RoK=1000K=4096K=10000K=50625 Recall = 1 (33%) Recall = 2 (66%) Recall = 3 (100%)

Time Complexity Analysis RANSAC: O(Kn): – K: random subset tried – n: input data size – no upper bound on the time it takes to compute the parameters Distribution of Feature Location distance relationship: – O(n 2 ) : distribution consists of all distance relationships – O(n): when n (match point number) is large enough, we can subsample “reliable enough” amount of samples to form the distribution The distance of orientation histograms of matched SIFT features: – O(n) : to generate rotation angle histograms of matched SIFT features – Constant time for compute rotation angles – Only little overhead with respect to searching match points

Future work We have: – 1. Scale information – 2. Orientation information – 3. Trivial to find translation – A good initial guess for precise homography matrix estimation? Applied the current approach to quantized SIFT features: – Using a code word to represent a interesting point, rather than applying 128 dimension vector Moving from exact 1-1 mapping to many-to-many mapping. – I’ve tried to solve this problem. However, there are now no satisfying results at this stage.

Reference [1] D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages , June [2] [3] Peter Kovesi, Centre for Exploration Targeting School of Earth and Environment The University of Western Australiahttp:// [4] David G. Lowe, "Distinctive image features from scale- invariant keypoints," International Journal of Computer Vision, 60, 2 (2004)