Fast Similarity Search in Image Databases CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 7: Scoring and results assembly.
Advertisements

Chamfer Distance for Handshape Detection and Recognition CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Fa07CSE 182 CSE182-L4: Database filtering. Fa07CSE 182 Summary (through lecture 3) A2 is online We considered the basics of sequence alignment –Opt score.
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Similarity Search on Bregman Divergence, Towards Non- Metric Indexing Zhenjie Zhang, Beng Chi Ooi, Srinivasan Parthasarathy, Anthony K. H. Tung.
Data Mining Classification: Alternative Techniques
MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Classification and Decision Boundaries
The Viola/Jones Face Detector (2001)
Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.
CS292 Computational Vision and Language Pattern Recognition and Classification.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
1 Learning Embeddings for Similarity-Based Retrieval Vassilis Athitsos Computer Science Department Boston University.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
3D Hand Pose Estimation by Finding Appearance-Based Matches in a Large Database of Training Views
Spatial and Temporal Data Mining
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Nearest Neighbor Retrieval Using Distance-Based Hashing Michalis Potamias and Panagiotis Papapetrou supervised by Prof George Kollios A method is proposed.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept.
Using Relevance Feedback in Multimedia Databases
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
CS Instance Based Learning1 Instance Based Learning.
Approximate Nearest Subspace Search with applications to pattern recognition Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech.
Rectangle Filters and AdaBoost CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
SISAP’08 – Approximate Similarity Search in Genomic Sequence Databases using Landmark-Guided Embedding Ahmet Sacan and I. Hakki Toroslu
Kernels, Margins, and Low-dimensional Mappings [NIPS 2007 Workshop on TOPOLOGY LEARNING ] Maria-Florina Balcan, Avrim Blum, Santosh Vempala.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
“Artificial Intelligence” in my research Seung-won Hwang Department of CSE POSTECH.
Visual Information Systems Recognition and Classification.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Query Sensitive Embeddings Vassilis Athitsos, Marios Hadjieleftheriou, George Kollios, Stan Sclaroff.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept. of Electronic.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #23.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Project Overview CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS Machine Learning Instance Based Learning (Adapted from various sources)
1 Learning Embeddings for Similarity-Based Retrieval Vassilis Athitsos Computer Science Department Boston University.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
KNN & Naïve Bayes Hongning Wang
Nearest Neighbor Classifiers
Supervised Time Series Pattern Discovery through Local Importance
Lecture 26 Hand Pose Estimation Using a Database of Hand Images
Lecture 10: Sketching S3: Nearest Neighbor Search
Dimensionality Reduction and Embeddings
Instance Based Learning
COSC 4335: Other Classification Techniques
CS4670: Intro to Computer Vision
Lecture 15: Least Square Regression Metric Embeddings
Physics-guided machine learning for milling stability:
Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech
Presentation transcript:

Fast Similarity Search in Image Databases CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington

images are generated for each hand shape. Total: 107,328 images. A Database of Hand Images

3 Efficiency of the Chamfer Distance Computing chamfer distances is slow. –For images with d edge pixels, O(d log d) time. –Comparing input to entire database takes over 4 minutes. Must measure 107,328 distances. input model

4 The Nearest Neighbor Problem database

5 The Nearest Neighbor Problem query Goal: –find the k nearest neighbors of query q. database

6 The Nearest Neighbor Problem Goal: –find the k nearest neighbors of query q. Brute force time is linear to: –n (size of database). –time it takes to measure a single distance. query database

7 Goal: –find the k nearest neighbors of query q. Brute force time is linear to: –n (size of database). –time it takes to measure a single distance. The Nearest Neighbor Problem query database

8 Examples of Expensive Measures DNA and protein sequences: Smith-Waterman. Dynamic gestures and time series: Dynamic Time Warping. Edge images: Chamfer distance, shape context distance. These measures are non-Euclidean, sometimes non-metric.

9 Embeddings database x1x1 x2x2 x3x3 xnxn

10 database x1x1 x2x2 x3x3 xnxn embedding F x1x1 x2x2 x3x3 x4x4 xnxn RdRd Embeddings

11 database x1x1 x2x2 x3x3 xnxn embedding F x1x1 x2x2 x3x3 x4x4 xnxn q query RdRd Embeddings

12 x1x1 x2x2 x3x3 x4x4 xnxn q database x1x1 x2x2 x3x3 xnxn embedding F q query RdRd Embeddings

13 x1x1 x2x2 x3x3 x4x4 xnxn RdRd q Measure distances between vectors (typically much faster). Caveat: the embedding must preserve similarity structure. Embeddings database x1x1 x2x2 x3x3 xnxn embedding F q query

14 Reference Object Embeddings original space X

15 Reference Object Embeddings original space X r r: reference object

16 Reference Object Embeddings original space X r: reference objectEmbedding: F(x) = D(x,r) D: distance measure in X. r

17 Reference Object Embeddings original space XReal line F r: reference objectEmbedding: F(x) = D(x,r) D: distance measure in X. r

18 original space XReal line F r: reference objectEmbedding: F(x) = D(x,r) D: distance measure in X. F(r) = D(r,r) = 0 r Reference Object Embeddings

19 original space XReal line F r: reference objectEmbedding: F(x) = D(x,r) D: distance measure in X. a F(r) = D(r,r) = 0 If a and b are similar, their distances to r are also similar (usually). b r Reference Object Embeddings

20 original space XReal line F r: reference objectEmbedding: F(x) = D(x,r) D: distance measure in X. a F(r) = D(r,r) = 0 If a and b are similar, their distances to r are also similar (usually). b r Reference Object Embeddings

21 F(x) = D(x, Lincoln) F(Sacramento)....= 1543 F(Las Vegas).....= 1232 F(Oklahoma City).= 437 F(Washington DC).= 1207 F(Jacksonville)..= 1344

22 F(x) = (D(x, LA), D(x, Lincoln), D(x, Orlando)) F(Sacramento)....= ( 386, 1543, 2920) F(Las Vegas).....= ( 262, 1232, 2405) F(Oklahoma City).= (1345, 437, 1291) F(Washington DC).= (2657, 1207, 853) F(Jacksonville)..= (2422, 1344, 141)

23 F(x) = (D(x, LA), D(x, Lincoln), D(x, Orlando)) F(Sacramento)....= ( 386, 1543, 2920) F(Las Vegas).....= ( 262, 1232, 2405) F(Oklahoma City).= (1345, 437, 1291) F(Washington DC).= (2657, 1207, 853) F(Jacksonville)..= (2422, 1344, 141)

24 Embedding Hand Images F(x) = (C(x, R 1 ), C(A, R 2 ), C(A, R 3 )) R1R1 R2R2 R3R3 image x x: hand image.C: chamfer distance.

25 Basic Questions R1R1 R2R2 R3R3 x: hand image.C: chamfer distance. How many prototypes? Which prototypes? What distance should we use to compare vectors? image x F(x) = (C(x, R 1 ), C(A, R 2 ), C(A, R 3 ))

26 Some Easy Answers. R1R1 R2R2 R3R3 x: hand image.C: chamfer distance. How many prototypes? Pick number manually. Which prototypes? Randomly chosen. What distance should we use to compare vectors? L 1, or Euclidean. image x F(x) = (C(x, R 1 ), C(A, R 2 ), C(A, R 3 ))

27 Filter-and-refine Retrieval Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches.

28 Evaluating Embedding Quality Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches. How often do we find the true nearest neighbor?

29 Evaluating Embedding Quality Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches. How often do we find the true nearest neighbor?

30 Evaluating Embedding Quality Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches. How often do we find the true nearest neighbor? How many exact distance computations do we need?

31 Evaluating Embedding Quality Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches. How often do we find the true nearest neighbor? How many exact distance computations do we need?

32 Evaluating Embedding Quality Embedding step: Compute distances from query to reference objects  F(q). Filter step: Find top p matches of F(q) in vector space. Refine step: Measure exact distance from q to top p matches. How often do we find the true nearest neighbor? How many exact distance computations do we need?

33 query Database (107,328 images) nearest neighbor Brute force retrieval time: 260 seconds. Results: Chamfer Distance on Hand Images

34 Brute Force Embeddings Accuracy100%95%100% # of distances Sec. per query Speed-up factor Query set: 710 real images of hands. Database: 80,640 synthetic images of hands. Results: Chamfer Distance on Hand Images

35 Ideal Embedding Behavior original space X FRdRd Notation: NN(q) is the nearest neighbor of q. For any q: if a = NN(q), we want F(a) = NN(F(q)). a q

36 A Quantitative Measure q a original space X FRdRd b If b is not the nearest neighbor of q, F(q) should be closer to F(NN(q)) than to F(b). For how many triples (q, NN(q), b) does F fail?

37 A Quantitative Measure original space X FRdRd q a F fails on five triples.

38 Embeddings Seen As Classifiers q a b Classification task: is q closer to a or to b?

39 Any embedding F defines a classifier F’(q, a, b). F’ checks if F(q) is closer to F(a) or to F(b). q a b Embeddings Seen As Classifiers Classification task: is q closer to a or to b?

40 Given embedding F: X  R d : F’(q, a, b) = ||F(q) – F(b)|| - ||F(q) – F(a)||. F’(q, a, b) > 0 means “q is closer to a.” F’(q, a, b) < 0 means “q is closer to b.” q a b Classifier Definition Classification task: is q closer to a or to b?

41 Given embedding F: X  R d : F’(q, a, b) = ||F(q) – F(b)|| - ||F(q) – F(a)||. F’(q, a, b) > 0 means “q is closer to a.” F’(q, a, b) < 0 means “q is closer to b.” Classifier Definition Goal: build an F such that F’ has low error rate on triples of type (q, NN(q), b).

42 1D Embeddings as Weak Classifiers 1D embeddings define weak classifiers. Better than a random classifier (50% error rate).

43 1D Embeddings as Weak Classifiers 1D embeddings define weak classifiers. Better than a random classifier (50% error rate). We can define lots of different classifiers. Every object in the database can be a reference object. Question: how do we combine many such classifiers into a single strong classifier?

44 1D Embeddings as Weak Classifiers 1D embeddings define weak classifiers. Better than a random classifier (50% error rate). We can define lots of different classifiers. Every object in the database can be a reference object. Question: how do we combine many such classifiers into a single strong classifier? Answer: use AdaBoost. AdaBoost is a machine learning method designed for exactly this problem.

45 Using AdaBoost original space X FnFn F2F2 F1F1 Real line Output: H = w 1 F’ 1 + w 2 F’ 2 + … + w d F’ d. AdaBoost chooses 1D embeddings and weighs them. Goal: achieve low classification error. AdaBoost trains on triples chosen from the database.

46 From Classifier to Embedding AdaBoost output H = w 1 F’ 1 + w 2 F’ 2 + … + w d F’ d What embedding should we use? What distance measure should we use?

47 From Classifier to Embedding AdaBoost output BoostMap embedding H = w 1 F’ 1 + w 2 F’ 2 + … + w d F’ d F(x) = (F 1 (x), …, F d (x)).

48 From Classifier to Embedding AdaBoost output BoostMap embedding Distance measure D((u 1, …, u d ), (v 1, …, v d )) =  i=1 w i |u i – v i | d F(x) = (F 1 (x), …, F d (x)). H = w 1 F’ 1 + w 2 F’ 2 + … + w d F’ d

49 From Classifier to Embedding AdaBoost output D((u 1, …, u d ), (v 1, …, v d )) =  i=1 w i |u i – v i | d F(x) = (F 1 (x), …, F d (x)). BoostMap embedding Distance measure Claim: Let q be closer to a than to b. H misclassifies triple (q, a, b) if and only if, under distance measure D, F maps q closer to b than to a. H = w 1 F’ 1 + w 2 F’ 2 + … + w d F’ d

50 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

51 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

52 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

53 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

54 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

55 Proof H(q, a, b) = = w i F’ i (q, a, b) = w i (|F i (q) - F i (b)| - |F i (q) - F i (a)|) = (w i |F i (q) - F i (b)| - w i |F i (q) - F i (a)|) = D(F(q), F(b)) – D(F(q), F(a)) = F’(q, a, b)  i=1 d d d

56 Significance of Proof AdaBoost optimizes a direct measure of embedding quality. We have converted a database indexing problem into a machine learning problem.

57 query Database (80,640 images) nearest neighbor Brute force retrieval time: 112 seconds. Results: Chamfer Distance on Hand Images

58 Query set: 710 real images of hands. Database: 80,640 synthetic images of hands. Results: Chamfer Distance on Hand Images Brute Force Random Reference Objects BoostMap Accuracy100%95% # of distances Sec. per query Speed-up factor143179

59 Query set: 710 real images of hands. Database: 80,640 synthetic images of hands. Results: Chamfer Distance on Hand Images Brute Force Random Reference Objects BoostMap Accuracy100% # of distances Sec. per query Speed-up factor