Recent Advances of Compact Hashing for Large-Scale Visual Search Shih-Fu Chang Columbia University October 2012 Joint work with Junfeng He (Facebook),

Slides:

Advertisements

Similar presentations

© 2009 IBM Corporation IBM Research Xianglong Liu 1, Junfeng He 2,3, and Bo Lang 1 1 Beihang University, Beijing, China 2 Columbia University, New York,

Advertisements

Recent Advances of Compact Hashing for Large-Scale Visual Search Shih-Fu Chang Columbia University December 2012 Joint work with.

Aggregating local image descriptors into compact codes

Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.

Three things everyone should know to improve object retrieval

VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.

Searching on Multi-Dimensional Data

MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.

Efficiently searching for similar images (Kristen Grauman)

Dimensionality Reduction PCA -- SVD

A NOVEL LOCAL FEATURE DESCRIPTOR FOR IMAGE MATCHING Heng Yang, Qing Wang ICME 2008.

Query Specific Fusion for Image Retrieval

Addressing the Medical Image Annotation Task using visual words representation Uri Avni, Tel Aviv University, Israel Hayit GreenspanTel Aviv University,

Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.

Presented by Relja Arandjelović Iterative Quantization: A Procrustean Approach to Learning Binary Codes University of Oxford 21 st September 2011 Yunchao.

Large-scale matching CSE P 576 Larry Zitnick

Coherency Sensitive Hashing (CSH) Simon Korman and Shai Avidan Dept. of Electrical Engineering Tel Aviv University ICCV2011 | 13th International Conference.

Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

1 Large Scale Similarity Learning and Indexing Part II: Learning to Hash for Large Scale Search Fei Wang and Jun Wang IBM TJ Watson Research Center.

Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.

Principal Component Analysis

Learning from Observations Chapter 18 Section 1 – 4.

1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.

Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (NYU) Yair Weiss (Hebrew U.) Antonio Torralba (MIT) TexPoint fonts used in EMF. Read.

Similarity Search in High Dimensions via Hashing Aristides Gionis, Protr Indyk and Rajeev Motwani Department of Computer Science Stanford University presented.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman

Nearest Neighbor Retrieval Using Distance-Based Hashing Michalis Potamias and Panagiotis Papapetrou supervised by Prof George Kollios A method is proposed.

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Dimensionality Reduction

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

FLANN Fast Library for Approximate Nearest Neighbors

Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.

Efficient Image Search and Retrieval using Compact Binary Codes

Indexing Techniques Mei-Chen Yeh.

Computer Vision James Hays, Brown

CSE 185 Introduction to Computer Vision Pattern Recognition.

Special Topic on Image Retrieval

School of Information Technology & Electrical Engineering Multiple Feature Hashing for Real-time Large Scale Near-duplicate Video Retrieval Jingkuan Song*,

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Nearest Neighbor Paul Hsiung March 16, Quick Review of NN Set of points P Query point q Distance metric d Find p in P such that d(p,q) < d(p’,q)

Fast Similarity Search for Learned Metrics Prateek Jain, Brian Kulis, and Kristen Grauman Department of Computer Sciences University of Texas at Austin.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Minimal Loss Hashing for Compact Binary Codes

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Similarity Searching in High Dimensions via Hashing Paper by: Aristides Gionis, Poitr Indyk, Rajeev Motwani.

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

© 2009 IBM Corporation IBM Research Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih-Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.

An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)

Image Classification for Automatic Annotation

Efficient Image Search and Retrieval using Compact Binary Codes Rob Fergus (NYU) Jon Barron (NYU/UC Berkeley) Antonio Torralba (MIT) Yair Weiss (Hebrew.

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.

Cross-modal Hashing Through Ranking Subspace Learning

KNN & Naïve Bayes Hongning Wang

Scalability of Local Image Descriptors Björn Þór Jónsson Department of Computer Science Reykjavík University Joint work with: Laurent Amsaleg (IRISA-CNRS)

Naifan Zhuang, Jun Ye, Kien A. Hua

Grassmannian Hashing for Subspace searching

Learning Mid-Level Features For Recognition

Presenter: Chu-Song Chen

Paper Reading Dalong Du April.08, 2011.

Minwise Hashing and Efficient Search

Presentation transcript:

Recent Advances of Compact Hashing for Large-Scale Visual Search Shih-Fu Chang Columbia University October 2012 Joint work with Junfeng He (Facebook), Sanjiv Kumar (Google), Wei Liu (IBM Research), and Jun Wang (IBM Research)

digital video | multimedia lab Outline Lessons learned in designing hashing functions The importance of balancing hash bucket size How to incorporate supervised information Prediction of NN search difficulty & hashing performance Demo: Bag of hash bits for Mobile Visual Search

Fast Nearest Neighbor Search Applications: image search, texture synthesis, denoising … Avoid exhaustive search ( time complexity) 3 Dense matching, Coherence sensitive hashing (Korman&Avidan ’11) Photo tourism patch search Image search

Locality-Sensitive Hashing hash code collision probability proportional to original similarity l : # hash tables, K : hash bits per table hash function random 101 Index by compact code [Indyk, and Motwani 1998] [Datar et al. 2004]

Hash Table based Search 5 O(1) search time by table lookup bucket size is important (affect accuracy & post processing cost) xixi n q hash table hash bucketaddress

Different Approaches 6 Unsupervised Hashing LSH ‘98, SH ‘08, KLSH ‘09, AGH ’10, PCAH, ITQ ‘11 Semi-Supervised Hashing SSH ‘10, WeaklySH ‘10 Supervised Hashing RBM ‘09, BRE ‘10, MLH ‘11, LDAH ’11, ITQ ‘11, KSH ‘12

PCA + Minimize Quantization Errors PCA to maximize variance in each hash dimension find optimal rotation in the subspace to minimize quantization error ITQ method, Gong&Lazebnik, CVPR 11

Effects of Min Quantization Errors 580K tiny images PCA-ITQ, Gong&Lazebnik, CVPR 11 PCA-random rotation PCA-ITQ optimal alignment

Utilize supervised labels Semantic Category Supervision 9 Metric Supervision similar dissimilar similar dissimilar

Design Hash Codes to Match Supervised Information 10 similar dissimilar 0 1 Preferred hashing function

Adding Supervised Labels to PCA Hash Relaxation: Wang, Kumar, Chang, CVPR ’10, ICML’10 “adjusted” covariance matrix solution W: eigen vectors of adjusted covariance matrix If no supervision (S=0), it is simply PCA hash Fitting labels PCA covariance matrix dissimilar pair similar pair

Semi-Supervised Hashing (SSH) 1 Million GIST Images 1% labels, 99% unlabeled Supervised RBM Random LSH Unsupervised SH SSH top 1K

Problem of orthogonal projections Many buckets become empty when # bits increases. Need to search many neighbor buckets at query time hamming radius 2

Explicitly optimize two terms – Preserve similarity (accuracy) – Balanced bucket size  max entropy  min mutual info I (search time) Search accuracy ICA Type Hashing Balanced bucket size SPICA Hash, He et al, CVPR 11 Fast ICA to find non-orthogonal projections

The Importance of balanced size Bucket index Bucket size LSH SPICA Hash Balanced bucket size Simulation over 1M tiny image samples The largest bucket of LSH contains 10% of all 1M samples

Different Approaches 16 Unsupervised Hashing LSH ‘98, SH ‘08, KLSH ‘09, AGH ’10, PCAH, ITQ ‘11 Semi-Supervised Hashing SSH ‘10, WeaklySH ‘10 Supervised Hashing RBM ‘09, BRE ‘10, MLH ‘11, LDAH ’11, ITQ ‘11, KSH ‘12

Better ways to handle supervised information? 17 MLH [Norouzi & Flee, ‘11] BRE [Kulis & Darrell, ‘10] Hamming distance between H(x i ) and H(x j ) hinge loss But optimizing Hamming Distance (D H, XOR) is not easy!

A New Supervision Form: Code Inner Products 18 S x2x2 x3x3 x1x1 dissimilar similar supervised hashing labeled data dissimilar Х T code matrix x1x1 x2x2 x3x3 x1x1 x2x2 x3x3 pair-wise label matrix code inner products r x1x1 x2x2 x3x3 code matrix fitting Liu, Wang, Ji, Jiang, Chang, CVPR’12 proof: code inner product ≡ Hamming distance

Code Inner Product enables efficient optimization Much easier/faster to optimize and extend to kernels 19 sample hash bit Hashing: Design hash codes to match supervised information Liu, Wang, Ji, Jiang, Chang, CVPR2012

Extend Code Inner Product to Kernel Following KLSH, construct a hash function using a kernel function and m anchor samples: zero-mean normalization applied to k(x). 20 = sgn hash coefficients kernel matrix × l samples m anchors

Benefits of Code Inner Product 21 CIFAR 10, 60K object images from 10 classes, 1K query images. 1K supervised labels. KSH 0 Spec Relax, KSH Sigmoid hashing function Supervised Methods Open Issue: empty buckets and balance not addressed

Speedup by Inner Code Product 22CVPR 2012 Method Train Time Test Time 48 bits SSH ×10 −5 LDAH ×10 −5 BRE ×10 −5 MLH ×10 −5 KSH ×10 −5 KSH ×10 −5 Significant speedup

Effect of training data size 23

Tiny-1M 24CVPR M tiny images from the web, 2K query images. Pseudo labels: top 5% L2 neighbors. top 5K NN neighbors used in as supervised labels Supervised Methods

25 Tiny-1M: Visual Search Results CVPR 2012 More visually relevant

Comparison: KD-Tree O(log n) search time (e.g., 20 bits for 1 million nodes) E.g., VlFeat/FLANN tool, Best-Fit-First Search Strategy curse of dimensionality – needs of backtracking Might be hard to store tree indexes on small devices

Comparison of Hashing vs. KD-Tree Supervised Hashing Photo Tourism Patch set (Norte Dame subset, 103K samples) 512D GIFT Anchor Graph Hashing KD Tree

How difficult is approximate nearest neighbor search in a dataset? Understand Difficulty of Approximate Nearest Neighbor Search Toy example q x is an ε-approximate NN if Search not meaningful! A concrete measure of difficulty of search in a dataset? He, Kumar, Chang, ICML 2012

A naïve search approach: Randomly pick a point and compare that to the NN Relative Contrast q High Relative Contrast  easier search If, search not meaningful He, Kumar, Chang, ICML 2012

With CLT, and binomial approximation Estimation of Relative Contrast ϕ - standard Gaussian cdf σ' – a function of data properties (dimensionality and sparsity) n: data size p: Lp distance

Data sampled randomly from U[0,1] Synthetic Data relative contrast higher dimensionality  bad sparser vectors  good s: prob. of non-zero element in each dim. d: feature dimension

Data sampled randomly from U[0,1] Synthetic Data relative contrast lower p  good Larger database  good

Predict Hashing Performance of Real-World Data 16 bits LSH DatasetDimensionality (d) Sparsity (s) Relative Contrast (C r ) for p = 1 SIFT Gist Color Hist Imagenet BoW bits LSH

Mobile Search System by Hashing 34 Light Computing Low Bit Rate Big Data Indexing He, Feng, Liu, Cheng, Lin, Chung, Chang. Mobile Product Search with Bag of Hash Bits and Boundary Reranking, CVPR 2012.

Estimate the Complexity 500 local features per image – Feature size ~128 Kbytes – more than 10 seconds for transmission over 3G Database indexing – 1 million images need 0.5 billions local features – Finding matched features becomes challenging Idea: directly compute compact hash codes on mobile devices

Approach: hashing Each local feature coded as hash bits –locality sensitive, efficient for high dimensions Each image is represented as Bag of Hash Bits … … 36

Bit Reuse for Multi-Table Hashing To reduce transmission size – Reuse a single hash bit pool by random subsampling Optimal hash bit pool (e.g., 80 bits, PCA Hash or SPICA hash) Random subset... Table 1 Table 2 Table 11 Table bits Union Results

Rerank Results with Boundary Features Use automatic salient object segmentation for every image in DB [Cheng et al, CVPR 2011] Compute boundary features: normalized central distance, Fourier magnitude Invariance: translation, scaling, rotation 38

Boundary Feature – Central Distance Distance to Center D(n) FFT: F(n) 39

Reranking with boundary feature 40

Server: 1 million product images crawled from Amazon, eBay and Zappos Hundreds of categories; shoes, clothes, electrical devices, groceries, kitchen supplies, movies, etc. Speed Feature extraction: ~1s Transmission: 80 bits/feature, 1KB/image Serer Search: ~0.4s Download/display: 1-2s Mobile Product Search System: Bags of Hash Bits and Boundary features video demovideo demo (52”) He, Feng, Liu, Cheng, Lin, Chung, Chang. Mobile Product Search with Bag of Hash Bits and Boundary Reranking, CVPR 2012.

Performance Baseline [Chandrasekhar et al CVPR ‘10]: Client: compress local features with CHoG Server: BoW with Vocabulary Tree (1M codes) 30% higher recall and 6X-30X search speedup 42

Summary Some Ideas Discussed – bucket balancing is important – code inner product – an efficient form of supervised hashing – insights on search difficulty prediction – Large mobile search – a good test case for hashing Open Issues – supervised hashing vs. attribute discovery – hashing beyond point-to-point search – hashing to incorporate structured relation (spatio- temporal) 43

References ( Supervised Kernel Hash) W. Liu, J. Wang, R. Ji, Y. Jiang, and S.-F. Chang, Supervised Hashing with Kernels, CVPR (Difficulty of Nearest Neighbor Search) J. He, S. Kumar, S.-F. Chang, On the Difficulty of Nearest Neighbor Search, ICML ( Hash Based Mobile Product Search) J. He, T. Lin, J. Feng, X. Liu, S.-F. Chang, Mobile Product Search with Bag of Hash Bits and Boundary Reranking, CVPR 2012 (Hashing with Graphs) W. Liu, J. Wang, S. Kumar, S.-F. Chang. Hashing with Graphs, ICML (Iterative Quantization) Y. Gong and S. Lazebnik, Iterative Quantization: A Procrustean Approach to Learning Binary Codes, CVPR (Semi-Supervised Hash) J. Wang, S. Kumar, S.-F. Chang. Semi-Supervised Hashing for Scalable Image Retrieval. CVPR (ICA Hashing) J.He, R. Radhakrishnan, S.-F. Chang, C. Bauer. Compact Hashing with Joint Optimization of Search Accuracy and Time. CVPR