Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Slides:



Advertisements
Similar presentations
Semi-Supervised Learning in Gigantic Image Collections
Advertisements

Applications of one-class classification
CSC2535: 2013 Advanced Machine Learning Lecture 8b Image retrieval using multilayer neural networks Geoffrey Hinton.
Face Recognition and Biometric Systems Eigenfaces (2)
CS590M 2008 Fall: Paper Presentation
Searching on Multi-Dimensional Data
Presented by Arshad Jamal, Rajesh Dhania, Vinkal Vishnoi Active hashing and its application to image and text retrieval Yi Zhen, Dit-Yan Yeung, Published.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl.
Large-scale matching CSE P 576 Larry Zitnick
Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.
Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.
Artificial Intelligence Statistical learning methods Chapter 20, AIMA (only ANNs & SVMs)
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)
Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (NYU) Yair Weiss (Hebrew U.) Antonio Torralba (MIT) TexPoint fonts used in EMF. Read.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Radial-Basis Function Networks
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Diffusion Maps and Spectral Clustering
Efficient Image Search and Retrieval using Compact Binary Codes
Indexing Techniques Mei-Chen Yeh.
SVD(Singular Value Decomposition) and Its Applications
CIAR Second Summer School Tutorial Lecture 2b Autoencoders & Modeling time series with Boltzmann machines Geoffrey Hinton.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Artificial Neural Networks
1 Graph Embedding (GE) & Marginal Fisher Analysis (MFA) 吳沛勳 劉冠成 韓仁智
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
Geoffrey Hinton CSC2535: 2013 Lecture 5 Deep Boltzmann Machines.
Minimal Loss Hashing for Compact Binary Codes
NEAREST NEIGHBORS ALGORITHM Lecturer: Yishay Mansour Presentation: Adi Haviv and Guy Lev 1.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Linear Discrimination Reading: Chapter 2 of textbook.
Query Sensitive Embeddings Vassilis Athitsos, Marios Hadjieleftheriou, George Kollios, Stan Sclaroff.
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Principal Manifolds and Probabilistic Subspaces for Visual Recognition Baback Moghaddam TPAMI, June John Galeotti Advanced Perception February 12,
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.
Efficient Image Search and Retrieval using Compact Binary Codes Rob Fergus (NYU) Jon Barron (NYU/UC Berkeley) Antonio Torralba (MIT) Yair Weiss (Hebrew.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Feature Selection and Dimensionality Reduction. “Curse of dimensionality” – The higher the dimensionality of the data, the more data is needed to learn.
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Multimodal Learning with Deep Boltzmann Machines
Structure learning with deep autoencoders
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Rob Fergus Computer Vision
Linear sketching over
Linear sketching with parities
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
Machine learning overview
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Lecture 16. Classification (II): Practical Considerations
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU) Spectral Hashing Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Motivation What does the world look like? Object Recognition for large-scale search High level image statistics And also the relationships between objects and the scene in general.

Semantic Hashing Semantic Hash Function Binary code [Salakhutdinov & Hinton, 2007] Query Image Semantic Hash Function Address Space Binary code Images in database Query address Semantically similar images Quite different to a (conventional) randomizing hash

1. Locality Sensitive Hashing Gionis, A. & Indyk, P. & Motwani, R. (1999) Take random projections of data Quantize each projection with few bits 1 101 1 1 Gist descriptor No learning involved

Toy Example 2D uniform distribution

2. Boosting Learn threshold & dimension for each bit (weak classifier) Modified form of BoostSSC [Shaknarovich, Viola & Darrell, 2003] Positive examples are pairs of similar images Negative examples are pairs of unrelated images 1 1 Learn threshold & dimension for each bit (weak classifier) 1

Toy Example 2D uniform distribution

3. Restricted Boltzmann Machine (RBM) Type of Deep Belief Network Hinton & Salakhutdinov, Science 2006 Hidden units Visible units Symmetric weights Units are binary & stochastic Single RBM layer W Attempts to reconstruct input at visible layer from activation of hidden layer

Multi-Layer RBM: non-linear dimensionality reduction Output binary code (N dimensional) Layer 3 N w3 256 Layer 2 256 w2 512 Layer 1 512 w1 512 Linear units at first layer Input Gist vector (512 dimensions)

Toy Example 2D uniform distribution

2-D Toy example: 3 bits 7 bits 15 bits Distance from query point Red – 0 bits Green – 1 bit Black – >2 bits Blue – 2 bits Query Point

Toy Results Distance Red – 0 bits Green – 1 bit Blue – 2 bits

Semantic Hashing Semantic Hash Function Binary code [Salakhutdinov & Hinton, 2007] Query Image Semantic Hash Function Address Space Binary code Images in database Query address Semantically similar images Quite different to a (conventional) randomizing hash

Spectral Hash Binary code Real-valued vectors Query Image Spectral Hash Non-linear dimensionality reduction Address Space Binary code Images in database Real-valued vectors Query address Semantically similar images Quite different to a (conventional) randomizing hash

Spectral Hashing (NIPS ’08) Assume points are embedded in Euclidean space How to binarize so Hamming distance approximates Euclidean distance? Ham_Dist(10001010,11101110)=3

Spectral Hashing theory Want to min YT(D-W)Y subject to: Each bit on 50% of time Bits are independent Sadly, this is NP-complete Relax the problem, by letting Y be continuous. Now becomes eigenvector problem

Nystrom Approximation Method for approximating eigenfunctions Interpolate between existing data points Requires evaluation of distance to existing data  cost grows linearly with #points Also overfits badly in practice

What about a novel data point? Need a function to map new points into the space Take limit of Eigenvalues as n\inf Need to carefully normalize graph Laplacian Analytical form of Eigenfunctions exists for certain distributions (uniform, Gaussian) Constant time compute/evaluate new point For uniform: Only depends on extent of distribution (b-a)

Eigenfunctions for uniform distribution

The Algorithm Input: Data {xi} of dimensionality d; desired # bits, k Fit a multidimensional rectangle to the data Run PCA to align axes, then bound uniform distribution For each dimension, calculate k smallest eigenfunctions. This gives dk eigenfunctions. Pick ones with smallest k eigenvalues. Threshold eigenfunctions at zero to give binary codes

1. Fit Multidimensional Rectangle Run PCA to align axes Bound uniform distribution

2. Calculuate Eigenfunctions

3. Pick k smallest Eigenfunctions Eigenvalues e.g. k=3

4. Threshold chosen Eigenfunctions

Back to the 2-D Toy example 3 bits 7 bits 15 bits Distance Red – 0 bits Green – 1 bit Blue – 2 bits

2-D Toy Example Comparison

10-D Toy Example

Experiments on Real Data

Input Image representation: Gist vectors Pixels not a convenient representation Use Gist descriptor instead (Oliva & Torralba, 2001) 512 dimensions/image (real-valued  16,384 bits) L2 distance btw. Gist vectors not bad substitute for human perceptual distance NO COLOR INFORMATION Oliva & Torralba, IJCV 2001

LabelMe images 22,000 images (20,000 train | 2,000 test) Ground truth segmentations for all Assume L2 Gist distance is true distance

LabelMe data

Extensions

How to handle non-uniform distributions

Bit allocation between dimensions Compare value of cuts in original space, i.e. before the pointwise nonlinearity.

Summary Spectral Hashing Simple way of computing good binary codes Forced to make big assumption about data distribution Use point-wise non-linearities to map distribution to uniform Need more experiments on real data

Overview Assume points are embedded in Euclidean space (e.g. output from RBM) How to binarize the space so that Hamming distance between points approximates L2 distance?

Semantic Hashing beyond 30 bits

Strategies for Binarization Deliberately add noise during backprop - forces extreme values to overcome noise 1 1