Unsupervised Learning of Visual Object Categories Michael Pfeiffer

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

Applications of one-class classification
A Bayesian Approach to Recognition Moshe Blank Ita Lifshitz Reverend Thomas Bayes
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Supervised Learning Recap
Recognition by Probabilistic Hypothesis Construction P. Moreels, M. Maire, P. Perona California Institute of Technology.
Segmentation and Fitting Using Probabilistic Methods
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 143, Brown James Hays 02/22/11 Many slides from Derek Hoiem.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/15/12.
Visual Recognition Tutorial
EE-148 Expectation Maximization Markus Weber 5/11/99.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Overview Full Bayesian Learning MAP learning
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Expectation Maximization Algorithm
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus, P. Perona, and A. Zisserman Presented By Jeff.
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Unsupervised Category Modeling, Recognition and Segmentation Sinisa Todorovic and Narendra Ahuja.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Semi-Supervised Learning
Exercise Session 10 – Image Categorization
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
A Statistically Selected Part-Based Probabilistic Model for Object Recognition Zhipeng Zhao, Ahmed Elgammal Department of Computer Science, Rutgers, The.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Computer and Robot Vision II Chapter 20 Accuracy Presented by: 傅楸善 & 王林農 指導教授 : 傅楸善 博士.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Probability and Statistics in Vision. Probability Objects not all the sameObjects not all the same – Many possible shapes for people, cars, … – Skin has.
Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/22/11.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Lecture 1.31 Criteria for optimal reception of radio signals.
Object Recognition by Parts
Classification of unlabeled data:
Statistical Models for Automatic Speech Recognition
Data Mining Lecture 11.
Object Recognition by Parts
Probabilistic Models for Linear Regression
Lecture 26: Faces and probabilities
Object Recognition by Parts
Bayesian Models in Machine Learning
Probabilistic Models with Latent Variables
Object Recognition by Parts
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Unsupervised Learning of Models for Recognition
Unsupervised learning of models for recognition
Object Recognition by Parts
Object Recognition with Interest Operators
Presentation transcript:

Unsupervised Learning of Visual Object Categories Michael Pfeiffer

References M. Weber 1, M. Welling 1, P. Perona 1 (2000) Unsupervised Learning of Models for Recognition Li Fei-Fei 1, R. Fergus 2, P. Perona 1 (2003) A Bayesian Approach to Unsupervised One- Shot Learning of Object Categories ( 1 CalTech, 2 Oxford)

Question Which image shows a face / car / …?

Visual Object Category Recognition Easy for humans Very difficult for machines Large variability inside one category

To make it even more difficult … Unsupervised No help from human supervisor No labelling No segmentation No alignment

Topics Constellation Model Feature Selection Model Learning (EM Algorithm) Results, Comparison One-shot Category Learning

Constellation Model (Burl, et.al.) Object: Random constellation of Parts Shape / GeometryPart Appearance Object class: Joint pdf on Shape and Appearance

Strength of Constellation Model Can model Classes with strict geometric rules (e.g. faces) Can also model Classes where appearance is the main criteria (e.g. spotted cats) LikelyUnlikely

Recognition Detect parts of the image Form likely hypotheses Calculate category likelihood Training Decide on key parts of object Select those parts in training images Estimate joint pdf

Object Model Object is a collection of parts Parts in an image come from –Foreground (target object) –Background (clutter or false detections) Information about parts: –Location –Part type

Probabilistic Model X o : „matrix“ of positions of parts from one image (observable) x m : position of unobserved parts (hidden) h: Hypothesis: which parts of X o belong to the foreground (hidden) n: Number of background candidates (dependent) b: Which parts were detected (dependent) p(X o, x m, h) = p(X o, x m, h, n, b)

Bayesian Decomposition p(X o, x m, h, n, b) = p(X o, x m |h, n)  p(h|n, b)  p(n)  p(b) We assume independence between foreground and background (p(n) and p(b))

Models of PDF factors (1) p(n) : Number of background part-detections M t : avg. Number of background (bg) detections of type t per image Ideas: –Independence between bg parts –Bg parts can arise at every position with same probability Poisson Distribution

Models of PDF factors (2) p(b) : 2 F values for F features –b: Which parts have been detected Explicit table of 2 F joint probabilities If F is large: F independent prob. –Drawback: no modelling of simultaneous occlusions

Models of PDF factors (3) p(h | n, b) –How likely is a hypothesis h for given n and b? –n and b are dependent on h  Uniform distribution for all consistent hypotheses, 0 for inconsistent

Models of PDF factors (4) p(X o, x m | h, n) = p fg (z)  p bg (x bg ) –z = (x o x m ) : Coordinates of observed and missing foreground detections –x bg : Coordinates of all background detections Assumption: foreground detections are independent of the background

Models of PDF factors (5) p fg (z) : Foreground positions –Joint Gaussian with mean  and covariance matrix  Translation invariant: Describe part positions relative to one reference part

Models of PDF factors (6) p bg : positions of all background detections Uniform distribution over the whole image of Area A

Recognition Decide between object present (Class C 1 ) and object absent (Class C 2 ) Choose class with highest a posteriori probability from observed X o h 0 : Null hypothesis: everything is bg noise Localization is also possible!

Topics Constellation Model Feature Selection Model Learning (EM Algorithm) Results, Comparison One-shot Category Learning

Part selection Selecting parts that make up the model is closely related to finding parts for recognition 1.: Finding Points of Interest 2.: Vector quantization

Interest Operators Förstner operator Kadir-Brady operator Well-known results from computer vision Detect –Corner points –Intersection of lines –Centers of circular patterns Returns ~150 parts per image –May come from background

Vector Quantization (1) > parts for 100 training images k-means clustering of image patches  ~ 100 patterns Pattern is average of all images in cluster

Vector Quantization (2) Remove clusters with < 10 patterns: –pattern does not appear in significant number of training images Remove patterns that are similar to others after 1-2 pixel shift Calculate PCA of image patch –precalculated PCA basis

Result of Vector Quantization Faces –Eyes, hairline, Mouth can be recognized Cars –high-pass filtered –Corners and lines result from huge clusters

Topics Constellation Model Feature Selection Model Learning (EM Algorithm) Results, Comparison One-shot Category Learning

Two steps of Model Learning Model Configuration –How many parts make up the model? –Greedy search: Add one part and look if it improves the model Estimate hidden Model Parameters –EM Algorithm

The EM Algorithm (1) Expectation Maximization Find a Maximum Likelihood Hypothesis for incomplete-data problems –Likelihood: –Find the most likely parameter vector  for (complete) observation X –What if X = (O, H) and only O is known?

The EM Algorithm (2) p (O, H |  ) = p(H | O,  ) · p(O |  ) Likelihood L(  | O, H) = p(O, H |  ) is a function of random variable H Define –Conditional expectation of log-likelihood depending on constants O and  i-1

The EM Algorithm (3) E – Step –Calculate Q(  |  i-1 ) using the current hypothesis  i-1 and the observation O to model the distribution of H M – Step –Find parameter vector  i to maximize Q(  i,  i-1 ) Repeat until convergence –Guaranteed to converge to local maximum

Hidden Parameters for This Model  : Mean of foreground part coordinates  : Covariance matrix of foreground detection coordinates p(b): Occlusion statistics (Table) M: Number of background detections Observation: X i o coordinates of detections in images

Log-Likelihood Maximization Use earlier decomposition of probabilistic model in 4 parts Decompose Q into 4 parts –For every hidden parameter, only one part is dependent on it: maximize only this one! –Easy derivation of update rules (M – step) –Set derivation w.r.t. hidden parameter zero and calculate maximum point –Needed statistics calculated in E-step Not shown here in detail

Topics Constellation Model Feature Selection Model Learning (EM Algorithm) Results, Comparison One-shot Category Learning

Experiments (1) Two test sets –Faces –Rear views of cars 200 images showing the target 200 background images Random test and training set

Experiments (2) Measure of success: –ROC : Receiver Operating Characteristics –X-Axis: False positives / Total Negatives –Y-Axis: True positives / Total Positives Area under curve: –Larger area means: smaller classification error (good recall, good precision)

Experiments (3) Number of parts: 2 – learning runs for each configuration Complexity: –EM converges in 100 iterations 10s for 2 parts, 2 min for 5 parts In total: Several hours –Detection: less than 1 second in Matlab

Results (1) 93,5 % of all faces 86,5 & of all cars correctly classified Ideal number of parts visible –4 for faces –5 or more for cars

Results (2) Appearance of parts in best performing models Intuition not always correct –E.g. hairline more important than nose –For cars: often shadow below car is important, not tyres

Results (3) Examples of correctly and incorrectly classified images

Related Work R. Fergus, P. Perona, A. Zisserman (2003) Object Class Recognition by Unsupervised Scale-Invariant Learning Straightforward extension of this paper Even better results through scale invariance More sophisticated feature detector (Kadir and Brady)

Characteristics of Classes

Topics Constellation Model Feature Selection Model Learning (EM Algorithm) Results, Comparison One-shot Category Learning

One-shot learning Introducing the OCELOT

Can you spot the ocelot? Jaguar Lynx Puma Leopard Tiger OCELOT Serval

Biological Interpretation Humans can recognize between 5000 and object categories Humans are very quick at learning new object categories We take advantage of prior knowledge about other object categories

Bayesian Framework Prior information about objects modelled by prior pdf Through a new observation learn a posterior pdf for object recognition Priors can be learned from unrelated object categories

Basic idea Learn a new object class from 1-5 new training images (unsupervised) Builds upon same framework as before Train prior on three categories with hundreds of training images Learn new category from 1-5 images (leave-one-out)

Results: Face Class General information alone is not enough Algorithm performs slightly worse than other methods Still good performance: 85-92% recognition Similar results for other categories Huge speed advantage over other methods –3-5 sec per category

Summary Using Bayesian learning framework, it is possible to learn new object categories with very few training examples Prior information comes from previously learned categories Suitable for real-time training

Future Work Learn a larger number of categories How does prior knowledge improve with number of known categories? Use more advanced stochastic model

Thank you!