Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce

Introduction Invariant local descriptors => robust recognition of specific objects or scenes Recognition of textures and object classes => description of intra-class variation, selection of discriminant features, spatial relations texture recognitioncar detection

1.An affine-invariant texture recognition (CVPR’03) 2.A two-layer architecture for texture segmentation and recognition (ICCV’03) 3.Feature selection for object class recognition (ICCV’03) 4.Building affine-invariant part models for recognition Overview

Affine-invariant texture recognition Texture recognition under viewpoint changes and non-rigid transformations Use of affine-invariant regions –invariance to viewpoint changes –spatial selection => more compact representation, reduction of redundancy in texton dictionary [A sparse texture representation using affine-invariant regions, S. Lazebnik, C. Schmid and J. Ponce, CVPR 2003]

Spatial selection clustering each pixelclustering selected pixels

Overview of the approach

Harris detector Laplace detector Region extraction

Descriptors – Spin images

Signature and EMD Hierarchical clustering => Signature : Earth movers distance –robust distance, optimizes the flow between distributions –can match signatures of different size –not sensitive to the number of clusters S S = { ( m 1, w 1 ), …, ( m k, w k ) } SS’ D( S, S’ ) = [  i,j f ij d( m i, m’ j )] / [  i,j f ij ]

Database with viewpoint changes 20 samples of 10 different textures

Results Spin images Gabor-like filters

A two-layer architecture Texture recognition + segmentation Classification of individual regions + spatial layout [A generative architecture for semi-supervised texture recognition, S. Lazebnik, C. Schmid, J. Ponce, ICCV 2003]

A two-layer architecture Modeling : 1.Distribution of the local descriptors (affine invariants) Gaussian mixture model estimation with EM, allows incorporating unsegmented images 2.Co-occurrence statistics of sub-class labels over affinely adapted neighborhoods Segmentation + Recognition : 1.Generative model for initial class probabilities 2.Co-occurrence statistics + relaxation to improve labels

Texture Dataset – Training Images T1 (brick)T2 (carpet)T3 (chair)T4 (floor 1) T5 (floor 2) T6 (marble)T7 (wood)

Effect of relaxation + co-occurrence Original image Top: before relaxation (indivual regions), bottom: after relaxation (co-occurrence)

Recognition + Segmentation Examples

Animal Dataset – Training Images no manual segmentation, weakly supervised 10 training images per animal (with background) no purely negative images

Recognition + Segmentation Examples

Object class detection/classification Description of intra-class variations of object parts [Selection of scale inv. regions for object class recognition, G. Dorko and C. Schmid, ICCV’03]

Object class detection/classification Description of intra-class variations of object parts Selection of discrimiant features (weakly supervised)

Training the model Training phase 1 –Input : Images of the object with background (positive images), no normalization, alignment of the image –Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT –Clustering : estimation of Gaussian mixture with EM

Training the model Training phase 1 –Input : Images of the object with background (positive images), no normalization, alignment of the image/object –Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT –Clustering : estimation of Gaussian mixture with EM

Training the model Training phase 2 (selection) –Input : verification set, positive and negative images –Rank each cluster with likelihood (or mutual information) –MAP classifier with the n top clusters

5 LikelihoodMutual Information 25 Likelihood – mutual information –likelihood: more discriminant but very specific –mutual Information: discriminant but not too specific

Results for test images Harris-Laplace 354 points49 correct + 37 incorrect31 correct + 20 incorrect 25 Likelihood10 Mutual InformationDetection Harris-Laplace 277 points43 correct + 36 incorrect26 correct + 20 incorrect

Relaxation – propagation of probablities

Classification Assign each test descriptor to the most probable cluster (MAP) Each descriptor assigned to one of the top n clusters is positive If the number of positive descriptors are above a threshold p classify the image as positive

Classification experiments AirplanesMotorbikes Wild Cats Training Phase 1 #Positive images200 25 Training Phase 2 #Positive images200 25 #Negative images450 Testing #Positive images400 50 #Negative images450 Training Verification Test http://www.robots.ox.ac.uk/~vgg/dataCorel Image Library

Results: Motorbikes Equal-Error-Rates as a function of p. Receiver-Operating-Characteristic p=6

BestEstimated pp=6Fergus p%p%% Airplanes Harris 897,559797.25- Kadir 18973096.59694 Motorbikes Harris 99959898.25- Kadir 1998.753298.259896 Wild Cats Harris 3194349272- Kadir 178645828490 97.5 99 94 Classification results: ROC equal error rates

Matching collections of local affine-invariant regions that map with an affine transformation => part Matching works for unsegmented images Model = a collection of parts A Affine-invariant part models

Matching: Faces spurious match

Matching: 3D Objects closeup

Matching: Finding Repeated Patterns

Matching: Finding Symmetry

Modeling for Recognition Match multiple pairs of training images to produce several candidate parts. Use additional validation images to evaluate repeatability of parts and individual patches. Retain a fixed number of parts having the best repeatability score as class model. No background model

The Butterfly Dataset 16 training images (8 pairs) per class 10 validation images per class 437 test images 619 images total

Butterfly Models Top two rows: pairs of images used for modeling. Bottom two rows: closeup views of some of the parts making up the models of the seven butterfly classes.

Recognition Top 10 models per class used for recognition Multi-class classification results: total model size (smallest/largest)

Classification Rate vs. Number of Parts Number of parts

Successful Detection Examples Model part Yellow: detected in test image Blue: occluded in test image Test image: All ellipses Test image: Matched ellipses Note: only one of the two training images is shown

Successful Detection Examples (cont.)

Detection of Multiple Instances

Detection Failures

Future Work Spatial relation –non-rigid models –relations between clusters and affine-invariant parts Feature selection: dimensionality reduction Shape information: appropriate descriptors Rapid search: structuring of the data

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Similar presentations

Presentation on theme: "Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Similar presentations

Presentation on theme: "Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce."— Presentation transcript:

Similar presentations

About project

Feedback