Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.

Slides:

Advertisements

Similar presentations

Max-Margin Additive Classifiers for Detection

Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…

The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.

CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Groups of Adjacent Contour Segments for Object Detection Vittorio Ferrari Loic Fevrier Frederic Jurie Cordelia Schmid.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.

Fast intersection kernel SVMs for Realtime Object Detection

Discriminative and generative methods for bags of features

Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Lecture 28: Bag-of-words models

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Bag-of-features models

Local Features and Kernels for Classification of Object Categories J. Zhang --- QMUL UK (INRIA till July 2005) with M. Marszalek and C. Schmid --- INRIA.

Spatial Pyramid Pooling in Deep Convolutional

What, Where & How Many? Combining Object Detectors and CRFs

Discriminative and generative methods for bags of features

Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.

The Beauty of Local Invariant Features

Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,

Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.

Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.

Classification 2: discriminative models

Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Face detection Slides adapted Grauman & Liebe’s tutorial

SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Methods for classification and image representation

Hierarchical Matching with Side Information for Image Classification

First-Person Activity Recognition: What Are They Doing to Me? M. S. Ryoo and Larry Matthies Jet Propulsion Laboratory, California Institute of Technology,

CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Lecture IX: Object Recognition (2)

Object Recognition by Parts

Object detection with deformable part-based models

Nonparametric Semantic Segmentation

Paper Presentation: Shape and Matching

Digit Recognition using SVMS

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Object detection as supervised classification

CS 1674: Intro to Computer Vision Scene Recognition

Object Recognition by Parts

Object Recognition by Parts

Brief Review of Recognition + Context

Object Recognition with Interest Operators

Presentation transcript:

Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information

Spatial pyramid matching Add spatial information to the bag-of-features Perform matching in 2D image space [Lazebnik, Schmid & Ponce, CVPR 2006]

Related work Szummer & Picard (1997) Lowe (1999, 2004)Torralba et al. (2003) GistSIFT Similar approaches: Subblock description [Szummer & Picard, 1997] SIFT [Lowe, 1999] GIST [Torralba et al., 2003]

Locally orderless representation at several levels of spatial resolution level 0 Spatial pyramid representation

level 0 level 1 Locally orderless representation at several levels of spatial resolution

Spatial pyramid representation level 0 level 1 level 2 Locally orderless representation at several levels of spatial resolution

Spatial pyramid matching Combination of spatial levels with pyramid match kernel [Grauman & Darell’05]

Pyramid match kernel [Grauman & Darell’05] optimal partial matching between sets of features

Scene classification LSingle-levelPyramid 0(1x1)72.2±0.6 1(2x2)77.9± ±0.5 2(4x4)79.4± ±0.3 3(8x8)77.2± ±0.3

Retrieval examples

Category classification – CalTech101 LSingle-levelPyramid 0(1x1)41.2±1.2 1(2x2)55.9± ±0.8 2(4x4)63.6± ±0.8 3(8x8)60.3± ±0.7 Bag-of-features approach by Zhang et al.’07: 54 %

CalTech101 Easiest and hardest classes Sources of difficulty: –Lack of texture –Camouflage –Thin, articulated limbs –Highly deformable shape

Discussion Summary –Spatial pyramid representation: appearance of local image patches + coarse global position information –Substantial improvement over bag of features –Depends on the similarity of image layout Extensions –Integrating different types of features, learning weights, use of different grids [Zhang’07, Bosch & Zisserman’07, Varma et al.’07, Marszalek et al.’07] –Flexible, object-centered grid

Overview Global spatial layout: spatial pyramid matching Spatial weighting the features

Motivation Evaluating the influence of background features [J. Zhang, M. Marszalek, S. Lazebnik & C. Schmid, IJCV’07] –Train and test on different combinations of foreground and background by separating features based on bounding boxes Training: different combinations foreground + background features Testing: original test set Best results when training with “harder” dataset (with background)

Motivation Evaluating the influence of background features [J. Zhang, M. Marszalek, S. Lazebnik & C. Schmid, IJCV’07] –Train and test on different combinations of foreground and background by separating features based on bounding boxes Training: original training set Testing: different combinations foreground + background features Best results when testing with foreground features only

Approach Better to train on a “harder” dataset with background clutter and test on an easier one without background clutter Spatial weighting for bag-of-features [Marszalek & Schmid, CVPR’06] –weight features by the likelihood of belonging to the object –determine likelihood based on shape masks

Masks for spatial weighting For each test feature: - Select closest training features + corresponding masks (training requires segmented images or bounding boxes) - Align mask based on local co-ordinates system (transformation between training and test co-ordinate systems) Sum masks weighted by matching distance three features agree on object localization, the object has higher weights Weight histogram features with the strength of the final mask

Example masks for spatial weighting

Classification for PASCAL dataset Zhang et al. Spatial weightingGain bikes cars motorbikes people Equal error rates for PASCAL test set 2

Extension to localization Cast hypothesis –Aligning the mask based on matching features Evaluate each hypothesis –SVM for local features Merge hypothesis to produce localization decisions –Online clustering of similar hypothesis, rejection of weak ones [Marszalek & Schmid, CVPR 2007]

Illustration of hypothesis evaluation False hypotheses due to the ambiguities of the wheels Eliminated after the evaluation

Illustration of hypotheses merging Weak classifier response due to occlusion Merging of evidence based on consistent object features

Localization results

Localization result Illustration of subsequent hypotheses Confidence value

Comparison to state-of-the-art Comparison with [Shotton et al. ICCV’05] - use their images, search at a single scale - improved performance over them, and: - no use of shape-based features - can detect objects at multiple scales

Aspect clusters

Discussion Including spatial information improves results Importance of flexible modeling of spatial information –coarse global position information –object based models Extensions –Hierarchical organization of the objects/aspects