Multi-Local Feature Manifolds for Object Detection Oscar Danielsson Stefan Carlsson Josephine Sullivan

Slides:



Advertisements
Similar presentations
Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Face Alignment by Explicit Shape Regression
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Face Alignment with Part-Based Modeling
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Robust and large-scale alignment Image from
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Graz University of Technology, AUSTRIA Institute for Computer Graphics and Vision Fast Visual Object Identification and Categorization Michael Grabner,
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely.
Recognizing and Tracking Human Action Josephine Sullivan and Stefan Carlsson.
Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus, P. Perona, and A. Zisserman Presented By Jeff.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Global and Efficient Self-Similarity for Object Classification and Detection CVPR 2010 Thomas Deselaers and Vittorio Ferrari.
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
Multimodal Interaction Dr. Mike Spann
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
A Statistically Selected Part-Based Probabilistic Model for Object Recognition Zhipeng Zhao, Ahmed Elgammal Department of Computer Science, Rutgers, The.
Learning Collections of Parts for Object Recognition and Transfer Learning University of Illinois at Urbana- Champaign.
Object Detection with Discriminatively Trained Part Based Models
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
CAMEO: Face Recognition Year 1 Progress and Year 2 Goals Fernando de la Torre, Carlos Vallespi, Takeo Kanade.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.
CSE 185 Introduction to Computer Vision Feature Matching.
Presented by David Lee 3/20/2006
Categorical Perception 강우현. Introduction Scalable representations for visual categorization Representation for functional and affordance-based categorization.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Object Recognition by Parts
Presented by David Lee 3/20/2006
Recognizing Deformable Shapes
Nonparametric Semantic Segmentation
Object Recognition by Parts
A. Opelt, M. Fussenegger, A. Pinz, P. Auer
Object Recognition by Parts
Object Recognition by Parts
SIFT keypoint detection
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Object Recognition by Parts
Object Recognition with Interest Operators
Presentation transcript:

Multi-Local Feature Manifolds for Object Detection Oscar Danielsson Stefan Carlsson Josephine Sullivan DICTA08

The Problem Object categories are often modeled by collections (bag-of-features) or constellations (pictorial structures) of local features Many simple, shape-based objects don’t have any discriminative local appearance features ?

The Multi-Local Feature  A specific spatial constellation of oriented edgels (or other local content)  Captures global shape properties  “Weak” detector of shape-based object categories  Described by coordinate vector: (x 1,…,x 12 )

Modeling Intra-Class Variation

1. Generate coordinate vectors by clicking corresponding edgels in a (small) number of training images 2. Align coordinate vectors wrt. similarity transform

Modeling Intra-Class Variation 3. Extend coordinate vectors into their convex hull

Detection

For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Experiments Detection performance was evaluated on a standard database (ETHZ Shape Classes) and we want to investigate:  Is a multi-local feature a good weak detector?  How many local features should be used?

Mugs - Training training images were downloaded from Google images 14 edgels constituting a multilocal feature were marked in each training image

Mugs - Results

Performance decreases when adding more than 9 local features %

Bottles - Training training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image

Bottles - Results

%

Apple logos - Training 20 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image

Apple logos - Results

Performance decreases when adding more than 11 local features %

Conclusions  A multi-local feature is a good weak detector of shape-based object categories  The best performance is achieved with multi- local features with a moderate number of local features  Convex combinations of valid exemplars are in general also valid exemplars (we can extend a few training examples into their convex hull)

Future Work  Automatic learning of multi-local features  Building combinations of multi-local feature detectors into an object detection system

Related Work  Pictorial Structures  E.g.. Felzenszwalb, Huttenlocher. Pictorial Structures for Object Recognition, IJCV No. 1, January  Constellation Models  E.g.. Fergus, Perona, Zisserman. Object class recognition by unsupervised scale-invariant learning, CVPR03. Differences  Different detection methods  Use rich local features

Thanks!

Representation The multi-local feature manifold consists of all convex combinations of the training examples