Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

Slides:

Advertisements

Similar presentations

Combining Detectors for Human Hand Detection Antonio Hernández, Petia Radeva and Sergio Escalera Computer Vision Center, Universitat Autònoma de Barcelona,

Advertisements

Multiclass SVM and Applications in Object Classification

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Object recognition and scene “understanding”

Object Recognition with Features Inspired by Visual Cortex T. Serre, L. Wolf, T. Poggio Presented by Andrew C. Gallagher Jan. 25, 2007.

CSE 473/573 Computer Vision and Image Processing (CVIP)

Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.

HMAX Models Architecture Jim Mutch March 31, 2010.

CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.

1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.

Lecture 31: Modern object recognition

My Group’s Current Research on Image Understanding.

Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct

CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object

Large-Scale Object Recognition with Weak Supervision

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.

Learning Convolutional Feature Hierarchies for Visual Recognition

Fast intersection kernel SVMs for Realtime Object Detection

Lecture 6: Feature matching CS4670: Computer Vision Noah Snavely.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Fast, Multiscale Image Segmentation: From Pixels to Semantics Ronen Basri The Weizmann Institute of Science Joint work with Achi Brandt, Meirav Galun,

A Study of Approaches for Object Recognition

1 Model Fitting Hao Jiang Computer Science Department Oct 8, 2009.

Segmentation Divide the image into segments. Each segment:

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Texture Reading: Chapter 9 (skip 9.4) Key issue: How do we represent texture? Topics: –Texture segmentation –Texture-based matching –Texture synthesis.

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

CS4670: Computer Vision Kavita Bala Lecture 8: Scale invariance.

Ashish Uthama EOS 513 Term Paper Presentation Ashish Uthama Biomedical Signal and Image Computing Lab Department of Electrical.

Generic object detection with deformable part-based models

Exercise Session 10 – Image Categorization

Multiclass object recognition

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.

1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.

Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

Object Recognition in Images Slides originally created by Bernd Heisele.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Project 3 Results.

Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

A Statistical Approach to Texture Classification Nicholas Chan Heather Dunlop Project Dec. 14, 2005.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Max-Margin Training of Upstream Scene Understanding Models Jun Zhu Carnegie Mellon University Joint work with Li-Jia Li *, Li Fei-Fei *, and Eric P. Xing.

Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

Object detection with deformable part-based models

Learning Mid-Level Features For Recognition

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Recognition using Nearest Neighbor (or kNN)

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Features Readings All is Vanity, by C. Allan Gilbert,

CS 1674: Intro to Computer Vision Scene Recognition

Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox

Brief Review of Recognition + Context

AN EVALUATION OF LOCAL IMAGE FEATURES FOR OBJECT CLASS RECOGNITION

SIFT keypoint detection

Presentation transcript:

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006

2 Object-class recognition: What are we missing? Are current feature types sufficient? Biology suggests that local features are sufficient –Patches, but with arbitrary shapes –Contours, texture, color, motion,… –Features can incorporate localization Next step: learn feature types instead of hand-crafting them What else are we missing? Hypothesis: One key is more and better training data Hierarchical categories: Recognize “mammal” then “dog”

3 Simple biologically-motivated system local max Image Layer -Pyramid of rescaled images. -Model is multiscale from the outset. S1 Layer -Compute responses to Gabor filters (4 orientations). -At each position/scale, now have 4 types of units: C1 Layer -Compute local maxima for each feature type (orientation); subsample. -Units now have some position/scale invariance:

4 Computing Features, cont’d [ r 1 r 2 … r 4075 ] global max S2 Layer -Here, prototypes are patches of C1 units sampled from training images (4,075). -Note: no clustering needed C2 Layer -Maximum response to each S2 feature. -4,075 features. SVM classifier x y orientation

5 Location Information in Features Should we be using a “bag of features”? Sacrifice position/scale invariance; objects must be centered. –Assume centered objects for Caltech 101 –Other datasets use a sliding window. Our approach: an S2 feature isn’t computed for the entire image, but only “near” its originally sampled location.

6 Classification Results (Caltech 101) Serre et al. [05]42 Base41 + sparse S orientations + inhibited S1/C1 49 (+ 8) + localized features54 (+ 5) + feature selection56 (+ 2) Classification rates for the full dataset: (30 training images per category. Scores are the average of the per- category classification rates.) x

7 Localization Results (UIUC Cars) Correct examples: (single-scale) The only errors in 8 runs (1600 cars)

8 Interest points vs. dense features Disadvantages of interest points –Selection of interest points is error prone, so reliability of matching is lower than for dense features –Feature size and shape are predetermined Disadvantages of dense features –Efficiency. However, cost is reduced by increasing invariance at lower levels. My conclusion: –Interest points are useful for near-term efficiency. –Dense features are likely to win long-term.

9 Objection: how can human vision recognize objects after just a single example? –First category (e.g., mammal) is recognized, then instance –Features for categorization are borrowed from most similar previous category Hypothesis: Object class recognition would already be solved if training sets were large enough Performance increases strongly with training set size Good success is achieved on difficult categories (faces, pedestrians) with large training sets However, large training sets require more efficient algorithms