Visual Object Recognition Rob Fergus Courant Institute, New York University

Visual Object Recognition Rob Fergus Courant Institute, New York University http://cs.nyu.edu/~fergus/icml_tutorial/

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based image retrieval Datasets & Conclusions

Li Fei-Fei, Princeton Rob Fergus, NYU Antonio Torralba, MIT Recognizing and Learning Object Categories: Year 2007 http://people.csail.mit.edu/torralba/shortCourseRLOC

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based image retrieval Datasets & Conclusions

So what does object recognition involve?

Classification: are there street-lights in the image?

Detection: localize the street-lights in the image

Object categorization mountain building tree banner vendor people street lamp

Scene and context categorization outdoor city …

Application: Assisted driving meters Ped Car Lane detection Pedestrian and car detection Collision warning systems with adaptive cruise control, Lane departure warning systems, Rear object detection systems,

Application: Computational photography

Application: Improving online search Query: STREET Organizing photo collections

Challenges 1: view point variation Michelangelo 1475-1564

Challenges 2: scale

Challenges 3: illumination slide credit: S. Ullman

Challenges 4: background clutter Bruegel, 1564

Challenges 5: occlusion http://lh5.ggpht.com/_wJc6t2hDl2M/RrL7Gh6sS7I/AAAAAAAAAYY/n3xaHc2opls/DSC00633.JPG

Challenges 6: deformation Xu, Beihong 1943 http://img.timeinc.net/time/asia/magazine/2007/1112/racehorse_1112.jpg

History: single object recognition Object 1 Object 2 Object 3

David Lowe [1985] Single object recognition history: Geometric methods Rothwell et al. [1992]

Single object recognition history: Appearance-based methods Murase & Nayer 1995 Schmid & Mohr 1997 Lowe, et al. 1999, 2003 Mahamud and Herbert, 2000 Ferrari et al. 2004 Rothganger et al. 2004 Moreels and Perona, 2005 …

Instance 1 Instance 2 Instance 3 Challenges 7: intra-class variation Shoe class

History: early object categorization

Fischler, Elschlager, 1973 Turk and Pentland, 1991 Belhumeur, Hespanha, & Kriegman, 1997 Rowley & Kanade, 1998 Schneiderman & Kanade 2004 Viola and Jones, 2000 Heisele et al., 2001 Amit and Geman, 1999 LeCun et al. 1998 Belongie and Malik, 2002 DeCoste and Scholkopf, 2002 Simard et al. 2003 Poggio et al. 1993 Argawal and Roth, 2002 Schneiderman & Kanade, 2004 …..

Three main issues Representation –How to represent an object category Learning –How to form the classifier, given training data Recognition –How the classifier is to be used on novel data

Representation –Generative / discriminative / hybrid

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances View point Illumination Occlusion Scale Deformation Clutter etc.

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances –Part-based or global with sub-window

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances –Parts or global w/sub- window –Use set of features or each pixel in image

–Unclear how to model categories, so learn rather than manually specify Learning

–Unclear how to model categories, so learn rather than manually specify –Methods of training: generative vs. discriminative Learning

–Unclear how to model categories, so learn rather than manually specify –Methods of training: generative vs. discriminative –Level of supervision Manual segmentation; bounding box; image labels; noisy labels Learning Contains a motorbike

Learning –Unclear how to model categories, so learn rather than manually specify –Methods of training: generative vs. discriminative –Level of supervision Manual segmentation; bounding box; image labels; noisy labels -- Training images: Issue of over-fitting (typically limited training data) Negative images for discriminative methods

Learning –Unclear how to model categories, so learn rather than manually specify –Methods of training: generative vs. discriminative –Level of supervision Manual segmentation; bounding box; image labels; noisy labels -- Training images: Issue of over-fitting (typically limited training data) Negative images for discriminative methods --Priors

–Scale / orientation range to search over –Speed –Context Recognition

Hoiem, Efros, Herbert, 2006 –Context enables pruning of detector output Recognition

Visual Object Recognition Rob Fergus Courant Institute, New York University

Similar presentations

Presentation on theme: "Visual Object Recognition Rob Fergus Courant Institute, New York University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Visual Object Recognition Rob Fergus Courant Institute, New York University

Similar presentations

Presentation on theme: "Visual Object Recognition Rob Fergus Courant Institute, New York University"— Presentation transcript:

Similar presentations

About project

Feedback