Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
ICCV 2005 Beijing Recognizing and Learning Object Categories Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

For complete version of slides and source code from demos, visit the course web site:

Program Introduction Session 1: Session 2: Session 3: Session 4:
bag of words models Session 2: parts-based models Session 3: discriminative methods Session 4: concurrent segmentation and recognition Summary

Bag of words models An image is represented by a collection of “visual words” and their corresponding counts given a universal dictionary. Object categories are modeled by the distributions of these visual words. Although “bag of words” models can use both generative and discriminative approaches, here we will focus on generative models.

Part-based models An object in an image is represented by a collection of parts, characterized by both their visual appearances and locations. Object categories are modeled by the appearance and spatial distributions of these characteristic parts. Issues for such models include efficient methods for finding correspondences between the object and the scene.

Discriminative methods
Object detection and recognition is formulated as a classification problem. The image is partitioned into a set of overlapping windows, and a decision is taken at each window about if it contains a target object or not. Each window is represented by extracting a large number of features that encode information such as boundaries, textures, color, spatial structure. The classification function, that maps an image window into a binary decision, is learnt using methods such as SVMs, boosting or neural networks. Zebra Non-zebra

Segmentation and Recognition
The goal is to segment the image, at the pixel level, into foreground object and background clutter. To assist the segmentation, probabilistic models of the object category may be learnt. The problem may be formulated as one of graphical model inference, or graph partitioning.

Some chairs Related by function, not form

Some challenges

Links to datasets The next tables summarize some of the available datasets for training and testing object detection and recognition algorithms. These lists are far from exhaustive. Databases for object localization CMU/MIT frontal faces vasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html Patches Frontal faces Graz-02 Database Segmentation masks Bikes, cars, people UIUC Image Database l2r.cs.uiuc.edu/~cogcomp/Data/Car/ Bounding boxes Cars TU Darmstadt Database Motorbikes, cars, cows LabelMe dataset people.csail.mit.edu/brussell/research/LabelMe/intro.html Polygonal boundary >500 Categories Databases for object recognition Caltech 101 Segmentation masks 101 categories COIL-100 www1.cs.columbia.edu/CAVE/research/softlib/coil-100.html Patches 100 instances NORB Bounding box 50 toys On-line annotation tools ESP game Global image descriptions Web images LabelMe people.csail.mit.edu/brussell/research/LabelMe/intro.html Polygonal boundary High resolution images

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

Similar presentations

Presentation on theme: "Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

Similar presentations

Presentation on theme: "Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT"— Presentation transcript:

Similar presentations

About project

Feedback