In Search of Objects: 50 years of wondering 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Slides:



Advertisements
Similar presentations
Part 4: Combined segmentation and recognition by Rob Fergus (MIT)
Advertisements

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.
More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Fitting: The Hough transform
Robust and large-scale alignment Image from
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley.
Features-based Object Recognition Pierre Moreels California Institute of Technology Thesis defense, Sept. 24, 2007.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely.
Visual Object Recognition Rob Fergus Courant Institute, New York University
Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus, P. Perona, and A. Zisserman Presented By Jeff.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
By Suren Manvelyan,
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
The Beauty of Local Invariant Features
Efficient Algorithms for Matching Pedro Felzenszwalb Trevor Darrell Yann LeCun Alex Berg.
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Computer vision.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Lecture 2 Overview on object recognition and one practical example Object Recognition and Scene Understanding
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Visual Object Recognition
Classical Methods for Object Recognition Rob Fergus (NYU)
Object Detection with Discriminatively Trained Part Based Models
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Pedestrian Detection and Localization
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows Sliding Windows – Silver Bullet or Evolutionary Deadend? Alyosha Efros,
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Fitting: The Hough transform
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.
Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Presented by David Lee 3/20/2006
Part 4: combined segmentation and recognition Li Fei-Fei.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.
Object Recognition by Parts
Cascade for Fast Detection
Presented by David Lee 3/20/2006
CS 4501: Introduction to Computer Vision Sparse Feature Detectors: Harris Corner, Difference of Gaussian Connelly Barnes Slides from Jason Lawrence, Fei.
Feature description and matching
Object Recognition by Parts
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Object Recognition by Parts
Object Recognition by Parts
Brief Review of Recognition + Context
Unsupervised learning of models for recognition
Lecture 29: Face Detection Revisited
Feature descriptors and matching
Object Recognition by Parts
Object Recognition with Interest Operators
Presentation transcript:

In Search of Objects: 50 years of wondering : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009

Object recognition Is it really so hard? This is a chair Find the chair in this image Output of normalized correlation Slide by Antonio Torralba

Object recognition Is it really so hard? Antonio’s biggest concern: how do I justify 50 years of research if this experiment did work? Find the chair in this image Pretty much garbage Simple template matching is not going to make it Slide by Antonio Torralba

The Religious Wars Geometry vs. Appearance Parts vs. The Whole …and the standard answer: probably both or neither

Geometry First

Roberts and the Blockworld (1960s) Object Recognition in the Geometric Era: a Retrospective. Joseph L. Mundy If you don’t like the world – get a new one!

Object Recognition in the Geometric Era: a Retrospective. Joseph L. Mundy Binford and generalized cylinders (1970s) I am cylinder, you are a cylinder

Biederman and Recognition-by-components Irving Biederman Recognition-by-Components: A Theory of Human Image Understanding. Psychological Review, )We know that this object is nothing we know 2)We can split this objects into parts that everybody will agree 3)We can see how it resembles something familiar: “a hot dog cart”

Objects and their geons Hypothesis: there is a small number of geometric components that constitute the primitive elements of the object recognition system (like letters to form words).

Aspect Graphs and their demise

Appearance Makes an Appearance

Eigenfaces: NN in low-dim subspace (1990s) Sirovich & Kirby (1987), Turk & Pentland (1991) Later turns out, simple NN works Just as well…

Columbia Object Image Library (COIL), 1996 Squash 3D pose variation with data!

Object not cropped? No problem!

The Age of Sliding Window Craziness Rowley et al.,1998 Schniderman & Kanade, 1999 Viola & Jones, 2001 etc.

What is a Sliding Window Approach? Search over space and scale Detection as subwindow classification problem “In the absence of a more intelligent strategy, any global image classification approach can be converted into a localization approach by using a sliding-window search.”... Slide by Bastian Liebe

What features to match? SSD is too strict. Need a bit of invariance to appearance, focus, and contours Edges (Chamfer/Housdorff/…) Wavelets / Filters / Jets … Blur (Geometric Blur, …) Spatial Histograms (SIFT, HOG, gist, Shape Context, …) Slide inspired by Deva Ramanan

Edge Matching Edge-Template (hand-drawn from footage, or automatically generated from CAD models) ? Image Scene Real world, real time video footage. Template sliding

Edge MapDistance Transform Chamfer / Hausdorff Distance The Chamfer distance is the average distance to the nearest feature. Housdorff is distance of the worst matching object pixel to its closest image pixel.

Wavelets / Filters / Jets Schniderman & Kanade, 1999 Viola & Jones, 2001

bluring gradients blurred Half-wave rect.blur

histograms (of gradients) Freeman and Roth IAFGR 1995 Lowe ICCV1999 Oliva & Torralba, 2001 Belongie et al, 2001 Dalal &Triggs CVPR05 Gradients within 8X8 patchBin into local (4X4) neighborhoods & 8 orientations Binning achieves invariance to small patch offsets Shape Context Gist

Matching Parts

Why Matching? Old idea –Statistical Pattern Theory (Ulf Grenander) –Deformable Templates –Fischler & Elschlager –Etc. at least by the early 1970’s “transform” and “appearance” parameters Matching to estimate transform MODEL TRANSFORM IMAGE Slide by Alex Berg

Why Matching? Old idea –Statistical Pattern Theory (Ulf Grenander) –Deformable Templates –Fischler & Elschlager –Etc. at least by the early 1970’s “transform” and “appearance” parameters Matching to estimate transform MODEL TRANSFORM IMAGE Slide by Alex Berg

Why Matching? Old idea –Statistical Pattern Theory (Ulf Grenander) –Deformable Templates –Fischler & Elschlager –Etc. at least by the early 1970’s “transform” and “appearance” parameters Matching to estimate transform –Searching over diffeomorphisms difficult –Searching over discrete assignments easier? MODEL TRANSFORM IMAGE Slide by Alex Berg

Why parts? Model of Car Image ? Slide by Alex Berg

Why Parts? Model of Car Image Slide by Alex Berg

Why Parts? Model of Car Image Slide by Alex Berg

Huttenlocker & Ullman and Alignment

Lowe and the birth of SIFT (1999)

On to object classes! Slide by Alex Berg

Quadratic Assignment (Adding Geometric Constraints) Slide by Alex Berg

Model: Parts and Structure Slide by Rob Fergus

Representation Object as set of parts –Generative representation Model: –Relative locations between parts –Appearance of part Issues: –How to model location –How to represent appearance –Sparse or dense (pixels or regions) –How to handle occlusion/clutter Figure from [Fischler & Elschlager 73]

History of Parts and Structure approaches Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ‘95 Amit & Geman ‘95, ‘99 Perona et al. ‘95, ‘96, ’98, ’00, ’03, ‘04, ‘05 Felzenszwalb & Huttenlocher ’00, ’04 Crandall & Huttenlocher ’05, ’06 Leibe & Schiele ’03, ’04 Many papers since 2000 Slide by Rob Fergus

Constellation Models + Sparse representation + Computationally tractable (10 5 pixels  parts) + Avoid modeling global variability - Throw away most image information - Parts need to be distinctive to separate from other classes Slide by Rob Fergus

from Sparse Flexible Models of Local Features Gustavo Carneiro and David Lowe, ECCV 2006 Different connectivity structures O(N 6 )O(N 2 )O(N 3 ) O(N 2 ) Fergus et al. ’03 Fei-Fei et al. ‘03 Crandall et al. ‘05 Fergus et al. ’05 Crandall et al. ‘05 Felzenszwalb & Huttenlocher ‘00 Bouchard & Triggs ‘05Carneiro & Lowe ‘06 Csurka ’04 Vasconcelos ‘00

Trouble with trees Limbs attracted to regions of high likelihood (local image evidence is double-counted) Lan & Huttenlocher, ICCV05 Slide by Deva Ramanan

Pictorial Structure Models  Parts have match quality at each location –Location in a configuration space –No feature detection  Maps for parts combined together into overall quality map –According to underlying graph structure Slide by Pedro

Matching Pictorial Structures  Cost map for each part  Distance transform (soft max) using spatial model  Shift and combine –Localize root then recursively other parts Slide by Pedro

Sparse Part Voting Part based: We create weak detectors by using parts and voting for the object center location Car model Screen model Slide by Antonio Torralba

Implicit shape model Spatial occurrence distributions x y s x y s x y s x y s Probabilistic Voting Interest Points Matched Codebook Entries Recognition Learning Learn appearance codebook –Cluster over interest points on training images Learn spatial distributions –Match codebook to training images –Record matching positions on object –Centroid is given Use Hough space voting to find object Leibe and Schiele ’03,’05

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows Duality to Sliding Window Approaches… How to find maxima in the Hough space efficiently? Maxima search = coarse-to-fine sliding window stage! y s Binned accum. array y s x Refinement (MSME) y s x Candidate maxima y s Hough votes Slide by Bastian Leibe