Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows Sliding Windows – Silver Bullet or Evolutionary Deadend? Alyosha Efros,

Slides:



Advertisements
Similar presentations
Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics Institute Carnegie Mellon University.
Advertisements

EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Part 4: Combined segmentation and recognition by Rob Fergus (MIT)
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.
Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
AdaBoost & Its Applications
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.
Detecting Pedestrians by Learning Shapelet Features
In Search of Objects: 50 years of wondering : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.
The Viola/Jones Face Detector (2001)
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Fitting: The Hough transform
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
A Study of Approaches for Object Recognition
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Detection using Histograms of Oriented Gradients
Visual Object Recognition Rob Fergus Courant Institute, New York University
Robust Real-Time Object Detection Paul Viola & Michael Jones.
Viola and Jones Object Detector Ruxandra Paun EE/CS/CNS Presentation
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Face Detection CSE 576. Face detection State-of-the-art face detection demo (Courtesy Boris Babenko)Boris Babenko.
Generic object detection with deformable part-based models
Perceptual and Sensory Augmented Computing Integrating Recognitoin and Reconstruction Integrating Recognition and Reconstruction for Cognitive Scene Interpretation.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Visual Object Recognition
Object Detection 01 – Advance Hough Transformation JJCAO.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Fitting: The Hough transform
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Project 3 Results.
Adaboost and Object Detection Xu and Arun. Principle of Adaboost Three cobblers with their wits combined equal Zhuge Liang the master mind. Failure is.
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Object Detection Overview Viola-Jones Dalal-Triggs Deformable models Deep learning.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
Part 4: combined segmentation and recognition Li Fei-Fei.
Face Detection and Head Tracking Ying Wu Electrical Engineering & Computer Science Northwestern University, Evanston, IL
More sliding window detection: Discriminative part-based models
Face detection Many slides adapted from P. Viola.
Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.
Reading: R. Schapire, A brief introduction to boosting
Cascade for Fast Detection
Presented by Minh Hoai Nguyen Date: 28 March 2007
Lit part of blue dress and shadowed part of white dress are the same color
Recap: Advanced Feature Encoding
Object detection as supervised classification
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
Brief Review of Recognition + Context
Lecture 29: Face Detection Revisited
Presentation transcript:

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows Sliding Windows – Silver Bullet or Evolutionary Deadend? Alyosha Efros, Bastian Leibe, Krystian Mikolajczyk Sicily Workshop, Syracusa,

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 2 A. Efros, B. Leibe, K. Mikolajczyk What is a Sliding Window Approach? Search over space and scale Detection as subwindow classification problem “In the absence of a more intelligent strategy, any global image classification approach can be converted into a localization approach by using a sliding-window search.”...

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 3 A. Efros, B. Leibe, K. Mikolajczyk Task: Object Localization in Still Images What options do we have to choose from?  Sliding window approaches –Classification problem –[Papageorgiou & Poggio,’00], [Schneiderman & Kanade,’00], [Viola & Jones,01], [Mikolajczyk et al.,’04], [Torralba et al.,’04], [Dalal & Triggs,’05], [Wu & Nevatia,’05], [Laptev,’06],…  Feature-transform based approaches –Part-based generative models, typically with a star topology –[Fergus et al.,’03], [Leibe & Schiele,’04], [Fei-Fei et al.,’04], [Felszenszwalb & Huttenlocher,’05], [Winn & Criminisi,’06], [Opelt et al.,’06], [Mikolajczyk et al.,’06],…  Massively parallel NN architectures –e.g. convolutional NNs –[LeCun et al.,’98], [Osadchy et al.,’04], [Garcia et al.,??],…  “Smart segmentation” based approaches –Localization based on robustified bottom-up segmentation –[Todorovic & Ahuja,’06], [Roth & Ommer,’06]

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 4 A. Efros, B. Leibe, K. Mikolajczyk Sliding-Window Approaches Pros:  Can draw from vast stock of ML methods.  Independence assumption between subwindows. –Makes classification easier. –Process can be parallelized.  Simple technique, can be tried out very easily. –No translation/scale invariance required in model.  There are methods to do it very fast. –Cascades with AdaBoost/SVMs  Good detection performance on many benchmark datasets. –e.g. face detection, VOC challenges  Direct control over search range (e.g. on ground plane).

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 5 A. Efros, B. Leibe, K. Mikolajczyk Sliding-Window Approaches Cons:  Can draw from vast stock of ML methods… …as long as they can be evaluated in a few ms.  Need to evaluate many subwindows (100’000s).  Needs very fast & accurate classification  Many training examples required, often limited to low training resolution.  Can only deal with relatively small occlusions.  Still need to fuse resulting detections  Hard/suboptimal from binary classification output  Classification task often ill-defined –How to label half a car?  Difficult to deal with changing aspect ratios

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 6 A. Efros, B. Leibe, K. Mikolajczyk Duality to Feature-Based Approaches… How to find maxima in the Hough space efficiently? Maxima search = coarse-to-fine sliding window stage! Main differences:  All features evaluated upfront (instead of in cascade).  Generative model instead of discriminative classifier.  Maxima search already performs detection fusion. y s Binned accum. array y s x Refinement (MSME) y s x Candidate maxima y s Hough votes

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 7 A. Efros, B. Leibe, K. Mikolajczyk So What is Left to Oppose? 1. Feature-based vs. Window-based? 2. (Almost) exclusive use of discriminative methods 3. Low training resolutions 4. How to deal with changing aspect ratios?

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 8 A. Efros, B. Leibe, K. Mikolajczyk 1. Feature-based vs. Window-based May be mainly an implementation trade-off  Few, localized features  feature-based evaluation better  Many, dense features  window-based evaluation better  Noticed already by e.g. [Schneiderman,’04]  The trade-offs may change as your method develops… y s

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 9 A. Efros, B. Leibe, K. Mikolajczyk 2. Exclusive Use of Discriminative Methods Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting Segmentation 3D Voting Space (continuous) x y s Backprojection of Maxima p(figure) Probabilities [Leibe & Schiele,04] Gen. Model inside!

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 10 A. Efros, B. Leibe, K. Mikolajczyk Generative Models for Sliding Windows Continuous confidence scores  Smoother maxima in hypothesis space  Coarser sampling possible

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 11 A. Efros, B. Leibe, K. Mikolajczyk Generative Models for Sliding Windows Continuous confidence scores  Smoother maxima in hypothesis space  Coarser sampling possible Backprojection capability  Determine a hypothesis’s support in the image  Resolve overlapping cases

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 12 A. Efros, B. Leibe, K. Mikolajczyk Generative Models for Sliding Windows Continuous confidence scores  Smoother maxima in hypothesis space  Coarser sampling possible Backprojection capability  Determine a hypothesis’s support in the image  Resolve overlapping cases Easier to deal with partial occlusion  Part-based models  Reasoning about missing parts

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 13 A. Efros, B. Leibe, K. Mikolajczyk Sliding Windows for Generative Models Apply cascade idea to generative models  Discriminative training  Evaluate most promising features first

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 14 A. Efros, B. Leibe, K. Mikolajczyk Sliding Windows for Generative Models Apply cascade idea to generative models  Discriminative training  Evaluate most promising features first Direct control over search range  Only need to evaluate positions in search corridor  Only need to consider subset of features  Easier to adapt to different geometry (e.g. curved ground surface)  Should combine discriminative and generative elements! x s y Search corridor

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 15 A. Efros, B. Leibe, K. Mikolajczyk 3. Low Training Resolutions Many current s-w detectors operate on tiny images  Viola & Jones: 24  24 pixels  Torralba et al.: 32  32 pixels  Dalal & Triggs: 64  96 pixels (notable exception) Main reasons  Training efficiency (exhaustive feature selection in AdaBoost)  Evaluation speed  Want to recognize objects at small scales But…  Limited information content available at those resolutions  Not enough support to compensate for occlusions!

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 16 A. Efros, B. Leibe, K. Mikolajczyk 4. Changing Aspect Ratios Sliding window requires fixed window size  Basis for learning efficient cascade classifier How to deal with changing aspect ratios?  Fixed window size  Wastes training dimensions  Adapted window size  Difficult to share features  “Squashed” views [Dalal & Triggs]  Need to squash test image, too

A. Efros, B. Leibe, K. Mikolajczyk17 What is wrong with sliding window? Search complexity?

A. Efros, B. Leibe, K. Mikolajczyk18 Is there anything that cannot be done with sliding window?

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 19 A. Efros, B. Leibe, K. Mikolajczyk Sliding-Window Approaches Pros:  Can draw from vast stock of ML methods.  Simple technique, can be tried out very easily.  There are methods to do it very fast.  Good detection performance on many benchmark datasets.  Direct control over search range (e.g. on ground plane). Cons:  Need to evaluate many subwindows (100’000s).  Needs very fast & accurate classification  cascades, AdaBoost  Many training examples, often limited to low training resolution.  Can only deal with relatively small occlusions.  Still need to fuse resulting detections  Hard/suboptimal from binary classification output  Difficult to deal with changing aspect ratios

Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows 20 A. Efros, B. Leibe, K. Mikolajczyk So What is Left to Oppose? Feature-based vs. Window-based?  Mainly implementation trade-off… (Almost) exclusive use of discriminative methods  Why not apply generative methods instead, or combinations?  Smoother maxima in sampled 3D space.  Ability to backproject responses (top-down segmentation).  Easier to deal with partial occlusions. Low training resolutions  Only limited information content How to deal with changing aspect ratios?  E.g. front & side views of cars?  Fixed/adaptive window size?  How to share features between those?