TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton ; University of Cambridge J. Jinn,

Slides:



Advertisements
Similar presentations
The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.
Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Part 4: Combined segmentation and recognition by Rob Fergus (MIT)
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
Interactive Segmentation with Super-Labels Andrew Delong Western Yuri BoykovOlga VekslerLena GorelickFrank Schmidt TexPoint fonts used in EMF. Read the.
Semantic Texton Forests for Image Categorization and Segmentation We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד.
Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006 Boosted Histograms for Improved Object Detection.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
GrabCut Interactive Foreground Extraction using Iterated Graph Cuts Carsten Rother Vladimir Kolmogorov Andrew Blake Microsoft Research Cambridge-UK.
GrabCut Interactive Foreground Extraction using Iterated Graph Cuts Carsten Rother Vladimir Kolmogorov Andrew Blake Microsoft Research Cambridge-UK.
GrabCut Interactive Image (and Stereo) Segmentation Joon Jae Lee Keimyung University Welcome. I will present Grabcut – an Interactive tool for foreground.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Detecting Pedestrians by Learning Shapelet Features
Contextual Classification with Functional Max-Margin Markov Networks Dan MunozDrew Bagnell Nicolas VandapelMartial Hebert.
Models for Scene Understanding – Global Energy models and a Style-Parameterized boosting algorithm (StyP-Boost) Jonathan Warrell, 1 Simon Prince, 2 Philip.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Robust Higher Order Potentials For Enforcing Label Consistency
Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton*, J. Winn†, C. Rother†, and A.
Oxford Brookes Seminar Thursday 3 rd September, 2009 University College London1 Representing Object-level Knowledge for Segmentation and Image Parsing:
Learning 3D mesh segmentation and labeling Evangelos Kalogerakis, Aaron Hertzmann, Karan Singh University of Toronto Head Tors o Upper arm Lower arm Hand.
The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz.
1 Outline Overview Integrating Vision Models CCM: Cascaded Classification Models Learning Spatial Context TAS: Things and Stuff Descriptive Querying of.
3D Scene Models Object recognition and scene understanding Krista Ehinger.
Graph-based Segmentation
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.
3D LayoutCRF Derek Hoiem Carsten Rother John Winn.
Learning Based Hierarchical Vessel Segmentation
Analysis: TextonBoost and Semantic Texton Forests
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Segmentation: MRFs and Graph Cuts Computer Vision CS 143, Brown James Hays 10/07/11 Many slides from Kristin Grauman and Derek Hoiem.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Visual Object Recognition
Texture We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Texture We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
Putting Context into Vision Derek Hoiem September 15, 2004.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Associative Hierarchical CRFs for Object Class Image Segmentation
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
Inference in generative models of images and video John Winn MSR Cambridge May 2004.
Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.
Improved Object Detection
Learning Jigsaws for clustering appearance and shape John Winn, Anitha Kannan and Carsten Rother NIPS 2006.
Jigsaws: joint appearance and shape clustering John Winn with Anitha Kannan and Carsten Rother Microsoft Research, Cambridge.
Markov Random Fields & Conditional Random Fields
Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
M ACHINE L EARNING : F OUNDATIONS C OURSE TAU – 2012A P ROF. Y ISHAY M ANSOUR o TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class.
Image segmentation.
Holistic Scene Understanding Virginia Tech ECE /02/26 Stanislaw Antol.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/27/12.
2. Skin - color filtering.
GrabCut Interactive Foreground Extraction using Iterated Graph Cuts Carsten Rother Vladimir Kolmogorov Andrew Blake Microsoft Research Cambridge-UK.
Context-based vision system for place and object recognition
Object detection as supervised classification
Learning to Combine Bottom-Up and Top-Down Segmentation
Integration and Graphical Models
Lecture 29: Face Detection Revisited
“Traditional” image segmentation
Presentation transcript:

TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton ; University of Cambridge J. Jinn, C. Rother, A. Criminisi ; MSR Cambridge Presented by Derek Hoiem For Misc Reading 02/15/06

The Ideas in TextonBoost Textons from Universal Visual Dictionary paper [Winn Criminisi Minka ICCV 2005] Color models and GC from “Foreground Extraction using Graph Cuts” [Rother Kolmogorov Blake SG 2004] Boosting + Integral Image from Viola-Jones Joint Boosting from [Torralba Murphy Freeman CVPR 2004]

What’s good about this paper Provides recognition + segmentation for many classes (perhaps most complete set ever) Combines several good ideas Very thorough evaluation

What’s bad about this paper A bit hacky Does not beat past work (in terms of quantitative recognition results) No modeling of “everything else” class

Object Recognition and Segmentation are Coupled Images from [Leibe et al. 2005] Approximate Segmentation Good Segmentation No Segmentation People Present

The Three Approaches Segment  Detect Detect  Segment Segment  Detect

Segment first and ask questions later. Reduces possible locations for objects Allows use of shape information and makes long-range cues more effective But what if segmentation is wrong? [Duygulu et al ECCV 2002]

Object recognition + data-driven smoothing Object recognition drives segmentation Segmentation gives little back He et al This Paper

Is there a better way? Integrated segmentation and recognition Generalized Swendsen-Wang [Tu et al. 2003] [Barba Wu 2005]

TextonBoost Overview Shape-texture: localized textons Color: mixture of Gaussians Location: normalized x-y coordinates Edges: contrast-sensitive Pott’s model

Learning the CRF Params The authors claim to be using piecewise training … [Sutton McCallum UAI 2005]

Learning the CRF Params But it’s really just piecewise hacking –Learn params for different potential functions independently –Raise potentials to some exponent to reduce overcounting

Location Term Counts for each normalized position over training images for each class from Validation

Color Term Mixture of Gaussian learned over image Mixture coefficients determined separately for each class Iterate between class labeling and parameter-estimation Manual: 3

Edge Term Parameters learned using validation data

Texture-Shape 17 filters (oriented gaus/lap + dots) Cluster responses to form textons Count textons within white box (relative to position i) Feature = texton + rectangle

Boosting Textons Use “Joint Boosting” [Torralba Murphy Freeman CVPR 2004] –Different classes share features –Weak learners: decision stumps on texton count within rectangle To speed training: –Randomly select 0.3% of possible features from large set –Downsample texton maps for training images

“Shape Context” Toy example

Random Feature Selection Toy example (training on ten images)

Results on Boosted Textons Boosted shape-textons in isolation –Training time: 42 hrs for 5000 rounds on 21- class training set of 276 images

Parameters Learned from Validation Number of Adaboost rounds (when to stop) Number of textons Edge potential parameters Location potential exponent

Qualitative (Good) Results

Qualitative (Bad) Results But notice good segmentation, even with bad labeling

Quantitative Results

Effect of Different Model Potentials Boosted textons onlyNo color modelingFull CRF model

Corel/Sowerby

The End.