Download presentation
Presentation is loading. Please wait.
Published byIlene Thornton Modified over 9 years ago
1
Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty
2
Outline Why Image Parsing? Introduction to Concepts in DDMCMC DDMCMC applied to Image Parsing Combining Discriminative and Generative Models for Parsing Results Comments
3
Image Parsing Image I Parse Structure W Optimize p(W|I)
4
Properties of Parse Structure Dynamic and reconfigurable –Variable number of nodes and node types Defined by a Markov Chain –Data Driven Markov Chain Monte Carlo (earlier work in segmentation, grouping and recognition)
5
Key Concepts Joint model for Segmentation & Recognition –Combine different modules to obtain cues Fully generative explanation for Image generation –Uses Generative and Discriminative Models + DDMCMC framework –Concurrent Top-Down & Bottom-Up Parsing
6
Pattern Classes 62 characters Faces Regions
7
Key Concepts: –Markov Chains –Markov Chain Monte Carlo Metropolis-Hastings [Metropolis 1953, Hastings 1970] Reversible Jump [Green 1995] –Data Driven Markov Chain Monte Carlo MCMC: A Quick Tour
8
Markov Chains Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
9
Markov Chain Monte Carlo Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
10
Metropolis-Hastings Algorithm Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
11
Metropolis-Hastings Algorithm Proposal Distribution Invariant Distribution Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
12
Reversible Jumps MCMC Many competing models to explain data –Need to explore this complicated state space Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
13
DDMCMC Motivation Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005
14
DDMCMC Motivation Generative Model p(I|W)p(W) State Space
15
DDMCMC Motivation Generative Model p(I|W)p(W) State Space Discriminative Model q( w j | I ) Dramatically reduce search space by focusing sampling to highly probable states.
16
DDMCMC Framework Moves: –Node Creation –Node Deletion –Change Node Attributes
17
Transition Kernel Satisfies detailed balanced equation Full Transition Kernel
18
Convergence to p(W|I) Monotonically at a geometric rate
19
Criteria for Designing Transition Kernels
20
Image Generation Model Regions: Constant Intensity Textures Shading State of parse graph
21
62 characters Faces 3 Regions
22
Uniform Designed to penalize high model complexity
23
Shape Prior Faces 3 Regions
24
Shape Prior: Text
25
Intensity Models
26
Intensity Model: Faces
27
Discriminative Cues Used Adaboost Trained –Face Detector –Text Detector Adaptive Binarization Cues Edge Cues –Canny at 3 scales Shape Affinity Cues Region Affinity Cues
28
Transition Kernel Design Remember
29
Possible Transitions 1.Birth/Death of a Face Node 2.Birth/Death of Text Node 3.Boundary Evolution 4.Split/Merge Region 5.Change node attributes
30
Face/Text Transitions
31
Region Transitions
32
Change Node Attributes
33
Basic Control Algorithm
35
Results
37
Comments Well motivated but very complicated approach to THE HOLY GRAIL problem in vision –Good global convergence results for inference with very minor dependence on initial W. –Extensible to larger set of primitives and pattern types. Many details of the algorithm are missing and it is hard to understand the motivation for choices of values for some parameters Unclear if the p(W|I)’s for configurations with different class compositions are comparable. Derek’s comment on Adaboost false positives and their failure to report their exact improvement No quantitative results/comparison to other algorithms and approaches –It should be possible to design a simple experiment to measure performance on recognition/detection/localization tasks.
38
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.