Download presentation
Presentation is loading. Please wait.
Published byShawn Casey Modified over 9 years ago
1
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11
2
Goal: Detect all instances of objects Cars Faces Cats
3
Object model: last class Statistical Template in Bounding Box – Object is some (x,y,w,h) in image – Features defined wrt bounding box coordinates ImageTemplate Visualization Images from Felzenszwalb
4
Last class: sliding window detection … …
5
Last class: statistical template Object model = log linear model of parts at fixed positions +3+2-2-2.5= -0.5 +4+1+0.5+3+0.5= 10.5 > 7.5 ? ? Non-object Object
6
When are statistical templates useful? Caltech 101 Average Object Images
7
Deformable objects Images from Caltech-256 Slide Credit: Duan Tran
8
Deformable objects Images from D. Ramanan’s dataset Slide Credit: Duan Tran
9
Object models: this class Articulated parts model – Object is configuration of parts – Each part is detectable Images from Felzenszwalb
10
Parts-based Models Define object by collection of parts modeled by 1.Appearance 2.Spatial configuration Slide credit: Rob Fergus
11
How to model spatial relations? Star-shaped model Root Part
12
How to model spatial relations? Star-shaped model = X X X ≈ Root Part
13
How to model spatial relations? Tree-shaped model
14
How to model spatial relations? Fergus et al. ’03 Fei-Fei et al. ‘03 Leibe et al. ’04, ‘08 Crandall et al. ‘05 Fergus et al. ’05 Crandall et al. ‘05Felzenszwalb & Huttenlocher ‘05 Bouchard & Triggs ‘05Carneiro & Lowe ‘06Csurka ’04 Vasconcelos ‘00 from [Carneiro & Lowe, ECCV’06] O(N 6 )O(N 2 )O(N 3 )O(N 2 ) Many others...
15
Today’s class 1.Tree-shaped model – Example: Pictorial structures Felzenszwalb Huttenlocher 2005 2.Optimization with Dynamic Programming 3.Distance Transforms
16
Pictorial Structures Model Part = oriented rectangleSpatial model = relative size/orientation Felzenszwalb and Huttenlocher 2005
17
Part representation Background subtraction
18
Pictorial Structures Model Appearance likelihood Geometry likelihood
19
Modeling the Appearance Any appearance model could be used – HOG Templates, etc. – Here: rectangles fit to background subtracted binary map Train a detector for each part independently Appearance likelihood Geometry likelihood
20
Pictorial Structures Model Minimize Energy (-log of the likelihood): Appearance CostGeometry Cost
21
Optimization – Brute Force H T F Appearance CostDeformation Cost
22
Optimization H T F
23
Optimization - Dynamic Programming H T F
24
H T F
25
H T F
26
H T F
27
Optimization - Complexity H T F ? For all l T ? Each part can take N locations Overall? (for K parts)
28
For each pixel p, how far away is the nearest pixel q of set S – – G is often the set of edge pixels Distance Transform G: black pixels p1p1 q1q1 p2p2 q2q2
29
Computing the Distance Transform 1-Dimension
30
Extending to N-Dimensions Savings can be extended from 1-D to N-D cases if deformation cost is separable: Computing the Distance Transform
31
Distance Transform - Applications Set distances – e.g. Hausdorff Distance Image processing – e.g. Blurring Robotics – Motion Planning Alignment – Edge images – Motion tracks – Audio warping Deformable Part Models (of course!)
32
Generalized Distance Transform Original form: General form: – m(q): arbitrary function sampled at discrete points – For each p, find a nearby q with small m(q) – For original DT: – For part models: For some deformation costs,
33
How do we do it? Key idea: Construct lower “envelope” of data Given envelope, compute f(p) in O(N) time Goal: Compute envelope in O(N) time
34
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = Inf
35
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = Inf X
36
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = 4 End(3) = Inf X
37
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope X Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = 4 End(3) = Inf
38
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X
39
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X
40
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = 3 End(2) = 4 Start(3) = [] End(3) = [] Start(4) = 4 End(4) = Inf X
41
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 3 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 3 End(4) = Inf X
42
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 2.6 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 2.6 End(4) = Inf X
43
Key idea: Keep track of intersection points – f i (x), f j (x) only intersect once (let i<j) Computing the Lower Envelope Start(1) = -Inf End(1) = 2.6 Start(2) = [] End(2) = [] Start(3) = [] End(3) = [] Start(4) = 2.6 End(4) = 3.5 Start(5) = 3.5 End(5) = Inf X
44
Is This O(N)? Yes! N parabolas are added – For each addition, may remove up to O(N) parabolas But, each parabola can only be deleted once Thus, O(N) additions, and at most O(N) deletions
45
What distances can we use? Quadratic (with linear shift) Abs. diff Min-composition Bounded (Requires extra bookkeeping)
46
Back to the Pictorial structures model Problem: May need to infer more than one configuration a)May be more than one object Report scores for every root position, apply NMS b)Optimal solution may be incorrect Sampling – Sample root node, then each node given parent, until all parts are sampled
47
Sample poses from likelihood and choose best match with Chamfer distance to map
48
Results for person matching 48
49
Results for person matching - Mistakes Ambiguous Background subtracted image 49
50
Recently enhanced pictorial structures BMVC 2009
51
Deformable Latent Parts Model Useful parts discovered during training Detections Template Visualization Felzenszwalb et al. 2008
52
Things to Remember Rather than searching for whole object, can locate parts that compose the object – Better encoding of spatial variation Models can be broken down into part appearance and spatial configuration – Wide variety of models Efficient optimization is often tricky, but many tricks available – Dynamic programming, Distance transforms
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.