Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.

Similar presentations


Presentation on theme: "Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik."— Presentation transcript:

1 Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik

2 Computer Vision Group University of California Berkeley 2 Abstract We present a model of edge and region grouping using a conditional random field built over a scale-invariant representation of images to integrate multiple cues. Our model includes potentials that capture low-level similarity, mid-level curvilinear continuity and high-level object shape. Maximum likelihood parameters for the model are learned from human labeled groundtruth on a large collection of horse images using belief propagation. Using held out test data, we quantify the information gained by incorporating generic mid-level cues and high-level shape.

3 Computer Vision Group University of California Berkeley 3 Introduction CRF Conditional Random Fields on triangulated images, trained to integrate low/mid/high-level grouping cues

4 Computer Vision Group University of California Berkeley 4 Inference on the CDT Graph Xe Yt Z Contour variables{Xe} Region variables{Yt} Object variable{Z} Integrating {Xe},{Yt} and{Z}: low/mid/high-level cues Xe Yt Z

5 Computer Vision Group University of California Berkeley 5 Grouping Cues Low-level Cues –Edge energy along edge e –Brightness/texture similarity between two regions s and t Mid-level Cues –Edge collinearity and junction frequency at vertex V –Consistency between edge e and two adjoining regions s and t High-level Cues –Texture similarity of region t to exemplars –Compatibility of region shape with pose –Compatibility of local edge shape with pose Low-level Cues –Edge energy along edge e –Brightness/texture similarity between two regions s and t Mid-level Cues –Edge collinearity and junction frequency at vertex V –Consistency between edge e and two adjoining regions s and t High-level Cues –Texture similarity of region t to exemplars –Compatibility of region shape with pose –Compatibility of local edge shape with pose L 1 (X e |I) L 2 (Y s,Y t |I) M 1 (X V |I) M 2 (X e,Y s,Y t ) H 1 (Y t |I) H 2 (Y t,Z|I) H 3 (X e,Z|I)

6 Computer Vision Group University of California Berkeley 6 Conditional Random Fields for Cue Integration Estimate the marginal posteriors of X, Y and Z

7 Computer Vision Group University of California Berkeley 7 Encoding Object Knowledge (Region-based) Support Mask Yt Z (Edge-based) Shapemes Xe Z

8 Computer Vision Group University of California Berkeley 8 H 3 (X e,Z|I): local shape and pose distribution ON(x,y,i) shapeme j (vertical pairs) distribution ON(x,y,j) Let S(x,y) be the shapeme at image location (x,y); (x o,y o ) be the object location in Z. Compute average log likelihood S ON (e,Z) as: Then we have: S OFF (e,Z) is defined similarly. shapeme i (horizontal line)

9 Computer Vision Group University of California Berkeley 9 Training and Testing Trained on half (172) of the grayscale horse images from the [Borenstein & Ullman 02] Horse Dataset. Use human-marked segmentations to construct groundtruth labels on both CDT edges and triangles. Uses loopy belief propagation for approximate inference; takes < 1 second to converge for a typical image. Parameter estimation with gradient descent for maximum likelihood; converges in 1000 iterations. Tested on the other half of the horse images in grayscale. Quantitative evaluation against groundtruth: precision- recall curves for both contours and regions. Trained on half (172) of the grayscale horse images from the [Borenstein & Ullman 02] Horse Dataset. Use human-marked segmentations to construct groundtruth labels on both CDT edges and triangles. Uses loopy belief propagation for approximate inference; takes < 1 second to converge for a typical image. Parameter estimation with gradient descent for maximum likelihood; converges in 1000 iterations. Tested on the other half of the horse images in grayscale. Quantitative evaluation against groundtruth: precision- recall curves for both contours and regions.

10 Computer Vision Group University of California Berkeley 10

11 Computer Vision Group University of California Berkeley 11

12 Computer Vision Group University of California Berkeley 12 Results InputInput PbOutput ContourOutput Figure

13 Computer Vision Group University of California Berkeley 13 InputInput PbOutput ContourOutput Figure

14 Computer Vision Group University of California Berkeley 14 InputInput PbOutput ContourOutput Figure

15 Computer Vision Group University of California Berkeley 15 Conclusion Constrained Delaunay Triangulation provides a scale- invariant discrete structure which enables efficient probabilistic inference. Conditional random fields combine joint contour and region grouping and can be efficiently trained. Mid-level cues are useful for figure/ground labeling, even when powerful object-specific cues are present. Constrained Delaunay Triangulation provides a scale- invariant discrete structure which enables efficient probabilistic inference. Conditional random fields combine joint contour and region grouping and can be efficiently trained. Mid-level cues are useful for figure/ground labeling, even when powerful object-specific cues are present.

16 Computer Vision Group University of California Berkeley 16 Thank You

17 Computer Vision Group University of California Berkeley 17


Download ppt "Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik."

Similar presentations


Ads by Google