Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.

Slides:

Advertisements

Similar presentations

Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.

Advertisements

Basic Steps 1.Compute the x and y image derivatives 2.Classify each derivative as being caused by either shading or a reflectance change 3.Set derivatives.

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.

November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.

The Shape Boltzmann Machine S. M. Ali Eslami Nicolas Heess John Winn CVPR 2012 Providence, Rhode Island A Strong Model of Object Shape.

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Part 4: Combined segmentation and recognition by Rob Fergus (MIT)

Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.

Shape Sharing for Object Segmentation

Exact Inference in Bayes Nets

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.

ADS lab NCKU1 Michael Maire, Pablo Arbelaez, Charless Fowlkes, and Jitendra Malik university of California, Berkeley – Berkeley university of California,

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

1 P. Arbelaez, M. Maire, C. Fowlkes, J. Malik. Contour Detection and Hierarchical image Segmentation. IEEE Trans. on PAMI, Student: Hsin-Min Cheng.

1 Contours and Junctions in Natural Images Jitendra Malik University of California at Berkeley (with Jianbo Shi, Thomas Leung, Serge Belongie, Charless.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.

A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge.

Computer Vision Group University of California Berkeley Estimating Human Body Configurations using Shape Context Matching Greg Mori and Jitendra Malik.

Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Computer Vision Group University of California Berkeley 1 Learning Scale-Invariant Contour Completion Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.

Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues David R. Martin Charless C. Fowlkes Jitendra Malik.

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

Abstract We present a model of curvilinear grouping using piecewise linear representations of contours and a conditional random field to capture continuity.

Measuring the Ecological Statistics of Figure-Ground Charless Fowlkes, David Martin, Jitendra Malik.

1 Learning to Detect Natural Image Boundaries David Martin, Charless Fowlkes, Jitendra Malik Computer Science Division University of California at Berkeley.

CVR05 University of California Berkeley 1 Familiar Configuration Enables Figure/Ground Assignment in Natural Scenes Xiaofeng Ren, Charless Fowlkes, Jitendra.

Berkeley Vision GroupNIPS Vancouver Learning to Detect Natural Image Boundaries Using Local Brightness,

CVR05 University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes, Jitendra Malik.

Measuring the Ecological Statistics of Figure-Ground Charless Fowlkes, David Martin, Jitendra Malik.

A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes, Doron Tal, Jitendra Malik UC Berkeley

1 The Ecological Statistics of Grouping by Similarity Charless Fowlkes, David Martin, Jitendra Malik Computer Science Division University of California.

Computer Vision Group University of California Berkeley 1 Scale-Invariant Random Fields for Mid-level Vision Xiaofeng Ren, Charless Fowlkes and Jitendra.

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

Probabilistic Models for Parsing Images Xiaofeng Ren University of California, Berkeley.

Understanding Belief Propagation and its Applications Dan Yuan June 2004.

MSRI University of California Berkeley 1 Recovering Human Body Configurations using Pairwise Constraints between Parts Xiaofeng Ren, Alex Berg, Jitendra.

Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.

On Measuring * the Ecological Validity of Local Figure-Ground Cues Charless Fowlkes, David Martin, Jitendra Malik Computer Science Division University.

1 Occlusions – the world is flat without them! : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

© 2006 by Davi GeigerComputer Vision April 2006 L1.1 Binocular Stereo Left Image Right Image.

1 How do ideas from perceptual organization relate to natural scenes?

1 Ecological Statistics and Perceptual Organization Charless Fowlkes work with David Martin and Jitendra Malik at University of California at Berkeley.

Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, U.C. Berkeley We present a model of edge and region grouping.

A Trainable Graph Combination Scheme for Belief Propagation Kai Ju Liu New York University.

Heather Dunlop : Advanced Perception January 25, 2006

1st Day Lecture 1: Intro. Goal of Vision To understand and interpret the image. Images consist of many different patterns – grass, faces, crowds.

Computer vision: models, learning and inference

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Linked Edges as Stable Region Boundaries* Michael Donoser, Hayko Riemenschneider and Horst Bischof This work introduces an unsupervised method to detect.

Boltzmann Machines and their Extensions S. M. Ali Eslami Nicolas Heess John Winn March 2013 Heriott-Watt University.

1 Contours and Junctions in Natural Images Jitendra Malik University of California at Berkeley (with Jianbo Shi, Thomas Leung, Serge Belongie, Charless.

Multiscale Symmetric Part Detection and Grouping Alex Levinshtein, Sven Dickinson, University of Toronto and Cristian Sminchisescu, University of Bonn.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

John Lafferty Andrew McCallum Fernando Pereira

Markov Random Fields & Conditional Random Fields

Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Perfect recall: Every decision node observes all earlier decision nodes and their parents (along a “temporal” order) Sum-max-sum rule (dynamical programming):

Nonparametric Semantic Segmentation

Context-Aware Modeling and Recognition of Activities in Video

Contours and Junctions in Natural Images

Learning to Combine Bottom-Up and Top-Down Segmentation

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.

Learning complex visual concepts

Presentation transcript:

Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik

Computer Vision Group University of California Berkeley 2 Abstract We present a model of edge and region grouping using a conditional random field built over a scale-invariant representation of images to integrate multiple cues. Our model includes potentials that capture low-level similarity, mid-level curvilinear continuity and high-level object shape. Maximum likelihood parameters for the model are learned from human labeled groundtruth on a large collection of horse images using belief propagation. Using held out test data, we quantify the information gained by incorporating generic mid-level cues and high-level shape.

Computer Vision Group University of California Berkeley 3 Introduction CRF Conditional Random Fields on triangulated images, trained to integrate low/mid/high-level grouping cues

Computer Vision Group University of California Berkeley 4 Inference on the CDT Graph Xe Yt Z Contour variables{Xe} Region variables{Yt} Object variable{Z} Integrating {Xe},{Yt} and{Z}: low/mid/high-level cues Xe Yt Z

Computer Vision Group University of California Berkeley 5 Grouping Cues Low-level Cues –Edge energy along edge e –Brightness/texture similarity between two regions s and t Mid-level Cues –Edge collinearity and junction frequency at vertex V –Consistency between edge e and two adjoining regions s and t High-level Cues –Texture similarity of region t to exemplars –Compatibility of region shape with pose –Compatibility of local edge shape with pose Low-level Cues –Edge energy along edge e –Brightness/texture similarity between two regions s and t Mid-level Cues –Edge collinearity and junction frequency at vertex V –Consistency between edge e and two adjoining regions s and t High-level Cues –Texture similarity of region t to exemplars –Compatibility of region shape with pose –Compatibility of local edge shape with pose L 1 (X e |I) L 2 (Y s,Y t |I) M 1 (X V |I) M 2 (X e,Y s,Y t ) H 1 (Y t |I) H 2 (Y t,Z|I) H 3 (X e,Z|I)

Computer Vision Group University of California Berkeley 6 Conditional Random Fields for Cue Integration Estimate the marginal posteriors of X, Y and Z

Computer Vision Group University of California Berkeley 7 Encoding Object Knowledge (Region-based) Support Mask Yt Z (Edge-based) Shapemes Xe Z

Computer Vision Group University of California Berkeley 8 H 3 (X e,Z|I): local shape and pose distribution ON(x,y,i) shapeme j (vertical pairs) distribution ON(x,y,j) Let S(x,y) be the shapeme at image location (x,y); (x o,y o ) be the object location in Z. Compute average log likelihood S ON (e,Z) as: Then we have: S OFF (e,Z) is defined similarly. shapeme i (horizontal line)

Computer Vision Group University of California Berkeley 9 Training and Testing Trained on half (172) of the grayscale horse images from the [Borenstein & Ullman 02] Horse Dataset. Use human-marked segmentations to construct groundtruth labels on both CDT edges and triangles. Uses loopy belief propagation for approximate inference; takes < 1 second to converge for a typical image. Parameter estimation with gradient descent for maximum likelihood; converges in 1000 iterations. Tested on the other half of the horse images in grayscale. Quantitative evaluation against groundtruth: precision- recall curves for both contours and regions. Trained on half (172) of the grayscale horse images from the [Borenstein & Ullman 02] Horse Dataset. Use human-marked segmentations to construct groundtruth labels on both CDT edges and triangles. Uses loopy belief propagation for approximate inference; takes < 1 second to converge for a typical image. Parameter estimation with gradient descent for maximum likelihood; converges in 1000 iterations. Tested on the other half of the horse images in grayscale. Quantitative evaluation against groundtruth: precision- recall curves for both contours and regions.

Computer Vision Group University of California Berkeley 10

Computer Vision Group University of California Berkeley 11

Computer Vision Group University of California Berkeley 12 Results InputInput PbOutput ContourOutput Figure

Computer Vision Group University of California Berkeley 13 InputInput PbOutput ContourOutput Figure

Computer Vision Group University of California Berkeley 14 InputInput PbOutput ContourOutput Figure

Computer Vision Group University of California Berkeley 15 Conclusion Constrained Delaunay Triangulation provides a scale- invariant discrete structure which enables efficient probabilistic inference. Conditional random fields combine joint contour and region grouping and can be efficiently trained. Mid-level cues are useful for figure/ground labeling, even when powerful object-specific cues are present. Constrained Delaunay Triangulation provides a scale- invariant discrete structure which enables efficient probabilistic inference. Conditional random fields combine joint contour and region grouping and can be efficiently trained. Mid-level cues are useful for figure/ground labeling, even when powerful object-specific cues are present.

Computer Vision Group University of California Berkeley 16 Thank You

Computer Vision Group University of California Berkeley 17