Contours and Junctions in Natural Images

Slides:

Advertisements

Similar presentations

Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.

Advertisements

Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.

Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

ADS lab NCKU1 Michael Maire, Pablo Arbelaez, Charless Fowlkes, and Jitendra Malik university of California, Berkeley – Berkeley university of California,

1 P. Arbelaez, M. Maire, C. Fowlkes, J. Malik. Contour Detection and Hierarchical image Segmentation. IEEE Trans. on PAMI, Student: Hsin-Min Cheng.

Computer Vision Group University of California Berkeley Ecological Statistics of Good Continuation: Multi-scale Markov Models for Contours Xiaofeng Ren.

1 Contours and Junctions in Natural Images Jitendra Malik University of California at Berkeley (with Jianbo Shi, Thomas Leung, Serge Belongie, Charless.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Biased Normalized Cuts 1 Subhransu Maji and Jithndra Malik University of California, Berkeley IEEE Conference on Computer Vision and Pattern Recognition.

Lecture 6 Image Segmentation

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Image segmentation. The goals of segmentation Group together similar-looking pixels for efficiency of further processing “Bottom-up” process Unsupervised.

Algorithms & Applications in Computer Vision Lihi Zelnik-Manor Lecture 11: Structure from Motion.

Segmentation. Terminology Segmentation, grouping, perceptual organization: gathering features that belong together Fitting: associating a model with observed.

Computer Vision Group University of California Berkeley 1 Learning Scale-Invariant Contour Completion Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.

CS 376b Introduction to Computer Vision 04 / 08 / 2008 Instructor: Michael Eckmann.

Region Segmentation. Find sets of pixels, such that All pixels in region i satisfy some constraint of similarity.

Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues David R. Martin Charless C. Fowlkes Jitendra Malik.

Understanding Gestalt Cues and Ecological Statistics using a Database of Human Segmented Images Charless Fowlkes, David Martin and Jitendra Malik Department.

Abstract We present a model of curvilinear grouping using piecewise linear representations of contours and a conditional random field to capture continuity.

Measuring the Ecological Statistics of Figure-Ground Charless Fowlkes, David Martin, Jitendra Malik.

1 Learning to Detect Natural Image Boundaries David Martin, Charless Fowlkes, Jitendra Malik Computer Science Division University of California at Berkeley.

CVR05 University of California Berkeley 1 Familiar Configuration Enables Figure/Ground Assignment in Natural Scenes Xiaofeng Ren, Charless Fowlkes, Jitendra.

Berkeley Vision GroupNIPS Vancouver Learning to Detect Natural Image Boundaries Using Local Brightness,

Measuring the Ecological Statistics of Figure-Ground Charless Fowlkes, David Martin, Jitendra Malik.

A Database of Human Segmented Natural Images and Two Applications David Martin, Charless Fowlkes, Doron Tal, Jitendra Malik UC Berkeley

1 The Ecological Statistics of Grouping by Similarity Charless Fowlkes, David Martin, Jitendra Malik Computer Science Division University of California.

Cutting complete weighted graphs Jameson Cahill Ido Heskia Math/CSC 870 Spring 2007.

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.

On Measuring * the Ecological Validity of Local Figure-Ground Cues Charless Fowlkes, David Martin, Jitendra Malik Computer Science Division University.

1 Occlusions – the world is flat without them! : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

1 How do ideas from perceptual organization relate to natural scenes?

1 Ecological Statistics and Perceptual Organization Charless Fowlkes work with David Martin and Jitendra Malik at University of California at Berkeley.

Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.

Perceptual Organization: Segmentation and Optical Flow.

Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, U.C. Berkeley We present a model of edge and region grouping.

Heather Dunlop : Advanced Perception January 25, 2006

The Hilbert Problems of Computer Vision

Computer Vision - A Modern Approach Set: Segmentation Slides by D.A. Forsyth Segmentation and Grouping Motivation: not information is evidence Obtain a.

Graph-based Segmentation

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.

Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.

1 Contours and Junctions in Natural Images Jitendra Malik University of California at Berkeley (with Jianbo Shi, Thomas Leung, Serge Belongie, Charless.

Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.

Visual Grouping and Recognition Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

Geodesic Saliency Using Background Priors

Segmentation & Grouping Tuesday, Sept 23 Kristen Grauman UT-Austin.

Image Segmentation Superpixel methods Speaker: Hsuan-Yi Ko.

October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.

Computational Vision Jitendra Malik University of California, Berkeley.

Gestalt Perception Cognitive Robotics David S. Touretzky &

Enhanced-alignment Measure for Binary Foreground Map Evaluation

Segmentation and Grouping

Perceiving and Recognizing Objects

Learning to Combine Bottom-Up and Top-Down Segmentation

Seam Carving Project 1a due at midnight tonight.

Image Segmentation CS 678 Spring 2018.

Grouping/Segmentation

Fourier Transform of Boundaries

Announcements Project 1 is out today help session at the end of class.

“Traditional” image segmentation

Learning complex visual concepts

Presentation transcript:

Contours and Junctions in Natural Images Jitendra Malik University of California at Berkeley (with Jianbo Shi, Thomas Leung, Serge Belongie, Charless Fowlkes, David Martin, Xiaofeng Ren, Michael Maire, Pablo Arbelaez)

From Pixels to Perception Tiger Grass Water Sand outdoor wildlife tail eye legs head back shadow mouth Tiger

No. I have sky, house, and trees. I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees. ---- Max Wertheimer, 1923 I couldn’t resist the temptation to put on this famous quote  Max Wertheimer, the father of Gestalt psychology, said this in 1923: …. Wertheimer lived in an age when there were no computer, no digital camera, and no pixels. However he somehow knew about pixels, and he already warned us about the risk of using pixels to model perception, and the risk of ignoring structure in an image.

Perceptual Organization Grouping Figure/Ground

Key Research Questions in Perceptual Organization Predictive power Factors for complex, natural stimuli ? How do they interact ? Functional significance Why should these be useful or confer some evolutionary advantage to a visual organism? Brain mechanisms How are these factors implemented given what we know about V1 and higher visual areas?

Attneave’s Cat (1954) Line drawings convey most of the information

Contours and junctions are fundamental… Key to recognition, inference of 3D scene properties, visually- guided manipulation and locomotion… This goes beyond local, V1-like, edge-detection. Contours are the result of perceptual organization, grouping and figure/ground processing

Some computer vision history… Local Edge Detection was much studied in the 1970s and early 80s (Sobel, Rosenfeld, Binford-Horn, Marr-Hildreth, Canny …) Edge linking exploiting curvilinear continuity was studied as well (Rosenfeld, Zucker, Horn, Ullman …) In the 1980s, several authors argued for perceptual organization as a precursor to recognition (Binford, Witkin and Tennebaum, Lowe, Jacobs …)

However in the 90s … We realized that there was more to images than edges Biologically inspired filtering approaches (Bergen & Adelson, Malik & Perona..) Pixel based representations for recognition (Turk & Pentland, Murase & Nayar, LeCun …) We lost faith in the ability of bottom-up vision Do minimal bottom up processing , e.g. tiled orientation histograms don’t even assume that linked contours or junctions can be extracted Matching with memory of previously seen objects then becomes the primary engine for parsing an image. √ ?

At Berkeley, we took a contrary view… Collect Data Set of Human segmented images Learn Local Boundary Model for combining brightness, color and texture Global framework to capture closure, continuity Detect and localize junctions Integrate low, mid and high-level information for grouping and figure-ground segmentation

Berkeley Segmentation DataSet [BSDS] D. Martin, C. Fowlkes, D. Tal, J. Malik. "A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics", ICCV, 2001

Contour detection ~1970 13

Contour detection ~1990 14

Contour detection ~2004 15

Contour detection ~2008 (gray) 16

Contour detection ~2008 (color) 17

Outline Collect Data Set of Human segmented images Learn Local Boundary Model for combining brightness, color and texture Global framework to capture closure, continuity Detect and localize junctions Integrate low, mid and high-level information for grouping and figure-ground segmentation

Contours can be defined by any of a number of cues (P. Cavanagh)

Cue-Invariant Representations Gray level photographs Objects from motion Objects from luminance Objects from disparity Line drawings Objects from texture Grill-Spector et al. , Neuron 1998

Martin, Fowlkes, Malik PAMI 04 Pb Image Boundary Cues Cue Combination Brightness Model Color Texture Challenges: texture cue, cue combination Goal: learn the posterior probability of a boundary Pb(x,y,) from local information only

These are combined using logistic regression Individual Features 1976 CIE L*a*b* colorspace Brightness Gradient BG(x,y,r,) Difference of L* distributions Color Gradient CG(x,y,r,) Difference of a*b* distributions Texture Gradient TG(x,y,r,) Difference of distributions of V1-like filter responses  r (x,y) These are combined using logistic regression

Various Cue Combinations

Outline Collect Data Set of Human segmented images Learn Local Boundary Model for combining brightness, color and texture Global framework to capture closure, continuity Detect and localize junctions Integrate low, mid and high-level information for grouping and figure-ground segmentation

Build a weighted graph G=(V,E) from image Exploiting global constraints: Image Segmentation as Graph Partitioning Build a weighted graph G=(V,E) from image V: image pixels E: connections between pairs of nearby pixels Partition graph so that similarity within group is large and similarity between groups is small -- Normalized Cuts [Shi & Malik 97]

Wij small when intervening contour strong, small when weak Wij small when intervening contour strong, small when weak.. Cij = max Pb(x,y) for (x,y) on line segment ij; Wij = exp ( - Cij /  So what are our similarity cues going to be? For a pair of points i and j, what can we measure about the image that will tell us whether they belong to the same group? We can start by looking in a small neighborhood around each point and comparing them. Do they have similar Color, intensity …texture? Of course this isn’t really enough since if I choose points from both these elephants, they Will have the same local appearance. <click>

Normalized Cuts as a Spring-Mass system Each pixel is a point mass; each connection is a spring: Fundamental modes are generalized eigenvectors of (D - W) x = Dx

Eigenvectors carry contour information

We do not try to find regions from the eigenvectors, so we avoid the “broken sky” artifacts of Ncuts ..

The Benefits of Globalization Maire, Arbelaez, Fowlkes, Malik, CVPR 08

Comparison to other approaches

Outline Collect Data Set of Human segmented images Learn Local Boundary Model for combining brightness, color and texture Global framework to capture closure, continuity Detect and localize junctions Integrate low, mid and high-level information for grouping and figure-ground segmentation

Detecting Junctions

Benchmarking corner detection

Better object recognition using previous version of Pb Ferrari, Fevrier, Jurie and Schmid (PAMI 08) Shotton, Blake and Cipolla (PAMI 08)

Outline Collect Data Set of Human segmented images Learn Local Boundary Model for combining brightness, color and texture Global framework to capture closure, continuity Detect and localize junctions Integrate low, mid and high-level cues for grouping and figure-ground segmentation Ren, Fowlkes, Malik, IJCV ‘08 Fowlkes, Martin, Malik, JOV ‘07 Ren, Fowlkes, Malik, ECCV ‘06

Power laws for contour lengths

Convexity Convexity(p) = log(ConvF / ConvG) [Metzger 1953, Kanizsa and Gerbino 1976] p G F ConvG = percentage of straight lines that lie completely within region G Convexity(p) = log(ConvF / ConvG)

Figural regions tend to be convex

Lower Region LowerRegion(p) = θG θ p center of mass [Vecera, Vogel & Woodman 2002] θ p center of mass LowerRegion(p) = θG

Figural regions tend to lie below ground regions

Ren, Fowlkes, Malik ECCV ‘06 Object and Scene Recognition Grouping / Segmentation Figure/Ground Organization Human subjects label groundtruth figure/ground assignments in natural images. Shapemes encode high-level knowledge in a generic way, capturing local figure/ground cues. A conditional random field incorporates junction cues and enforces global consistency. So to summarize,

Forty years of contour detection Roberts (1965) Sobel (1968) Prewitt (1970) Marr Hildreth (1980) Canny (1986) Perona Malik (1990) Martin Fowlkes Malik (2004) Maire Arbelaez Fowlkes Malik (2008) 47

Forty years of contour detection Roberts (1965) Sobel (1968) Prewitt (1970) Marr Hildreth (1980) Canny (1986) Perona Malik (1990) Martin Fowlkes Malik (2004) Maire Arbelaez Fowlkes Malik (2008) ??? (2013) 48

Curvilinear Grouping Boundaries are smooth in nature! A number of associated visual phenomena Visual completion Illusory contours Good continuation Curvilinear grouping is based on a simple but fundamental fact, that boundaries of objects in nature are smooth and continuous. There are a few visual phenomena associated with it, such as good continuation, visual completion, or illusory contours. Without going into details of these phenomena, it suffices to say that this is a well studied problem in psychophysics, and generally considered to be an important part of perception. But, from a practical point of view, what do we care about this?