Download presentation
1
Heather Dunlop 16-721: Advanced Perception January 25, 2006
Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture Cues by David R. Martin, Charless C. Fowlkes, Jitendra Malik Heather Dunlop 16-721: Advanced Perception January 25, 2006
2
What is a Boundary? Canny Human Martin, 2002
“A boundary is a contour in the image plane that represents a change in pixel ownership from one object or surface to another” Edges are not boundaries
3
Dataset “You will be presented a photographic image. Divide the image into some number of segments, where the segments represent ‘things’ or ‘parts of things’ in the scene. The number of segments is up to you, as it depends on the image. Something between 2 and 30 is likely to be appropriate. It is important that all of the segments have approximately equal importance.”
4
Dataset Database of over 1000 images and 5-10 segmentations for each
Thicker lines indicate more common boundary choice Martin, 2002
5
Boundaries Non-boundaries Boundaries Intensity Brightness Color
Brightness on its own is not sufficient Color Texture Martin, 2002
6
Method Goal: learn the probability of a boundary, Pb(x,y,θ) Image
Optimized Cues Boundary Strength Brightness Color Texture Benchmark Human Segmentations Cue Combination Model Goal: “use features extracted from such an image patch to estimate the posterior probability of a boundary passing through the center point” Use cues such as intensity, brightness, color and texture to get a measure of boundary strength How to combine cues? It’s a supervised learning problem. Learn an optimal local boundary model from labeled images Approach: look at each pixel for local discontinuities in several feature channels, over a range of orientations and scales Martin, 2002
7
Image Features CIE L*a*b* color space (luminance, red-green, yellow-blue) Oriented Energy: fe: Gaussian second derivative fo: Its Hilbert transform Brightness L* distribution Color a* and b* distributions (joint or marginal) Texture “In natural images, brightness edges are more than simple steps. Phenomena such as specularities, mutual illumination, and shading result in composite intensity profiles consisting of steps, peaks and roofs.” even and odd symmetric filters has maximum response for contours at orientation theta -minimal accuracy difference between joint and marginal distributions -joint is much more computation intensive, so use marginal
8
Texture Convolve with a filter bank:
Gaussian second derivative Its Hilbert transform Difference of Gaussians Filter responses give a measure of texture Each pixel is associated with a vector of 13 filter responses centered at that pixel
9
Other Filter Banks Leung-Malik filter set: Schmid filter set:
Maximum Response 8 filter set: MR8: take maximum over orientations
10
Textons Convolve image with filter bank
Cluster filter responses to form textons Adapted from Martin, 2002 and Varma, Zisserman, 2005
11
Texton Distribution Assign each pixel to nearest texton
Form distribution of textons Adapted from Martin, 2002 and Varma, Zisserman, 2005
12
Gradient-based Features
Brightness (BG), color (CG), texture (TG) gradients Half-disc regions described by histograms Compare distributions with χ2 statistic r (x,y) “At at location (x,y) in the image, draw a circle of radius r, and divide it along the diameter at orientation theta. The gradient function G(x,y,theta,r) compares the contents of the two resulting disc halves. A large difference between the disc halves indicates a discontinuity in the image along the disc’s diameter.” 8 orientations, 3 scales
13
Texture Gradient Texton distribution in two half circles Martin, 2002
14
Localization Tightly localize boundaries Reduce noise
Coalesce double detections Improve OE and TG features OE OE localized Fit a peak (parabola) TG TG localized Martin, Fowlkes, Malik, 2004
15
Optimization Texture parameters:
type of filter bank scale of filters number of textons universal or image-specific textons Other possible distance/histogram comparison metrics Number of bins for histograms Scale parameter for all cues It was found that a single scale is sufficient for texture Universal vs. image-specific textons: -computational cost approximately equal -accuracy approx equal -optimal number of textons for universal textons is roughly double that for image-specific textons -universal preferable if doing image retrieval and object recognition -image-specific used for convenience L1, L2 norm, chi-square, quadratic form, Earth Mover’s distance
16
Evaluation Methodology
Posterior probability of boundary: Pb(x,y,θ) Evaluation measure: precision recall curve F-measure: “formulate boundary-detection as a classification problem of discriminating non-boundary from boundary pixels” -take maximum over orientations PR curve: “captures the tradeoff between accuracy and noise as the detector threshold varies” “Precision is the fraction of detections that are true positives rather than false positives, while recall is the fraction of true positives that are detected rather than missed. In probabilistic terms, precision is the probability that the detector’s signal is valid, and recall is the probability that the ground truth data was detected.” “Each point on the PR curve is computed from the detector’s output at a particular threshold.” “First, we correspond the machine boundary map separately with each human map in turn. Only those machine boundary pixels that match no human boundary are counted as false positives. The hit rate is simply averaged over the different human, so that to achieve perfect recall the machine boundary map must explain all of the human data.” F-measure: defines a relative cost between precision and recall for a specific application The location of maximum F-measure along the curve provides the optimal detector threshold given alpha. Canny’s goals of boundary detection: high detection rate, single detection, good localization Martin, 2002
17
Cue Combination Which cues should be used?
OE is redundant when other cues are present BG+CG+TG produces best results Martin, 2002
18
Classifiers Until now, only logistic regression was used
Other possible classifiers: Density estimation Classification trees Hierarchical mixtures of experts Support vector machines Logistic regression: fast convergence and reliable svm: prohibitively slow didn’t always produce meaningful results performance of all classifiers is approximately equal Martin, 2002
19
Result Comparison Alternative methods:
Matlab’s Canny edge detector with and without hysteresis Spatially-averaged second moment matrix (2MM) Where’s the human curve come from? “The points marked by a ‘+’ on the plot show the precision and recall of each ground truth human segmentation when compared to other humans The solid curve shows the F=0.80 curve, representing the frontier of human performance for this task.” Martin, 2002
20
Results Canny 2MM BG+CG+TG Human Image Martin, 2002
-texture on man’s shirt -no boundary on shoulder Martin, 2002
21
Results Canny 2MM BG+CG+TG Human Image Martin, 2002
-windows are texture not boundary -underside of boat and on dock Martin, 2002
22
Results Canny 2MM BG+CG+TG Human Image Martin, 2002
23
Conclusions Large data set used for testing
Texture gradients are a powerful cue Simple linear model sufficient for cue combination Outperforms existing methods An approach that is useful for higher-level algorithms Code is available online:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.