Top-Down & Bottom Up Segmentation

Slides:



Advertisements
Similar presentations
Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Top-Down & Bottom-Up Segmentation
Word Spotting DTW.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learning.
Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.
Computer Vision Lecture 16: Region Representation
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Texture Segmentation Based on Voting of Blocks, Bayesian Flooding and Region Merging C. Panagiotakis (1), I. Grinias (2) and G. Tziritas (3)
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Quadtrees, Octrees and their Applications in Digital Image Processing
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Lecture 6 Image Segmentation
Hierarchical Region-Based Segmentation by Ratio-Contour Jun Wang April 28, 2004 Course Project of CSCE 790.
CS 376b Introduction to Computer Vision 04 / 08 / 2008 Instructor: Michael Eckmann.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Fast, Multiscale Image Segmentation: From Pixels to Semantics Ronen Basri The Weizmann Institute of Science Joint work with Achi Brandt, Meirav Galun,
Contents Description of the big picture Theoretical background on this work The Algorithm Examples.
Quadtrees, Octrees and their Applications in Digital Image Processing
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Cutting complete weighted graphs Jameson Cahill Ido Heskia Math/CSC 870 Spring 2007.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
CS292 Computational Vision and Language Visual Features - Colour and Texture.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Computer Vision - A Modern Approach Set: Segmentation Slides by D.A. Forsyth Segmentation and Grouping Motivation: not information is evidence Obtain a.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Image Segmentation by Clustering using Moments by, Dhiraj Sakumalla.
Presented by: Kamakhaya Argulewar Guided by: Prof. Shweta V. Jain
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
Graph-based Segmentation. Main Ideas Convert image into a graph Vertices for the pixels Vertices for the pixels Edges between the pixels Edges between.
1Ellen L. Walker Segmentation Separating “content” from background Separating image into parts corresponding to “real” objects Complete segmentation Each.
Multiscale Symmetric Part Detection and Grouping Alex Levinshtein, Sven Dickinson, University of Toronto and Cristian Sminchisescu, University of Bonn.
Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Digital Image Processing CCS331 Relationships of Pixel 1.
Quadtrees, Octrees and their Applications in Digital Image Processing.
EDGE DETECTION IN COMPUTER VISION SYSTEMS PRESENTATION BY : ATUL CHOPRA JUNE EE-6358 COMPUTER VISION UNIVERSITY OF TEXAS AT ARLINGTON.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
Machine Vision ENT 273 Regions and Segmentation in Images Hema C.R. Lecture 4.
Wonjun Kim and Changick Kim, Member, IEEE
Course 5 Edge Detection. Image Features: local, meaningful, detectable parts of an image. edge corner texture … Edges: Edges points, or simply edges,
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.
April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.
Content Based Coding of Face Images
October 3, 2013Computer Vision Lecture 10: Contour Fitting 1 Edge Relaxation Typically, this technique works on crack edges: pixelpixelpixel pixelpixelpixelebg.
Graph-based Segmentation
Course : T Computer Vision
Machine Vision ENT 273 Lecture 4 Hema C.R.
- photometric aspects of image formation gray level images
Nonparametric Semantic Segmentation
Mean Shift Segmentation
Image gradients and edges
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
Computer Vision Lecture 9: Edge Detection II
Computer Vision Lecture 16: Texture II
Brief Review of Recognition + Context
Text Categorization Berlin Chen 2003 Reference:
EE 492 ENGINEERING PROJECT
“Traditional” image segmentation
Presentation transcript:

Top-Down & Bottom Up Segmentation Reem Amara Presented to : Prof.Hagit Hel-Or

Content of the slides 1- Present the bottom-up algorithm. 2- Present the top-down algorithm. 3- Present the combined algorithm. Tip - if you don’t like horses this isn’t the right place !

Let’s get started >>>

Bottom-up segmentation The goal is to identify an object in an image and separate it from the background. The bottom-up approach is to first segment the image into regions and then identify the image regions that correspond to a single object.

What are we looking for? Pixels that have something in common. Pixels that belong together. Similar pixels.

How to identify the object regions from the image ? Relying mainly on continuity principles . This means we group pixels according to Grey level. Texture uniformity. Smoothness and continuity of bounding contours.

Similar colors –intensity assign to color categories What else we can do ?

Similar texture assign to texture categories

Difficulties an object may be segmented into multiple regions. may merge an object with its background.

The bottom-up segmentation tree example input image :

Bottom-up segmentation of input image at multiple scales. jhjkלחיךלחלח

Segmentation tree. The bottom up are organized in a tree structure The colors of segments match the color of nodes

Fast Multiscale Image Segmentation Cast the segmentation problem as a graph clustering problem . Given an image that contains N = n*n pixels. Construct a graph in which each node represents a pixel . Every two nodes representing neighboring are connected by an arc.

Planar graph G(V,E,W) E – edges of G. V – vertex of G , index i- [1….N] . Ii – intensity value . W – wij is the weight associated with each edge for example - wij = |Ii –Ij|, reflecting the degree to which they tend to belong to the same segment.

The pixel graph (example) strong coupling weak coupling Low contrast – strong coupling High contrast – weak coupling

?How to detect a segment Associate with a graph a state vector u = (u1,u2,……,uN). ui – state variable associate with the pixel i.

Energy Function In an ideal segment (with only binary state) E(S) sums the coupling value along the segment boundary.

Energy function (2) -predetermined parameter To avoid preference of small/very large segment , the energy function divided by the volume of the segment. 1- a small segment will have smaller E(u) and at some point we will try to minimize E(u) because of that we will prefer a smaller .segments

Example W54 = 5 W58 = 8 W52 =3 W56 = 1 E(5) = 5 +8+3+1 =16 N(5) = 5

Salient segment The segments that yield small values for the functional and whose volume is less than half the size of the image, are considered salient.

Choosing a Coarse Grid A set of representative nodes Given a graph . A set of representative nodes so that every node in is strongly connected to C. is associated With C A node is considered strongly connected to C if the sum of its weights to nodes in C is a significant proportion of its weights to nodes outside C.

illustration -Coarse Grid

Interpolation matrix Because the original graph is local and because every node is strongly connected to C, there exists a sparse Interpolation matrix P:

Interpolation matrix – illustration

Weighted aggregation every node k in C can be thought of as representing an aggregation of pixels. for example pixel “i “ belongs to the k’th aggregation with weigh Pik(0,1) decomposition of the image into aggregates

This eq. used to generate G[s]

Influences only the internal weight of an aggregate

The bottom-up algorithm Segmentation by Weighted Aggregations (SWA) We are treating our graph as a grid graph, starting from the most Refined grid, and we coarsen it at each step. First choose ½ your nodes as representatives “C” ,Choose those so that each node in your graph is “strongly”

The bottom-up algorithm (cont.) Now we will aggregate all the nodes which are strongly coupled to a node in C, to that node, so that we eliminate a big amount of the nodes. Now each node Corresponds to an aggregate of pixels, not just a single one.

The bottom-up algorithm (cont.) Recalculate the aggregate properties. Recalculate the edges weight accordingly. Now apply the same to new nodes.

Conclusion At the end of this process a full pyramid has been constructed. Every salient segment appears as an aggregate in some level of the pyramid.

Results

Result(2)

Result(3)

Result (4) input image scale 11 scale 8 scale 7 the smaller the scale the smaller the segments .

What if I told you there is a horse in the image ?

Top –down segmentation rely on acquired class-specific information, and can only be applied to images from a specific class. segmentation approach is to use known shape characteristics of objects within a given class to guide the segmentation process.

Top- down  jigsaw puzzle The construction of an object by fragments is somewhat similar to the assembly of a jigsaw puzzle, where we try to put together a set of pieces such that their templates form an image similar to a given example.

The goal The goal is to cover as closely as possible the images of deferent objects from a given class, using a set of more primitive shapes. How ? to identify useful “building blocks”- a collection of components that can be used to identify and delineate the boundaries of objects in the class.

What are we looking for in an image? image fragments that are strongly correlated with images containing the desired object class.

Example Class – horse class fragments→ stored in memory

Fragments representation in memory . Stage 1 – 1- Divide set of training images into class images (C) and non-class images (NC) . 2- Generates a large number of candidate fragments from the images in C . 3- These sub-images can vary in size and range from 1/50 to 1/7 of the object size. How are candidate fragments chosen? We simply extract from the images in C a large number of rectangular sub-image.

Stage 2 – compare the distribution of each fragment in the C and NC. 1- For a given fragment Fi ,we measure Si . 2-To reach a fixed level of false alarms α in non-class images we determine a threshold θi for Fi by the criterion: p(Si > θi|NC) ≤ α Strength of Response – Maximal normalized correlation of a fragment i with each image I in C and NC Strength of Response – Maximal normalized correlation of a fragment i with each image I in C and NC

Stage 3 1) Order the fragments by their hit rate p(Si > θi|C) and select the K best ones .( k = size of set), and add this reliability value to each fragment. 2) Add new factor to each fragment: a figureground label . Grey level → ←figure template ground label Figure ground label Manual labeling Learned from relative motion or grey level variability Reliability value = Hit rate: A fixed level of false alarms  is achieved by the criterion: Select the k best fragments according to the Hit rate

Segmentation by Optimal Cover the main stage in the algorithm consists of covering an image with class-based fragments . cover

class-based fragment Class based fragments Class human face Class car

Segmentation by Optimal Cover The main stage in the algorithm consists of covering an image with class-based fragments . How do we compute the quality of a cover ? 1- Individual Match. 2- Consistency. 3- Fragment Reliability.

Individual Match Measures the similarity between fragments and the image regions that they cover. Similarity measure that combines region correlation with edge detection. Using the figure-ground label exclude background pixels from the similarity measure.

Segmentation by Optimal Cover The main stage in the algorithm consists of covering an image with class-based fragments . How do we compute the quality of a cover ? 1- Individual Match. 2- Consistency. 3- Fragment Reliability.

Consistency The fragments provide a consistent global cover of the shape. Cij - consistency measure between a pair of overlapping fragments Fi and Fj. The maximum term in the denominator prevents overlaps smaller than a fixed value μij from contributing a high consistency term.

example -Consistency consistent cover inconsistent cover Figure pixels are marked white, background pixels are grey. The inconsistent region is marked in black.

Segmentation by Optimal Cover the main stage in the algorithm consists of covering an image with class-based fragments . How do we compute the quality of a cover ? 1- Individual Match. 2- Consistency. 3- Fragment Reliability.

Fragment Reliability Similar to a jigsaw puzzle, the task of piecing together the correct cover can be simplified by starting with some more “reliable” fragments. Reliable fragments capture distinguishing features of the shapes in the class. A fragment’s reliability is evaluated by the likelihood ratio between the detection rate and the false alarm rate.

Fragment Reliability (cont.) We set the minimal threshold such that the false alarm rate does not exceed α.

Fragment Reliability Reliable fragments completed by less reliable ones

Individual Match -Example

Consistency – Example

Consistency – Example (2)

The cover algorithm we seek to maximize cs. Penalizes for inconsistent overlapping fragments Rewards for match quality and reliability Constant that determines the magnitude of the penalty for insufficient consistency Zero for non-overlapping pairs

The algorithm 1- At each stage, a small number M of good candidate fragments are identified. 2- A subset of these M fragments, that maximally improve the current score, are selected and added to the cover. 3- existing fragments that are inconsistent with the new match are removed. 4- To initialize the process, sub-window selected within the image with the maximal concentration of reliable fragments.

Experiments Algorithm tested on a database of horse images. A bank of 485 fragments was constructed from a sample library of 41 horse containing images. p(Si|C) and p(Si|NC) by measuring the similarity with 193 images of horses and 253 of non-horses. Using these estimated distributions, the fragments were assigned their appropriate threshold and classified to 146 reliable and 339 non-reliable fragments.

Result Row 2- results obtained from low-level segmentation Row 3 - class-based segmentation to figure and ground.

Result (cont.)

Top-Down Bottom-Up

Combining Top-down and Bottom -up motivation Can you determine accurately the other foreleg of the horse ?!

Bottom-up Top -down

C(x,y) - Classification map The goal is to construct a classification map C(x, y). C(x,y) should make the best compromise between a top-down requirement and a bottom-up constraint. Top-down requirement is to make C as close as possible to the initial top-down classification map T . Bottom-up requirement is to penalize configuration that separate homogenous image regions. C(X,Y) gives +1 figure , -1 back ground The overall voting defines a figure-ground segmentation map T (x, y) of the image, which classifies each pixel in the image as figure or background. The map can be given in either a deterministic form (a pixel can be either figure, T (x, y) = 1, or background, T (x, y) = −1)

The combined segmentation algorithm stages 1)The bottom-up phase. 2) The Top-down phase, using the output of the bottom-up phase which is the segmentation tree.

Definitions 1- Label si = +1/ − 1. 2- A configuration vector s represents the labeling si of all the segments in the segmentation tree. si =1 si

How the algorithm choose between Bottom-up and Top-down if they contrast? What is the algorithm criterion to leave a segment as in the bottom-up segmentation or to change it according to Top-down requirements ? Cost function - Label of parent node Local cost function configuration vector

Local cost function The bottom-up term The top-down term

The top-down term The distance between the final classification C(s) and the top-down classification T depends only on the labeling of the terminal segments of the tree , that’s why ti is defined as follows :  ̄s represents the labeling si of all the segments in the segmentation tree

Saliency The salience (also called saliency) of an item – be it an object, a person, a pixel, etc. – is the state or quality by which it stands out relative to its neighbors. Saliency typically arises from contrasts between items and their neighborhood, such as a red dot surrounded by white dots, or a loud noise in an otherwise quiet environment. 

The bottom-up term h(Ti) hi ∈ [0, 1) Size of the segment if a segment is low salience same label as its parent else if it is a high-salience segment independent label Γi provided by this bottom-up algorithm A segment is expected to have the same label as its parent if it is a low-salience segment, but can have its own independent label if it is a high-salience segment. The cost function therefore penalizes a segment if its label is different from its parent’s label, unless it is a salient segment. The saliency itself is supplied as a part of the bottom-up segmentation. Predetermined constant e.g. 4 ,that controls the tradeoff between the top -down and the bottom up h(Ti) hi ∈ [0, 1) hi →1 As the segment Si becomes more salient. Size of the segment

Minimizing the cost function Less cost  a better compromise according to the sum product algorithm if the decomposition of ‘f ‘ forms a so-called “factor tree” then minimization can be found efficiently by a simple message passing scheme that requires only two messages between neighboring nodes in the tree.

I don’t see any tree The segmentation tree

Minimizing the cost function (cont.) In our case , the local cost pi defines a weighted edge between a segment Si and its parent Si- , the global function f is the summation of all there costs defined by :

Minimizing the cost function(2) In this tree, the local cost pi defines a weighted edge between a segment Si and its parent . The global function f is the summation of all these costs as defined by P(the cost function). The computation proceeds by sending messages between neighboring nodes in the tree.

Minimizing the cost function(3) During the bottom-up phase, each node sends a message to its parent. During the top-down phase each node receives a message from its parent. The messages from a node containing variable s consists of two values (both){ s =1, s= -1} Both messages are defined by recursive computation.

Minimizing the cost function(4) The computation terminates for all nodes when each node has sent to and received one message from its parent. Once passing messages is complete , the algorithm combine at each node the two messages in a way that each node has two values ms(-1),ms(+1) The minimal of these two values is the selected label of node s in the configuration  ̄s.

Minimizing the Costs – Information Exchange in a Tree Bottom-up message: Cost of si = –1 and s = x Message from si = –1 Cost of si = +1 and s = x Message from si = +1

Minimizing the Costs – Information Exchange in a Tree (2) Top – down message: Min-cost Label: Minimal Cost if the region was classified as background Minimal Cost if the region was classified as figure Computed at each node – minimal of the values is the selected label of node s in s

Result (bottom-up corrects top-down) The top-down process may produce a figure- ground approximation that may miss or produce erroneous figure regions (especially in highly variable regions such as the horse legs and tail). Salient bottom-up segments can correct these errors and delineate precise region boundaries.

Result (1) The first row shows the bottom-up segments at four scales. (a.) The initial classification map T(x, y) (c.) Final figure-ground segmentation C(x, y) (e.) Confidence in the final classification.

Result(2) The first row shows the bottom-up segments at four scales. (a.) The initial classification map T(x, y) (c.) Final figure-ground segmentation C(x, y) (e.) Confidence in the final classification.

Result (top-down corrects bottom-up) the bottom-up segmentation may be insufficient in detecting the figure-ground contour, and the top-down process completes the missing information.

Complexity the bottom-up process provides the segmentation tree in complexity linear in image size. the top-down process provides the initial classification map in complexity linear in the image size and the number of fragments.

Conclusion The advantage of the combined algorithm is the ability to use top down information to group together segments belonging to the object despite image-based dissimilarity. By using the entire segmentation tree, all image discontinuities at all scales are taken into account, to provide a final figure-ground segmentation. This segmentation provides an optimal compromise between the image content (bottom-up constraint) and the model (top-down requirement).

References Eran Borenstein, Eitan Sharon and Shimon Ullman Combining Top-down and Bottom-up Segmentation Proceedings IEEE workshop on Perceptual Organization in Computer Vision IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2004  2) Class-specific , top-down segmentation, E. Borenstein and S. Ullman, ECCV 2002 http://www.csd.uwo.ca/~olga/Courses/Fall2007/840/StudentPapers/Borenstei nUllman2002.pdf

References 3) Learning to Segment  E. Borenstein and S. Ullman Springer-Verlag LNCS 3023  European Conference on Computer Vision (ECCV), May 2004 http://www.msri.org/people/members/eranb/learning_to_segme nt.pdf 4) E. Sharon, A. Brandt, and R. Basri, “Segmentation and boundary detection using multiscale intensity measurements,” in CVPR (1), 2001, pp. 469–476.