Adaptive Segmentation Based on a Learned Quality Metric

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Repaso: Unidad 1 Lección 2
1 A B C
Scenario: EOT/EOT-R/COT Resident admitted March 10th Admitted for PT and OT following knee replacement for patient with CHF, COPD, shortness of breath.
1 ZonicBook/618EZ-Analyst Resonance Testing & Data Recording.
AP STUDY SESSION 2.
1
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
Custom Services and Training Provider Details Chapter 4.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt RhymesMapsMathInsects.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
PUBLIC KEY CRYPTOSYSTEMS Symmetric Cryptosystems 6/05/2014 | pag. 2.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
1 The Blue Café by Chris Rea My world is miles of endless roads.
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
The challenge ahead: Ocean Predictions in the Arctic Region Lars Petter Røed * Presented at the OPNet Workshop May 2008, Geilo, Norway * Also affiliated.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Subtraction: Adding UP
: 3 00.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
How to create Magic Squares
Presentation transcript:

Adaptive Segmentation Based on a Learned Quality Metric I. Frosio1, E. Ratner2 1 NVIDIA, USA, 2 Lyrical Labs, USA

Motivation: good / bad segmentation Let me start with showing some segmentation result. This is a the output of SLIC superpixel segmentation for an image of the sky with some clouds. Let’s have a look at the segmentation result… We can put a like here, where the sky and the cloud are well separated, but we should also put a dislike here, where cloud and sky are merged together. SLIC (Achanta, 2012)

Motivation: good / bad segmentation Let’s move to another segmentation algorithm: graph-cut. Let us put a like for larger sky segment, which also preserves the boundary with the cloud. But what about the segment including both cloud and sky? Dislike! GRAPH-CUT (Felzenszwalb, 2004)

Motivation: good / bad segmentation Finally, this slide shows the result obtained with the method proposed here. We still have one like here and one dislike here… Oops, no, two likes. ADAPTIVE GRAPH-CUT (our)

Motivation: good / bad segmentation SLIC (Achanta, 2012) GRAPH-CUT (Felzenszwalb, 2004) ADAPTIVE GRAPH-CUT (our) > > We compared three different segmentation of the same image… And we decided that adaptive graph cut is better than graph cut and this is better than SLIC. Why do we make this classification? ? ? ?

Motivation: good / bad segmentation Achanta, 2012 (SLIC); Kaufhold, 2004: segmentation algorithms aggregate sets of perceptually similar pixels in an image. Felzenszwalb, 2004 (graph-cut): a segmentation algorithm should capture perceptually important groupings or regions, which often reflect global aspects of the image. In this case the main issue with segmentation is that we do not have any general purpose solution approaching human level competence, and evaluating the quality of a segmentation algo remains challenging even today (I do no think things changed a lot since 2001). This is also evident when we analyze the inspiring principles of different segmentation algorithm, where the main idea is to aggreate PERCEPTUALLY similar pixels.

Motivation: segmentation & video compression Frame segmentation Segment motion estimation Encoding True block and sub-block motion vectors

Aim #1: use the human factor (aka segmentation quality metric) So the lesson we learned up to now is that segmentation quality has to be evaluated by a human observer.

Aim #2: automatic parameter tuning

1) Pick a segmentation algorithm… Road map 3) … And put them together (autotuning). 2) … Learn a quality metric including the human factor (application needs) … The requirements of the application enter through the human evaluation of the segmentation quality 1) Pick a segmentation algorithm…

Graph-cut Graph: Nodes: Edges: Weights: w(vi, vj)>>0 Let me start with introducing some basic conceptfor graph cut. We have nodes, edges and weights. The larger the distance in terms of color between two pixels, the larger the weight. vi w(vi, vj)=0

Graph-cut Cm Internal difference: Let’s define then a connected component in the graph, like the ones represented here.

Graph-cut Cm Difference between components: Cn Let’s define then a connected component in the graph, like the ones represented here. Cn

Graph-cut 10 15 12 Ck Boundary predicate: Cn Let’s define then a connected component in the graph, like the ones represented here. 10 15 12 Cn

Graph-cut 15 8 11 C1 Boundary predicate: C2 There is a last term, tau(Ck), which is defined as the ratio between a constant k and the number of pixels in a given component. When the component is small, the denominator is small and this term is large (it can dominates Int(Ck)). The effect of this is that the threshold to assert that there is a boundary is higher for smaller components. By the practical point of view, this term prevent the creation of small components, or, in other words, the constant k sets the scale of observation and it is the most significant parameter of the algorithm, that has to be defined by the user. 15 8 11 C2

Graph-cut Observation scale ~ k Boundary predicate: Observation scale ~ k There is a last term, tau(Ck), which is defined as the ratio between a constant k and the number of pixels in a given component. When the component is small, the denominator is small and this term is large (it can dominates Int(Ck)). The effect of this is that the threshold to assert that there is a boundary is higher for smaller components. By the practical point of view, this term prevent the creation of small components, or, in other words, the constant k sets the scale of observation and it is the most significant parameter of the algorithm, that has to be defined by the user. C2

Graph-cut K = 3 K = 10,000 K = 100

1) Pick a segmentation algorithm… Road map 3) … And put them together (autotuning). 2) … Learn a quality metric including the human factor… 1) Pick a segmentation algorithm…

(Weighted) symmetric uncertainty 4 bits ------------------ = 33% 7 bits + 5 bits Entropy based average

k vs. Uw vs. quality 160 x 120 image block Let me go more in details. Let us consider the 160x120 rectangle highlighted in the image in red, and let’s ask a human to classify this area of the image as under-, well- or over-segmented for different values of k. For 1  k  50, over-segmentation occurs: areas that are perceptually homogeneous are divided into several segments. The segmentation looks fine for 75  k  200, whereas for 350  k  10,000 only few segments are present (under-segmentation). Now let me also introduce the wegithed uncertainty index, Uw. This index measures the percentage of information shared between the original image and the segmented one. When it is one, the two images have the same information content, i.e. also noise is represented into the segmented image. On the other hand, when this index is zero, there is no correlation at all between the pixels in the original image and the segmented image. Notice this index is normalized, so that it is comprised between 0 and 1 and independent from the absolute quantity of information contained in the image. Moreover, it is a weighted index since it computes the quantity of information in the red, green and blue channel and makes a weighted average consequently, starting from the symmetric uncertainty computed for each channel. But let’s forget the math details – we only have to remember that Uw is and index that describes how much information is in common between the original and the segmented image. Not surprisingly, when we have a lot of segments Uw is high and it decreases as k increases and the number of segments in the image decreases. 160 x 120 image block

visual inspection & classification k vs. Uw vs. quality Training 160 x 120 blocks 320x240 rgb images K = [1, …, 10,000] visual inspection & classification We segmented a set of 12 images including flower, portraits, landscapes and sport images using graph-cut and various values of k. For each 160x120 block in the images, we classified it as under-, well- or over-segmented. Here I plot the results for the case of images whose intial resolution is 320x240 (thus these images are divided into 2x2 blocks).

visual inspection & classification k vs. Uw vs. quality Training 160 x 120 blocks 640x480 rgb images K = [1, …, 10,000] visual inspection & classification It is evident that a single value of k is not sufficient to correctly segment all the images. It is also evident that, for all the blocks, we have a S-shaped curve in the log(k) / Uw space.

Learning the metric Uw = m log(k) + b We want to identify a curve (a line in the log(k), Uw space) s.t. the points on the curve are associated to well-segmented blocks. Once we have defined this area, we can force the segmentation algorithm to produce results that lie close to this line, so they are likely to be well segmented. To compute this curve, we used a SVM like approach – we define the classification error for over- / well- segmented blocks (delta us and delta we) and give more importance to these error if we are far from the line that separates the under-segmented and well-segmented areas. We minimize this cost function with Nelder-Mead simplex algorithm and we get the green line in the plots, which is associated to an area where the human observer classified the blocks as well segmented. Uw = m log(k) + b

1) Pick a segmentation algorithm… Road map 3) … And put them together (autotuning). 2) … Learn a quality metric including the human factor… 1) Pick a segmentation algorithm…

Automatic k selection Given a 160x120 block, we are now able to automatically select k s.t. the quality of the segmented image is optimal. This is an iterative process. Let’s start by segmenting the image for k = 1 (very low) and k = 10000 (very high). In both case we can measure Uw after segmentation, and we realize that for k = 1 we have oversegmentation, since the point is over the optimal line, whereas for k = 10000 we have undersegmentation, since the corresponding Uw is under the optimal line. Thus the optimal value of k has to lie between k = 1 and k = 10000. We go on with a bracketing search strategy for the identification of the optimal k.

Automatic k selection At iteration 1, we therefore try to segment the image with k = 100 (average value in the log space). We measure the weighted symmetric uncertainty Uw and we realize that this is still high (we are above the optimal line), so we have to increase k.

Automatic k selection Segmentation with the new value of k leads to undersegmentation. So we have to decrease k.

Automatic k selection At iteration 3 the point is still under the optimal line. K has to be increased…

Automatic k selection After 5 iteration we are already at convergence.

… and adaptivity k = k(x,y) To get an adaptive segmentation procedure, we compute thethe optimal k for each (non overlapping) 160x120 block in the image, we assign the optimal k to each pixel of the image and we smooth it to avoid brisk transitions.

Road map

Adaptive graph-cut (ours) Graph-cut (Felzensswalb, 2004) * Results - Quality Adaptive graph-cut (ours) Graph-cut (Felzensswalb, 2004) * SLIC (Achanta, 2012) * * Same number of segments forced for each algorithm We compared numerically the developed adaptive segmentation algorithm with the original version of graph cut, and with SLIC. Inter-class contrast measures the contrast between different segmentes – it should be high if the segmentes are really different. The Intra-class uniformity is a measure of the std of the pixels within the same segments – it should be low if each segment is uniform (with the exception of a textured segment). The ratio between the two is a measure which is more independent wrt texture or noise in the image.

Results

Results SLIC Graph-cut Adaptive graph-cut

Results

Results SLIC Graph-cut Adaptive graph-cut

Results: inter-class contrast (the higher the better) Sum of the contrasts among segments weighted by their areas (Chabrier, 2004) We compared numerically the developed adaptive segmentation algorithm with the original version of graph cut, and with SLIC. Inter-class contrast measures the contrast between different segmentes – it should be high if the segmentes are really different. The Intra-class uniformity is a measure of the std of the pixels within the same segments – it should be low if each segment is uniform (with the exception of a textured segment). The ratio between the two is a measure which is more independent wrt texture or noise in the image.

Results: intra-class uniformity (the lower the better) Sum of the normalized standard deviation for each region (Chabrier, 2004) We compared numerically the developed adaptive segmentation algorithm with the original version of graph cut, and with SLIC. Inter-class contrast measures the contrast between different segmentes – it should be high if the segmentes are really different. The Intra-class uniformity is a measure of the std of the pixels within the same segments – it should be low if each segment is uniform (with the exception of a textured segment). The ratio between the two is a measure which is more independent wrt texture or noise in the image.

Results: contrast - uniformity ratio (the higher the better) We compared numerically the developed adaptive segmentation algorithm with the original version of graph cut, and with SLIC. Inter-class contrast measures the contrast between different segmentes – it should be high if the segmentes are really different. The Intra-class uniformity is a measure of the std of the pixels within the same segments – it should be low if each segment is uniform (with the exception of a textured segment). The ratio between the two is a measure which is more independent wrt texture or noise in the image.

Discussion LEARNED segmentation quality metric including the HUMAN FACTOR Iterative method to AUTOMATICALLY and ADAPTIVELY compute the optimal scale parameter

A more general approach (edge thresholding segmentation in YUV)

A more general approach (edge thresholding segmentation in YUV) Openboradcast encoding (x264) Lyricallabs encoding (adaptive segmentation) Show

A more general approach (edge thresholding segmentation in YUV) Openboradcast encoding (x264) Lyricallabs encoding (adaptive segmentation) Show

Open issues & improvements Resolution dependency (160x120 blocks) Learning: the Berkeley Segmentation Dataset Avoid iterations (see I. Frosio, SPIE EI 2015)

Questions ? ? ?