Download presentation
1
An Iterative Optimization Approach for Unified Image Segmentation and Matting
Hello everyone, my name is Jue Wang, I’m glad to be here to present our paper : an iterative optimization approach for unified image segmentation and matting. This is a joint work with Michael Cohen at Microsoft Research. Jue Wang1 and Michael F. Cohen2 University of Washington Microsoft Research
2
Introduction Image Matting Observed Image Alpha Matte Known Unknown
Image matting, or foreground matting, means to extract a soft mask for the foreground object which separates it from the background. The observed image I(z) is modeled as a linear interpolation of a foreground image F(z) and a background image B(z), and alpha(z) is the interpolation coefficients map which we call a matte. Since for each pixel z, only its observed color I(z) is known, and we have three unknowns bz, fz and alpha_z, this is an under-constrained problem. Observed Image Alpha Matte Known Unknown
3
Introduction Trimap-based Matting Background Unknown Foreground
To solve this under-constrained problem, previous matting approaches requires a special map called trimap. The trimap segments the image into three regions : definite foreground region, definite background region, and unknown region. The problem is then constrained to be estimating alpha values for pixels in the unknown region based on known foreground and background regions. Foreground Original Trimap Matte
4
Introduction Matting After Segmentation
Trimap generation can be erroneous A common way to generate this trimap is to first apply a binary image segmentation. As shown in this example, the user draws a few strokes indicating foreground and background regions, and the we can use optimization-based image segmentation approaches, such as graph-cut, to segment foreground object from the background. Then we can dilate and erode this foreground region to create an unknown region shown in yellow here. However, this unknown region is not sufficient to cover all hair strands thus the final matte will not be good. Original + User’s input Graph-cut Segmentation [GrabCut, Rother et al., LazySnapping, Li et al., SIGGRAPH 2004] Automatically Generated Trimap
5
Iterative Optimization
Introduction Our Approach: Unified Segmentation and Matting No explicit trimap is required In this paper we propose a unified approach which combines image segmentation and matting together by an iterative optimization process. As shown in this example, we start from a few paint strokes the user specified, and iteratively estimate the matte until it converges to the final result. The uniqueness of our approach lies in the fact that no explicit trimap is required. Iterative Optimization Initial Input Final Matte
6
Related Work Blue Screen Matting Mishima et al. 93
Smith et al. SIGGRAGPH96 Problem is simplified by photographing foreground objects against a constant-colored background Let’s quickly go through some related work. Some early image matting systems simplify the problem by photographing foreground objects against a constant-colored background, which is called blue screen matting.
7
Related Work Bayesian Matting, Chuang et al. CVPR2001
Bayesian framework + MAP solver Extended to video in Chuang et al. SIGGRAPH 2002 Bayesian Image Matting Recent approaches try to estimate mattes from natural images. One of the significant work in this area is the Bayesian matting approach presented at CVPR2001. The approach formulates the problem in a Bayesian framework and solve it using the MAP technique. It can achieve fairly good results when the trimap provided is accurate. This work has been further extended to video by using optical flow to interpolate trimaps between keyframes. However, the bottleneck of the system is how to get correct trimap on each frame. Bayesian Video Matting
8
Related Work Poisson Matting, Sun et al. SIGGRAPH 2004
Formulate matting as solving Poisson equations Assumption: intensity change in the foreground and background is smooth User can interactively manipulate the matte The Poisson Matting approach presented at SIGGRAPH 2004 formulates the problem as solving poisson equations, under the assumption that the intensity change in the foreground and background is smooth. The more interesting part of this work is that it provides interactive tool to allow the user manually improve the matte.
9
Limitations of A Trimap
Key requirement for A trimap should be as accurate as possible Automatically generated trimap is not optimal can be erroneous User specified trimap can be very tedious to create As we can see all previous approaches require a trimap for an input image. More accurately the trimap is provided, better results can be generated. However, automatically generated trimap is not optimal and can be erroneous as we demonstrated, and manually specifying a trimap can be a very tedious process, especially for transparent objects such as the spider web shown here.
10
Hidden Random Field: Matte
Our Approach Iterative Matte Optimization Solving a Conditional Random Field (CRF) Observation: Image Our system employs an iterative optimization approach to avoid the trimap problem. We model the alpha matte as a hidden random field, and the image as observation. We solve a conditional random field, or CRF, by maximizing the conditional probability of the alpha matte given the observed image. If we set up the CRF correctly, we can get the correct matte we want. Goal: maximize Hidden Random Field: Matte
11
CRF vs. MRF In our system we solve CRF by iteratively solving MRFs
Conditional Random Field (CRF) Goal: maximize Assumption: ‘s are dependent Markov Random Field (MRF) Goal: maximize Assumption: ‘s are independent Observations Y Hidden Field X We model the matte as conditional random field. Given observation data Y, and the goal is to maximize the conditional probability P(X|Y), where observation data yi’s are dependent with each other. However, this CRF is hard to solve. Let’s look at another commonly used model: markov random field. The goal of solving a MRF is to maximize the joint probability P(x,y), to make it tractable we often assume Yi’s are independent with each othe. In our system we solve CRF by iteratively solving MRFs. In each iteration we solved a new MRF, which finally leads us to the solution of the CRF. In our system we solve CRF by iteratively solving MRFs
12
Iterative Solver Color Sampling : The “conditional” part.
Let’s look at how we iteratively solve it. We first transfer the user’s input down to the matte. For a given pixel under consideration, we will sample a group of foreground and background samples for it as observation data. Each sample has four attributes: R,G,B channel values and a weight. The weight is related to the confidence we have in the sample colors representing the true foreground and background at that pixel. Foreground Samples Background Samples
13
Iterative Solver + Solve MRF
In this way we can form a MRF on part of the matte and solve it using loopy belief propagation. After the MRF is solved, we will update the matte.
14
Step 2: Solve MRF by Belief Propagation
Iterative Solver Step 1: Color Sampling + We do this iteratively. We first apply color sampling to form a MRF. Then we solve the MRF by belief propagation, [click] and update the matte. We then repeat this until the whole matte converges. Note that in each iteration the MRF is different. Step 2: Solve MRF by Belief Propagation
15
MRF Set Up Attributes of a Pixel/Node Foreground samples
Background samples Observed color Observations Hidden Node Quantized alpha level Let‘s look at how we set up the MRF in the inner loop of the algorithm. Each pixel in the region of consideration corresponds to a hidden node. For a node, the observation data include a group of foreground and background samples, and the real color of the pixel. In the hidden part, we quantize the continuous alpha value into discrete levels. [click] For each level we’ll calculate a likelihood. [click] Each node has an estimated foregound color, an estimated background color, [click] and an uncertainty value. The uncertainty value has a number of functions. It’ll limit the size of the mrf in each iteration, guild color sampling and set weights to nodes. Likelihoods Estimated foreground color 1. Limit the size of the MRF Estimated background color 2. Guide foreground/background sampling Uncertainty 3. Setting weights for nodes
16
Region of Consideration
Iterative Solver Matte is estimated in a front-propagation fashion Region of Consideration Modeled as MRF Definite Foreground We first look at how to use uncertainties to limit the size of MRF. We start from user marked regions, which has uncertainty of 0, and dilate them to create region of consideration in this iteration, as shown in yellow here. We model the yellow regions as a MRF and estimate a matte for current region. Note that the definite foreground and background regions have been enlarged. We then update information for every pixel in the region of consideration, and then dilate the most confident regions again to start the next iteration. Definite Background
17
MRF Set Up Attributes of a Pixel/Node Foreground samples
Background samples Observed color Observations Hidden Node Quantized alpha level Now let‘s look at how to get these foreground and background samples for a given node. Likelihoods Estimated foreground color Estimated background color Uncertainty
18
MRF Set Up Color Sampling Local Sampling
Local samples from low uncertainty areas are given high weights. Color Sampling Global Sampling Train a GMM model on all foreground samples Assign each sample to a single Gaussian For a given node, choose the nearest Gaussian in color space, and collect samples belonging to this Gaussian Global samples are given lower weight. We first define a local neighborhood area for a given node. In the neighborhood area, we exam all nodes, and choose those nodes whose alpha value is smaller than the current one as background samples. This is called local sampling which has been used in previous approaches. However, in the early stage of our algorithm, we may not get valid samples in local neighborhood area, such as this node, there is no foreground samples in the local area. We then use a global sampling method to get foreground samples for this node. We first train a GMM model on all foreground samples, and then assign each sample to the closest Gaussian. Note that the GMM is in color space, but here we illustrate it as 2D ellipse just for visualization. We then choose the nearest Gaussian to the current node, and collect samples belonging to this Gaussian as foreground samples to this node.
19
MRF Set Up Data Costs Foreground samples Background samples
Observed color Observations Hidden Node Quantized alpha level Once we determined the foreground and background samples, the next question is how to assign likelihoods to different alpha levels of the hidden node. Likelihoods Estimated foreground color Estimated background color Uncertainty
20
MRF Set Up Data Cost Foreground samples Background samples
Observed color Observations Hidden Node Quantized alpha level For a given alpha level , we first exam one pair of foreground and background samples. we computed an interpolated color, and calculate its distance to the real color of the pixel. Then likelihood function is an exponential function of the distance. Note that we also compute a weight for the foreground and the background sample. We exam every possible pair of foreground and background samples, and the final likelihood is a weight sum of likelihoods computed from each pair. Likelihoods Estimated foreground color Estimated background color Uncertainty
21
MRF Set Up Neighborhood Cost p q
To allow the alpha values can be correctly propagated from limited size of known region to the whole image, we define neighborhood cost under the assumption that the matte should be local smooth. For a pair of neighboring nodes, the neighborhood cost is defined as an exponential function based on the difference between their alpha levels. p q
22
Solving MRF by Belief Propagation
Minimize T(p) Msg(p, T(p)) Msg(T(p),p) Msg(L(p), p) Msg(p, R(p)) We use the loopy belief propagation algorithm to minimize the total energy, which is a combination of data and neighborhood costs. Belief propagation itself is an iterative process. It works by passing messages along links in the constructed graph. In our case each node will send and receive messages from four neighbors. Msg(p,L(p)) Msg(R(p),p) L(p) p R(p) Msg(B(p),p) Msg(p,B(p)) B(p)
23
Solving MRF by Belief Propagation
Minimize T(p) Msg(p, R(p)) For example, to compute the message for p to its right neighbor R(p), [click] we first take the message in the previous iteration from top node to p,[click] times the message from left node to p, [click] times the message from the bottom node to p, then times the likelihood vector at p, and final times the whole thing with the neighborhood cost matrix. In this way we get the final message from p to the right node at current iteration. L(p) p R(p) B(p)
24
Solving MRF by Belief Propagation
Minimize T(p) Once the belief propagation algorithm converges, the likelihood vector at node p is computed as a dot product of all messages coming into p, and its own likelihood vector. We then update the likelihood vector at node p, and final out the alpha level with maximum likelihood as the final alpha level for node p at this iteration of the CRF optimization. L(p) p R(p) B(p)
25
CRF Update Foreground samples Background samples Observed color
i j Foreground samples Background samples Observed color Observations Hidden Node Quantized alpha level Once we determine the alpha value for nodes included in current MRF, we will update the corresponding nodes in the CRF. We choose a pair of foreground and background samples which best match with current estimated alpha value and the observed color, and use them as the estimated foreground and background color for this node. Each of the samples comes along with a weight, we use the two weights to update the uncertainty value of this node. Estimated foreground color Estimated background color Uncertainty
26
CRF Update Defining Uncertainties
Each pixel is associated with an uncertainty value Pixels with low uncertainties will contribute more for solving the MRF The uncertainty is specially designed in our system to measure the confidence of the alpha value for each pixel. As shown in this diagram, some pixels are more close to the known foreground and background regions than others, thus these pixels should be trusted more in matte estimation, and they will contribute more for solving the MRF since the they have lower uncertainty values.
27
Results 3 Now I will show some results of our system. For this example, we start from the few strokes drawn by the user, which provides an initial matte. Then we show the intermediate matte in the middle of the iterative process. 6 9 14
28
Original image with user input
Results Original image with user input Extracted Matte For this given image, start from a few strokes our system can generate a good matte, [click], which allows the peacock to be composed onto a new background.
29
Extracted matte using Bayesian matting
Results Comparison with Bayesian Matting We can use bayesian matting to extract matte from this image. To use bayesian matting we provide this trimap, however, bayesian matting didn’t give us the right matte, even with more user inputs. User specified trimap Extracted matte using Bayesian matting
30
Results Original image with user input Extracted matte
For transparent foreground object such as the spider web shown here, our system is very convenient to extract the matte. On the contrary, to use bayesian matting, the user needs to provide a very complicated trimap, but the final matte computed by bayesian matting still contains significant errors. User specified trimap Estimated matte by Bayesian Matting
31
Results Original image with user input Extracted matte
Another example, again we can see that our system requires much less user input to extract a good matte compared with previous approaches. User specified trimap Estimated matte by Bayesian Matting
32
Automatic initialization on frame 2
Extension to Video Frame 1 Matte1 Automatic initialization on frame 2 Matte 2 Our system can be easily extended to video since it doesn’t require strict initialization such as a trimap. We simple run our algorithm on the first frame, after it is down, we compute pixel-wise frame difference, and propagate alpha values to the next frame for those pixels who have small color changes. With this initialization on frame 2 we can also estimate a good alpha matte. In the same way we can continue estimating mattes for following frames without user’s interaction. Matte 5 Matte 10 Matte 15 Matte 20
33
Extension to Video Original Video Extracted Matte All User Inputs
This video shows by marking a few frames, our system is able to extract good foreground mattes for the whole sequence. All User Inputs
34
Original image with user input
Failure Mode As we can in previous examples our system works well when foreground and background have distinct color distributions and simple texture patterns. For more complicated examples such as this image shown here, directly applying our system is not able to extract a good matte, which I think is reasonable. If you look at the image carefully you can see foreground and background colors are very ambiguous, especially under the effects of shadows. Original image with user input Extracted matte
35
Solution User specified rough trimap Extracted matte using our system
An advantage of our approach is that it has the flexibility to work on different inputs. For this difficult image we can ask the user to provide a rough trimap, given this trimap, our system can extract a much better matte compared with bayesian matting. User specified rough trimap Extracted matte using our system Extracted matte using Bayesian Matting
36
Timings A brute force implementation is computational expensive
15-20 min for a 640*480 input image Speedups Fast Belief Propagation ideas [Felzenszwalb et al., CVPR 04] Hierarchical methods Using gradient information Current system: seconds per image Let’s look at timings. A brute force implementation is computational very expensive, like minutes for a normal size input image. We’ve developed a bunch of speedups. Fore examples, we used some ideas presented in the fast belief propagation algorithm presented at CVPR04. We also use hieararchical methods along with gradient information for further speedup. So our current system can run 2—30 seconds for an input image.
37
Future Directions Combining all visual information
Color, texture, shape… One future direction of this work, and other matting approaches, is to combine all visual information to develop more robust algorithm for difficult images such as this squirrel image shown here. As we can see only using color information is obviously not enough to estimate a good matte since the colors are so ambiguous.
38
Summary Solve a CDF for unified segmentation and matting
CDF solved iteratively each iteration solves an MRF using Belief Propagation Advantages: No accurate trimap is required Efficient for large semi-transparent objects Extends to video Disadvantages: Computational expensive No real-time interaction Based only on color information In summary, we’ve developed a system for unified segmentation and matting by solving a cdf. In our system the cdf is solved iteratively by solving a mrf in each iteration to optimized the matte. The major advantage is that no accurate trimap is required so it is especially efficient for large transparent objects. The disadvantage is that the system is computational expensive thus the user cannot give real-time interaction. And it is based only on color information.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.