Download presentation
Presentation is loading. Please wait.
Published byAsgeir Jenssen Modified over 6 years ago
1
“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts
An weizhi
2
overview Background Previous Approach Start From Graph Cut Grab Cut
Conclusion
3
Background Foreground-background segmentation
4
Previews work white brush : foreground yellow brush :boundary
red brush :background
5
Previews work Compared to the previews work
Graph cut simplied the iteraction Robust for the segmentation in complex situation.
6
Start from Graph Cut foreground background
7
Start from Graph Cut Define image as an array (gray image)
𝑧= 𝑧 1 + 𝑧 2 + 𝑧 3 +…+ 𝑧 𝑁 Define the segmentation as opacity for each pixel 𝛼=( 𝛼 1 + 𝛼 2 + 𝛼 3 +…+ 𝛼 𝑁 ) 𝛼 𝑛 ∈(0,1) if ( 𝛼 n =0 ) z n is background if ( 𝛼 n = 1) 𝑧 𝑛 is foreground
8
Start from Graph Cut Define 𝜃 to describe image grey-level distribution 𝜃= ℎ 𝑧;𝛼 ,𝛼=0,1 𝑧 ℎ 𝑧;𝛼 =1 Create histogram distributions for foreground and background respectively
9
Graph Cut Energy function E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧)
The problem is how to infer 𝛼 with the given array z and model 𝜃 ? Energy function E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Data term Smooth term
10
E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Data term 𝑈 𝛼,𝜃,𝑧 = 𝑛 − log ℎ( 𝑧 𝑛 ; 𝛼 𝑛 )
𝑈(0,𝜃,𝑧) estimated using background histogram distribution model 𝑈 1,𝜃,𝑧 estimated using foreground histogram distribution model
11
Personal understanding
Imaging there are two histograms ℎ 0,𝑧 ,ℎ(1,𝑧) When uncertain pixels put into the two histograms, it will output a value which can be seen as a probability. The higher probability means the pixels are more likely belong to the correspondent ℎ When uncertain pixels has been classified correctly, 𝑈 𝛼,𝜃,𝑧 will convergence
12
E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Smoothness term
𝑉 𝛼,𝑧 =𝛾 (𝑚,𝑛)∈𝐶 𝑑𝑖𝑠(𝑚,𝑛 ) −1 𝛼 𝑛 ≠ 𝛼 𝑚 𝑒𝑥𝑝−𝛽( 𝑧 𝑚 − 𝑧 𝑛 ) 2 𝛼 𝑛 ≠ 𝛼 𝑚 represent 𝑉 𝛼,𝑧 works on boundary When minimize 𝑉(𝛼,𝑧) , it aims to get a “large pixel distance” boundary When 𝛽=(2 ( 𝑧 𝑚 − 𝑧 𝑛 ) 2 ) −1 in 𝑉(𝛼,𝑧) ,it switch apporprately between high and low contrast
13
Min-Cut / Max-Flow Algorithm
Objective funtion The segmentation can be estimated as a global minimun: 𝛼 ∗ =𝑎𝑟𝑔 min 𝛼 𝐸(𝛼,𝜃,𝑧) How to solve this optimization problem? Min-Cut / Max-Flow Algorithm
14
Image to Graph Treat an image as a graph Graph:
Nodes A background node A foreground node n-nodes corresponds to n-pixels Edges Every node connect with both S and T Every node connect with its neighbors Treat Cut as segmentation
16
New Challenge How about take image into RGB colour space ?
How to succeed more simple users interaction?
17
Motivation use the value histogram? Too sparse
GMM (Gaussian Mixture Model) estimation
18
Assumption: the image array z satisfied a probability distribution
Colour data modelling background Background is exactly fixed Assumption: the image array z satisfied a probability distribution
19
Colour data modeling Define There are two GMMs ,one for background and one for foreground The GMMs are full-covariance Gaussian mixture with K components(K=5) Define vector 𝑘= 𝑘 1 , 𝑘 2 ,…, 𝑘 𝑛 ,… 𝑘 𝑁 𝑘 𝑛 𝜖 1,2,…,𝐾
20
Colour data modeling Energy function 𝐸 𝛼,𝑘,𝜃,𝑧 =𝑈 𝛼,𝑘,𝜃,𝑧 +𝑉(𝛼,𝑧)
Data term Smoothness term Data term 𝑈 𝛼,𝑘,𝜃,𝑧 = 𝑛 𝐷( 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 )
21
Data term 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 =− log 𝜋( 𝛼 n , 𝑘 𝑛 )(𝑝( 𝑧 𝑛 | 𝑎 𝑛 , 𝑘 𝑛, 𝜃) Gaussian Probability Formula 𝑝 𝑧 𝑛 𝛼 𝑛 , 𝑘 𝑛 ,𝜃 = 2 − 𝑘 2 𝜋 − 𝑘 2 |𝑑𝑒𝑡Σ( 𝛼 𝑛 , 𝑘 𝑛 ) | − 1 2 exp(− 1 2 [ 𝑧 𝑛 −𝜇 𝛼 𝑛 , 𝑘 𝑛 ] 𝑇 Σ( 𝛼 𝑛 , 𝑘 𝑛 ) −1 [ 𝑧 𝑛 −𝜇( 𝛼 𝑛 , 𝑘 𝑛 ) 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 =− log 𝜋 𝛼 𝑛 , 𝑘 𝑛 log 𝑑𝑒𝑡Σ 𝛼 𝑛 , 𝑘 𝑛 + 1 2 [ 𝑧 𝑛 −𝜇( 𝑎 𝑛 , 𝑘 𝑛 ) ] 𝑇 Σ( 𝛼 𝑛 , 𝑘 𝑛 ) −1 [ 𝑧 𝑛 −𝜇 𝛼 𝑛 , 𝑘 𝑛 ]
22
E(𝛼,𝜃,𝑧)=𝑈 𝛼,k,𝜃,𝑧 +𝑉(𝛼,𝑧) Smoothness term Our aim is to get 𝜃
𝑉 𝛼,𝑧 =𝛾 (𝑚,𝑛)∈𝐶 𝑑𝑖𝑠(𝑚,𝑛 ) −1 𝛼 𝑛 ≠ 𝛼 𝑚 𝑒𝑥𝑝−𝛽( 𝑧 𝑚 − 𝑧 𝑛 ) 2 The V is unchanged from the previous term except the pixel distance calculation Our aim is to get 𝜃 𝜃= 𝜋 𝛼,𝑘 ,𝜇 𝛼,𝑘 ,Σ 𝛼,𝑘 ,𝛼=0,1,𝑘=1…𝐾 weight means covariance opacity GMM components
23
Method:EM algorithm Initialisation 𝑇 𝐹 =∅ Background:
𝛼 𝑛 =0 𝑖𝑓 𝑛 𝜖 𝑇 𝐵 Initial foreground 𝛼 𝑛 =1 𝑖𝑓 𝑛∈𝑇 𝑈 𝑇 𝑈 = 𝑇 𝐵 updated
24
Initialisation Initialize k Use k-means clustering
For each pixel belongs to a GMM component Initialize 𝜋,𝜇,Σ for GMMs components
25
𝒌 𝒏 ≔𝒂𝒓𝒈 𝒎𝒊𝒏 𝒌 𝒏 𝑫 𝒏 ( 𝜶 𝒏 , 𝒌 𝒏 ,𝜽, 𝒛 𝒏 ) (1)
𝒌 𝒏 ≔𝒂𝒓𝒈 𝒎𝒊𝒏 𝒌 𝒏 𝑫 𝒏 ( 𝜶 𝒏 , 𝒌 𝒏 ,𝜽, 𝒛 𝒏 ) (1) 𝑧 𝑛 is an image pixels array Each pixel already assigned to foreground or background, 𝛼 𝑛 is known. 𝜃 has been initialized Our aim: 𝑘 𝑛
26
learning GMM paramaters
learn GMM parameters from data z 𝜃≔𝑎𝑟𝑔 𝑚𝑖𝑛 𝜃 𝑈(𝛼,𝑘,𝜃,𝑧) (2) GMM parameters: 𝜋(𝛼=1,𝑘)= |𝐹(𝑘)| 𝑘 |𝐹 𝑘 | 𝜇 𝛼=1,𝑘 =𝑚𝑒𝑎𝑛 z n n ∈𝐹(𝑘) Σ 𝛼=1,𝑘 =𝑐𝑜𝑣 z n n∈𝐹(𝑘) 𝐹 𝑘 set of foreground pixels assigned to component k
27
Estimate segmentation
min 𝑘 𝐸 𝛼,𝑘,𝜃,𝑧 = 𝑛 min 𝑘 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 +𝑉(𝛼,𝑧)(3) Estimate segmentation use min cut min 𝛼 𝑛 :𝑛∈ 𝑇 𝑈 min 𝑘 𝐸 𝛼,𝑘,𝜃,𝑧 = min 𝛼 𝑛 :𝑛∈ 𝑇 𝑈 𝑛 𝐷 𝛼 𝑛 ,𝜃, 𝑧 𝑛 +𝑉(𝛼,𝑧) Repeat above steps (1)(2)(3) until convergence
28
Optimizaiton result
29
Border Matting 𝛼 𝑛 =𝑔( 𝑟 𝑛 ; Δ 𝑡 𝑛 , 𝜎 𝑡 𝑛 )
Our aim:for produce continuous 𝛼 𝑛 in the boundary Define a Contour C (previous segmentation) Recompute 𝑇 𝐵 , 𝑇 𝑈 , 𝑇 𝐹 nearby Caculate 𝛼 n 𝑛𝜖𝑇 𝑈 𝛼 𝑛 =𝑔( 𝑟 𝑛 ; Δ 𝑡 𝑛 , 𝜎 𝑡 𝑛 ) distance centre width
30
Border Matting Using DP algorithm to minimize E
𝐸= 𝑛𝜖 𝑇 𝑈 𝐷 𝑛 ( 𝛼 𝑛 )+ 𝑡=1 𝑇 𝑉 ( Δ 𝑡 , 𝜎 𝑡 , Δ 𝑡+1 , 𝜎 𝑡+1 ) Data term Smoothness term Using DP algorithm to minimize E
31
Border Matting Smoothness term Data term
𝑉 ∆,𝜎, ∆ ′ , 𝜎 ′ = 𝜆 1 (Δ− Δ ′ ) 2 + 𝜆 2 (𝜎− 𝜎 ′ ) 2 Data term 𝐷 𝑛 𝛼 𝑛 =− log 𝑁( 𝑧 𝑛 ; 𝜇 𝑡 𝑛 , Σ 𝑡 𝑛 𝛼 𝑛 ) mean covariance Gaussian probability 𝜇 𝑡 𝛼 = 1−𝛼 𝜇 𝑡 0 +𝛼 𝜇 𝑡 (1) Σ 𝑡 𝛼 =(1−𝛼 ) 2 Σ 𝑡 0 + 𝛼 2 Σ 𝑡 (1)
32
Foreground estimation
For estimate foreground pixel not from background(Bayes matte), grabcut has no blackground colours bleeding Comparing methods for border matting
33
Result
34
Result More difficult situation
35
Result
36
Failures situation Regions of low contrast(reduce V penalty)
Camouflage, with overlap in distribution Background material inside the user rectangle happens important to the background total distribution
37
Conclusion Grab cut could cope with moderately difficult images with simple user interaction It combines hard segmentation by iteration It use border matting to make the hard boundary more smooth
38
Q&A Q: what does 𝜃 mean? A: It means ℎ 𝑧,0 ,ℎ(𝑧,1) and each ℎ 𝑧;𝛼 could output a probability Q:Why does the grabcut not use the original histograms instead of using GMM A:For the image in colour space,the image will have 3 channels, and it is too sparse to use the histograms.So the grabcut proposed a more intuitive model GMM to
39
Q&A Replace the histograms
Q:What’s the requirements when drawing a rectangle on the images? A:In fact,we need to ensure the background is outside the rectangle.For we have emphasized in the failure situations that the background distributions need a abundant information. Q:In grabcut,the 𝛼 is uncertaion,and how to
40
Q&A solve 𝛼 A:We set a initial 𝛼 at first, and then we use
EM methods to do a iteration.In the interation, we could get a optimization of 𝛼
41
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.