“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts

“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts
An weizhi

overview Background Previous Approach Start From Graph Cut Grab Cut
Conclusion

Background Foreground-background segmentation

Previews work white brush : foreground yellow brush :boundary
red brush :background

Previews work Compared to the previews work
Graph cut simplied the iteraction Robust for the segmentation in complex situation.

Start from Graph Cut foreground background

Start from Graph Cut Define image as an array (gray image)
𝑧= 𝑧 1 + 𝑧 2 + 𝑧 3 +…+ 𝑧 𝑁 Define the segmentation as opacity for each pixel 𝛼=( 𝛼 1 + 𝛼 2 + 𝛼 3 +…+ 𝛼 𝑁 ) 𝛼 𝑛 ∈(0,1) if ( 𝛼 n =0 ) z n is background if ( 𝛼 n = 1) 𝑧 𝑛 is foreground

Start from Graph Cut Define 𝜃 to describe image grey-level distribution 𝜃= ℎ 𝑧;𝛼 ,𝛼=0,1 𝑧 ℎ 𝑧;𝛼 =1 Create histogram distributions for foreground and background respectively

Graph Cut Energy function E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧)
The problem is how to infer 𝛼 with the given array z and model 𝜃 ? Energy function E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Data term Smooth term

E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Data term 𝑈 𝛼,𝜃,𝑧 = 𝑛 − log ℎ( 𝑧 𝑛 ; 𝛼 𝑛 )
𝑈(0,𝜃,𝑧) estimated using background histogram distribution model 𝑈 1,𝜃,𝑧 estimated using foreground histogram distribution model

Personal understanding
Imaging there are two histograms ℎ 0,𝑧 ,ℎ(1,𝑧) When uncertain pixels put into the two histograms, it will output a value which can be seen as a probability. The higher probability means the pixels are more likely belong to the correspondent ℎ When uncertain pixels has been classified correctly, 𝑈 𝛼,𝜃,𝑧 will convergence

E(𝛼,𝜃,𝑧)=𝑈 𝛼,𝜃,𝑧 +𝑉(𝛼,𝑧) Smoothness term
𝑉 𝛼,𝑧 =𝛾 (𝑚,𝑛)∈𝐶 𝑑𝑖𝑠(𝑚,𝑛 ) −1 𝛼 𝑛 ≠ 𝛼 𝑚 𝑒𝑥𝑝−𝛽( 𝑧 𝑚 − 𝑧 𝑛 ) 2 𝛼 𝑛 ≠ 𝛼 𝑚 represent 𝑉 𝛼,𝑧 works on boundary When minimize 𝑉(𝛼,𝑧) , it aims to get a “large pixel distance” boundary When 𝛽=(2 ( 𝑧 𝑚 − 𝑧 𝑛 ) 2 ) −1 in 𝑉(𝛼,𝑧) ,it switch apporprately between high and low contrast

Min-Cut / Max-Flow Algorithm
Objective funtion The segmentation can be estimated as a global minimun: 𝛼 ∗ =𝑎𝑟𝑔 min 𝛼 𝐸(𝛼,𝜃,𝑧) How to solve this optimization problem? Min-Cut / Max-Flow Algorithm

Image to Graph Treat an image as a graph Graph:
Nodes A background node A foreground node n-nodes corresponds to n-pixels Edges Every node connect with both S and T Every node connect with its neighbors Treat Cut as segmentation

New Challenge How about take image into RGB colour space ?
How to succeed more simple users interaction?

Motivation use the value histogram? Too sparse
GMM (Gaussian Mixture Model) estimation

Assumption: the image array z satisfied a probability distribution
Colour data modelling background Background is exactly fixed Assumption: the image array z satisfied a probability distribution

Colour data modeling Define There are two GMMs ,one for background and one for foreground The GMMs are full-covariance Gaussian mixture with K components(K=5) Define vector 𝑘= 𝑘 1 , 𝑘 2 ,…, 𝑘 𝑛 ,… 𝑘 𝑁 𝑘 𝑛 𝜖 1,2,…,𝐾

Colour data modeling Energy function 𝐸 𝛼,𝑘,𝜃,𝑧 =𝑈 𝛼,𝑘,𝜃,𝑧 +𝑉(𝛼,𝑧)
Data term Smoothness term Data term 𝑈 𝛼,𝑘,𝜃,𝑧 = 𝑛 𝐷( 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 )

Data term 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 =− log 𝜋（ 𝛼 n , 𝑘 𝑛 ）(𝑝( 𝑧 𝑛 | 𝑎 𝑛 , 𝑘 𝑛, 𝜃) Gaussian Probability Formula 𝑝 𝑧 𝑛 𝛼 𝑛 , 𝑘 𝑛 ,𝜃 = 2 − 𝑘 2 𝜋 − 𝑘 2 |𝑑𝑒𝑡Σ( 𝛼 𝑛 , 𝑘 𝑛 ) | − 1 2 exp⁡(− 1 2 [ 𝑧 𝑛 −𝜇 𝛼 𝑛 , 𝑘 𝑛 ] 𝑇 Σ( 𝛼 𝑛 , 𝑘 𝑛 ) −1 [ 𝑧 𝑛 −𝜇( 𝛼 𝑛 , 𝑘 𝑛 ) 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 =− log 𝜋 𝛼 𝑛 , 𝑘 𝑛 log 𝑑𝑒𝑡Σ 𝛼 𝑛 , 𝑘 𝑛 + 1 2 [ 𝑧 𝑛 −𝜇( 𝑎 𝑛 , 𝑘 𝑛 ) ] 𝑇 Σ( 𝛼 𝑛 , 𝑘 𝑛 ) −1 [ 𝑧 𝑛 −𝜇 𝛼 𝑛 , 𝑘 𝑛 ]

E(𝛼,𝜃,𝑧)=𝑈 𝛼,k,𝜃,𝑧 +𝑉(𝛼,𝑧) Smoothness term Our aim is to get 𝜃
𝑉 𝛼,𝑧 =𝛾 (𝑚,𝑛)∈𝐶 𝑑𝑖𝑠(𝑚,𝑛 ) −1 𝛼 𝑛 ≠ 𝛼 𝑚 𝑒𝑥𝑝−𝛽( 𝑧 𝑚 − 𝑧 𝑛 ) 2 The V is unchanged from the previous term except the pixel distance calculation Our aim is to get 𝜃 𝜃= 𝜋 𝛼,𝑘 ,𝜇 𝛼,𝑘 ,Σ 𝛼,𝑘 ,𝛼=0,1,𝑘=1…𝐾 weight means covariance opacity GMM components

Method:EM algorithm Initialisation 𝑇 𝐹 =∅ Background:
𝛼 𝑛 =0 𝑖𝑓 𝑛 𝜖 𝑇 𝐵 Initial foreground 𝛼 𝑛 =1 𝑖𝑓 𝑛∈𝑇 𝑈 𝑇 𝑈 = 𝑇 𝐵 updated

Initialisation Initialize k Use k-means clustering
For each pixel belongs to a GMM component Initialize 𝜋,𝜇,Σ for GMMs components

𝒌 𝒏 ≔𝒂𝒓𝒈 𝒎𝒊𝒏 𝒌 𝒏 𝑫 𝒏 ( 𝜶 𝒏 , 𝒌 𝒏 ,𝜽, 𝒛 𝒏 ) (1)
𝒌 𝒏 ≔𝒂𝒓𝒈 𝒎𝒊𝒏 𝒌 𝒏 𝑫 𝒏 ( 𝜶 𝒏 , 𝒌 𝒏 ,𝜽, 𝒛 𝒏 ) (1) 𝑧 𝑛 is an image pixels array Each pixel already assigned to foreground or background, 𝛼 𝑛 is known. 𝜃 has been initialized Our aim: 𝑘 𝑛

learning GMM paramaters
learn GMM parameters from data z 𝜃≔𝑎𝑟𝑔 𝑚𝑖𝑛 𝜃 𝑈(𝛼,𝑘,𝜃,𝑧) (2) GMM parameters： 𝜋（𝛼=1,𝑘）= |𝐹(𝑘)| 𝑘 |𝐹 𝑘 | 𝜇 𝛼=1,𝑘 =𝑚𝑒𝑎𝑛 z n n ∈𝐹(𝑘) Σ 𝛼=1,𝑘 =𝑐𝑜𝑣 z n n∈𝐹(𝑘) 𝐹 𝑘 set of foreground pixels assigned to component k

Estimate segmentation
min 𝑘 𝐸 𝛼,𝑘,𝜃,𝑧 = 𝑛 min 𝑘 𝐷 𝛼 𝑛 , 𝑘 𝑛 ,𝜃, 𝑧 𝑛 +𝑉(𝛼,𝑧)(3) Estimate segmentation use min cut min 𝛼 𝑛 :𝑛∈ 𝑇 𝑈 min 𝑘 𝐸 𝛼,𝑘,𝜃,𝑧 = min 𝛼 𝑛 :𝑛∈ 𝑇 𝑈 𝑛 𝐷 𝛼 𝑛 ,𝜃, 𝑧 𝑛 +𝑉(𝛼,𝑧) Repeat above steps (1)(2)(3) until convergence

Optimizaiton result

Border Matting 𝛼 𝑛 =𝑔( 𝑟 𝑛 ; Δ 𝑡 𝑛 , 𝜎 𝑡 𝑛 )
Our aim:for produce continuous 𝛼 𝑛 in the boundary Define a Contour C （previous segmentation） Recompute 𝑇 𝐵 , 𝑇 𝑈 , 𝑇 𝐹 nearby Caculate 𝛼 n 𝑛𝜖𝑇 𝑈 𝛼 𝑛 =𝑔( 𝑟 𝑛 ; Δ 𝑡 𝑛 , 𝜎 𝑡 𝑛 ) distance centre width

Border Matting Using DP algorithm to minimize E
𝐸= 𝑛𝜖 𝑇 𝑈 𝐷 𝑛 ( 𝛼 𝑛 )+ 𝑡=1 𝑇 𝑉 ( Δ 𝑡 , 𝜎 𝑡 , Δ 𝑡+1 , 𝜎 𝑡+1 ) Data term Smoothness term Using DP algorithm to minimize E

Border Matting Smoothness term Data term
𝑉 ∆,𝜎, ∆ ′ , 𝜎 ′ = 𝜆 1 (Δ− Δ ′ ) 2 + 𝜆 2 (𝜎− 𝜎 ′ ) 2 Data term 𝐷 𝑛 𝛼 𝑛 =− log 𝑁( 𝑧 𝑛 ; 𝜇 𝑡 𝑛 , Σ 𝑡 𝑛 𝛼 𝑛 ) mean covariance Gaussian probability 𝜇 𝑡 𝛼 = 1−𝛼 𝜇 𝑡 0 +𝛼 𝜇 𝑡 (1) Σ 𝑡 𝛼 =(1−𝛼 ) 2 Σ 𝑡 0 + 𝛼 2 Σ 𝑡 (1)

Foreground estimation
For estimate foreground pixel not from background(Bayes matte), grabcut has no blackground colours bleeding Comparing methods for border matting

Result

Result More difficult situation

Result

Failures situation Regions of low contrast（reduce V penalty）
Camouflage, with overlap in distribution Background material inside the user rectangle happens important to the background total distribution

Conclusion Grab cut could cope with moderately difficult images with simple user interaction It combines hard segmentation by iteration It use border matting to make the hard boundary more smooth

Q&A Q: what does 𝜃 mean? A: It means ℎ 𝑧,0 ,ℎ(𝑧,1) and each ℎ 𝑧;𝛼 could output a probability Q:Why does the grabcut not use the original histograms instead of using GMM A:For the image in colour space,the image will have 3 channels, and it is too sparse to use the histograms.So the grabcut proposed a more intuitive model GMM to

Q&A Replace the histograms
Q:What’s the requirements when drawing a rectangle on the images? A:In fact,we need to ensure the background is outside the rectangle.For we have emphasized in the failure situations that the background distributions need a abundant information. Q:In grabcut,the 𝛼 is uncertaion,and how to

Q&A solve 𝛼 A:We set a initial 𝛼 at first, and then we use
EM methods to do a iteration.In the interation, we could get a optimization of 𝛼

Thank you

“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts

Similar presentations

Presentation on theme: "“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts

Similar presentations

Presentation on theme: "“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts"— Presentation transcript:

Similar presentations

About project

Feedback