Real-time Background Cut Alon Rubin Shira Kritchman Present: , Weizmann Institute of Science
The Challenge Real-time bilayer segmentation of video
The Challenge Real-time bilayer segmentation of video –Fully automatic –High quality –Robustness to changing background –And yet – efficient!
Application Background substitution in video conferencing Privacy Coolness
Example
Overview General method 2 binocular (stereo) algorithms 2 monocular algorithms
How to Segment?
Information –Colour –Contrast –Disparity –Motion
How to Segment? Information Prior assumptions –Spatial coherence –Temporal coherence
Information Prior assumptions How to Segment? Notation - point in a 3D colour space - segmentation label - set of neighbours F/B
Combining Cues Need: mathematical formulation Prior assumptions –Spatial coherence –Temporal coherence –Priors over disparity Informative cues –Colour –Contrast –Stereo –Motion ?
Prior Combining Cues Probabilistic framework: –Maximizing Bayes’ law: Gibbs energy –Maximizing probability Minimizing energy –Dynamic programming / Graph cut Likelihood Constant x - labels I - data
General energy: Spatial coherence + Contrast Colour
Modeling Colour Global VS Local Initial VS Dynamic Lots of blue! This is white!
Modeling Colour – Globaly –Histograms Overlearning –GMM’s (Gaussian Mixture Models) Number of components Learning: EM (iterations) –Initialization parameters –Stopping condition –Time consuming
General energy: Spatial coherence + Contrast Colour
Modeling Contrast and Spatial Coherence Spatial coherence Contrast – inhibits the penalty 22 pixels, 72 penalty 22 pixels, 21 penalty 4 3 Segmentation maps Black: Foreground White: Background
Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation
Stereo Bilayer Segmentation Information: –Colour –Contrast –Stereo Prior: –Spatial coherence –Disparity coherence –Disparity-labeling relations Stereo Bilayer Segmentation
Notations F – foreground B – background O – Occluded – disparity vector Stereo Bilayer Segmentation
Want to find Stereo Bilayer Segmentation
Want to find This is intractable! Stereo Bilayer Segmentation
Dealing with Intractability LDP – Layered Dynamic Programming –Defining a similar problem –Separating to scanlines –Solving with dynamic programming Stereo Bilayer Segmentation
Dealing with Intractability LGC – Layered Graph Cut –Relaxing some dependencies on disparity –Marginalizing over –Solving with graph cut Stereo Bilayer Segmentation
Energy Function Define a Gibbs energy Model it as Prior: Spatial Coherence + contrast Likelihood: Matching Likelihood: Colour Stereo Bilayer Segmentation
The Prior V Sum of binary and unary potentials: F: –Spatial coherence –Contrast dependency Stereo Bilayer Segmentation
Recall: F: –F,B,O –Sophisticated switch Stereo Bilayer Segmentation The Prior V
Recall: V*: –e=0 same equation, –e=1 dilution, –e=0 no use of contrast, Stereo Bilayer Segmentation The Prior V
Sum of unary and binary potentials: G: –Higher disparities in foreground –Based on a threshold –Uniform penalty Stereo Bilayer Segmentation The Prior V
Likelihood for Matching Distinguish Matched (F,B) from Occluded (O) Determine disparity Model as - balance between occlusion and bad matches Stereo Bilayer Segmentation
Likelihood for Matching N – measures quality of match between patches –Classical SSD: Additive + Multiplicative normalization Robustness –NSSD : Stereo Bilayer Segmentation
Likelihood for Matching Balance between occlusion and bad matches: Preference for occlusion Stereo Bilayer Segmentation
Likelihood for Colour GMM’s for Foreground and Background –20 mixture componenets Learn from previous frames Learn using EM –10 iterations Stereo Bilayer Segmentation
Likelihood for Colour Model as: Too strong Balancing factor Stereo Bilayer Segmentation
Fusion of Cues Colour FusionStereo Stereo Bilayer Segmentation
LDP Layered Dynamic Programming Want: separation to scanlines Recall: V – Sum on neighbouring pixels Use only horizontal cliques Work on scanlines Prior: Spatial Coherence + contrast Stereo Bilayer Segmentation - LDP
LDP Classical DP Diagonal: matched Vertical: occluded Horizontal: occluded Stereo Bilayer Segmentation - LDP
LDP Layered DP Alternates between matched and occluded! Stereo Bilayer Segmentation - LDP
LDP Layered DP The whole line is matched! Stereo Bilayer Segmentation - LDP
LDP Layered DP No diagonal moves Vertical: matched or occluded Horizontal: matched or occluded
LDP 4-State Space Many parameters: Learn parameters from labeled data = mean width of matched region = mean width of occluded region a – viewing geometry considerationsa 0,c - normalization Stereo Bilayer Segmentation - LDP
LDP 6-State Space Stereo Bilayer Segmentation - LDP Solve with dynamic programming!
LGC – Layered Graph Cut Does not solve for disparity Minimizes Marginalize over disparities: Stereo Bilayer Segmentation - LGC X
LGC Expansion move algorithm Stereo Bilayer Segmentation - LGC
LGC Expansion move algorithm with savings: –Only 3 Labels –Only 2 iterations : Initialize with B for all pixels Run F-expansion Run O-expansion on a constrained region Stereo Bilayer Segmentation - LGC
Results LGCLDP Stereo Bilayer Segmentation
Results (LGC) Stereo Bilayer Segmentation
Results – Errors Stereo Bilayer Segmentation
Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation
Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation
Quantitative Results Stereo Bilayer Segmentation
Quantitative Results Computation times: Around 10 fps at 320 X 240 resolution On a conventional 3GHz processor Stereo Bilayer Segmentation
Results Stereo Bilayer Segmentation
Stereo Segmentation – Summary 2 algorithms: LGC and LDP Require binocular configuration Temporal relations are implicit Stereo cues are very strong Stereo Bilayer Segmentation
Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation
Background Cut Information: –Colour –Contrast –Initialization phase Prior: –Spatial coherence Background Cut
- = Most Efficient Approach: Background Subtraction Background Cut
Problems: Foreground-Background similarity Sensitive threshold Most Efficient Approach: Background Subtraction Background Cut
Spatial coherence Colour model Background maintenance (Minimize by min-cut) Background Cut
Basic Model – Colour Term Background: global and local Global: GMM model Background Cut
Background: global and local Global: GMM model Local: single Gaussian Basic Model – Colour Term t Background Cut
Background: global and local Global: GMM model Local: single gaussian Combination: Basic Model – Colour Term Background Cut
Foreground colour model 5 components GMM Basic Model – Colour Term Background Cut
Basic Model – Colour Term Background Cut
Colour Term Adaptive mixture global-local colour model Background Cut
Colour Term Adaptive mixture global-local colour model How can we quantify the difference? Kullback-Liebler divergence Background Cut
Colour Term Kullback-Liebler divergence quantify the difference between two GMM’s Background Cut
Colour Term Only Global Equally Local and Global Background Cut
Colour Term – Summary Background Cut
Basic Model – Contrast Term Penalty term + Penalty inhibition Background Cut
Contrast Term Background contrast attenuation Background Cut
Contrast Term Foreground boundaries Background contrast Clues: Comparison to original background contrast Background Cut
Contrast Term Over attenuation of boundaries! Background Cut
Clues: Comparison to original background contrast Difference from original background Contrast Term Background Cut
Clues: Comparison to original background contrast Difference from original background Contrast Term Background Cut
Contrast Term – Summary +Background contrast +Background colour +Background contrast Background Cut
Contrast Term – Summary Background Cut
Background Maintenance Sudden illuminance change Auto gain control Fluorescent lamps Light switching Background Cut
Background Maintenance Minor change Histogram transformation function Major change Colour model rebuilding Background Cut
Background Maintenance Colour model rebuilding Foreground threshold increasing Background uncertainty map initialization Mixture model modification Dynamic updating of and Background Cut
Background Maintenance Movement in the background Sleeping and waking objects Casual camera shaking - Relying on global model - Keeping biggest connected component - Background maintenance - Appling Gaussian blurring - Using less local colour model Background Cut
Background Maintenance
Background Cut Background Maintenance
Background Cut Quantitative Results
Computation times: Around fps at 320 X 240 resolution On a conventional 3.2 GHz processor Background Cut
Background Cut – Summary Adaptive mixture global-local colour model Background contrast attenuation Background Maintenance Background Cut
Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation
Information: –Colour –Contrast –Initialization phase –Motion Prior: –Spatial coherence –Temporal coherence Temporal Bilayer Segmentation
Motion – Notations Basic image features (YUV) Temporal Bilayer Segmentation
Temporal continuity Spatial continuity Colour likelihood Motion likelihood Temporal Bilayer Segmentation
Temporal Coherence 4 pixel types: Not likely: BF B, FB F Temporal Bilayer Segmentation
Temporal Coherence 4 pixel types: Not likely: BF B, FB F Temporal Bilayer Segmentation
Temporal Coherence 4 pixel types: Temporal Bilayer Segmentation
Spatial Coherence The usual term: Temporal Bilayer Segmentation
Likelihood for Colour GMM –Number of mixture components –Learning: Initialization Convergence Time consuming Histograms –Nonparametric –Smoothed to avoid overlearning X Temporal Bilayer Segmentation
Likelihood for Colour Foreground –Learned adaptively from previous frames Background –Learned from initialization phase –Static over time –Only global (Claim: local doesn’t improve much) Temporal Bilayer Segmentation
Likelihood for Motion Optical flow –Inaccuracies along boundaries –The aperture problem –Expensive Motion/Non-motion classifier –Adaptive –Efficient X Temporal Bilayer Segmentation
Motion Classifier Basic features: ? Temporal Bilayer Segmentation
Motion Classifier ! X-axis: Grad( I ) Y-axis: I.
Testing the Motion Classifier Not sufficient: Must fill in the gaps Temporal Bilayer Segmentation
Minimizing the Energy Want: Instead: Allowing for changes in t-1 Temporal Bilayer Segmentation
Minimizing the Energy Instead: Temporal Bilayer Segmentation
Minimizing the Energy Now minimize using Graph Cut Instead: Temporal Bilayer Segmentation
Results Temporal Bilayer Segmentation
Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Temporal Bilayer Segmentation
Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Temporal Bilayer Segmentation
Limitations High illuminance changes Failure (2/6 seq’s) Recommend: switch off Auto Gain Control Stereo V Monocular X Temporal Bilayer Segmentation
Summary LDPLGCBackground Cut Temporal Bilayer Seg. Colour/ Contrast VVVV Colour Model GMM’s Histograms Background maintenance VVV– Disparities ExplicitImplicit –– Background Attenuation ––V– Motion –––V Temporal Coherence –––V
Another Approach to Background Substituition
Thank You! A special thank to Dr. Vladimir Kolmogorov and to Eli Shechtman for their assitatnce , Weizmann Institute of Science
F.A.Q. Alon Rubin Shira Kritchman , Weizmann Institute of Science
Likelihood for Matching Empirical test – Is N discriminative? –Take labeled data –Compute and discretize N –Count matched pixels for each N –Count occluded pixels for each N –Divide Get: Likelihood ratio of matching as a function of N Stereo Bilayer Segmentation
Likelihood for Matching Example: Take N=0.1 –15% of matched pixels have N=0.1 –5% of occluded pixels have N=0.1 => likelihood ratio for N=0.1 Get: Likelihood ratio of matching as a function of N Stereo Bilayer Segmentation
Likelihood for Matching Stereo Bilayer Segmentation Empirical results –X axis: Discretized values of N –Y axis: Log-likelihood ratio
Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation
Stereo – Prior Parameters LDP: LGC: Working parameters: Stereo Bilayer Segmentation
Contrast Term Background Cut Alternative suggestion: Inter-label attenuation
Dynamic updating of and Background Cut
Kullback-Liebler Divergence Background Cut
VS Basic Model Background Cut
Background Maintenance Background Cut
Temporal Coherence Why 2 nd order? Temporal Bilayer Segmentation
Motion Classifier Why use spatial derivatives? Temporal Bilayer Segmentation
Minimizing the Energy Instead: Temporal Bilayer Segmentation
Results Temporal Bilayer Segmentation
Results With colour Without colour Temporal Bilayer Segmentation