Real-time Background Cut Alon Rubin Shira Kritchman Present: 7.5.2006, Weizmann Institute of Science.

Slides:



Advertisements
Similar presentations
POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts Mathieu Bray Pushmeet Kohli Philip H.S. Torr Department of.
Advertisements

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,
I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:
電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.
Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.
Human Pose detection Abhinav Golas S. Arun Nair. Overview Problem Previous solutions Solution, details.
GrabCut Interactive Image (and Stereo) Segmentation Carsten Rother Vladimir Kolmogorov Andrew Blake Antonio Criminisi Geoffrey Cross [based on Siggraph.
Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.
Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor :王聖智 教授 Student :周節.
Forward-Backward Correlation for Template-Based Tracking Xiao Wang ECE Dept. Clemson University.
Robust Object Tracking via Sparsity-based Collaborative Model
Stephen J. Guy 1. Photomontage Photomontage GrabCut – Interactive Foreground Extraction 1.
GrabCut Interactive Image (and Stereo) Segmentation Joon Jae Lee Keimyung University Welcome. I will present Grabcut – an Interactive tool for foreground.
Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1, Lehigh University.
Epipolar lines epipolar lines Baseline O O’ epipolar plane.
Formation et Analyse d’Images Session 8
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:
Segmentation Divide the image into segments. Each segment:
Advanced Topics in Computer Vision Spring 2006 Video Segmentation Tal Kramer, Shai Bagon Video Segmentation April 30 th, 2006.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
High-Quality Video View Interpolation
Stereo Computation using Iterative Graph-Cuts
A Probabilistic Framework for Video Representation Arnaldo Mayer, Hayit Greenspan Dept. of Biomedical Engineering Faculty of Engineering Tel-Aviv University,
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Image Segmentation Rob Atlas Nick Bridle Evan Radkoff.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.
Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Graph Cut Algorithms for Binocular Stereo with Occlusions
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.
#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS
Chapter 10 Image Segmentation.
CS 4487/6587 Algorithms for Image Analysis
Kevin Cherry Robert Firth Manohar Karki. Accurate detection of moving objects within scenes with dynamic background, in scenarios where the camera is.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Associative Hierarchical CRFs for Object Class Image Segmentation
1 Motion Analysis using Optical flow CIS601 Longin Jan Latecki Fall 2003 CIS Dept of Temple University.
Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.
Gaussian Mixture Models and Expectation-Maximization Algorithm.
Journal of Visual Communication and Image Representation
CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.
Suspicious Behavior in Outdoor Video Analysis - Challenges & Complexities Air Force Institute of Technology/ROME Air Force Research Lab Unclassified IED.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
Representing Moving Images with Layers J. Y. Wang and E. H. Adelson MIT Media Lab.
Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.
ICCV 2007 Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1,
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Learning color and locality cues for moving object detection and segmentation Yuan-Hao Lai Feng Liu and Michael Gleicher University of Wisconsin-Madison.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/27/12.
Summary of “Efficient Deep Learning for Stereo Matching”
GrabCut Interactive Foreground Extraction using Iterated Graph Cuts Carsten Rother Vladimir Kolmogorov Andrew Blake Microsoft Research Cambridge-UK.
Motion and Optical Flow
Markov Random Fields with Efficient Approximations
Nonparametric Semantic Segmentation
Geometry 3: Stereo Reconstruction
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
“grabcut”- Interactive Foreground Extraction using Iterated Graph Cuts
Seam Carving Project 1a due at midnight tonight.
EM Algorithm and its Applications
“Traditional” image segmentation
Presentation transcript:

Real-time Background Cut Alon Rubin Shira Kritchman Present: , Weizmann Institute of Science

The Challenge Real-time bilayer segmentation of video

The Challenge Real-time bilayer segmentation of video –Fully automatic –High quality –Robustness to changing background –And yet – efficient!

Application Background substitution in video conferencing Privacy Coolness

Example

Overview General method 2 binocular (stereo) algorithms 2 monocular algorithms

How to Segment?

Information –Colour –Contrast –Disparity –Motion

How to Segment? Information Prior assumptions –Spatial coherence –Temporal coherence

Information Prior assumptions How to Segment? Notation - point in a 3D colour space - segmentation label - set of neighbours F/B

Combining Cues Need: mathematical formulation Prior assumptions –Spatial coherence –Temporal coherence –Priors over disparity Informative cues –Colour –Contrast –Stereo –Motion ?

Prior Combining Cues Probabilistic framework: –Maximizing Bayes’ law: Gibbs energy –Maximizing probability  Minimizing energy –Dynamic programming / Graph cut Likelihood Constant x - labels I - data

General energy: Spatial coherence + Contrast Colour

Modeling Colour Global VS Local Initial VS Dynamic Lots of blue! This is white!

Modeling Colour – Globaly –Histograms Overlearning –GMM’s (Gaussian Mixture Models) Number of components Learning: EM (iterations) –Initialization parameters –Stopping condition –Time consuming

General energy: Spatial coherence + Contrast Colour

Modeling Contrast and Spatial Coherence Spatial coherence Contrast – inhibits the penalty 22 pixels, 72 penalty 22 pixels, 21 penalty 4 3 Segmentation maps Black: Foreground White: Background

Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation

Stereo Bilayer Segmentation Information: –Colour –Contrast –Stereo Prior: –Spatial coherence –Disparity coherence –Disparity-labeling relations Stereo Bilayer Segmentation

Notations F – foreground B – background O – Occluded – disparity vector Stereo Bilayer Segmentation

Want to find Stereo Bilayer Segmentation

Want to find This is intractable! Stereo Bilayer Segmentation

Dealing with Intractability LDP – Layered Dynamic Programming –Defining a similar problem –Separating to scanlines –Solving with dynamic programming Stereo Bilayer Segmentation

Dealing with Intractability LGC – Layered Graph Cut –Relaxing some dependencies on disparity –Marginalizing over –Solving with graph cut Stereo Bilayer Segmentation

Energy Function Define a Gibbs energy Model it as Prior: Spatial Coherence + contrast Likelihood: Matching Likelihood: Colour Stereo Bilayer Segmentation

The Prior V Sum of binary and unary potentials: F: –Spatial coherence –Contrast dependency Stereo Bilayer Segmentation

Recall: F: –F,B,O –Sophisticated switch Stereo Bilayer Segmentation The Prior V

Recall: V*: –e=0  same equation, –e=1  dilution, –e=0  no use of contrast, Stereo Bilayer Segmentation The Prior V

Sum of unary and binary potentials: G: –Higher disparities in foreground –Based on a threshold –Uniform penalty Stereo Bilayer Segmentation The Prior V

Likelihood for Matching Distinguish Matched (F,B) from Occluded (O) Determine disparity Model as - balance between occlusion and bad matches Stereo Bilayer Segmentation

Likelihood for Matching N – measures quality of match between patches –Classical SSD: Additive + Multiplicative normalization  Robustness –NSSD : Stereo Bilayer Segmentation

Likelihood for Matching Balance between occlusion and bad matches:  Preference for occlusion Stereo Bilayer Segmentation

Likelihood for Colour GMM’s for Foreground and Background –20 mixture componenets Learn from previous frames Learn using EM –10 iterations Stereo Bilayer Segmentation

Likelihood for Colour Model as: Too strong  Balancing factor Stereo Bilayer Segmentation

Fusion of Cues Colour FusionStereo Stereo Bilayer Segmentation

LDP Layered Dynamic Programming Want: separation to scanlines Recall: V – Sum on neighbouring pixels Use only horizontal cliques  Work on scanlines Prior: Spatial Coherence + contrast Stereo Bilayer Segmentation - LDP

LDP Classical DP Diagonal: matched Vertical: occluded Horizontal: occluded Stereo Bilayer Segmentation - LDP

LDP Layered DP Alternates between matched and occluded! Stereo Bilayer Segmentation - LDP

LDP Layered DP The whole line is matched! Stereo Bilayer Segmentation - LDP

LDP Layered DP No diagonal moves Vertical: matched or occluded Horizontal: matched or occluded

LDP 4-State Space Many parameters: Learn parameters from labeled data = mean width of matched region = mean width of occluded region a – viewing geometry considerationsa 0,c - normalization Stereo Bilayer Segmentation - LDP

LDP 6-State Space Stereo Bilayer Segmentation - LDP Solve with dynamic programming!

LGC – Layered Graph Cut Does not solve for disparity Minimizes Marginalize over disparities: Stereo Bilayer Segmentation - LGC X

LGC Expansion move algorithm Stereo Bilayer Segmentation - LGC

LGC Expansion move algorithm with savings: –Only 3 Labels –Only 2 iterations : Initialize with B for all pixels Run F-expansion Run O-expansion on a constrained region Stereo Bilayer Segmentation - LGC

Results LGCLDP Stereo Bilayer Segmentation

Results (LGC) Stereo Bilayer Segmentation

Results – Errors Stereo Bilayer Segmentation

Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation

Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation

Quantitative Results Stereo Bilayer Segmentation

Quantitative Results Computation times: Around 10 fps at 320 X 240 resolution On a conventional 3GHz processor Stereo Bilayer Segmentation

Results Stereo Bilayer Segmentation

Stereo Segmentation – Summary 2 algorithms: LGC and LDP Require binocular configuration Temporal relations are implicit Stereo cues are very strong Stereo Bilayer Segmentation

Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation

Background Cut Information: –Colour –Contrast –Initialization phase Prior: –Spatial coherence Background Cut

- = Most Efficient Approach: Background Subtraction Background Cut

Problems: Foreground-Background similarity Sensitive threshold Most Efficient Approach: Background Subtraction Background Cut

Spatial coherence Colour model  Background maintenance (Minimize by min-cut) Background Cut

Basic Model – Colour Term Background: global and local Global: GMM model Background Cut

Background: global and local Global: GMM model Local: single Gaussian Basic Model – Colour Term t Background Cut

Background: global and local Global: GMM model Local: single gaussian Combination: Basic Model – Colour Term Background Cut

Foreground colour model 5 components GMM Basic Model – Colour Term Background Cut

Basic Model – Colour Term Background Cut

Colour Term Adaptive mixture global-local colour model Background Cut

Colour Term Adaptive mixture global-local colour model How can we quantify the difference? Kullback-Liebler divergence Background Cut

Colour Term Kullback-Liebler divergence quantify the difference between two GMM’s Background Cut

Colour Term Only Global Equally Local and Global Background Cut

Colour Term – Summary Background Cut

Basic Model – Contrast Term Penalty term + Penalty inhibition Background Cut

Contrast Term Background contrast attenuation Background Cut

Contrast Term Foreground boundaries Background contrast Clues: Comparison to original background contrast Background Cut

Contrast Term Over attenuation of boundaries! Background Cut

Clues: Comparison to original background contrast Difference from original background Contrast Term Background Cut

Clues: Comparison to original background contrast Difference from original background Contrast Term Background Cut

Contrast Term – Summary +Background contrast +Background colour +Background contrast Background Cut

Contrast Term – Summary Background Cut

Background Maintenance Sudden illuminance change Auto gain control Fluorescent lamps Light switching Background Cut

Background Maintenance Minor change Histogram transformation function Major change Colour model rebuilding Background Cut

Background Maintenance Colour model rebuilding Foreground threshold increasing Background uncertainty map initialization Mixture model modification Dynamic updating of and Background Cut

Background Maintenance Movement in the background Sleeping and waking objects Casual camera shaking - Relying on global model - Keeping biggest connected component - Background maintenance - Appling Gaussian blurring - Using less local colour model Background Cut

Background Maintenance

Background Cut Background Maintenance

Background Cut Quantitative Results

Computation times: Around fps at 320 X 240 resolution On a conventional 3.2 GHz processor Background Cut

Background Cut – Summary Adaptive mixture global-local colour model Background contrast attenuation Background Maintenance Background Cut

Algorithms Review "Probabilistic fusion of stereo with color and contrast for bi-layer segmentation", V. Kolmogorov et al., CVPR Represents two stereo algorithms: –LDP Layered Dynamic Programming –LGC Layered Graph Cut "Background Cut", J. Sun et al., ECCV 2006, to appear "Bilayer Segmentation of Live Video", A. Criminisi et al., CVPR 2006, to appear Stereo Bilayer Segmentation Background Cut Temporal Bilayer Segmentation

Information: –Colour –Contrast –Initialization phase –Motion Prior: –Spatial coherence –Temporal coherence Temporal Bilayer Segmentation

Motion – Notations Basic image features (YUV) Temporal Bilayer Segmentation

Temporal continuity Spatial continuity Colour likelihood Motion likelihood Temporal Bilayer Segmentation

Temporal Coherence 4 pixel types: Not likely: BF  B, FB  F Temporal Bilayer Segmentation

Temporal Coherence 4 pixel types: Not likely: BF  B, FB  F Temporal Bilayer Segmentation

Temporal Coherence 4 pixel types: Temporal Bilayer Segmentation

Spatial Coherence The usual term: Temporal Bilayer Segmentation

Likelihood for Colour GMM –Number of mixture components –Learning: Initialization Convergence Time consuming Histograms –Nonparametric –Smoothed to avoid overlearning X Temporal Bilayer Segmentation

Likelihood for Colour Foreground –Learned adaptively from previous frames Background –Learned from initialization phase –Static over time –Only global (Claim: local doesn’t improve much) Temporal Bilayer Segmentation

Likelihood for Motion Optical flow –Inaccuracies along boundaries –The aperture problem –Expensive Motion/Non-motion classifier –Adaptive –Efficient X Temporal Bilayer Segmentation

Motion Classifier Basic features: ? Temporal Bilayer Segmentation

Motion Classifier ! X-axis: Grad( I ) Y-axis: I.

Testing the Motion Classifier Not sufficient: Must fill in the gaps Temporal Bilayer Segmentation

Minimizing the Energy Want: Instead: Allowing for changes in t-1 Temporal Bilayer Segmentation

Minimizing the Energy Instead: Temporal Bilayer Segmentation

Minimizing the Energy Now minimize using Graph Cut Instead: Temporal Bilayer Segmentation

Results Temporal Bilayer Segmentation

Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Temporal Bilayer Segmentation

Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Temporal Bilayer Segmentation

Limitations High illuminance changes  Failure (2/6 seq’s) Recommend: switch off Auto Gain Control Stereo V Monocular X Temporal Bilayer Segmentation

Summary LDPLGCBackground Cut Temporal Bilayer Seg. Colour/ Contrast VVVV Colour Model GMM’s Histograms Background maintenance VVV– Disparities ExplicitImplicit –– Background Attenuation ––V– Motion –––V Temporal Coherence –––V

Another Approach to Background Substituition

Thank You! A special thank to Dr. Vladimir Kolmogorov and to Eli Shechtman for their assitatnce , Weizmann Institute of Science

F.A.Q. Alon Rubin Shira Kritchman , Weizmann Institute of Science

Likelihood for Matching Empirical test – Is N discriminative? –Take labeled data –Compute and discretize N –Count matched pixels for each N –Count occluded pixels for each N –Divide Get: Likelihood ratio of matching as a function of N Stereo Bilayer Segmentation

Likelihood for Matching Example: Take N=0.1 –15% of matched pixels have N=0.1 –5% of occluded pixels have N=0.1 => likelihood ratio for N=0.1 Get: Likelihood ratio of matching as a function of N Stereo Bilayer Segmentation

Likelihood for Matching Stereo Bilayer Segmentation Empirical results –X axis: Discretized values of N –Y axis: Log-likelihood ratio

Quantitative Results Hand labeled ground truth (any 5 th /10 th frame) Percentage of misclassified Stereo Bilayer Segmentation

Stereo – Prior Parameters LDP: LGC: Working parameters: Stereo Bilayer Segmentation

Contrast Term Background Cut Alternative suggestion: Inter-label attenuation

Dynamic updating of and Background Cut

Kullback-Liebler Divergence Background Cut

VS Basic Model Background Cut

Background Maintenance Background Cut

Temporal Coherence Why 2 nd order? Temporal Bilayer Segmentation

Motion Classifier Why use spatial derivatives? Temporal Bilayer Segmentation

Minimizing the Energy Instead: Temporal Bilayer Segmentation

Results Temporal Bilayer Segmentation

Results With colour Without colour Temporal Bilayer Segmentation