Download presentation
1
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu, Matti Pietikainen, Stan Z. Li
2
Outline Introduction Scale Invariant Local Ternary Pattern (SILTP)
Background modeling Multiscale Block-based SILTP Experimental results Conclusion
3
Introduction Moving object detection in video sequences is one of the main tasks in many computer vision applications. Its output is as an input to a higher level process, such as object categorization, tracking or action recognition. Background subtraction is an common approach for this task.
4
Introduction (cont.) The popular idea is to model temporal samples in multi-modal distribution, in either parametric or nonparametric way Parametric : GMM , Nonparametric : KDE GMM is the most popular technique. Each pixel is modeled independently using a mixture of Gaussians and updated by an online approximation. Don’t handle cast shadow and dynamic scenes well A number of existing works handle illumination such as moving shadows in special color spaces, but the learned parameters are not adaptable .
5
GMM Background Modeling
Initial background model The first N frames of the input sequence K-means clustering The history of each pixel, , is modeled by three independent Gaussian distributions, and X is the color value, i.e For computational reasons, we assume the red, green, and blue pixel values are independent. 我們先用最先的N張frame來建initial的background model 為了避免這N張frame中 不是純粹只有background的景色 而會有foreground在裡面走動 我們用k-means clustering去找出存在最久的value來當作background 然後對於每個pixel的這段history 我們用三個independent的Gaussian distributions去model他 也就是RGB color value 我們在這裡假設RGB是independent的
6
GMM Background Modeling
The probability of the observing pixel value and : the mean value and the covariance matrix of the Gaussian at time t, where η: the Gaussian probability density function Update of and : ρ: the update rate and its value is between 0 and 1. 所以現在一個pixel value Xt 的probability可以寫成這樣 因為是RGB三個independent的Gaussian distribution 然後我們的update 是用一個update rate去update他的mean跟variance
7
Introduction (cont.) Besides color based background modeling, LBP based method models each pixel by a group of LBP histograms Moving shadow could not be handled well Sensitive to noise In this paper, they extend LBP to a scale invariant local ternary pattern (SILTP) operator to handle illumination variations. They propose a Pattern KDE technique to effectively model probability distribution for handling complex dynamic background In addition, they develop a multi-scale fusion scheme to consider the spatial scale information
8
Scale Invariant Local Ternary Pattern
LBP is proved to be a powerful local image descriptor. Not robust to local image noises when neighboring pixels are similar Tan and Triggs proposed a LTP (Local Ternary Pattern) operator for face recognition. It is extended from LBP by simply adding a small offset value for comparison It’s not invariant under scale transform of intensity values
9
Scale Invariant Local Ternary Pattern
The intensity scale invariant property of a local comparison operator is very important The illumination variations, either global or local, often cause sudden changes of gray scale intensities of neighboring pixels, which would approximately be a scale transform Given any pixel location (Xc,Yc) , SILTP encodes it as Ic is the gray intensity value of the center pixel, Ik are that of its N neighborhood pixel, denotes concatenation operator of binary strings, 𝜏 is a scale factor
10
The advantage of SILTP operator
It is computationally efficient By introducing a tolerable range like LTP, the SILTP operator is robust to local image noises within a range. Especially in the shadowed area, the region is darker and contain more noises, in which SILTP is tolerable while local comparison result of LBP would be affected more The scale invariance property makes SILTP robust to illumination changes The illumination is suddenly changed from darker to brighter
11
Comparison of LBP, LTP and SILTP
Black block white block
12
KDE of Local Patterns Given a gray scale video sequence, the pixel process with the local pattern observations over time 1,2,…,t at a pixel location(X0,Y0) is defined as where F is the texture image sequence. all background patterns within a pixel process, either uni-tary or dynamic, are distributed at just several possible bins
13
Pattern KDE Traditional numerical value based methods can not be used directly for modeling local patterns into background They develop a pattern KDE technique with a particular local pattern kernel that is suitable for descriptors like LBP,LTP, and SILTP First they define distance function d(p,q) as the number of different bits between two local patterns p and q Then they derive the local pattern kernel as where g is a weighting function that can typically be a Gaussian
14
Pattern KDE (cont.) The probability density function can be estimated smoothly as Where ci are weighting coefficients For example, the distribution of LTP pixel process with unitary back-ground is: 20( )/432 hits, 21( )/4 hits, 84( )/6 hits, and 148( )/58 hits. If no kernel technique id used, the patterns 21 and 84 might be regarded as foregrounds and the pattern 148 might be considered to be another modal In their pattern, all patterns will be considered in the same distribution
15
Modeling background with local patterns
Given an estimated density function at time t-1 and a new coming background pattern pt, we update the new density function as (𝛼 is a learning rate) For multimodality of local pattern, they estimate K density functions for each pixel process via PKDE method, with 𝜔 𝑘,𝑡 , k= 1,2,…,K being the corresponding weights and normalized to 1 Afterwards, we estimate the probability of a new pattern pt being background as
16
Multiscale Block-based SILTP(MB-SILTP)
Fuse multiscale spatial information to achieve better performance MB-SILTP encodes in a way similar with (1) , where Ic and Ik are replaced with mean values of corresponding blocks, of which the size 𝜔 x 𝜔 indicate the scale. To implement the MB-SILTP based background subtraction efficiently, first the original video frame is downsample by 𝜔 Afterwards, MB-SILTP can be calculated, and the proposed background model can be applied on the reduced space
17
Multiscale Block-based SILTP(MB-SILTP)
Finally, the resulted probabilities are upsampled bilinearly to the original space and generate the foreground / background segmentation result In the way, the speed is much faster with larger scale, whereas the precision is generally lower. The background probabilities at each scale can be fused on the original space Adopt the geometric average of probabilities at each scale as the fusion score
18
Experimental Results Nine dataset containing complex backgrounds , such as busy human flows, moving cast shadows, etc. The proposed approach was compared with existing state-of-the-art online background subtraction algorithms Mixture of Gaussian (MoG) ACMMM03 Blockwise LBP histogram based (LBP-B) Pixelwise LBP histogram based (LBP-P) PKDE based background subtraction with LTP (PKDEltp) PKDE based background subtraction with SILTP (PKDEsiltp) MB-SILTP with 𝜔 = 3 ( ) MB-SILTP with 𝜔 = 1,2,3 ( )
19
Experimental Results All tested algorithm were implemented in c++ and ran on standard PC with 2.4GHz CPU, 2G memory For all algorithms, a standard OpenCV postprocessing was used which eliminates small pieces less than 15 pixels.
20
Experimental Results Quantitative evaluation
21
Conclusion They proposed an improved local image descriptor called SILTP, and demonstrated its power for background subtraction They have also proposed a multiscale block-based SILTP operator for considering the spatial scale information. A Pattern Kernel Density Estimation technique was proposed and based on it we have developed a multimodal background modeling framework
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.