Download presentation
1
Vision Topics Seminar Mean Shift
Egorov Svetlana Based on: D. Comaniciu, P. Meer: Mean Shift Analysis and Applications, IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece , , 1999
2
Presentation plan Motivation and Goal
Intro: problem formulation, previous methods overview Base paper on the mean-shift in details Recent modifications and improvements Recent applications
3
Presentation goals Present Mean-Shift algorithm used as a common technique for two Computer Vision tasks: Image filtering and discontinuity preserving smoothing Clustering/segmentation Highlight the pros/cons and tradeoffs of this method, compare with previous methods. Review recent modifications and improvements. Present possible applications of the method, emphasizing on one specific case.
4
Segmentation methods – overview
As presented in “Segmentation and low-level grouping” by Bill Freeman, MIT, following methods exist for segmentation: Background subtraction Estimate the background using a moving average and subtract from the current frame to extract the foreground. K-means clustering The k-means algorithm is an algorithm to cluster n objects based on attributes into k partitions, k < n Mean-shift algorithm (focus of this PPT). Normalized cuts
5
Mean-shift – motivation and intuitive description
Given a distribution of points, mean shift is a procedure for finding the densest region. Example for simple 2D case (see next slide): Start from arbitrary point in the distribution Region of interest is a circle centered in this point On each iteration find the center of the mass for the region of interest Move the circle to the center of the mass Continue the iterations until convergence
6
Intuitive Description
Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
7
Intuitive Description
Region of interest Center of mass Mean Shift vector From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
8
Intuitive Description
Region of interest Center of mass Mean Shift vector From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
9
Intuitive Description
Region of interest Center of mass Mean Shift vector From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
10
Intuitive Description
Region of interest Center of mass Mean Shift vector From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
11
Intuitive Description
Region of interest Center of mass Mean Shift vector From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
12
Intuitive Description
Region of interest Center of mass From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
13
Mean-shift – algorithm formal definition.
The Basic Mean Shift Algorithm is formulated according to the following paper: D. Comaniciu, P. Meer: Mean Shift Analysis and Applications, IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece , , 1999
14
Mean-shift – algorithm formal definition.
Given: set of n points in the d-dimensional space: {xi}i=1..n Model: We assume non-parametric statistical model, i.e. there is a probability density function (PDF) associated with the set of points, without any assumptions on its parameters. Goal: for any given point find closest local mode of the density function.
15
Non-parametric density gradient estimation
Density Estimation Data Discrete PDF Representation Non-parametric Density GRADIENT Estimation (Mean Shift) PDF Analysis Non-parametric – no assumption about PDF form (e.g. normal distribution) Density Gradient is estimated instead of Density itself. From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
16
Kernels Kernel notion is used for PDF gradient estimation method (referred also as Parzen windows method used in statistics) A kernel is a non-negative real-valued integrable function K satisfying the following requirements Kernel Properties: Normalized Symmetric Exponential weight decay From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
17
Kernel - examples K(x) = c∏k(x(j))
In practice one of the following forms is used, where k( ) is a Kernel profile K(x) = c∏k(x(j)) or Where x(j) are individual coordinates Examples: Epanechnikov Kernel Uniform Kernel Normal Kernel From “Mean Shift Theory and Applications”, presentation for “Advanced Topics in Computer Vision” course, Weizmann Institute.
18
Mean-shift – algorithm (cont).
The multivariate kernel density estimate obtained with kernel K(x) and window radius h, computed in the point x: The optimum kernel yielding asymptotic minimum mean integrated square error (AMISE) is the Epanechnikov kernel where cd is the volume of the unit d-dimensional sphere
19
Mean-shift – algorithm (cont).
Density gradient estimate for Epanechnikov kernel: where Sh(x) is a sphere of radius h centered on x and containing nx data points. The sample mean shift is given by: The first term is the center of the mass of the points within the sphere, when all the points are equally weighted.
20
Mean-shift – algorithm (cont).
Mean shift relation to f(x) and its gradient: Mean-shift vector has the same direction as the density gradient.
21
Mean-shift properties
Estimate of the normalized gradient can be obtained by computing the sample mean shift in a uniform kernel centered on x. The mean shift vector has the direction of the gradient of the density estimate at x when this estimate is obtained with the Epanechnikov kernel. The mean shift vector always points towards the direction of the maximum increase in the density and can define a path leading to a density mode. The mean shift procedure, obtained by successive computation of the mean shift vector Mh(x) and translation of the window Sh(x) by Mh(x), is guaranteed to converge
22
Processing in joint Spatial-Range Domain
An image is typically represented as a 2-dimensional lattice of r-dimensional vectors (pixels) r is 1 in the gray level case, 3 for color images, or r > 3 in the multi-spectral case (frequencies beyond the visible light range) The space of the lattice is the spatial domain The gray level, color, or spectral information is represented in the range domain. After a normalization with global parameters σs and σr, the location and range vectors concatenated to a joint spatial-range domain of dimension d = r + 2.
23
Processing in joint Spatial-Range Domain (cont.)
The discussed method applies the mean shift procedure for the data points in the joint spatial-range domain. Each data point becomes associated to a point of convergence which represents the local mode of the density in the d-dimensional space
24
Mean shift applications - Discontinuity preserving filtering
The output of the mean shift filter for an image pixel is the range information carried by the point of convergence. Filtering procedure: {xj}j=1...n - original image (normalized with σs and σr) {zj}j=1...n - filtered image
25
Computational complexity
The lattice structure of the spatial domain is used for the efficient search of the points This search can be limited to a rectangular window of size 2x2 in the normalized space, which corresponds to image pixels The arithmetic complexity of mean shift filtering is about ops per image pixel. where kc is the mean number of iterations to convergence.
26
Filtering - example Original image Filtered image
Example from Comaniciu & Meer
27
Mean shift applications - Segmentation
Segmentation divides the image into segments or clusters The arithmetic complexity of the segmentation is similar to that of the mean shift filtering.
28
Segmentation examples
Original image Segmented image Corresponding contours Example from Comaniciu & Meer
29
Segmentation examples
Original image Segmented image Example from Comaniciu & Meer
30
Mean Shift - recent modifications and improvements
One of the recent modifications to the basic mean shift, is P.A.M.S. The path assigned mean shift algorithm: A new fast mean shift implementation for colour image segmentation Pooransingh, A.; Radix, C.-A.; Kokaram, A.; 15th IEEE International Conference on Image Processing, ICIP 2008. According to this paper, the mean shift method is effective in high density regions but for multidimensional data sets proves to be computationally expensive. The goal of the method proposed in this paper is to achieve fast mean shift methods capable of processing multidimensional data sets easily.
31
General mean-shift (GMS) method for YUV colour space (revised algorithm).
The main computational load is the calculation of the mean shift vector, mc(U,V). The computational cost is O(n2) where n is the size of the data set.
32
Fast mean-shift methods.
A number of modifications were proposed to improve complexity: Use of single metric to represent each data point Hierarchical clustering method: repeatedly applying the mean shift over increasingly large bandwidth, with each step using the results of the previous to initialize. Neighbourhood consistency algorithm: Step 1: Partition: The original data set is decomposed into a number of local subsets of similar size and centre calculated. Step 2: Clustering: The mean shift is calculated for each sample rather than the whole data set to find a single class for each sample
33
Path Assigned Mean Shift (PAMS) – main idea
For any random start point, the mean shift vector always points to the mode point In the PAMS assignment, all points along the path toward the mode point are assigned to that final mode value. points already assigned modes are eliminated from the mean shift process and are not traversed in the future
34
Path Assigned Mean Shift Algorithm in the Colour Domain
The complexity is reduced to O(φ2) where φ is the total number of unassigned points per iteration.
35
Example illustrating GMS vs. PAMS
General mean-shift PAMS
36
Comparison of segmentation results between different algorithms
(a),(b) Original (c)(d) GMS (g),(h) PAMS (e)-(f) other fast mean shift method.
37
Mean Shift - Recent Applications.
One of the recent mean shift applications is presented in the following paper: Region-based mean shift tracking: Application to face tracking Vilaplana, V.; Marques, F.; 15th IEEE International Conference on Image Processing, ICIP 2008. Refer to Appendix for details Face tracking: Face tracking is a task required by applications such as video indexing, visual surveillance, human-computer interaction, or facial expression recognition. In these applications, it is necessary to detect the faces, track them from frame to frame and analyze the tracks, e.g. to understand the object’s behavior. Tracking methods are organized in three groups, based on the model selected to describe the shape Point tracking Kernel tracking Silhouette tracking
38
Face tracking - example
Examples from
39
Conclusions Mean-shift is a useful method for low-level tasks such as filtering or segmentation. Minor details of the background are eliminated, while objects discontinuities are preserved The method is non-parametric, i.e. doesn’t assume any model for underlying density function The method works in joint spatial-range domain The M.S. method is guaranteed to converge Scaling factors (σs and σr) have major impact on algorithm performance and should be adjusted to the objects nature The Basic M.S. is computationally expensive. Some efficient modifications, with improved complexity and same quality were proposed recently. One example is Path Assigned M.S. Another possible application of the mean-shift is face tracking. Consistent tracking can be achieved by combining mean-shift with image partition into regions.
40
References [1] D. Comaniciu, P. Meer: Mean Shift Analysis and Applications, IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece , , 1999 [2] Segmentation and low-level grouping. Bill Freeman, MIT. [3] The path assigned mean shift algorithm: A new fast mean shift implementation for colour image segmentation Pooransingh, A.; Radix, C.-A.; Kokaram, A.; 15th IEEE International Conference on Image Processing, ICIP 2008. [4] Region-based mean shift tracking: Application to face tracking Vilaplana, V.; Marques, F.; [5] D. Comaniciu, P. Meer: Mean Shift: A Robust Approach toward Feature Space Analysis, IEEE Trans. Pattern Analysis Machine Intell., Vol. 24, No. 5, , 2002 [6] “Mean Shift Theory and Applications”, PowerPoint slides for “Advanced Topics in Computer Vision” course, Weizmann Institute.
41
Appendix – Region based Face Tracking
42
Mean shift (revised) X – n-dimensional space
S - a finite set, the sample data Kernel: K(x)=k(||x||2) where k( ) is kernel profile w : S → [0,∞) a weight function The sample mean with kernel K at a point x from X: Mean shift is m(x) − x The repeated movement of data points to the sample mean is called mean shift algorithm.
43
Mean shift (cont.) Let T be a finite set, and m(T) = {m(t) : t T}.
The full mean shift procedure iterates and evolves T until it finds a fixed point T = m(T). The weights w(s) can be fixed or re-evaluated after each iteration and may also be a function of the current set T. Kernels define an influence zone for each point x in T and can be scaled to modify their spatial extent.
44
Mean shift for tracking
In object tracking, the evolving set T typically consists of just one point, the object centroid. A sample corresponds to the spatial coordinates of a pixel x, and has an associated sample weight w(x), which defines how likely the pixel with color I(x) belongs to an object model. The mean shifts seek the mode of the kernel density computed with these weights. Implementation requires defining: The kernel (scale and shape), An object model, The weight function The shape of the final object.
45
Kernel selection considerations
The basic mean shift requires isotropic kernels (e.g. Epanechnikov or Gaussian) and assumes constant object scale and orientation during the tracking However, objects may have complex shapes whose scale and orientation constantly change. This leads to using generalized kernels
46
Kernel selection considerations
Two main parameters for Kernel selection are scale and shape, both should be adjusted to the tracked object Scale: The kernel scale determines the size of the window where sample weights are examined and is a crucial parameter in the mean shift algorithm. Changes in the object scale require adjusting the kernel bandwidth to consistently track the object. Shape: In the basic formulation, radially symmetric kernels which are isotropic in shape are used. However, objects often have anisotropic structure and, therefore, anisotropic symmetric kernels like rectangles or ellipses are frequently used.
47
Object model and weight image
The tracked object is modeled as a class conditional color distribution P(I(x)/O) that estimates, for each pixel with color I(x), the probability of the color of the pixel, given that the pixel belongs to the tracked object O. The object distribution is learned off-line from training images or during the initialization. The model is commonly built with histograms in a particular color space. The weight function measures, for each pixel, some feature related to its similarity to the object model. Example: the object histogram is compared with a histogram of colors observed within the current mean shift target window To adapt to background variation, the background model is continuously recomputed.
48
Final shape definition
The tracking output at each frame is usually the object centroid and a rectangle which has the size of the last iteration window. This rectangle is used as an estimate of the object extent.
49
Region-based mean shift for tracking
Approach of Vilaplana& Marques combines mean shift with the use of regions. Regions are useful to compute the weight image and to define precisely the contours of the tracked objects and provide a natural mechanism to initialize the search in the next frame. The algorithm works with pixels that lie within a sub-image defined by a rectangular search window W and an image partition P.
50
Region based method – Kernel selection
Kernel scale: At each frame, the size of the rectangular search window is defined as the size of the bounding box of the object O found in the previous frame, scaled by a fixed factor (constant). The window size is the same for all iterations within a frame, except for occasional cases when the search window size is underestimated.
51
Region based method – Kernel selection
Shape: The image partition P is fitted to the search window W to define the kernel shape. The kernel extent is defined by all the regions R in partition P that are completely included in W: At each iteration, the kernel scale changes according to the size of the tracked object and its shape takes into account the color homogeneity observed in the image since it is defined by the regions in the partition
52
Region based partition and kernel
Example by Vilaplana& Marques
53
Object model and weight image
The object is modeled as a class conditional color distribution computed with a histogram in the YCbCr color space. YCbCr is a more efficient way of encoding RGB information Given a pixel x with color I(x), the probability of the pixel given the object is p(I(x)/O) = hO(I(x)), where hO is the object histogram. The histogram is generated from the object segmented in the first frame.
54
Object model and weight image (cont.)
The weight w(x) associated to the pixel x is the probability that the pixel represents the object, given its color P(O) – probability that the pixel belongs to the object P(I(x)) – probability that the pixel has color I: where p(B) is the probability that the pixel is part of the background Each region Ri in the fitted partition is assigned a weight value, which is computed as the average of the individual weights of the pixels that form that region:
55
Object model and weight update
The object model p(I(x)/O) (i.e. object histogram) is recomputed at each frame, using the object segmented in the previous frame p(I(x)) which depends on the background, is estimated, building a histogram hW of the pixels that are within the current search window W, which is recomputed at every iteration avoids tracking failure when the background scene changes. The value of p(O) is estimated as the ratio between the sizes in pixels of the object detected in the previous frame and the kernel.
56
Final shape definition
Once the mean shift converges, the fitted partition in the last search window is used to define the final shape of the object (three steps procedure): Initial object mask Shape matching Final object mask Example by Vilaplana& Marques
57
Region-based mean shift tracking- Results
Region-based mean shift tracking is compared with basic mean shift and demonstrate superior performance for the new method.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.