Motion Detection And Analysis Michael Knowles Tuesday 13 th January 2004
Introduction Brief Discussion on Motion Analysis and its applications Brief Discussion on Motion Analysis and its applications Static Scene Object Tracking Static Scene Object Tracking Motion Compensation for Moving-Camera Sequences Motion Compensation for Moving-Camera Sequences
Applications of Motion Tracking Control Applications Control Applications Object Avoidance Object Avoidance Automatic Guidance Automatic Guidance Head Tracking for Video Conferencing Head Tracking for Video Conferencing Surveillance/Monitoring Applications Surveillance/Monitoring Applications Security Cameras Security Cameras Traffic Monitoring Traffic Monitoring People Counting People Counting
Two Approaches Optical Flow Optical Flow Compute motion within region or the frame as a whole Compute motion within region or the frame as a whole Object-based Tracking Object-based Tracking Detect objects within a scene Detect objects within a scene Track object across a number of frames Track object across a number of frames
My Work Started by tracking moving objects in a static scene Started by tracking moving objects in a static scene Develop a statistical model of the background Develop a statistical model of the background Mark all regions that do not conform to the model as moving object Mark all regions that do not conform to the model as moving object
My Work Now working on object detection and classification from a moving camera Now working on object detection and classification from a moving camera Current focus is motion compensated background filtering Current focus is motion compensated background filtering Determine motion of background and apply to the model. Determine motion of background and apply to the model.
Static Scene Object Detection and Tracking Model the background and subtract to obtain object mask Model the background and subtract to obtain object mask Filter to remove noise Filter to remove noise Group adjacent pixels to obtain objects Group adjacent pixels to obtain objects Track objects between frames to develop trajectories Track objects between frames to develop trajectories
Background Modelling
Background Model
After Background Filtering…
Background Filtering My algorithm based on: My algorithm based on: “Learning Patterns of Activity using Real-Time Tracking” C. Stauffer and W.E.L. Grimson. IEEE Trans. On Pattern Analysis and Machine Intelligence. August 2000 The history of each pixel is modelled by a sequence of Gaussian distributions The history of each pixel is modelled by a sequence of Gaussian distributions
Multi-dimensional Gaussian Distributions Described mathematically as: Described mathematically as: More easily visualised as: More easily visualised as:(2-Dimensional)
Simplifying…. Calculating the full Gaussian for every pixel in frame is very, very slow Calculating the full Gaussian for every pixel in frame is very, very slow Therefore I use a linear approximation Therefore I use a linear approximation
How do we use this to represent a pixel? Stauffer and Grimson suggest using a static number of Gaussians for each pixel Stauffer and Grimson suggest using a static number of Gaussians for each pixel This was found to be inefficient – so the number of Gaussians used to represent each pixel is variable This was found to be inefficient – so the number of Gaussians used to represent each pixel is variable
Weights Each Gaussian carries a weight value Each Gaussian carries a weight value This weight is a measure of how well the Gaussian represents the history of the pixel This weight is a measure of how well the Gaussian represents the history of the pixel If a pixel is found to match a Gaussian then the weight is increased and vice-versa If a pixel is found to match a Gaussian then the weight is increased and vice-versa If the weight drops below a threshold then that Gaussian is eliminated If the weight drops below a threshold then that Gaussian is eliminated
Matching Each incoming pixel value must be checked against all the Gaussians at that location Each incoming pixel value must be checked against all the Gaussians at that location If a match is found then the value of that Gaussian is updated If a match is found then the value of that Gaussian is updated If there is no match then a new Gaussian is created with a low weight If there is no match then a new Gaussian is created with a low weight
Updating If a Gaussian matches a pixel, then the value of that Gaussian is updated using the current value If a Gaussian matches a pixel, then the value of that Gaussian is updated using the current value The rate of learning is greater in the early stages when the model is being formed The rate of learning is greater in the early stages when the model is being formed
Colour Spaces If RGB is used then the background filtering is sensitive to shadows If RGB is used then the background filtering is sensitive to shadows The use of a colour space that separates intensity information from chromatic information overcomes this The use of a colour space that separates intensity information from chromatic information overcomes this For this reason the YUV colour space is used For this reason the YUV colour space is used
Colour Spaces Background and Frame: Background and Frame: Channel Differences: Channel Differences:
Isolate Objects Groups of object pixels must be grouped to form objects Groups of object pixels must be grouped to form objects A connected components algorithm is used A connected components algorithm is used The result is a list of objects and their position and size The result is a list of objects and their position and size
Track objects Objects are tracked from frame to frame using: Objects are tracked from frame to frame using: Location Location Direction of motion Direction of motion Size Size Colour Colour
The Story So Far… Basic principle of background filtering Basic principle of background filtering Stages necessary in maintaining a background model Stages necessary in maintaining a background model How it is applied to tracking How it is applied to tracking
Moving Camera Sequences Basic Idea is the same as before Basic Idea is the same as before Detect and track objects moving within a scene Detect and track objects moving within a scene BUT – this time the camera is not stationary, so everything is moving BUT – this time the camera is not stationary, so everything is moving
Motion Segmentation Use a motion estimation algorithm on the whole frame Use a motion estimation algorithm on the whole frame Iteratively apply the same algorithm to areas that do not conform to this motion to find all motions present Iteratively apply the same algorithm to areas that do not conform to this motion to find all motions present Problem – this is very, very slow Problem – this is very, very slow
Motion Compensated Background Filtering Basic Principle Basic Principle Develop and maintain background model as previously Develop and maintain background model as previously Determine global motion and use this to update the model between frames Determine global motion and use this to update the model between frames
Advantages Only one motion model has to be found Only one motion model has to be found This is therefore much faster This is therefore much faster Estimating motion for small regions can be unreliable Estimating motion for small regions can be unreliable Not as easy as it sounds though….. Not as easy as it sounds though…..
Motion Models Trying to determine the exact optical flow at every point in the frame would be ridiculously slow Trying to determine the exact optical flow at every point in the frame would be ridiculously slow Therefore we try to fit a parametric model to the motion Therefore we try to fit a parametric model to the motion
Affine Motion Model The affine model describes the vector at each point in the image The affine model describes the vector at each point in the image Need to find values for the parameters that best fit the motion present Need to find values for the parameters that best fit the motion present
Minimisation of Error Function If we are to find the optimum parameters we need an error function to minimise: If we are to find the optimum parameters we need an error function to minimise: But this is not in a form that is easy to minimise… But this is not in a form that is easy to minimise…
Gradient-based Formulation Applying Taylor expansion to the error function: Applying Taylor expansion to the error function: Much easier to work with Much easier to work with
Gradient-descent Minimisation If we know how the error changes with respect to the parameters, we can home in on the minimum error If we know how the error changes with respect to the parameters, we can home in on the minimum error Various methods built on this principle: Various methods built on this principle:
Applying Gradient Descent We need: We need: Using the chain rule: Using the chain rule:
Robust Estimation What about points that do not belong to the motion we are estimating? What about points that do not belong to the motion we are estimating? These will pull the solution away from the true one These will pull the solution away from the true one
Robust Estimators Robust estimators decrease the effect of outliers on estimation Robust estimators decrease the effect of outliers on estimation
Error w.r.t. parameters The complete function is: The complete function is:
Aside – Influence Function It can be seen that the first derivative of the robust estimator is used in the minimisation: It can be seen that the first derivative of the robust estimator is used in the minimisation:
Pyramid Approach Trying to estimate the parameters form scratch at full scale can be wasteful Trying to estimate the parameters form scratch at full scale can be wasteful Therefore a ‘pyramid of resolutions’ or ‘Gaussian pyramid’ is used Therefore a ‘pyramid of resolutions’ or ‘Gaussian pyramid’ is used The principle is to estimate the parameters on a smaller scale and refine until full scale is reached The principle is to estimate the parameters on a smaller scale and refine until full scale is reached
Pyramid of Resolutions Each level in the pyramid is half the scale of the one below – i.e. a quarter of the area Each level in the pyramid is half the scale of the one below – i.e. a quarter of the area
Out pops the solution…. Out pops the solution…. When combined with a suitable gradient based minimisation scheme… When combined with a suitable gradient based minimisation scheme…
Problems with this approach Resampling the background model: Resampling the background model: Model cannot be too complex Model cannot be too complex Resampling will bring in errors Resampling will bring in errors Motion model is only an estimate of what is really happening Motion model is only an estimate of what is really happening Can lead to false object detection – particularly close to boundaries Can lead to false object detection – particularly close to boundaries
Background Model Design The background model needs to be robust to these problems The background model needs to be robust to these problems We need some way to differentiate between genuine object detections and false ones from motion model and background model errors We need some way to differentiate between genuine object detections and false ones from motion model and background model errors
My Approach Rather than updating model values with current, matched values replace them Rather than updating model values with current, matched values replace them In this way resampling errors are not allowed to accumulate In this way resampling errors are not allowed to accumulate
Aside – ‘Real Time’ The ability to process a sequence in real- time is dependent on THREE key factors: The ability to process a sequence in real- time is dependent on THREE key factors: The speed of the algorithm The speed of the algorithm The frame rate required The frame rate required The number of pixels The number of pixels
Recap Introduction to motion analysis Introduction to motion analysis Principles of background modelling Principles of background modelling Example of a static scene tracker Example of a static scene tracker Discussion of motion estimation Discussion of motion estimation Shortcomings when applied to background filtering Shortcomings when applied to background filtering