Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor :王聖智 教授 Student :周節
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Introduction Surveillance Detection Tracking
Introduction If the object is isolated… Physical properties are useful. Such as color, shape… It is much simpler.
Introduction ? But if the objects are not isolated… Occlusion Lack of visibility In crowded and cluttered scenes
Introduction Difficulty It’s difficult to track individual people when occlusion occurred.
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Related work Monocular Approach Multi-camera Approach
Monocular Approach Handling occlusions (1) Color and shape (2) Objects contours and appearances (3) Blob split and merge analysis (4) ……
Monocular Approach Drawbacks (1) Can’t deal with full occlusion. (2) Can’t deal with long periods. Therefore (1) Single view is limited. (2) Multi-camera is preferred.
Multi-camera Approach Methods (1) Construct 3D model using voxel (2) Calibrated camera to obtain 3D locations (3) Switching camera (4) Stereo camera (5) ……
Multi-camera Approach Problems (1) Features might be corrupted by occlusions. (2) Very similar color and shape. (3) Full occlusion. Therefore This paper…
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Localization Algorithm 1.Obtain the foreground likelihood maps 2.Obtain reference plane homographies 3.Compute and find the localization
Localization Algorithm 1.Obtain the foreground likelihood maps –Model Background using a Mixture of Gaussians. –Perform Background Subtraction to obtain foreground likelihood information.
Localization Algorithm 2.Obtain reference plane homographies –Homography Constraint
Homography Constraint (R, T)
Homography Constraint
Localization Algorithm x 1 ~x n are the observations in images. X is the event that the pixel is inside a foreground object. L(x i ) is the likelihood of observation x i belonging to the foreground.
Example
Localization Algorithm For robustness Modeling Clutter and FOV Constraints
Example
Localization Algorithm For robustness –Localization at Multiple Planes
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Tracking Algorithm Object trajectories are spatially and temporally coherent. T=0 T=10 T=20 T=30
Tracking Algorithm Spatially and temporally coherent
Tracking Algorithm
DEMO
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Result Case 1 –Parking lot
Result Multi-camera improves performance.
Result Case 2 –Indoor
Result Multi-plane improves performance.
Result Case 3 –Basketball View limited –Error –Miss
Result Case 4 –Soccer Background limited –Error –Miss
Outline Introduction Related work Localization Algorithm Tracking Algorithm Result Conclusion Reference
Conclusion To resolve occlusions and localize people: Planar homography constraints are used to fuse foreground likelihood information. Detection and tracking are performed simultaneously in the space-time occupancy likelihood data.
Conclusion Advantage –Very good performance when occlusions occur. –No calibration is needed. –Only 2D constructs, purely image-based.
Conclusion Disadvantage –If a person’s appearance is very similar to the background or –If a person is occluded by some portion of the background itself. –If a part of the scene is occluded in all views by the foreground objects. Solutions –Add other models –like color model,human motion model…
Reference Saad M. Khan and Mubarak Shah, “Tracking Multiple Occluding People by Localizing on Multiple Scene Planes”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 31, Issue: 3, March 2009 S.M. Khan and M. Shah, “A Multi-View Approach to Tracking People in Crowded Scenes Using a Planar Homography Constraint,” Proc. Ninth European Conf. Computer Vision, 2006.
Property 1.Do not use color models or shape cues of individual people 2.Our method of detection and occlusion resolution is based on geometrical constructs and requires only the distinction of foreground from back-ground, which is obtained using standard background modeling techniques. 3.the core of our method is a planar homographic occupancy constraint that combines foreground likelihood information from different views to resolve occlusions and determine regions on scene planes that are occupied by people. 4.consistently warp (under homographies of the reference plane) to foreground regions in every view. 5.The reason we use foreground likelihood maps instead of binary foreground maps is to delay the thresholding step to the last possible stage. 6.multiple planes parallel to the reference plane to robustly localize scene objects. 7.To track,we obtain object scene occupancies for awindow of time and stack them together, creating a space-time volume. 8.designing an energy functional that combines scene occupancy information and spatio-temporal proximity. 9.Homographies induced by the reference plane between views are computed using SIFT feature matches and employing the RANSAC algorithm. 10.The result is that our approach is purely image based and performs fusion in the image plane without requiring to go in 3D space, and thus eliminating the need for fully calibrated cameras.
Monocular Approach Methods (1) Blob tracking (2) Shape and Color (3) Adaboost and Particle filter (4) ……