Online Multi-Object Tracking via Structural Constraint Event Aggregation Ju Hong Yoon Chang-Ryeol Lee Ming-Hsuan Yang Kuk-Jin Yoon KETI CV Lab., GIST UC Merced In CVPR 2016
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion
Introduction Data association Similar objects ? Detections-to-detections Multi-object tracking (MOT) Detections-to-tracklets Object appearances Object appearances 被用來當作 Data association 的重要依據 Tracklets-to-tracklets Similar objects ?
Motion model Introduction Moving cameras not always smooth or predictable
Introduction A new data association method : The structural motion constraints between objects Location , Velocity Event aggregation : Assignment ambiguities reduce the assignment ambiguities caused by mis-detections
Introduction
Introduction Two-step online 2D MOT framework : Structural constraint event aggregation Infer and recover the missing objects Using the structural constraints of objects between frames, we can re-track the missing ones from the tracked objects from the first step.
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion
Structural Constraint Event Aggregation The state of an object 𝑖 at frame 𝑡 : Structural motion constraint between two objects : Position Velocity Size
Structural Constraint Event Aggregation
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion
Structural constraint cost function The MOT task can be considered as a data association problem 𝑖 𝑘 1 2 3 . N 1 2 3 . M finds the correct assignment event between objects and detections
Structural constraint cost function The MOT task can be considered as a data association problem If the detection 𝑘 is assigned to the object 𝑖, Otherwise, The best assignment event is then estimated by minimizing total assignment costs finds the correct assignment event between objects and detections
Structural constraint cost function A detection 𝑘 at frame 𝑡 : 不失一般性Without loss of generality, we remove the time index t ai,0 stands for the case of mis-detected objects 每個k最多只會被分給一個i(不包含k=0) 每個i一定會對應到一個k(包含k=0) mis-detected objects數量不會超過總數量
Structural constraint cost function anchor assignment structural constraint aik=1 the structural constraint cost evades the error caused by the global camera motion
Structural constraint cost function Size Appearance p(d) denote the histogram of an object and a detection b is the bin index and B is the number of bins
Structural constraint cost function τ=4
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion
Event aggregation
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion considering all of assignment events is not computationally efficient
Assignment event initialization and reduction τ=0.7 If the above conditions are satisfied, ai,k = 1
Assignment event initialization and reduction maximum number of objects in each partition = 5
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion recovery mis-detected the previous frame have been also updated with their corresponding detections
Two-Step Online MOT via SCEA
Two-Step Online MOT via SCEA D : not-assigned detections and dummy detections d0
Two-Step Online MOT via SCEA
Two-Step Online MOT via SCEA we select the object moving in the most similar direction and velocity
Two-Step Online MOT via SCEA Hungarian algorithm
Two-Step Online MOT via SCEA Update final tracking result with Kalman filter for smoothing : location of a detection assigned to the object i
Two-Step Online MOT via SCEA Structural constraint update : we indirectly update the structural constraint variations by using the standard Kalman filter
Two-Step Online MOT via SCEA Object management : Add new objects (velocity = 0) The distances and the appearance between a detection in the current frame and unassociated detections in the past a few frames are smaller than a certain threshold Delete objects If they are not associated with any detections for two frames (e.g., 4)
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion recovery mis-detected the previous frame have been also updated with their corresponding detections
Experiments Data association evaluation Efficiency of the event reduction Comparisons with State-of-the-Art Methods
Data association evaluation RMN - Relative Motion Network [29] LM - Linear Motion (Baseline) (without the structural constraints or event aggregation) SCNN - Structural Constraint Nearest Neighbor (without event aggregation) [29] J. H. Yoon, M.-H. Yang, J. Lim, and K.-J. Yoon. Bayesian multi-object tracking using motion context from multiple objects. In WACV, 2015
Data association evaluation ETH sequences (Bahnhof, Sunnyday, and Jelmoli sequences) [8] include at most 10 false detections per each frame RMN - well low level [8] A. Ess, B. Leibe, K. Schindler, and L. V. Gool. A mobile vision system for robust multi-person tracking. In CVPR, 2008
Efficiency of the event reduction with the gating technique
Comparisons with State-of-the-Art Methods MDP [26] TC ODAL [1] RMOT [29] NOMT-HM [5] ODAMOT [11] [1] S.-H. Bae and K.-J. Yoon. Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In CVPR, 2014 [5] W. Choi. Near-online multi-target tracking with aggregated local flow descriptor. In ICCV, 2015 [11] A. Gaidon and E. Vig. Online Domain Adaptation for Multi-Object Tracking. In BMVC, 2015 [26] Y. Xiang, A. Alahi, and S. Savarese. Learning to track:Online multi-object tracking by decision making. In ICCV,2015 [29] J. H. Yoon, M.-H. Yang, J. Lim, and K.-J. Yoon. Bayesian multi-object tracking using motion context from multiple objects.In WACV, 2015
Comparisons with State-of-the-Art Methods Evaluation metrics : MOTA - Multiple Object Tracking Accuracy MOTP - Multiple Object Tracking Precision MT - the number of mostly tracked ML - the number of mostly lost FG - the fragment ID - the identity switch Rec - the Recall Prec - the Precision sec/Hz - the runtime AR - the average ranking 10個
Comparisons with State-of-the-Art Methods Benchmark dataset : KITTI dataset [12] : 29 sequences Detections : DPM [10], regionlet [24] MOT Challenge dataset [17] : 22 sequences [10] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010 [24] X.Wang, M. Yang, S. Zhu, and Y. Lin. Regionlets for generic object detection. In ICCV, 2013 [12] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The kitti dataset. IJRR, 2013 [17] L. Leal-Taix´e, A. Milan, I. Reid, S. Roth, and K. Schindler. Motchallenge 2015: Towards a benchmark for multi-target tracking. In arXiv:1504.01942, 2015
OMDAMOT : the additional local detector to deal with missing objects caused by partial occlusions NOMT-HM : the optical flow information to reduce ambiguities caused by similar appearance of objects pedestrian the motion cue (the optical flow) becomes less discriminative when motion of objects is small
Comparisons with State-of-the-Art Methods TC ODAL : linear motion model to link the tracklets based on the Hungarian algorithm MDP : learns the target state (Active, Tracked, Lost and Inactive) from a training dataset and its ground truth in an online manner SCEA does not require any training datasets and it runs faster
Comparisons with State-of-the-Art Methods MDP-KITTI : MDP on the KITTI dataset MDP-MOTC : trained model provided with the original source code by the authors
Outline Introduction Structural Constraint Event Aggregation Structural constraint cost function Event aggregation Assignment event initialization and reduction Two-Step Online MOT via SCEA Experiments Conclusion recovery mis-detected the previous frame have been also updated with their corresponding detections
Conclusion Structural motion constraints - Large camera motion Event aggregation - Assignment ambiguities Two-step algorithm - Recover missing objects
Thanks for listening!