Multiple Object Tracking Using K-Shortest Paths Optimization PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 33, NO. 9, SEPTEMBER
OUTLINE INTRODUCTION ALGORITHM RESULTS CONCLUSION 2
OUTLINE INTRODUCTION ALGORITHM RESULTS CONCLUSION 3
INTRODUCTION MULTI-OBJECT tracking can be decomposed into two separate steps : 1)time-independent detection generative model machine learning-based algorithm 2)link detections into the most likely trajectories While it is easy to design a statistical trajectory model with all of the necessary properties for good filtering NP-complete 4
INTRODUCTION This has been dealt with in the literature : 1)sampling and particle filtering 2)linking short tracks generated using Kalman filtering 3)greedy Dynamic Programming None of these approaches guarantees a global optimum 5
INTRODUCTION A notable exception is : A Linear Programming Approach for Multiple Object Tracking[4] 1)priori specifying the number of objects being tracked 2)the detector produces false negatives Reformulating the linking step Constrained flow optimization results in a convex problem that fits into a standard Linear Programming framework However, yields a very large system that is hardly tractable 6
INTRODUCTION Due to its particular structure, the k-shortest paths algorithm, which yields real-time performance on realistically sized problems Our method does not present any of the limitations mentioned above, nor does it require an appearance model (optional) Moreover, it is far simpler both formally and algorithmically than existing techniques 7
INTRODUCTION It performs well in two difficult real-world scenarios : 1)tracking multiple balls of similar color, which is a case where an appearance model would not help 2)tracking multiple people with multiple cameras set at shoulder level so that there are significant occlusions In both cases, we use an object detector that produces a probabilistic occupancy map (POM) 8
OUTLINE INTRODUCTION ALGORITHM RESULTS CONCLUSION 9
ALGORITHM Formulate multitarget tracking as an Integer Programming (IP) problem A relaxation of it as a Linear Program in polynomial time However, the large number of variables and constraints makes it tractable only for small areas and short sequences K-shortest paths algorithm NP-hard in many cases 10
Formalization 11
Formalization incoming flows? 12
Formalization In general, some objects may appear inside the tracking area and others may leave Thus, the total mass of the system changes and we must allow flows to enter and exit the area 13
Formalization virtual locations 14
Formalization Our goal : feasible maps satisfies (1), (2), (3), and (4) 15
Formalization 16
Linear Programming Formulation The formulation defined above translates naturally into the Integer Program : Why? a≥b≥…≥c≥d a≤d a=b=…=c=d 17
Linear Programming Formulation This new formulation is strictly equivalent to the original one and no additional constraint is needed The inequalities are indeed sufficient to ensure that no flow can ever appear or disappear within the graph Under this formulation, our Integer Program can be solved by any generic LP solver NP-complete Polynomial Total Unimodularity 18
Unimodular Matrix 19
Total Unimodularity A totally unimodular matrix (TU matrix) is a matrix for which every square non-singular(or called invertible) submatrix is unimodular From the definition it follows that any totally unimodular matrix has only 0, +1 or −1 entries (The opposite is not true) If A is totally unimodular and b is integral, every extreme point of the feasible region is integral and thus the feasible region is an integral polyhedron 20
Linear Programming Formulation Generic LP solvers : Simplex algorithm [5] Interior-point-based methods [36] However, this approach would only be tractable for moderately sized problems and does not scale to most practical applications And have very high worst-case time complexities 21
K-Shortest Paths Formulation directed acyclic graph (DAG) 22
K-Shortest Paths Formulation The cost value of the edges emanating from the source node is set to zero feasible solutions of the original LP formulation of (11) 23
K-Shortest Paths Formulation 24
K-Shortest Paths Formulation 25
Batch Processing and Complexity Reduction Processing a whole video sequence is possible but impractical for applications such as broadcasting, in which the result must be supplied quickly We split the sequence into batches of 100 frames This results in a constant 4 second delay between input and output, which is nevertheless compatible with many applications To enforce temporal consistency across batches, we add the last frame of the previously optimized batch to the current one 26
Batch Processing and Complexity Reduction Since most of the probabilities of presence estimated by the detector are virtually equal to zero We can reduce the number of nodes In the examples presented in this paper, we have not found it necessary to do so 27
OUTLINE INTRODUCTION ALGORITHM RESULTS CONCLUSION 28
RESULTS First, we use a multicamera setup in which the cameras are located at shoulder level to track pedestrians who may walk in front of each other As a result, our approach was shown to compare favorably against other state-of-the-art algorithms in the PETS 2009 evaluation [37] Second, to highlight the fact that we do not depend on an appearance model, we track sets of similar-looking bouncing balls seen from above Compare to sequential Dynamic Programming and show that we can obtain good results even when using a single camera 29
Probabilistic Occupancy Map We used the publicly available implementation [40] of our earlier POM algorithm [3] to create the detection data needed as input by our tracker This method performs binary background/foreground segmentation and then uses a generative model to estimate the most likely locations The generative model at the heart of POM represents people as cylinders that project to rectangles in the images [40] POM: Probabilistic Occupancy Map 30
Probabilistic Occupancy Map In our model, the resolution of the ground grid is independent of the target’s size If grid cells are smaller than a target, the detections do not spread over several cells POM implicitly performs a nonmaximum suppression (peaky) 31
Evaluation Metrics Video Analysis and Content Extraction (VACE) program Sequence Frame Detection Accuracy (SFDA) Average Tracking Accuracy (ATA) SODA Classification of Events, Activities, and Relationships (CLEAR) consortium Multiple Object Detection Accuracy (MODA) Multiple Object Detection Precision (MODP) Multiple Object Tracking Accuracy (MOTA) Multiple Object Tracking Precision (MOTP) 32
Evaluation Metrics MODP : the quality of the bounding box alignment in case of correct detection MODA : false positives and missed detections MOTP : the alignment of tracks with the ground truth MOTA : false positives, missed detections, and identity switches 33
34
Test Data Laboratory Sequence Basketball Sequence Passageway Sequences PETS 2009 Sequence 35
36
Test Data Monocular Pedestrian 37
Test Data Ball Tracking 38
Evaluation Metrics To quantify our results, we manually labeled some of the test sequences : SequencesFramesFrames per Labeled Ball * 21,0003 PETS 2009 sequence S2/L18005 Passageway * 42,500, 800, 900, and Laboratory5,
40
41
42
43
Failure Modes Our tracking algorithm can be mainly affected by two elements : false detections missing ones Selected a 1,000 frame excerpt of the laboratory sequence 1)added various levels of random detection noise uniformly 2)randomly deleted detections from the same original sequence 44
45
46
47
200-frame excerpt of the laboratory sequence linear 48
OUTLINE INTRODUCTION ALGORITHM RESULTS CONCLUSION 49
CONCLUSION Combining frame-by-frame detections to estimate the most likely trajectories of an unknown number of targets, including their entrances and departures to and from the scene, is one of the most difficult components of a multiobject tracking algorithm Formalizing the motions of targets as flows Standard Linear Programming K-shortest paths algorithm Performing robust multi-object tracking in real time 50
CONCLUSION The resulting algorithm is far simpler than current state-of-the-art alternatives Ensures that a global optimum can be found Future work will focus on integrating additional cues to our framework, such as an appearance or a motion models, to robustly handle identities of intersecting trajectories 51
Thanks for Listening! 52