Tracking by Sampling Trackers Junseok Kwon* and Kyoung Mu lee Computer Vision Lab. Dept. of EECS Seoul National University, Korea Homepage:
Goal of Visual Tracking Robustly tracks the target in real-world scenarios Frame #1 Frame #43
Bayesian Tracking Approach Maximum a Posteriori (MAP) estimate Intensity edge
State Sampling MAP estimate by Monte Carlo sampling X position Y position Scale State space Visual tracker Guided by
Problem of previous works Conventional trackers have difficulty in obtaining good samples. Visual tracker Tracking environment changes Fixed can not reflect the changing tracking environment well.
Tracker space Tracker sampling Our approach : Tracker Sampling Sampling tracker itself as well as state X position Y position Scale X position Y position Scale Tracker #2 Tracker #M X position Y position Scale Tracker #1 State sampling
Two challenges How the tracker space is defined? When and which tracker should be sampled ? Tracker space Tracker #1 Tracker #2 Tracker #M Tracker space
Challenge 1 : Tracker Space Tracker space Nobody tries to define tracker space. Very difficult to design the space because the visual tracker is hard to be described. Tracker space
Bayesian Tracking Approach Go back to the Bayesian tracking formulation Updating rule
Bayesian Tracking Approach What is important ingredients of visual tracker? 1. Appearance model 2. Motion model 3. State representation type 4. Observation type
Appearance model Motion model State representation Observation Tracker Space
Motion model Observation type State representation type Appearance model Challenge 2 : Tracker Sampling Tracker sampling When and which tracker should be sampled ? To reflect the current tracking environment. Tracker space Tracker #m
Reversible Jump-MCMC We use the RJ-MCMC method for tracker sampling. Add Delete Set of sampled appearance models Add Delete Set of sampled motion models Add Delete Set of sampled state representation types Add Delete Set of sampled observation types Sampled basic trackers
Sampling of Appearance Model Make candidates using SPCA* The candidates are PCs of the target appearance. Appearance models * A. d’Aspremont et. al. A direct formulation for sparse PCA using semidefinite programming. Data Min. SIAM Review, Sparse Principle Component Analysis*
Accept an appearance model With acceptance ratio Sampling of Appearance Model Our method has the limited number of models
The accepted model increase the total likelihood scores for recent frames When it is adopted as the target reference
Sampling of Motion Model Make candidates using KHM* The candidates are mean vectors of the clusters for motion vectors. Motion models K-Harmonic Means Clustering (KHM)* * B. Zhang, M. Hsu, and U. Dayal. K-harmonic means - a data clustering algorithm. HP Technical Report, 1999
Sampling of Motion Model Accept a motion model With acceptance ratio Our method has the limited number of models
The accepted model decreases the total clustering error of motion vectors for recent frames When it is set to the mean vector of the cluster
Sampling of State Representation Make candidates using VPE* The candidates describe the target as the different combinations of multiple fragments. Vertical Projection of Edge (VPE)* Edge Position Intensity * F.Wang, S. Yua, and J. Yanga. Robust and efficient fragments-based tracking using mean shift. Int. J. Electron. Commun., 64(7):614–623, State representation Fragment 1 Fragment 2
Accept a state representation type With acceptance ratio Sampling of State Representation Our method has the limited number of types
The accepted type reduce the total variance of target appearance in each fragment for recent frames
Sampling of Observation Make candidates using GFB* The candidates are the response of multiple Gaussian filters of which variances are different. Gaussian Filter Bank (GFB)* * J. Sullivan, A. Blake, M. Isard, and J. MacCormick. Bayesian object localisation in images. IJCV, 44(2):111–135, 2001.
Sampling of Observation Accept an observation type With acceptance ratio Our method has the limited number of types
The accepted type makes more similar between foregrounds, but more different with foregrounds and backgrounds for recent frames
Tracker space Overall Procedure X position Y position Scale X position Y position Scale X position Y position Scale Tracker #1 Tracker #2 Tracker #M Tracker sampling State sampling Interaction
Qualitative Results
Iron-man dataset
Qualitative Results Matrix dataset
Qualitative Results Skating1 dataset
Qualitative Results Soccer dataset
Quantitative Results MCIVTMILVTDOurs soccer skating animal shaking Soccer* Skating1* Iron-man Matrix Average center location errors in pixels IVT : Ross et. al. Incremental learning for robust visual tracking. IJCV MIL : Babenko et. al. Visual tracking with online multiple instance learning. CVPR MC : Khan et. al. MCMC-based particle filtering for tracking a variable number of interacting targets. PAMI VTD: Kwon et. al. Visual tracking decomposition. CVPR 2010.
Summary Visual tracker sampler New framework, which samples visual tracker itself as well as state. Efficient sampling strategy to sample the visual tracker.