Fast and Robust Object Tracking with Adaptive Detection Sushil Pratap Bharati, Soumyaroop Nandi, Yanwei Wu, Yao Sui and Guanghui Wang
Aim of the research To develop an autonomous intelligent vision system that assists in autonomous navigation for Unmanned Aerial Vehicles (UAVs) Automatically localize and track the obstacles present in the trajectory of UAVs in real-time
Contributions Localizes and generates an adaptive bounding box in real-time around the object being tracked as the object changes its shape and size. Conjoins tracking with detection for long-term error free tracking Avoids computationally expensive supervised training for detection Runs automatically without any manual initializations
Breakdown Object Tracking Salient Object Detection Post-processing
Overview - Flowchart
OBJECT TRACKING
Correlation Filter based tracking Filters trained on previously tracked objects and their immediately surrounding background Small test window is selected on the object that needs to be tracked. Filter is correlated over a search window in an adjacent frame and a peak value is determined on the correlation output. Fast Fourier Transform is used to reduce computational complexity
Advantages of Correlation Filters Can track complex objects through rotations, occlusions and other distractions at over 20x the rate of current state-of-the-art techniques. Tracking is computationally inexpensive O(PlogP) where P is the number of pixels in the tracking window
Formulae used in MOSSE (A type of Correlation Filter)
Tracking using kernels Necessity? Sparse Sampling Strategy and Sub-window Overlap Inability to exploit structure efficiently or ran out of time
Classifier and its use The core component in tracking-by-detection is a classifier. Each frame, a set of samples is collected around the estimated position of the object/target; samples close to the target are labelled +ve and the ones further away are labelled –ve. Updating the classifier with these samples allows it to adapt over time. In Kernelized Correlation Filtering (KCF), training of a classifier is performed using dense sampling.
KCF in action Kernel Trick can improve the performance by allowing classification on a rich high dimensional features space Further, we can express ‘w’ as a linear combination of non-linear feature space with kernel trick
Closed form solutions
Kernel Trick We need to solve for Train using: Kernel correlation: Track (fast detection) using:
SALIENT OBJECT DETECTION
Saliency Map and Raster Scan Salient object detection is formulated as finding the shortest distance from pixel pij to set of pixels B along the image boundary.
Parameters for Saliency Map Since we consider 4-adjacent neighboring pixels to calculate the distance of pij namely; pi-1,j , pi+1,j , pi,j-1 , p i,j+1 A path P = <P(0), P(1), … , P(k)> on I is a sequence of pixels where consecutive pairs of pixels are adjacent Given a distance cost function D, the distance map for each pixel in I is best defined as The cost function is given as
Parameters for Saliency Map During the raster scan, pi,j-1 and pi-1,j and during the inverse raster scan, pi,j+1 and pi+1,j are updated using where, Thus, each iteration of raster/inverse raster scan updates H and L if path assignment is found to be changed. The final outcome is a saliency map S for region R, where post processing needs to be done to obtain a binary image for final object detection.
POST PROCESSING
Importance!? Post processing is vital to enhance the quality of the saliency map S Obtain a binary image that differentiates a foreground salient object with the background Applying a direct threshold would not work Why? Because different scene may have different level of noise content, relative size of objects background, illuminance and reflectance. Okay, gotcha. So what do we need? – Adaptive Thresholding
Adaptive Threshold
Mean Intensities
DEMO
Inter Class Variance
Experiments and Curves
Comparison Table
Our method in Action
Our Method in Action
THANK YOU