Nalin Pradeep Senthamil Masters Student, ECE Dept. Advisor, Dr Stan Birchfield Committee Members, Dr Adam Hoover, Dr Brian Dean
Accurate Tracking of Non-Rigid Objects using Level Sets Clemson University, Clemson, SC USA Accepted in ICCV, 2009
Outline Tracking Overview Literature Proposed Approach Object Fragmentation Region Growing Mechanism GMM modeling (feature-spatial) Level Set Framework Fragment Motion using Joint-KLT Results Conclusion
Tracking Overview Idea: Obtain Trajectories over time to locate object Three Main Categories Point Tracking – Kalman, Particle filters Kernel Tracking – Collins et al (linear RGB), Comaniciu (Mean-Shift) Contour Tracking – Shah et al, Cremers et al Applied to Surveillance – Vessel, human, vehicle etc Why not internet videos ? – 65,000 videos get uploaded in YouTube everyday (rich market)
Literature Linear RGB [Collins et al. 2003] Ada-boost classifier [Avidan 2005] Fragments based fixed size [Adam et al. 2006] Key-point Feature learning [Grabner et al. 2007] Shape priors [Cremers et al. 2006] Contour tracking using texture [Shah et al 2005] Limitations Ignore secondary cues such as multimodality Lack in determining accurate object shape Usually non-contour based techniques drift during occlusion Often ignore spatial arrangement of pixels
Algorithm Block Diagram Object Fragmentation Object Modeling Strength Map Computation Level-Set Formulation Estimate Fragment Motion Tracker Initialization User clicked ROI around object Each object as set of fragments Update made at each frame
Object Fragmentation Region Growing Mechanism Random pixel selected from mask – fragment (f) Neighboring pixels added to (f) within Γ (std deviation) Gaussian Model of (f) updated Each (f) represents a Gaussian ellipsoid Both Object and background are fragmented
Object Modeling (GMM) Joint feature-spatial space,
Strength Map +ve for FGND -ve for BKGND
Level Set Framework Level Set is numerical technique for fitting contour Level Set on 2D image is viewed as 3D function Contour in level set identified at zero level
Level Set for strength map In general, Level set evolution defined by Gradient Descent Iteration Strength Image Contour (zero level set) Strength Image Divergence operator speed contour
Level-Set Evolution Iterations using “Elmo” strength map Curve can grow inward and outward Figure shows for first frame as example Curve evolves from previous contours in subsequent tracking
Joint-KLT: Combines algorithms of KLT and HS Hence, Used to align coordinate system of object and model fragments Increases accuracy of strength map Fragment Motion data term smoothness term
Fragment Motion (contd.) ‘N’ features tracked in each fragment are averaged Motion of each fragment gives ‘prior’ information before computing strength map Drastic motion can be addressed KLTJoint-KLT
Results - Videos
Shape Matching Hausdorff metric is mathematical measure to compare two sets of points Application in Occlusion Handling and Shape recognition ‘a’ and ‘b’ are two point sets
Occlusion Handling Rate of decrease in object size determines occlusion Contour shapes learnt online is used to hallucinate during occlusion Best shape is identified using Hausdorff distance metric Previously learnt subsequent shapes are hallucinated during occlusion
Results – Occlusion Videos
Results – More Comparison Videos
Quantitative Comparison Average Normalized error obtained against ground-truth of sequences at every 5 frames. Girl Circle Walk Behind Elmo Doll
Conclusion Tracking algorithm based on modeling object and background with mixture of Gaussians Simple and efficient region growing mechanism to achieve fast computation Embedding “strength map” into Level-Set Framework Joint KLT introduced in the framework to improve accuracy Future Work: Robust shape prior learning and matching Self-occlusion handling for unknown fragments
Alternative Tracking Framework (outline) Overview Proposed Approach Vessel Detection Saliency Map Thresholding Vessel Tracking Strength Map using Linear RGB ML Framework for Search Results
Object Detection Using Saliency Map Saliency: Property of objects standing out relative to their neighbors. There is a statistical relationship between backgrounds of all natural images similar to pre-attentive search done by human visual system. Zhang et al (CVPR 2007) observed redundancies in log Fourier spectra of natural images. Hence, any statistical singularities in the spectrum can be treated as anomalies.
Saliency Map Computation Algorithm Let be the image. Real part of Fourier Spectrum Phase Log Spectrum Spectral Residual Saliency Map, j=sqrt(-1) Smoothing in spatial domain Smoothing in frequency domain
Sample Saliency Map detections
Object Tracking Objects detected through saliency used as FGND Immediate surrounding used as BKGND Strength Model Computed similar to Collins Linear RGB 49 features selected from linear combination used to identify strength map Maximum Likelihood Framework based search used to localize objects in each frame Region search was identified based on object velocity
Object Tracking – Strength Model 49 features of RGB are normalized into and discretized into 0-32 histogram bins For each feature, Variance Ratio of Log-likelihood is identified that best discriminates object from background probability Small value – 0.01 hist-index Variance of L(i) with respect to a distribution a(i)
Strength Model - Outputs
Object Tracking – ML Framework Objective was to recover tight bound around object ML Framework is like EM algorithm Search objective is to maximize the function (Mean, Covariance) mean Covariance Strength Map Prevent pixel locations farther from object
Object Tracking – ML Framework To maximize the function, Mean and Covariance are computed iteratively E-Step M-Step Iterated for 2-3 times to get optimal values Mean and covariance of current estimate
Conclusion Algorithm was real time and supported around fps in speed Saliency map based detection was introduced Concept of “strength map” from adaptive-fragmentation is applied here Depends only on color (linearRGB), and combination with KLT features would add robustness to the system. Good way to combine is explored.
Thank you !