Part-based visual tracking with online latent structural learning -Rui Yao et al. ICCV 2013 Cvlab Jung ilchae.

Slides:



Advertisements
Similar presentations
A Support Vector Method for Optimizing Average Precision
Advertisements

Max-Margin Additive Classifiers for Detection
Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Presenter: Duan Tran (Part of slides are from Pedro’s)
A Discriminative Key Pose Sequence Model for Recognizing Human Interactions Arash Vahdat, Bo Gao, Mani Ranjbar, and Greg Mori ICCV2011.
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
Curriculum Learning for Latent Structural SVM
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Robust Visual Tracking – Algorithms, Evaluations and Problems Haibin Ling Department of Computer and Information Sciences Temple University Philadelphia,
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct
Face Alignment at 3000 FPS via Regressing Local Binary Features
A Nonparametric Treatment for Location/Segmentation Based Visual Tracking Le Lu Integrated Data Systems Dept. Siemens Corporate Research, Inc. Greg Hager.
Forward-Backward Correlation for Template-Based Tracking Xiao Wang ECE Dept. Clemson University.
Robust Object Tracking via Sparsity-based Collaborative Model
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
1 PEGASOS Primal Efficient sub-GrAdient SOlver for SVM Shai Shalev-Shwartz Yoram Singer Nati Srebro The Hebrew University Jerusalem, Israel YASSO = Yet.
Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system.
CVPR 2006 New York City Granularity and Elasticity Adaptation in Visual Tracking Ming Yang, Ying Wu NEC Laboratories America Cupertino, CA 95014
Optimal Adaptation for Statistical Classifiers Xiao Li.
Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.
REALTIME OBJECT-OF-INTEREST TRACKING BY LEARNING COMPOSITE PATCH-BASED TEMPLATES Yuanlu Xu, Hongfei Zhou, Qing Wang*, Liang Lin Sun Yat-sen University,
Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.
Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE
Tracking by Sampling Trackers Junseok Kwon* and Kyoung Mu lee Computer Vision Lab. Dept. of EECS Seoul National University, Korea Homepage:
Visual Tracking with Online Multiple Instance Learning
A General Framework for Tracking Multiple People from a Moving Camera
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Boris 2 Boris Babenko 1 Ming-Hsuan Yang 2 Serge Belongie 1 (University of California, Merced, USA) 2 (University of California, San Diego, USA) Visual.
Pedestrian Detection and Localization
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Latent SVM 1 st Frame: manually select target Find 6 highest weighted areas in template Area of 16 blocks Train 6 SVMs on those areas Train 1 SVM on entire.
A Codebook-Free and Annotation-free Approach for Fine-Grained Image Categorization Authors Bangpeng Yao et al. Presenter Hyung-seok Lee ( 이형석 ) CVPR 2012.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
Human Re-identification by Matching Compositional Template with Cluster Sampling Yuanlu Xu 1, Liang Lin 1, Wei-Shi Zheng 1, Xiaobai Liu 2 Abstract This.
University of Montreal & iMAGIS A Light Hierarchy for Fast Rendering of Scenes with Many Lights E. Paquette, P. Poulin, and G. Drettakis.
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.
Recognition Using Visual Phrases
Week 10 Emily Hand UNR.
Hsu-Yung Cheng, Member, IEEE, Chih-Chia Weng, and Yi-Ying Chen.
Week 4 Emily Hand UNR. Basic Tracking Framework Template Tracking – Manually Select Template – Correlation tracking Densely scan frame and compute histograms.
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Strong Supervision From Weak Annotation Interactive Training of Deformable Part Models ICCV /05/23.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
Week 3 Emily Hand UNR. Online Multiple Instance Learning The goal of MIL is to classify unseen bags, instances, by using the labeled bags as training.
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
Karel Lebeda, Simon Hadfield, Richard Bowden
Data Driven Attributes for Action Detection
Adversarial Learning for Neural Dialogue Generation
Generative Adversarial Networks
Object Matching Using a Locally Affine Invariant and Linear Programming Techniques - H. Li, X. Huang, L. He Ilchae Jung.
Nonparametric Semantic Segmentation
Object detection as supervised classification
Graph matching algorithms
Janardhan Rao (Jana) Doppa, Alan Fern, and Prasad Tadepalli
Random Sampling over Joins Revisited
Aviv Rosenberg 10/01/18 Seminar on Experts and Bandits
Combining Geometric- and View-Based Approaches for Articulated Pose Estimation David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory.
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
On-going research on Object Detection *Some modification after seminar
Outline Background Motivation Proposed Model Experimental Results
Introduction to Object Tracking
Clustering appearance and shape by Jigsaw, and comparing it with Epitome. Papers (1) Clustering appearance and shape by learning jigsaws (2006 NIPS) (2)
A Graph-Matching Kernel for Object Categorization
Presentation transcript:

Part-based visual tracking with online latent structural learning -Rui Yao et al. ICCV 2013 Cvlab Jung ilchae

Abstract Part based tracking On-line structural SVM training Two stage training

2.1 representation 𝐵 𝑡 =𝑏𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑏𝑜𝑥 𝑏 𝑡 𝑖 = 𝑖 𝑡ℎ 𝑝𝑎𝑟𝑡 𝑏𝑜𝑥⇒(c,r,h,w) 𝑦 𝑡 =𝑏𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑏𝑜𝑥 𝑜𝑓𝑓𝑠𝑒𝑡 𝑧 𝑡 𝑖 =𝑝𝑎𝑟𝑡 𝑏𝑜𝑥 𝑜𝑓𝑓𝑠𝑒𝑡⇒ ∆𝑐,∆𝑟,∆𝑤,∆ℎ Φ 𝑥 𝑡 , 𝑦, 𝑧 = [ 𝜙 1 𝑥 𝑡 , 𝑧 1 , 𝜙 1 𝑥 𝑡 , 𝑧 2 ,⋅⋅⋅ 𝜙 1 𝑥 𝑡 , 𝑧 𝑀 , 𝜙 2 𝑥 𝑡 , 𝑦 , 𝜙 3 𝑦, 𝑧 1 , 𝜙 3 𝑦, 𝑧 2 ⋅⋅⋅ 𝜙 3 𝑦, 𝑧 𝑀 ] 𝑏 𝑡 1 𝐵 𝑡 𝑏 𝑡 2 𝑏 𝑡 3 𝑏 𝑡 4 𝜙 1 ()=𝐴𝑝𝑝𝑒𝑎𝑟𝑎𝑛𝑐𝑒 𝑚𝑜𝑑𝑒𝑙 𝑓𝑜𝑟 𝑝𝑎𝑟𝑡 𝑏𝑜𝑥 𝜙 2 ()=𝐴𝑝𝑝𝑒𝑎𝑟𝑎𝑛𝑐𝑒 𝑚𝑜𝑑𝑒𝑙 𝑓𝑜𝑟 𝑏𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑏𝑜𝑥 𝜙 3 ()=𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑏𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑏𝑜𝑥 𝑎𝑛𝑑 𝑝𝑎𝑟𝑡 𝑏𝑜𝑥 𝜙 1 (), 𝜙 2 ()=𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑑 𝑓𝑟𝑜𝑚 2𝑠𝑐𝑎𝑙𝑒𝑠, 6 𝑡𝑦𝑝𝑒 ℎ𝑎𝑎𝑟−𝑙𝑖𝑘𝑒 𝑚𝑎𝑠𝑘𝑠 𝜙 3 ()= 𝑎, 𝑎 2 𝑠.𝑡 𝑎=𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝐵 𝑡 𝑎𝑛𝑑 𝑏 𝑡 𝑖

Framework Finding target with 𝑤 𝑡 Training S-SVM sampling true Training S-SVM true sampling Finding target with sampling near 𝐵 𝑡−1 , 𝑏 𝑡−1 𝑖 𝑖=1,,𝑀 𝑦 𝑡 ∗ , 𝑧 𝑡 ∗ = arg max 𝑦,𝑧 𝑤 𝑡 Φ 𝑥 𝑡 , 𝑦, 𝑧 Sampling training data near the new target Training structured SVM to maximize target’s score

2.2 latent pegasos for training online 𝑤 𝑡+1 = arg min{ 𝑤 𝜆 2 𝑤 2 + 1 𝑁 𝑖=1 𝑁 ∆ 𝑦 𝑡 , 𝑦 𝑡,𝑖 + max 𝑧′ <𝑤,Φ( 𝑥 𝑡 , 𝑦 𝑡,𝑖 , 𝑧 ′ )>− max 𝑧 <𝑤,Φ( 𝑥 𝑡 , 𝑦 𝑡 , 𝑧 ′ )> + } ∆ 𝑦 𝑡 ,𝑦 =1− ( 𝐵 𝑡−1 + 𝑦 𝑡 )∩( 𝐵 𝑡−1 +𝑦) ( 𝐵 𝑡−1 + 𝑦 𝑡 )∪( 𝐵 𝑡−1 +𝑦) 𝑎 + =max⁡(0,𝑎) Find 𝑤 by gradient descent algorithm 𝑤 𝑡+1 ← 1− 𝜂 𝑡 𝜆 𝑤 𝑡 + 𝜂 𝑡 𝑁 𝑖=1 𝑀 1[ max 𝑧 ′ 𝑓( 𝑥 𝑡 , 𝑦 𝑡,𝑖 , 𝑧 ′ ; 𝑤 𝑡 ) − max 𝑧 𝑓 𝑥 𝑡 , 𝑦 𝑡 ,𝑧; 𝑤 𝑡 +∆ 𝑦 𝑡 , 𝑦 𝑡,𝑖 >0 ] 𝛿Φ t y t 𝛻 𝑡 =𝜆 𝑤 𝑡 − 1 𝑁 𝑖=1 𝑁 1 max 𝑧 ′ 𝑓( 𝑥 𝑡 , 𝑦 𝑡,𝑖 , 𝑧 ′ ; 𝑤 𝑡 − max 𝑧 𝑓 𝑥 𝑡 , 𝑦 𝑡 ,𝑧; 𝑤 𝑡 +∆ 𝑦 𝑡 , 𝑦 𝑡,𝑖 >0 ]δ Φ t y t 𝑠.𝑡 𝑧 = arg max 𝑧 𝑓 𝑥 𝑡 , 𝑦 𝑡 ,𝑧; 𝑤 𝑡 , 𝑧′ = arg max 𝑧′ 𝑓( 𝑥 𝑡 , 𝑦 𝑡,𝑖 , 𝑧 ′ ; 𝑤 𝑡 ) δ Φ t y t =Φ( 𝑥 𝑡 , 𝑦 𝑡 , 𝑧 )-Φ( 𝑥 𝑡 , 𝑦 𝑡 , 𝑧′ )

2.2 latent pegasos for training online The label cost ∆ does not take into account the part boxes ∆ 𝑦 𝑡 ,𝑦 =1− ( 𝐵 𝑡−1 + 𝑦 𝑡 )∩( 𝐵 𝑡−1 +𝑦) ( 𝐵 𝑡−1 + 𝑦 𝑡 )∪( 𝐵 𝑡−1 +𝑦)

3. Two stage training Stage 1. Update 𝑢 𝑡+1 𝑖 𝑖=1,,𝑀 for part boxes 𝑢 𝑡+1 𝑗 = arg min 𝑢 𝑗 𝜆 2 𝑢 𝑗 2 + 1 𝑁 𝑘=1 𝑁 ∆ 𝑧 𝑡 , 𝑧 𝑡,𝑘 𝑗 +< 𝑢 𝑗 ,Φ 𝑥 𝑡 , 𝑧 𝑡,𝑘 𝑗 >−< 𝑢 𝑗 ,Φ( 𝑥 𝑡 , 𝑧 𝑡 𝑗 )> + Stage 2. Update 𝑣 𝑡+1 𝑖 𝑖=0,,, bounding box 𝑣 𝑡+1 ← 1− 𝜂 𝑡 𝜆 𝑣 𝑡 + 𝜂 𝑡 𝑁 𝑖=1 𝑀 1 max 𝑧 ′ 𝑓( 𝑥 𝑡 , 𝑦 𝑡,𝑖 , 𝑧 ′ ; 𝑣 𝑡 − max 𝑧 𝑓 𝑥 𝑡 , 𝑦 𝑡 ,𝑧; 𝑣 𝑡 +∆ 𝑦 𝑡 , 𝑦 𝑡,𝑖 >0]𝛿 Φ t y t δ Φ t y t =Φ( 𝑥 𝑡 , 𝑦 𝑡 , 𝑧 )-Φ( 𝑥 𝑡 , 𝑦 𝑡 , 𝑧′ )

3. Two stage training

Another problem Part box initialization

Another problem Tracking of a non-rigid object via patch based dynamic appearance modeling and adaptive Basin hopping Monte Carlo Sampling –CVPR 09’ Part box initialization This Paper Is sufficiently Big part-box advantageous?

3. Result

3. Result

3. Experiment

contribution Strong at Partial occlusion & shape deformatation Online learning latent SVM 2 stage training -> more accurate

Discussion No accumulation of positive targets Problems of this paper No accumulation of positive targets Restriction of fixed size of bounding box Problem of part based tracking Part initialization – location, size Relations between bounding box and part boxes

Feedback My recent work: Tracking with part graph matching - Part box initialization - Feature Change : size, or others - Definition of relation between bounding box and part box