Convolutional Neural Networks for Visual Tracking

Slides:



Advertisements
Similar presentations
DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
Advertisements

HMAX Models Architecture Jim Mutch March 31, 2010.
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
Large-Scale Object Recognition with Weak Supervision
Spatial Pyramid Pooling in Deep Convolutional
From R-CNN to Fast R-CNN
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Image Feature Learning for Cold Start Problem in Display Advertising
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
A General Framework for Tracking Multiple People from a Moving Camera
Video Tracking Using Learned Hierarchical Features
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.
Skeleton Based Action Recognition with Convolutional Neural Network
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Feedforward semantic segmentation with zoom-out features
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Learning to Answer Questions from Image Using Convolutional Neural Network Lin Ma, Zhengdong Lu, and Hang Li Huawei Noah’s Ark Lab, Hong Kong
City Forensics: Using Visual Elements to Predict Non-Visual City Attributes Sean M. Arietta, Alexei A. Efros, Ravi Ramamoorthi, Maneesh Agrawala Presented.
Hierarchical Convolutional Features for Visual Tracking Chao Ma , SJTU Jia-Bin Huang , UIUC Xiaokang Yang , SJTU Ming-Hsuan Yang , UC Merced.
Deep Learning for Dual-Energy X-Ray
Convolutional Neural Network
Summary of “Efficient Deep Learning for Stereo Matching”
Deep Neural Net Scenery Generation
Object Detection based on Segment Masks
Automatic Lung Cancer Diagnosis from CT Scans (Week 2)
Object Classification through Deconvolutional Neural Networks
Deep Reinforcement Learning
From Vision to Grasping: Adapting Visual Networks
Computer Science and Engineering, Seoul National University
Perceptual Loss Deep Feature Interpolation for Image Content Changes
Intelligent Information System Lab
Adversarially Tuned Scene Generation
Introduction to Deep Learning for neuronal data analyses
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
By: Kevin Yu Ph.D. in Computer Engineering
Computer Vision James Hays
Principles of using neural networks for predicting molecular traits from DNA sequence Principles of using neural networks for predicting molecular traits.
CNNs and compressive sensing Theoretical analysis
Bilinear Classifiers for Visual Recognition
Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Vessel Extraction in X-Ray Angiograms Using Deep Learning
March 2017 Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs) Submission Title: Deep Leaning Method for OWC Date Submitted:
Introduction of MATRIX CAPSULES WITH EM ROUTING
Object Classification through Deconvolutional Neural Networks
Hairong Qi, Gonzalez Family Professor
CSC 578 Neural Networks and Deep Learning
KFC: Keypoints, Features and Correspondences
LECTURE 33: Alternative OPTIMIZERS
Patch-Based Image Classification Using Image Epitomes
Convolutional Neural Networks
Graph Neural Networks Amog Kamsetty January 30, 2019.
Introduction to Object Tracking
Related Work in Camera Network Tracking
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Inception-v4, Inception-ResNet and the Impact of
Heterogeneous convolutional neural networks for visual recognition
Human-object interaction
U-Net: Convolutional Network for Segmentation
Unrolling the shutter: CNN to correct motion distortions
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
CSC 578 Neural Networks and Deep Learning
SDSEN: Self-Refining Deep Symmetry Enhanced Network
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Convolutional Neural Networks for Visual Tracking Computer Vision Lab. 남현섭

Contents Convolutional Neural Networks Tracking by CNN J. Fan, et al., Human tracking using convolutional neural networks, Neural Networks, IEEE Transactions on, 2010 H. Li, et al., DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking, BMVC, 2014 On-going research

Convolutional Neural Network

J. Fan, et al., Human tracking using convolutional neural networks, Neural Networks, IEEE Transactions on, 2010

Contributions Learn both spatial and temporal features from image pairs of two adjacent images. Use multiple path ways in CNN to fuse local and global information. Use Shift-variant CNN architecture to alleviate the drift problem to distracting objects.

CNN Architecture

Shift-Variant Architecture Shift-invariant Shift-variant

Handling Scale Change

Results temporal&spatial features spatial features only global&local branch, shift-variant global branch only local branch only Shift-invariant

Results

Results

H. Li, et al., DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking, BMVC, 2014

Contributions A candidate pool of multiple CNNs => temporal adaptation Structural loss function => large, reliable training examples Class-specific tracking => Combine class-level detector and instance-level tracker

CNN Architecture

Structural Loss Function Traditional loss function Structural loss function Structural importance CNN loss overlapping ratio => Can use the training samples with high importance to avoid class ambiguity.

Online Learning: A Coordinate-Descent => Reduce overfitting, increase training speed

Temporal Adaptation With a CNN Pool

Temporal Adaptation With a CNN Pool Can accommodate as many as possible appearance variations without learning an ensemble of CNNs of a very complicated CNN Can explicitly refine the model pool and discard unreliable CNNs

Class-Specific Tracking Combine the class-level detector and the instance-level tracker

Results

Results

Results – Class Specific Tracking

Observations Need to combine low-level and high-level information. Deep CNN features lack of exact localization ability. Learning a CNN with few examples leads an overfitting problem.

On-Going Research Learning a CNN Probability map Re-initialize