Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification.

Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification and detection 114 December 2010

Large Visual Data 2 Cheap capturing, storage and internet devices

Rapid 3 Video sharing Image sharing Rapid growth in the amount of data available In the case of youtube

Problems Scene/Object Classification – Find specified categories of scenes/objects 4 Is there a bus in this image? Output the bounding box of the bus in this image Object Detection – Find the location of specified categories of scenes/objects Is there a a demonstration/protest in this image?

Challenges 5 Intra class variations Inter class similarity Ex: Boat/Ship category Protest FlowersCityscape

Challenges 6 View Point variation Occlusions/Truncations

Scalability We need solutions which can be scalable to large amount of data For example, if we have to test 1,40,000 images For best performance – Feature representation (Visual words based) 6300 dimensions takes ~50 seconds ->total time would be ~57 days – Classification (SVM with non-linear kernel) 20 classes 3 images/second, a total time of ~ 10 days 7

Overview Large scale semantic concept retrieval in videos Modeling subcategories Efficient detection by using GRBF feature maps Conclusions 8

1. Semantic video retrieval 9 Given a large set of videos, retrieve the videos of specific category – Ex: Find all the videos containing soccer

Training Testing Overview of the approach Feature Extraction Ex: PHOW, PHOG, GIST Classifier Ex: SVM, Random Forests Annotated Video Frames Unseen Videos Example Videos Feature Extraction Ranked Shots 10

Features GIST – Torralba et. al IJCV 01 – Image divided into m x m grid – For each cell, a set of filters (different scales, orientations) are applied – Final descriptor: Average of the filter responses over all blocks 11 Images from “Image Classification for large number of object categories”, Anna Bosch, 2006

Features 12 Images from “Image Classification for large number of object categories”, Anna Bosch, 2006 Pyramid Histogram of Oriented Gradients

Vector Quantization Pyramid Histogram of Visual Words Scale Invariant Feature Transform 13 “Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories.”, S. Lazebnik et. al CVPR 2006 Using dense SIFT descriptors

Support Vector Machines (SVM) w t (x) + b = 0 w t (x) + b = +1 w t (x) + b = -1 b w Support Vector Misclassified point  = 0  < 1 X i i = 1,..…..,N y i i = 1,……,N

SVM formulation Evaluation function f(x) = w t x + b

Kernel Trick Use a function which maps input space to feature space. And then build the classifier in feature space.

Dot product in feature space Moving to different space f(x) = w t x + b =  i  i y i + b

Replace it with kernel function Kernelizing SVMs

Kernels Linear : Polynomial : Intersection kernel Generalized RBF kernel : Weighted combination of multiple kernels

TRECVID competition Objective : Rank video shots based on the presence of given concept Participated in High level feature extraction, TRECVID Organized by NIST, USA 2008: around 180 submissions by 40 teams from all over the world 20

Some of the classes High-level Feature Extraction o Mountain o Hand o Street o Telephone o Flower o Bridge o Airplane flying o Boat/Ship o Bus o Dog o Cityscape o Classroom o Driver o Two People o Emergency Vehicle o Harbor o Kitchen o Nighttime o Singing o Demonstration/Protest 21

Data Statistics 22 Evaluation Measure Average Precision - Area under Precision-Recall curve

Our Approach Performance compared using different features and SVM parameters – Use of PHOW with Intersection kernel is efficient – Testing is very fast, with little drop in performance Testing time: ~2lakh frames in 10 seconds 23 “Classification using Intersection kernel SVMs is efficient”, A. Berg et. al, CVPR 2009

Variation with features 24

Variation with kernels 25

Results 26 More Results

1. Summary Method of visual concept retrieval suitable for large scale data PHOW with fast intersection kernel is very much useful 27

2. Modeling subcategories 28

Subcategories in real world 29

What we achieved? 30

Structural SVM vs SVM 31 - Joint feature map between input and output -Allows the output label to be a complex variable -Our case: Use as a combination of category and subcategory labels “Support Vector Learning for Interdependent and Structured Output Spaces”, I. Tsochantaridis,, et. al ICML 04

Use of latent variables 32 “Learning structural SVMs with latent variables”, C. N. Yu et. al ICML 2009

Toy Datasets 33

Real world datasets 34 TRECVID 2009 dataset PASCAL VOC (Visual Object Categorization) 2007 – Object Detection dataset

Results on TRECVID dataset 35

Improvement with latent SVM 36

Effect of no. of subclasses 37

2. Summary Method for modeling of subcategories using structural SVM Application of latent structural SVM for further improvements Improved the performance of linear kernel Performed various experiments on toy and real data 38

3. Generalized RBF feature maps for Efficient Detection 39

Object Detection aeroplane horse bicycle car cow motorbike 40

Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results 41

General Framework for detection Feature representations Classifier (Ex: SVM ) “Multiple Kernel Learning for Object Detection”, Vedaldi et. al, ICCV 2009, “Cascade Object Detection with Deformable Part Models”, Felzenszwalb et. al, CVPR 2010, Ex: Car Non-linear SVM Linear SVM Any Image 42

Linear SVM Additive kernels Generalized RBF kernels Ex: intersection Kernel Ex: exp- kernel Kernels faster more discriminative Fast Linear SVMs Stochastic SVM ( PEGASOS ) Primal SVM (liblinear) One-slack SVM ( SVM-perf ) 43

Kernels Problem: GRBF kernels with high computational complexity are required to get good performance Our Solution: Approximate Generalized RBF kernels with a linear one by using a feature map 44

A kernel is a dot product in a high dimensional feature space Define a feature map approximating the kernel Speeding up non-linear SVMs 45

Explicit feature maps Feature maps for RBF/multiplicative kernels – [Rahmi and Recht, NIPS 07] – [ F. Li et. al DAGM 2010] Feature maps for additive kernels – [Maji and Berg, ICCV 09] – [Vedaldi and Zisserman, CVPR 2010] – [Perronin, et. al CVPR 2010] Our Contribution Feature maps for generalized RBF kernels 2X to 3X speedup (only a little drop in performance) 46

Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results 47

Additive kernels Examples:, Intersection Hellinger’s, kernel 48

Additive Kernel Maps approximated by sampling Feature maps for additive kernels [ Vedaldi & Zisserman 10 ]: closed form function “Efficient Additive Kernels via Explicit Feature Maps”, A. Vedaldi and A. Zisserman, CVPR 2010 49

Random Fourier features [Rahimi & Recht 07] Feature maps for RBF kernels “Random Features for Large-Scale Kernel Machines”, Ali Rahimi, Ben Recht NIPS 2007 50

Generalized RBF kernels Definition Trick: In terms of feature map Example for distance: kernel distance 51

GRBF feature maps algorithm 52

Outline Introduction: Kernels and Feature maps Explicit feature maps for generalized RBF kernels Experiments & Results 53

Experimental Setup PASCAL VOC (Visual Object Categorization) 2007 – Object Detection dataset – 20 object categories Cascade of classifiers [Vedaldi et. al ICCV 2009] exp-chi2 kernel SVM Linear SVM Additive Kernel SVM PHOG features exp-chi2 feature map SVM 54

Approximate vs Exact Kernels Average precision but testing time increases with number of projections. 55

Approximate vs Exact Kernels Average Precision & Testing Time increase with number of projections 56

Large number of projections required for good performance Additional improvement in testing time SVM SPARSE L1-Regularized L2 - loss function LR SPARSE L1-Regularized Logistic Regression –loss function 57

Speedup with l 1 regularization 58

Effect of C on Sparsity Smaller C gives a sparser solution with only a slight drop in performance Parameter to control sparsity: SVM parameter C Recall SVM objective function 59

Example Results 60

Results on all the 20 categories SVM dense is faster than exact exp- and performs better than 61

Results on all the 20 categories LR sparse is 2 to 3 times faster than exact exp- and performs better than 62

3. Summary Feature maps for generalized RBF kernels Method for reducing the number of projections Results on VOC 2007: – nearly 2x to 3x speedup with a slight loss in performance 63

Conclusions Proposed efficient methods based on SVM for visual scene/object categorization and detection Validated these methods on a large amount of data Further: Porting these techniques on to GPUs, including time information for improvement of average precision. 64 AeroplaneMotorbike

Publications Generalized RBF feature maps for efficient detection, Sreekanth Vempati, Andrea Vedaldi, Andrew Zisserman, C. V. Jawahar 21st British Machine Vision Conference (BMVC), 2010 (Oral Presentation), Aberystwyth, UK 2009Oxford/IIIT - TRECVID 2009 - Notebook paper, Sreekanth Vempati, Mihir Jain, Omkar M. Parkhi, C. V. Jawahar, Andrea Vedaldi, Marcin Marszalek, Andrew Zisserman TRECVID 2009 Workshop, Gaithersburg, Md., USA. 2008Oxford/IIIT - TRECVID 2008 - Notebook paper, James Philbin, Manuel Marin-Jimenez, Siddharth Srinivasan and Andrew Zisserman, Mihir Jain, Sreekanth Vempati, Pramod Sankar and C. V. Jawahar TRECVID 2008 Workshop, Gaithersburg, Md., USA. 65

Thank You 66

Object Detection Should we put our results or this groundtruth? 67

GRBF-Algorithm 1.Compute  the approximate feature map corresponding to the additive kernel 68

GRBF-Algorithm 1.Compute  the approximate feature map corresponding to the additive kernel 2.Compute  the RBF feature map using as the input vector 69

Choice of features PHOG features are used in our experiments –exp- performs better than Using Exact Kernels PHOG PHOW 70

Sparsity vs Performance 71

Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification.

Similar presentations

Presentation on theme: "Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification.

Similar presentations

Presentation on theme: "Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification."— Presentation transcript:

Similar presentations

About project

Feedback