Download presentation
Presentation is loading. Please wait.
Published byGavin Carroll Modified over 8 years ago
1
Sreekanth Vempati ( 200402044 ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification and detection 114 December 2010
2
Large Visual Data 2 Cheap capturing, storage and internet devices
3
Rapid 3 Video sharing Image sharing Rapid growth in the amount of data available In the case of youtube
4
Problems Scene/Object Classification – Find specified categories of scenes/objects 4 Is there a bus in this image? Output the bounding box of the bus in this image Object Detection – Find the location of specified categories of scenes/objects Is there a a demonstration/protest in this image?
5
Challenges 5 Intra class variations Inter class similarity Ex: Boat/Ship category Protest FlowersCityscape
6
Challenges 6 View Point variation Occlusions/Truncations
7
Scalability We need solutions which can be scalable to large amount of data For example, if we have to test 1,40,000 images For best performance – Feature representation (Visual words based) 6300 dimensions takes ~50 seconds ->total time would be ~57 days – Classification (SVM with non-linear kernel) 20 classes 3 images/second, a total time of ~ 10 days 7
8
Overview Large scale semantic concept retrieval in videos Modeling subcategories Efficient detection by using GRBF feature maps Conclusions 8
9
1. Semantic video retrieval 9 Given a large set of videos, retrieve the videos of specific category – Ex: Find all the videos containing soccer
10
Training Testing Overview of the approach Feature Extraction Ex: PHOW, PHOG, GIST Classifier Ex: SVM, Random Forests Annotated Video Frames Unseen Videos Example Videos Feature Extraction Ranked Shots 10
11
Features GIST – Torralba et. al IJCV 01 – Image divided into m x m grid – For each cell, a set of filters (different scales, orientations) are applied – Final descriptor: Average of the filter responses over all blocks 11 Images from “Image Classification for large number of object categories”, Anna Bosch, 2006
12
Features 12 Images from “Image Classification for large number of object categories”, Anna Bosch, 2006 Pyramid Histogram of Oriented Gradients
13
Vector Quantization Pyramid Histogram of Visual Words Scale Invariant Feature Transform 13 “Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories.”, S. Lazebnik et. al CVPR 2006 Using dense SIFT descriptors
14
Support Vector Machines (SVM) w t (x) + b = 0 w t (x) + b = +1 w t (x) + b = -1 b w Support Vector Misclassified point = 0 < 1 X i i = 1,..…..,N y i i = 1,……,N
15
SVM formulation Evaluation function f(x) = w t x + b
16
Kernel Trick Use a function which maps input space to feature space. And then build the classifier in feature space.
17
Dot product in feature space Moving to different space f(x) = w t x + b = i i y i + b
18
Replace it with kernel function Kernelizing SVMs
19
Kernels Linear : Polynomial : Intersection kernel Generalized RBF kernel : Weighted combination of multiple kernels
20
TRECVID competition Objective : Rank video shots based on the presence of given concept Participated in High level feature extraction, TRECVID Organized by NIST, USA 2008: around 180 submissions by 40 teams from all over the world 20
21
Some of the classes High-level Feature Extraction o Mountain o Hand o Street o Telephone o Flower o Bridge o Airplane flying o Boat/Ship o Bus o Dog o Cityscape o Classroom o Driver o Two People o Emergency Vehicle o Harbor o Kitchen o Nighttime o Singing o Demonstration/Protest 21
22
Data Statistics 22 Evaluation Measure Average Precision - Area under Precision-Recall curve
23
Our Approach Performance compared using different features and SVM parameters – Use of PHOW with Intersection kernel is efficient – Testing is very fast, with little drop in performance Testing time: ~2lakh frames in 10 seconds 23 “Classification using Intersection kernel SVMs is efficient”, A. Berg et. al, CVPR 2009
24
Variation with features 24
25
Variation with kernels 25
26
Results 26 More Results
27
1. Summary Method of visual concept retrieval suitable for large scale data PHOW with fast intersection kernel is very much useful 27
28
2. Modeling subcategories 28
29
Subcategories in real world 29
30
What we achieved? 30
31
Structural SVM vs SVM 31 - Joint feature map between input and output -Allows the output label to be a complex variable -Our case: Use as a combination of category and subcategory labels “Support Vector Learning for Interdependent and Structured Output Spaces”, I. Tsochantaridis,, et. al ICML 04
32
Use of latent variables 32 “Learning structural SVMs with latent variables”, C. N. Yu et. al ICML 2009
33
Toy Datasets 33
34
Real world datasets 34 TRECVID 2009 dataset PASCAL VOC (Visual Object Categorization) 2007 – Object Detection dataset
35
Results on TRECVID dataset 35
36
Improvement with latent SVM 36
37
Effect of no. of subclasses 37
38
2. Summary Method for modeling of subcategories using structural SVM Application of latent structural SVM for further improvements Improved the performance of linear kernel Performed various experiments on toy and real data 38
39
3. Generalized RBF feature maps for Efficient Detection 39
40
Object Detection aeroplane horse bicycle car cow motorbike 40
41
Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results 41
42
General Framework for detection Feature representations Classifier (Ex: SVM ) “Multiple Kernel Learning for Object Detection”, Vedaldi et. al, ICCV 2009, “Cascade Object Detection with Deformable Part Models”, Felzenszwalb et. al, CVPR 2010, Ex: Car Non-linear SVM Linear SVM Any Image 42
43
Linear SVM Additive kernels Generalized RBF kernels Ex: intersection Kernel Ex: exp- kernel Kernels faster more discriminative Fast Linear SVMs Stochastic SVM ( PEGASOS ) Primal SVM (liblinear) One-slack SVM ( SVM-perf ) 43
44
Kernels Problem: GRBF kernels with high computational complexity are required to get good performance Our Solution: Approximate Generalized RBF kernels with a linear one by using a feature map 44
45
A kernel is a dot product in a high dimensional feature space Define a feature map approximating the kernel Speeding up non-linear SVMs 45
46
Explicit feature maps Feature maps for RBF/multiplicative kernels – [Rahmi and Recht, NIPS 07] – [ F. Li et. al DAGM 2010] Feature maps for additive kernels – [Maji and Berg, ICCV 09] – [Vedaldi and Zisserman, CVPR 2010] – [Perronin, et. al CVPR 2010] Our Contribution Feature maps for generalized RBF kernels 2X to 3X speedup (only a little drop in performance) 46
47
Part3: Outline Introduction: Kernels and Feature maps Explicit feature maps for GRBF kernels Experiments & Results 47
48
Additive kernels Examples:, Intersection Hellinger’s, kernel 48
49
Additive Kernel Maps approximated by sampling Feature maps for additive kernels [ Vedaldi & Zisserman 10 ]: closed form function “Efficient Additive Kernels via Explicit Feature Maps”, A. Vedaldi and A. Zisserman, CVPR 2010 49
50
Random Fourier features [Rahimi & Recht 07] Feature maps for RBF kernels “Random Features for Large-Scale Kernel Machines”, Ali Rahimi, Ben Recht NIPS 2007 50
51
Generalized RBF kernels Definition Trick: In terms of feature map Example for distance: kernel distance 51
52
GRBF feature maps algorithm 52
53
Outline Introduction: Kernels and Feature maps Explicit feature maps for generalized RBF kernels Experiments & Results 53
54
Experimental Setup PASCAL VOC (Visual Object Categorization) 2007 – Object Detection dataset – 20 object categories Cascade of classifiers [Vedaldi et. al ICCV 2009] exp-chi2 kernel SVM Linear SVM Additive Kernel SVM PHOG features exp-chi2 feature map SVM 54
55
Approximate vs Exact Kernels Average precision but testing time increases with number of projections. 55
56
Approximate vs Exact Kernels Average Precision & Testing Time increase with number of projections 56
57
Large number of projections required for good performance Additional improvement in testing time SVM SPARSE L1-Regularized L2 - loss function LR SPARSE L1-Regularized Logistic Regression –loss function 57
58
Speedup with l 1 regularization 58
59
Effect of C on Sparsity Smaller C gives a sparser solution with only a slight drop in performance Parameter to control sparsity: SVM parameter C Recall SVM objective function 59
60
Example Results 60
61
Results on all the 20 categories SVM dense is faster than exact exp- and performs better than 61
62
Results on all the 20 categories LR sparse is 2 to 3 times faster than exact exp- and performs better than 62
63
3. Summary Feature maps for generalized RBF kernels Method for reducing the number of projections Results on VOC 2007: – nearly 2x to 3x speedup with a slight loss in performance 63
64
Conclusions Proposed efficient methods based on SVM for visual scene/object categorization and detection Validated these methods on a large amount of data Further: Porting these techniques on to GPUs, including time information for improvement of average precision. 64 AeroplaneMotorbike
65
Publications Generalized RBF feature maps for efficient detection, Sreekanth Vempati, Andrea Vedaldi, Andrew Zisserman, C. V. Jawahar 21st British Machine Vision Conference (BMVC), 2010 (Oral Presentation), Aberystwyth, UK 2009Oxford/IIIT - TRECVID 2009 - Notebook paper, Sreekanth Vempati, Mihir Jain, Omkar M. Parkhi, C. V. Jawahar, Andrea Vedaldi, Marcin Marszalek, Andrew Zisserman TRECVID 2009 Workshop, Gaithersburg, Md., USA. 2008Oxford/IIIT - TRECVID 2008 - Notebook paper, James Philbin, Manuel Marin-Jimenez, Siddharth Srinivasan and Andrew Zisserman, Mihir Jain, Sreekanth Vempati, Pramod Sankar and C. V. Jawahar TRECVID 2008 Workshop, Gaithersburg, Md., USA. 65
66
Thank You 66
67
Object Detection Should we put our results or this groundtruth? 67
68
GRBF-Algorithm 1.Compute the approximate feature map corresponding to the additive kernel 68
69
GRBF-Algorithm 1.Compute the approximate feature map corresponding to the additive kernel 2.Compute the RBF feature map using as the input vector 69
70
Choice of features PHOG features are used in our experiments –exp- performs better than Using Exact Kernels PHOG PHOW 70
71
Sparsity vs Performance 71
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.