Download presentation
Presentation is loading. Please wait.
Published byEverett Bates Modified over 9 years ago
1
Empowering visual categorization with the GPU Present by 陳群元 我是強壯 !
2
outline 我是強壯 ! Introduction Overview of visual categorization Image feature extraction Category model learning Test image classification GPU accelerated categorization Experimental setup Results
3
introduction 我是強壯 ! Use GPU accelerate the quantization and classification components of a visual categorization architecture The algorithms and their implementations should push the state-of-the-art in categorization accuracy. Visual categorization must be decomposable into components to locate bottlenecks. Given the same input, implementations of a component on various hardware architectures must give the same output.
4
overview 我是強壯 !
6
Visual categorization system 我是強壯 ! Image Feature Extraction Category Model Learning Test Image Classification
7
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
8
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
9
Point sampling strategy 我是強壯 ! Dense sampling Typically, around10,000 points are sampled per image Salient point method Harris-Laplace salient point detector [29] Difference-of-Gaussians detector [28]
10
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
11
Descriptors 我是強壯 ! SIFT descriptor ->128 dim 10 frames per second for 640x480 images(GPU) SURF descriptor 100 frames per second for 640x480 images(GPU) ColorSIFT descriptor ->384 dim Triple of SIFT
12
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
13
Bag-of-words 我是強壯 ! Vector quantization is computationally the most expensive part of the bag-of-words model. Bag -> images set Words->features
14
Bag-of-words 我是強壯 ! N descriptors of length d in an image codebook with m elements O(ndm) per image A tree-based codebook O(nd log(m))->real-time on the GPU [25].
15
我是強壯 !
16
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
17
Category model learning 我是強壯 ! precompute kernel function values kernel-based SVM algorithm
18
我是強壯 !
19
Support Vector Machines Kernel Support Vector Machines
20
Visual categorization system 我是強壯 ! Image Feature Extraction Point Sampling Strategy Descriptor Computation Bag-of-Words Category Model Learning Test Image Classification
21
Test image classification 我是強壯 !
23
outline 我是強壯 ! Introduction Overview of visual categorization Image feature extraction Category model learning Test image classification GPU accelerated categorization Parallel Programming on the GPU and CPU GPU-Accelerated Vector Quantization GPU-Accelerated Kernel Value Precomputation Experimental setup Results
24
Parallel Programming on the GPU and CPU 我是強壯 ! SIMD instructions perform the same operation on multiple data elements at the same time
25
我是強壯 !
26
GPU-Accelerated Vector Quantization 我是強壯 ! The most expensive computational step in vector quantization is the calculation of the distance matrix.(n*m) A:n*d matrix with all image descriptors as rows B:m*d matrix with all codebook elements as rows
27
GPU-Accelerated Vector Quantization(cont.) 我是強壯 !
28
GPU-Accelerated Vector Quantization(cont.) 我是強壯 ! Compute the dot products between all rows of A and B (line 7). matrix multiplications are the building block for many algorithms highly optimized BLAS linear algebra libraries containing this operation exist for both the CPU and the GPU.
29
我是強壯 !
30
GPU-Accelerated Kernel Value Precomputation 我是強壯 ! To compute kernel function values, we use the kernel function based on the distance distance between feature vectors F and F’ kernel function based on this distance
31
GPU-Accelerated Kernel Value Precomputation(cont.) 我是強壯 ! multiple input features For kernel value precomputation, memory usage is an important problem. for a dataset with 50, 000 images, the input data is 12 GB and the output data is 19 GB to avoid holding all data in memory simultaneously. We divide the processing into evenly sized chunks.(1024*1024)
32
GPU-Accelerated Kernel Value Precomputation(cont.) 我是強壯 !
33
EXPERIMENTAL SETUP 我是強壯 ! Experiment 1: Vector Quantization Speed CPU implementation is SIMD-optimized. codebook of size m = 4, 000 20, 000 descriptors per image descriptor lengths of d = 128 (SIFT) and d = 384 (ColorSIFT). Experiment 2: Kernel Value Precomputation Speed chosen the large Mediamill Challenge training set of 30, 993 frames Experiment 3: Visual Categorization Throughput comparison is made between the quad-core Core i7 920 CPU (2.66GHz) and the Gefore GTX260 GPU (27 cores).
34
Results 我是強壯 ! Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation Speed Experiment 3: Visual Categorization Throughput
35
Results 我是強壯 ! Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation Speed Experiment 3: Visual Categorization Throughput
36
Vector Quantization Speed(SIFT) 我是強壯 !
37
Vector Quantization Speed(ColorSIFT) 我是強壯 !
38
Results 我是強壯 ! Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation Speed Experiment 3: Visual Categorization Throughput
39
Kernel Value Precomputation Speed 我是強壯 !
40
Results 我是強壯 ! Experiment 1: Vector Quantization Speed Experiment 2: Kernel Value Precomputation Speed Experiment 3: Visual Categorization Throughput
41
Visual Categorization Throughput 我是強壯 !
42
Other applications 我是強壯 ! Application 1: k-means Clustering Application 2: Bag-of-Words Model for Text Retrieval Application 3: Multi-Frame Processing for Video Retrieval
43
Conclusions 我是強壯 ! This paper provides an efficiency analysis of a state-of-the art visual categorization pipeline based on the bag-of- words model. two large bottlenecks were identified: the vector quantization step in the image feature extraction and the kernel value computation in the category classification Compared to a multi-threaded CPU implementation on a quad-core CPU, the GPU is 4.8 times faster.
44
The end 我是強壯 ! Thank you!
45
Conclusions 我是強壯 ! This paper provides an efficiency analysis of a state-of-the art visual categorization pipeline based on the bag-of- words model. two large bottlenecks were identified: the vector quantization step in the image feature extraction and the kernel value computation in the category classification GPU implementation of vector quantization, it is 3.9 times faster than when it is computed on a modern quad- core CPU. precomputing these kernel values on the GPU instead of a quad-core CPU accelerates it by a factor of 10.
46
Conclusion(cont.) 我是強壯 ! Overall, by using a parallel implementation on the GPU, classifying unseen images is 17 times faster than a singlethreaded CPU version Compared to a multi-threaded CPU implementation on a quad-core CPU, the GPU is 4.8 times faster.
47
Kernel svm 我是強壯 ! http://crsouza.blogspot.com/2010/03/kernel-functions-for- machine-learning.html#kernel_method http://crsouza.blogspot.com/2010/03/kernel-functions-for- machine-learning.html#kernel_method http://tw.myblog.yahoo.com/jw!.3IXVZqLEQ5cTBMyDRvv mIM-/article?mid=25 http://tw.myblog.yahoo.com/jw!.3IXVZqLEQ5cTBMyDRvv mIM-/article?mid=25 http://crsouza.blogspot.com/2010/04/kernel-support- vector-machines-for.html
48
我是強壯 !
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.