Download presentation
Presentation is loading. Please wait.
Published byCamron Todd Modified over 9 years ago
1
Gang WangDerek HoiemDavid Forsyth
2
INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION
3
Using online photo sharing sites → Flickr(Group) Determine which image are similar, how they are similar Learn these Group membership likelihoods Due to the time that it would take to learn categories Propose a new method for stochastic learning of SVMs using Histogram Intersection Kernel (HIK) SIKMA Combine with [14] and [18]
4
Related work Algorithm classes (train very large scale kernel SVM) i) Exploits the sparseness of the lagrange multipliers → SMO[22] ii) Use stochastic gradient descent without touching every example http://0rz.tw/BDHWJ Kivinen [14] → method applies to kernel machines Maji[18] → very quickly evaluating a histogram intersection kernel
5
Flickr provide an organizational structure How people like to group SIKMA classifier allows efficient and accurate learning of these categories This property generalizes well Even the test dataset was not obtained from Flickr
6
Suppose we have a list of training examples For the test example u The classification score
7
Approximate the gradient by replacing the sum over all examples(batch) with a sum over some subset, chosen at random. It is usual to consider a single example. New decision function It’s expensive to calculate f t-1. The NORMA Algo.[14] keeps a set of support vectors of fixed length by dropping the oldest ones. Doing so comes at a considerable cost in accuracy !
8
D is feature dimension
9
SIKMA Conventional SVM solver The Computational complexity O(TMD)O(T 2 D) The Space costO(MD) O(T 2 ) O(D) is Evaluation for each example T: # of training example M: # of quantization bins D: # of feature dimension
10
Measuring image similarity Found a simple Euclidean distance between the SVM outputs. Since we have names(groups), we can also perform text-based queries (get image like “people dancing”) and determine how two image are similar
11
Use four type of feature: SIFT feature Detect and describe local patches Gist feature 960 dimensions Gist descriptor Color feature RGB space, value range from 1 to 512 Gradient feature The whole image is represented as a 256 dimensional vector Combine the outputs of these four classifier to be a final prediction on a validation data set
12
For 103 Flickr categories, using 15,000 ~ 30,00 positive images and 60,000 negative images. The average AP over these categories is 0.433
14
Select top five negative examples and five randomly chosen positive examples from among the top 50 ranked images y i is 1 if it is positive, otherwise 0
18
Flickr category can be described with several word, we can support text- based queries. Input a word query find the Flickr group whose description contains such word Test this on the Corel data set, with two queries ”airplane” and “sunset”.
19
SIKMA, an algorithm to quickly train an SVM with the histogram intersection kernel using tens of thousands of training examples two images that are likely to belong to the same Flickr groups are considered similar. Experimental results show that matching with Prediction features better than matching with visual features
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.