ICMR Image Classification and Retrieval are ONE (Online NN Estimation)

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Aggregating local image descriptors into compact codes
Naïve-Bayes Classifiers Business Intelligence for Managers.
Three things everyone should know to improve object retrieval
Presented by Xinyu Chang
Query Specific Fusion for Image Retrieval
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Global and Efficient Self-Similarity for Object Classification and Detection CVPR 2010 Thomas Deselaers and Vittorio Ferrari.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc.
Dengsheng Zhang and Melissa Chen Yi Lim
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Recognition Using Visual Phrases
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Graph Indexing From managing and mining graph data.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Introduction to Machine Learning, its potential usage in network area,
Recent developments in object detection
Scalable Person Re-identification on Supervised Smoothed Manifold
Queensland University of Technology
How to forecast solar flares?
ALADDIN A Locality Aligned Deep Model for Instance Search
Deeply learned face representations are sparse, selective, and robust
An Image Database Retrieval Scheme Based Upon Multivariate Analysis and Data Mining Presented by C.C. Chang Dept. of Computer Science and Information.
Deep Learning Amin Sobhani.
Compact Bilinear Pooling
Bag-of-Visual-Words Based Feature Extraction
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
کاربرد نگاشت با حفظ تنکی در شناسایی چهره
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
ICCV Hierarchical Part Matching for Fine-Grained Image Classification
Data Mining (and machine learning)
Enhanced-alignment Measure for Binary Foreground Map Evaluation
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Efficient Image Classification on Vertically Decomposed Data
K Nearest Neighbor Classification
Scalability of Wireless Fingerprinting based
IEEE ICIP Feature Normalization for Part-Based Image Classification
CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes
Fine-Grained Visual Categorization
CornerNet: Detecting Objects as Paired Keypoints
Outline Background Motivation Proposed Model Experimental Results
Similarity Search: A Matching Based Approach
Actively Learning Ontology Matching via User Interaction
Scalable light field coding using weighted binary images
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Learning and Memorization
MEET-IP Memory and Energy Efficient TCAM-based IP Lookup
Li Li, Zhu Li, Vladyslav Zakharchenko, Jianle Chen, Houqiang Li
Presentation transcript:

ICMR 2015 Image Classification and Retrieval are ONE (Online NN Estimation) Speaker: Lingxi Xie Authors: Lingxi Xie1, Richang Hong2, Bo Zhang1, Qi Tian3 1Department of Computer Science and Technology, Tsinghua University 2School of Computer and Information, Hefei University of Technology 3Department of Computer Science, University of Texas at San Antonio Good afternoon everyone, this is Lingxi from Tsinghua University. Today I am very pleased to introduce my work “Image Classification and Retrieval are ONE”. Here, ONE not only stands for the name of our model, Online Nearest-neighbor Estimation, but also implies that we can unify the conventional approaches for image classification and retrieval into one algorithm.

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions Here is the outline of my talk. First, I will briefly introduce image classification and retrieval, as well as conventional BoVW models for solving them. Then, I will show the goal and motivation of this work, which is, unifying the models for classification and retrieval, and the advantages of doing this. The formulation of the ONE algorithm, including the analysis and acceleration techniques, forms the main part of this talk. After I show some promising experimental results, the conclusions will be drawn. 11/16/2018 ICMR 2015, Shanghai, China

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions Now let’s start from the introduction. 11/16/2018 ICMR 2015, Shanghai, China

Introduction: Image Classification BIRD DOG DOG FLOWER FLOWER Image Dataset Black-foot. Albatross Chihuahua daffodil Groove-billed Ani Siberian Husky snowdrop Bird-200 Dog-120 Flwr-102 Rhinoceros Auklet Golden Retriever Colts’ foot Image classification and retrieval are both fundamental problems in computer vision and multimedia communities. In image classification, we are given some image datasets, and aim at predicting the category of some test images, such as a bunch of FLOWERS and a pet DOG. Recent years, people are becoming more and more interested in the fine-grained object recognition, in which we need to judge the category of an image at a finer level, such as the biology class of the flower and the dog. Test FLOWER ? ? DOG Colts’ foot Siberian Husky 11/16/2018 ICMR 2015, Shanghai, China

Introduction: Image Retrieval Dataset Holiday In image retrieval, we are also dealing with image datasets, such as a near-duplicate image set. Given a query image, it is instructed to find a set of candidate images which are relevant to the query. This is an example of returned image list, in which there are both true-positives and false-positives. Obviously, the goal of retrieval is to find as many as possible true-positives meanwhile not introducing too many false-positives into the list. QUERY TP TP TP TP Test TP True- Positive FP TP FP TP FP False-Positive 11/16/2018 ICMR 2015, Shanghai, China

BoVW for Classification & Retrieval COMMON PART classification raw images visual features global features The Bag-of-Visual-Words model is one of the most popular algorithms for image classification and retrieval. Conventional BoVW models could often be partitioned into two parts, which are the common part and the specially designed stages for classification and retrieval. In the common part, from raw images we can extract local descriptors, train visual vocabulary on top of them, and encode descriptors into visual features. Then, classification models often aggregate local features into global representation, and send the global features into machine learning algorithms for training and testing. On the other hand, retrieval systems often construct an efficient lookup table such as the inverted index for fast online querying. A B Img. 1 Img. 2 Img. 3 Img. 2 Img. 4 Img. 5 image descriptors visual vocabulary inverted file 11/16/2018 ICMR 2015, Shanghai, China retrieval

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions Conventional models are verified very effective, but we still propose a problem: why classification and retrieval need to be solved with different flowcharts? Discovering this question forms the goal and motivation of our work. 11/16/2018 ICMR 2015, Shanghai, China

Designing a UNIFIED model: The Goal Designing a UNIFIED model: image classification for image retrieval Answering two questions: We aim at designing a UNIFIED model, for both image classification and retrieval. For this, we need to answer the following questions. First, what is the difference between image classification and retrieval? Can we alleviate the difference to design a unified model? Second, can we benefit from the unified model? How to achieve this? The first question will be answered immediately with a comparison between classification and retrieval, and the second one will be discussed in details after the main algorithm is formulated. What is the difference between them? Can we benefit from the unified model? 11/16/2018 ICMR 2015, Shanghai, China

Classification vs. Retrieval QUERY sitting people tidy shelves chessboard tidy shelves sitting people laptops library (library) arches open spaces square table bookstore 6 library attr. bookstore attr. neutral attr. dense books tidy shelves square tables 2 Q 3 7 1 5 ladder pictures square tables With Retrieval With Classification We know that the simplest model for both classification and retrieval is the nearest-neighbor search. However we will show here why a naive NN search fails to provide satisfying results especially for classification. Here is a toy example involving two classes, LIBRARY and BOOKSTORE. This is the query image, a sample from the LIBRARY class, and 3 most significant visual attributes are listed. We also have a set of 7 candidate images, each of them is drawn from either the LIBRARY or BOOKSTORE class. First let us consider the case of image retrieval, in which we do not know the label of candidate images, therefore, we can only sort the candidates according to their distance to the query image, illustrated with the numbers from 1 to 7. We can see that, since the most similar candidate is an outlier from the BOOKSTORE class, if we categorize the query image according to this sample, we will get the incorrect classification result. However in classification, the extra label of each image, either LIBRARY or BOOKSTORE, is available, and we can train an optimal classifier shown as the purple dashed line. With this, it is clear that sample #1 is an outlier and the query image gets the correct categorization. From this example, we can conclude that the reason why NN search fails in classification lies in that it does not utilize the image labels. √ × 4 dense books tidy shelves square tables cashier various styles square tables standing people sparse books square tables 11/16/2018 ICMR 2015, Shanghai, China

Any Inspirations? Fact 1: classification tasks benefit from extra information (image labels)! Fact 2: image-to-class distance is more stable than image-to-image distance. Classification with NN search? × Let us go a little bit further on top of this example. We propose the following two facts. The first one, as observed from the previous slide, is that classification benefits from extra information, which are the image labels. In fact, image labels partition the candidates into several groups (or classes), and we turn to measure image-to-class distance instead of image-to-image distance. The second fact is that, image-to-class distance is much more stable than image-to-image distance, as shown in this paper. Therefore, it is the complicated computation of image-to-class distance (such as using an SVM) that makes classification work better. TO DESIGN A UNIFIED MODEL, WE ARE NOT TO DEGENERATE CLASSIFICATION ALGORITHMS TO NN SEARCH (WHICH DOES NOT USE LABELS), BUT TO INTRODUCE CLASSIFICATION TECHNIQUES INTO RETRIEVAL FOR IMPROVEMENT. So, our solution is to define pseudo class labels in retrieval tasks, which is implemented by extracting multiple objects on each image. Retrieval with class labels? √ Solution: defining the class for retrieval: extracting multiple objects for each image! 11/16/2018 ICMR 2015, Shanghai, China

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions The above motivations lead to the ONE algorithm, which is very simple but effective in real use. 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation Measuring image-to-class distance! For classification, we have spontaneous categories. For retrieval, each image forms a category! Terminology Category: 1,2,⋯,𝐶 , for retrieval, 𝐶=𝑁. Image: 𝐈, each with a category label. Object proposal set: 𝒫, 𝒫 =𝐾. Feature: 𝐟, each object corresponds to a feature. Feature set: ℱ 𝑐 , all features in category 𝑐. The full name of ONE is Online Nearest-neighbor Estimation. Although it is quite similar to NN search, but we turn to measure the image-to-class distance, which makes a big difference in practise. We first introduce some terminologies. For both classification and retrieval, images are annotated with categories. Since there are no actual categories in retrieval, we simply regard each image as an independent category, and the number of categories is just identical to the number of candidate images. On each image I, we can define an object proposal set P, which contains a number of bounding boxes indicating the most probable locations on the image being identified as objects. On each object we extract a regional feature f, and all the features from a category form a feature set Fc. 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation How to compute image-to-class distance? dist 𝐈 0 ,𝑐 ≜dist 𝐈 0 , ℱ 𝑐 = 1 𝐾 0 𝑘=1 𝐾 0 dist 𝐟 0,𝑘 , ℱ 𝑐 = 1 𝐾 0 𝑘=1 𝐾 0 min 𝐟∈ ℱ 𝑐 𝐟 0,𝑘 −𝐟 2 2 image-to-class distance Now, it is easy to estimate the distance from the query image I0 to the c-th category. The formula is listed here, and this term is named the image-to-class distance, following the NBNN formulation. Briefly speaking, it is measured by the average distance between the query features to the c-th category. After we have obtained the distance between the query and every category (or equivalently, every candidate in the retrieval case), we can easily obtain desired results on top of them. Naive-Bayes Nearest Neighbor (NBNN) Boiman et.al, In Defense of Nearest-Neighbor based Image Classification, CVPR’08 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation “Class” 1 1 1 1 A toy example of the ONE algorithm is illustrated here. Three candidate images and a query image will be considered. For the first candidate, we use object detectors to locate several interest regions on the image, extract regional features, and correspond them into the feature space. 1 1 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation “Class” 2 1 2 2 1 2 2 1 2 Similarly, we can find interest regions and extract features for the second image. 1 1 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation “Class” 3 3 1 2 2 3 3 1 2 2 3 1 2 And the third image. 1 3 1 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation Test Case 3 1 2 2 3 3 1 2 2 3 1 2 When the test image comes, we also extract regional features on the interest regions. Then, for each test feature, we correspond it to the feature space and find its nearest neighbors in all the categories. 1 3 1 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation 3 1 2 2 3 3 1 2 2 3 1 2 This is the first feature, and the computation of its distance to class 1, class 2, and class 3. Then come the second feature, and the third feature. 1 3 1 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

ONE: Online NN Estimation Classification? Retrieval? Class 1 3 Rank 3 1 2 2 3 3 1 Class 2 2 2 Rank 1 3 1 2 We summarize the computed distance into a figure. The image-to-class distance is estimated via the average of feature-to-class distances. From the distance shown here, we can either perform classification, or retrieval tasks. 1 3 Class 3 1 Rank 2 Feature Space 11/16/2018 ICMR 2015, Shanghai, China

What is the Benefit? QUERY Search by “natural scene” mountain TP TP mountain Search by “mountain” TP terrace Search by “terrace” TP TP TP After the introduction of the ONE algorithm, a direct question may arise: what is the benefit of such an algorithm? Besides the advantage of measuring image-to-class distance, we provide another intuitive clue on object detection and description. This is a query image, on which we can find several interest regions and each of them may correspond to a visual concept or attribute. Conventional algorithms for retrieval often use global features directly, so we can only find those candidates with similar global attributes. When new regions are detected and described, we can find a lot more clues that help us with the retrieval task. The ONE algorithm, by fusing all the information, achieves much better retrieval performance. This is actually a good cooperation of object detection and description. natural scene Fused Results TP TP TP TP TP 11/16/2018 ICMR 2015, Shanghai, China

Definition of Object Proposals Manual Definition vs. Automatic Detection We briefly introduce the method of extracting object proposals. Briefly we have two choices, either with manual definition which extracts regular boxes, or with automatic annotation such as objectness which is able to find several high-confidence regions. In experiments, we observe that both strategies produce satisfying results, given that the number of objects is sufficiently large. It implies that it is the number of object proposals that helps to improve the accuracy. For simplicity, we will only use manual definition in later experiments. In experiments: both produce satisfying performance! For simplicity: we use manual definition in evaluation. 11/16/2018 ICMR 2015, Shanghai, China

Time & Memory Costs Dataset scale FOR ONE SINGLE QUERY TOO EXPENSIVE! 𝑁 candidate images (~ 10 6 ) 𝐾 object proposals for each image (~ 10 2 ) 𝐷-dimensional features for each object (4096) FOR ONE SINGLE QUERY Time Complexity Finally we analyze time and memory consumption of the ONE algorithm. We assume that there are N candidate images. On each image, we extract K interest regions and each is equipped with a D dimensional feature. This is the setting of a large-scale image retrieval task, in which images are described with deep features. We can find that the time and memory costs might be very high, costing more than 100 seconds for one single query. O 𝐾×𝑁𝐾×𝐷 =O 𝑁 𝐾 2 𝐷 # querying features # indexed features TOO EXPENSIVE! Memory Complexity O 𝑁×𝐾×𝐷 =O 𝑁𝐾𝐷 11/16/2018 ICMR 2015, Shanghai, China

Approximation Approximate NN search! FOR ONE SINGLE QUERY MUCH BETTER! PCA reduction: from 𝐷 to 𝐷′ (512) dimensions Product Quantization (PQ) approximation: 𝑀 (32) segments, each with 𝑇 (4096) codewords. FOR ONE SINGLE QUERY Time Complexity To cope with, we use approximate NN search, which involves using PCA and PQ for approximation. With these simple techniques, we can significantly reduce the computational costs. It is also worth noting that our algorithm benefits from the advantage of simple arithmetic computations and highly parallelizable flowchart, which makes it possible to adopt powerful devices such as GPUs for fast computation. As the result of approximation and parallelization, it requires only about 1 second to process a retrieval query among a large-scale database. O 𝐾×𝑁𝐾×𝑀+𝐾×𝐷′×𝑇 PQ cost in summation codebook costs MUCH BETTER! Memory Complexity O 𝑁𝐾×𝑀× log 2 𝑇 +𝐷′×𝑇 11/16/2018 ICMR 2015, Shanghai, China

Parallelization Why parallelization? After using GPU PQ needs a huge amount of regular computations In comparison, conventional BoVW models with either SVM or inverted index is difficult to parallelize GPU: the most powerful devices for parallelization After using GPU 30-50x speed up based on PQ Only ~1s for each query among 1M images To cope with, we use approximate NN search, which involves using PCA and PQ for approximation. With these simple techniques, we can significantly reduce the computational costs. It is also worth noting that our algorithm benefits from the advantage of simple arithmetic computations and highly parallelizable flowchart, which makes it possible to adopt powerful devices such as GPUs for fast computation. As the result of approximation and parallelization, it requires only about 1 second to process a retrieval query among a large-scale database. 11/16/2018 ICMR 2015, Shanghai, China

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions Here we report some experimental results. 11/16/2018 ICMR 2015, Shanghai, China

Experiments: Image Classification Fine-Grained Object Recognition The Pet-37 dataset (7390 images) The Flower-102 dataset (8189 images) The Bird-200 dataset (11788 images) Scene Recognition The LandUse-21 dataset (2100 images) The Indoor-67 dataset (15620 images) The SUN-397 dataset (108954 images) First we conduct experiments on image classification. It is partitioned into two parts, fine-grained recognition and scene recognition, each is composed of three popular and challenging datasets. 11/16/2018 ICMR 2015, Shanghai, China

Results: Fine-Grained Recognition Pet-37 Flower-102 Bird-200 Wang, IJCV14 59.29% 75.26% N/A Murray, CVPR14 56.8% 84.6% 33.3% Donahue, ICML14 N/A N/A 58.75% Razavian, CVPR14 N/A 86.8% 61.8% For fine-grained object recognition, results are shown here. We can see that, although ONE produces slightly inferior results to SVM with deep features, the combination of ONE and SVM gives higher accuracy than both individual models. This indicates that ONE provides complementary and helpful information to SVM. Ours (ONE) 88.05% 85.49% 59.66% SVM with deep feat. 89.50% 86.24% 61.54% ONE+SVM 90.03% 86.82% 62.02% 11/16/2018 ICMR 2015, Shanghai, China

Results: Scene Recognition LandUse-21 Indoor-67 SUN-397 Kobayashi, CVPR14 92.8% 63.4% 46.1% Xie, CVPR14 N/A 63.48% 46.91% Donahue, ICML14 N/A N/A 40.94% Razavian, CVPR14 N/A 69.0% N/A Similar results are also observed on scene recognition experiments. Ours (ONE) 94.52% 68.46% 53.00% SVM with deep feat. 93.98% 69.61% 54.47% ONE+SVM 94.71% 70.13% 54.87% 11/16/2018 ICMR 2015, Shanghai, China

Experiments: Image Retrieval Near-Duplicate Image Retrieval The Holiday dataset (1491 images) 500 image groups, 2-12 images per group Evaluation: the mAP score The UKBench dataset (10200 images) 2550 object groups, 4 objects per group Evaluation: the N-S score The Holiday+1M dataset Holiday mixed with 1 million distractor images Here are image retrieval experiments on the Holiday and UKBench datasets, with very common settings. 11/16/2018 ICMR 2015, Shanghai, China

Results: Image Retrieval Holiday UKBench Holiday+1M Zhang, ICCV13 0.809 3.60 0.633 Zheng, CVPR14 0.858 3.85 N/A Zheng, arXiv14 0.881 3.873 0.724 Razavian, CVPR14 0.843 N/A N/A Here are the results. Once again we achieve the state-of-the-art without any post-processing on top of initial retrieval results. The scores, 0.899 and 3.887, rank among the top of our known results. Ours (ONE) 0.887 3.873 N/A BoVW with SIFT 0.518 3.134 N/A ONE+BoVW 0.899 3.887 0.758 11/16/2018 ICMR 2015, Shanghai, China

Outline Introduction Goal and Motivation The ONE Algorithm Image Classification and Retrieval Conventional BoVW Model Goal and Motivation The ONE Algorithm Experimental Results Conclusions Finally we have some conclusions. 11/16/2018 ICMR 2015, Shanghai, China

What have we Learned? Image classification and retrieval: difference? Classification benefits from extra labels. Measuring image-to-class distance is more stable! Image classification and retrieval: connections? Both are dealing with image similarity! From retrieval to category: “pseudo” labels. ONE (Online Nearest-neighbor Estimation) A unified model for classification and retrieval. In our work, we design a unified model for both image classification and retrieval. We have learned several knowledge from this effort. First, the difference between image classification and retrieval lies in that the former one could benefit from extra labels, in other words, measuring image-to-class distance, which is much more stable than image-to-image distance. Second, both classification and retrieval could be solved with computing image-to-class similarity. We can use pseudo labels to achieve this goal in image retrieval. Summarizing these gives the ONE model, which is very effective in real use. 11/16/2018 ICMR 2015, Shanghai, China

Why ONE Works Well? Measuring image-to-class distance. Theory: NBNN [Boiman, CVPR’08]. Generalizing to image retrieval: “pseudo” labels. How to perform excellent classification/retrieval? Good detection (object proposals definition). Good description (deep conv-net features). Make it fast: approximation and acceleration. GPU might be the trend of big-data computation. The reasons that ONE works well are illustrated here. Based on the theory of NBNN, it is the good cooperation of object detection and description that produces the excellent performance. Meanwhile, GPU plays an important role in accelerating our algorithm. We think this is another important clue left by our work to motivate future researches. 11/16/2018 ICMR 2015, Shanghai, China

Thank you! Questions please? Thank you for your attentions. Any questions are mostly welcome! 11/16/2018 ICMR 2015, Shanghai, China