SVM Active Learning with Application to Image Retrieval

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
Lecture 9 Support Vector Machines
ECG Signal processing (2)
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Machine learning continued Image source:
Ping-Tsun Chang Intelligent Systems Laboratory Computer Science and Information Engineering National Taiwan University Text Mining with Machine Learning.
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Learning similarity measure for natural image retrieval with relevance feedback Reporter: Francis 2005/3/3.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Support Vector Machine
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Learning from Observations Chapter 18 Section 1 – 4.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
Support vector machine concept-dependent active learning for image retrieval Reporter: Francis
Active Learning with Support Vector Machines
Evaluating Hypotheses
Support Vector Machines
Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
Presented by Zeehasham Rasheed
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Active Learning for Class Imbalance Problem
SVM by Sequential Minimal Optimization (SMO)
Support Vector Machine & Image Classification Applications
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Machine Learning CSE 681 CH2 - Supervised Learning.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Universit at Dortmund, LS VIII
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
1 CMSC 671 Fall 2010 Class #24 – Wednesday, November 24.
CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.
1 CSC 4510, Spring © Paula Matuszek CSC 4510 Support Vector Machines (SVMs)
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machines Tao Department of computer science University of Illinois.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Question Classification using Support Vector Machine Dell Zhang National University of Singapore Wee Sun Lee National University of Singapore SIGIR2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Support Vector Machines
CS 9633 Machine Learning Support Vector Machines
An Introduction to Support Vector Machines
Support Vector Machines
Machine Learning Week 2.
Class #212 – Thursday, November 12
Concave Minimization for Support Vector Machine Classifiers
Support Vector Machines
SVMs for Document Ranking
Presentation transcript:

SVM Active Learning with Application to Image Retrieval Simon Tong , Edward Chang, Proceedings of the ninth ACM international conference on Multimedia, September 30-October 05, 2001

Outline Introduction SVM Version Space Active Learning Experiments Conclusion

Introduction- what is image retrieval? An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Most traditional methods of image retrieval utilize some method of adding metadata such as captioning, keywords, or descriptions to the images so that retrieval can be performed over the annotation words. But as you known…. user is Lazy So, Here is the Question … How to automatically find the correct images for user ?

Introduction- relevance feedback Because hand-labeling each image with descriptive words is time-consuming, and costly. Thus, there is a need for a way to allow a user to implicitly inform a database of his or her desired output or query concept. Relevance feedback can be used as a query refinement scheme to derive or learn a user’s query concept. To solicit feedback, the refinement scheme displays a few image instances and the user labels each image as ‘relevant’ or ‘not relevant’.

Introduction- Active Learning Based on the answers, another set of images from the database are brought up to the user for labeling. The previous mentioned scheme often called pool-based active learning. Label for relevant or not image image image image image image image image image image image image

Introduction- active learning The main issue with active learning is finding a way to choose informative images within the pool to ask the user to label. In general, and for the image retrieval task in particular, such a learner must meet two critical design goals. The learner must learn target concepts accurately. The learner must grasp a concept quickly, with only a small number of labeled instances, since most users do not wait around to provide a great deal of feedback The key idea with active learning is that is should choose its next pool-query based on past answers to previous pool-queries

Introduction- proposed learner In this study, we propose using a support vector machine active learner (short for SVMact) to achieve our goals. The support vector machine active learner followed three idea below: SVMact regards the task of learning a target concept as one of learning a SVM binary classifier. SVMact learns the classifier quickly via active learning. The active part of SVMact selects the most informative instances with which to train the SVM classifier. Once the classifier is trained, SVMact returns the top-k most relevant images.

SVM- Linear Classifier f(x,w,b) = sign(w x + b) denotes +1 denotes -1 w x + b>0 w x + b=0 How would you classify this data? w x + b<0

SVM- Linear Classifier f(x,w,b) = sign(w x + b) denotes +1 denotes -1 How would you classify this data?

SVM- Linear Classifier f(x,w,b) = sign(w x + b) denotes +1 denotes -1 How would you classify this data?

SVM- Linear Classifier f(x,w,b) = sign(w x + b) denotes +1 denotes -1 Any of these would be fine.. ..but which is best?

SVM- Linear Classifier f(x,w,b) = sign(w x + b) denotes +1 denotes -1 How would you classify this data? Misclassified to +1 class

SVM- Linear Classifier f(x,w,b) = sign(w x + b) f(x,w,b) = sign(w x + b) denotes +1 denotes -1 denotes +1 denotes -1 Define the margin of a linear classifier as the width that the boundary could be increased by before hitting a datapoint. Define the margin of a linear classifier as the width that the boundary could be increased by before hitting a datapoint.

SVM- Linear Classifier Maximizing the margin is good Implies that only support vectors are important; other training examples are ignorable. Empirically it works very very well. f(x,w,b) = sign(w x + b) denotes +1 denotes -1 The maximum margin linear classifier is the linear classifier with the, maximum margin. This is the simplest kind of SVM (Called an LSVM) Support Vectors are those datapoints that the margin pushes up against Linear SVM

SVM- SVM Mathematically Classifier Hyper plane Dist between Data point x to classifier

SVM- SVM Mathematically Margin: SVM want to maximize margin, so this is an object function Subject to: How to solve? Lagrange Multiplier

SVM- Nonlinear dataset If the data is not linearly separable, what should we do ? The data is not linear separable in this space, doesn’t mean that it is still not linear separable in other space

Version Space- Notations F: feature space H: a set that contain all hypotheses (hyperplane) W: parameter space f(x): classifier (hyperplane) v: version space Version space is the set that contain all possible classifiers

Active Learning- concept Given an unlabeled pool U , an active learner l has three components: l( f, q, X). f : classifier (trained on the current set of labeled data X) q: query component, which give a current labeled set X, decides which instance in U to query next. X: labeled dataset DEFINITION 4.1 Area(V) is the surface area that version space V occupies on the hypersphere, where ||w|| = 1 We want the classifier get more precise when queries rounds is increased. So we need to reduce the version space as much as possible

Active Learning- concept LEMMA (Tong & Koller, 2000) suppose we have an input space X, finite dimensional feature space F (induced via a kernel K), and parameter space W. suppose active learner l* always queries instances whose corresponding hyperplanes in parameter space W halves the area of the current version space. Let l be any other active learner. Denote Vi respectively. Let P denote the set of all conditional distribution of y given x. then

Active Learning- concept This discussion provide motivation for us an approach where we query instances that split the current version space into two equal parts as much as possible Given an unlabeled instance X from the pool, it is not practical to explicitly compute the sizes of new space V-, V+. Hence, there is a way of approximating this procedure simple method: learn an SVM on existing labeled data and choose as the next instance to query the pool instance that comes closest to the hyperplane.

Active Learning- why simple work The SVM unit vector w obtained from labeled data is the center of the largest hypersphere that can fit inside the current version space V. The position of w is often approximately in the center of the version space. So, we can test each of unlabeled instances x in the pool to see how close their corresponding hyperplane in W to the centrally placed w. The distance calculation is straightforward

Active Learning- why simple work labeled instance unlabeled instance Choose this for labeling w

Active Learning- SVMact summary To summarize, our SVMact system performs the following for each round of relevance feedback: Learn an SVM on the current labeled data If this is the first feedback round, ask the user to label twenty randomly selected images. Otherwise, ask the user to label the twenty pool images closest to the SVM boundary. After the relevance feedback rounds have been performed SVMact retrieves the top-k most relevant images: Learn a final SVM on the labeled data The final SVM boundary separate “relevant” images from irrelevant ones. Display the k relevant images that are farthest from the SVM boundary

Experiments- Image Characterization Our image retrieval system employs a multi-resolution image representation scheme. We characterize images by two main feature Color Texture Given an image, we can extract the above color and texture information to produce a 144 dimensional vector of numbers. Thus, the space X for our SVMs is a 144 dimensional space, and each image in our database corresponds to a point in this space.

Experiment- Image Databases We used three real-world image dataset: Four-category Ten-category Fifteen-category Each category consisted 100 to 150 images Architecture, flowers, landscape, people Architecture, bears, clouds, flowers, landscape, people objectionable , tigers, tools, and waves In addition to ten, Elephant, fabrics, fireworks, food, and texture

Experiments- method The goal of SVMact is to learn a given concept through a relevance feedback process. At each feedback round SVMact selects twenty images to ask the user to label as ‘relevant’ or ‘not relevant’ with respect to the query concept. It then uses the labeled instances to successively refine the concept boundary. After the relevance feedback rounds have finished SVMact then retrieves the top-k most relevant images from the dataset based on the final concept it has learned.

Experiments- average top-k accuracy

Experiments- compared with passive

Conclusion Active learning with SVM can provide a powerful tool for searching image databases, outperforming a number of traditional query refinement schemes. SVMact not only achieve s consistently high accuracy on a wide variety of desired returned results, but also does it quickly and maintains high precision when ask for large quantities of images.