Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.

Slides:



Advertisements
Similar presentations
Koby Crammer Department of Electrical Engineering
Advertisements

Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
On-line learning and Boosting
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
Sparse vs. Ensemble Approaches to Supervised Learning
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
2D1431 Machine Learning Boosting.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
Sparse vs. Ensemble Approaches to Supervised Learning
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
Online Learning Algorithms
Machine Learning CS 165B Spring 2012
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Using Error-Correcting Codes For Text Classification Rayid Ghani Center for Automated Learning & Discovery, Carnegie Mellon University.
Active Learning for Class Imbalance Problem
Data mining and machine learning A brief introduction.
CSSE463: Image Recognition Day 27 This week This week Last night: k-means lab due. Last night: k-means lab due. Today: Classification by “boosting” Today:
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Benk Erika Kelemen Zsolt
Multiple Instance Real Boosting with Aggregation Functions Hossein Hajimirsadeghi and Greg Mori School of Computing Science Simon Fraser University International.
The Perceptron. Perceptron Pattern Classification One of the purposes that neural networks are used for is pattern classification. Once the neural network.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Linear Discrimination Reading: Chapter 2 of textbook.
Protein Classification Using Averaged Perceptron SVM
Learning with AdaBoost
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Online Learning Rong Jin. Batch Learning Given a collection of training examples D Learning a classification model from D What if training examples are.
CSSE463: Image Recognition Day 33 This week This week Today: Classification by “boosting” Today: Classification by “boosting” Yoav Freund and Robert Schapire.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Consensus Group Stable Feature Selection
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Nawanol Theera-Ampornpunt, Seong Gon Kim, Asish Ghoshal, Saurabh Bagchi, Ananth Grama, and Somali Chaterji Fast Training on Large Genomics Data using Distributed.
1 Machine Learning in Natural Language More on Discriminative models Dan Roth University of Illinois, Urbana-Champaign
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Experience Report: System Log Analysis for Anomaly Detection
Debesh Jha and Kwon Goo-Rak
Asymmetric Gradient Boosting with Application to Spam Filtering
Finding Clusters within a Class to Improve Classification Accuracy
A New Boosting Algorithm Using Input-Dependent Regularizer
Incremental Training of Deep Convolutional Neural Networks
Behrouz Minaei, William Punch
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Classification Breakdown
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
The Voted Perceptron for Ranking and Structured Classification
Presentation transcript:

Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical & Computer Engineering MATH 6397: Data Mining

Online Multiple Kernel Classification (OMKC) Background - Online Online learning Learns one instance at a time and predicts labels for future instances Learner is given an instance Learner predicts the label of the instance Learner is given the correct label Learner refines its prediction mechanism 2

Online Multiple Kernel Classification (OMKC) Background – Multiple Kernel Composed of two online learning algorithms: Perceptron algorithm (Rosenblatt 1958) Type of linear classifier Learns a classifier for a given kernel Hedge algorithm (Freund and Schapire 1997) Combines classifiers by linear weights Perceptron Hedge 3

Online Multiple Kernel Classification (OMKC) Perceptron algorithm 4

Online Multiple Kernel Classification (OMKC) Hedge algorithm 5

Online Multiple Kernel Classification (OMKC) Notations 6

Online Multiple Kernel Classification (OMKC) Proposed framework 7

Online Multiple Kernel Classification (OMKC) Algorithms Deterministic approach: all kernels are used Stochastic approach: a subset of kernels are used 8 Deterministic Stochastic Deterministic Stochastic Update Combination

Online Multiple Kernel Classification (OMKC) OMKC (D,D) 9 … Kernel classifiers : Prediction: Combined Prediction: … Deterministic update Deterministic combination Deterministic Stochastic Deterministic Stochastic Update Combination

Online Multiple Kernel Classification (OMKC) OMKC (S,S) 10 … Kernel classifiers : Prediction: Combined Prediction: … Stochastic update Deterministic Stochastic Deterministic Stochastic Update Combination Stochastic combination

Online Multiple Kernel Classification (OMKC) Experimental setup 11 binary datasets

Online Multiple Kernel Classification (OMKC) Experimental setup 12

Online Multiple Kernel Classification (OMKC) Evaluation of the deterministic OMKC algorithm Comparison of the deterministic OMKC algorithm with three Perceptron based algorithms Perceptron : the well-known Perceptron baseline algorithm with a linear kernel (Rosenblatt 1958; Freund and Schapire 1999) Perceptron(u) : another Perceptron baseline algorithm with an unbiased/uniform combination of all the kernels Perceptron(*): an online validation procedure to search for the best kernel among the pool of kernels (using the first 10 % training examples), and then apply the Perceptron algorithm with the best kernel OM-2: a state-of-the-art online learning algorithm for multiple kernel learning (Jie et al. 2010; Orabona et al. 2010) 13

Online Multiple Kernel Classification (OMKC) Evaluation of the deterministic OMKC algorithm 14 < > <

Online Multiple Kernel Classification (OMKC) Average mistake rate (20 runs) 15

Online Multiple Kernel Classification (OMKC) Number of support vectors (20 runs) 16

Online Multiple Kernel Classification (OMKC) Kernel weights 17

Online Multiple Kernel Classification (OMKC) 18

Online Multiple Kernel Classification (OMKC) Time Efficiency 19 Decreases as size increases

Online Multiple Kernel Classification (OMKC) Conclusion All the OMKC algorithms usually perform better than the regular Perceptron algorithm with an unbiased linear combination of multiple kernels the Perceptron algorithm with the best kernel found by validation the state-of-the-art online MKL algorithm The deterministic combination strategy usually performs better Stochastic updating strategy improves computational efficiency without decreasing the accuracy significantly 20

21 1)How many kernel classifiers were used in the stochastic combination? 2)How was the number of support vectors determined? Should the support vectors be given in terms of the number of support vectors per kernel classifier? Did support vectors overlap between kernel classifiers?

Online Multiple Kernel Classification (OMKC) References Hoi, S. C. H., Jin, R., Zhao, P., & Yang, T. (2012). Online Multiple Kernel Classification. Machine Learning, 90(2), 289–316. doi: /s

Online Multiple Kernel Classification (OMKC) Algorithm 1 All kernels are used 23 Deterministic Stochastic Deterministic Stochastic Update Combination Normalize the weights Update Combination

Online Multiple Kernel Classification (OMKC) Algorithm 1 → 2 24 Stochastic combination Deterministic update 17: Update Combination Deterministic Stochastic Deterministic Stochastic

Online Multiple Kernel Classification (OMKC) Algorithm 2 → 3 25 Deterministic Stochastic Deterministic Stochastic Update Combination

Online Multiple Kernel Classification (OMKC) Algorithm 2 → 3 26 Deterministic Stochastic Deterministic Stochastic Deterministic combination Stochastic update Update Combination

Online Multiple Kernel Classification (OMKC) Algorithm 4 27 Deterministic Stochastic Deterministic Stochastic Stochastic update Stochastic combination Update Combination