Active Learning for Class Imbalance Problem

Slides:



Advertisements
Similar presentations
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Advertisements

On-line learning and Boosting
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Active Learning to Classify
Fei Xing1, Ping Guo1,2 and Michael R. Lyu2
Classification and Decision Boundaries
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
SUPPORT VECTOR MACHINES PRESENTED BY MUTHAPPA. Introduction Support Vector Machines(SVMs) are supervised learning models with associated learning algorithms.
Classification and risk prediction
Ensemble Learning: An Introduction
Active Learning with Support Vector Machines
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Semi-supervised protein classification using cluster kernels Jason Weston, Christina Leslie, Eugene Ie, Dengyong Zhou, Andre Elisseeff and William Stafford.
Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Ensemble Learning (2), Tree and Forest
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Learning from Imbalanced, Only Positive and Unlabeled Data Yetian Chen
Learning from Imbalanced Data
A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data Author: Gustavo E. A. Batista Presenter: Hui Li University of Ottawa.
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering National Institute of Technology.
Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear Models Kai-Wei Chang and Dan Roth Experiment Settings Block Minimization.
Universit at Dortmund, LS VIII
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
1 KDD-09, Paris France Quantification and Semi-Supervised Classification Methods for Handling Changes in Class Distribution Jack Chongjie Xue † Gary M.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.
A New Supervised Over-Sampling Algorithm with Application to Protein-Nucleotide Binding Residue Prediction Li Lihong (Anna Lee) Cumputer science 22th,Apr.
Agnostic Active Learning Maria-Florina Balcan*, Alina Beygelzimer**, John Langford*** * : Carnegie Mellon University, ** : IBM T.J. Watson Research Center,
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Class Imbalance in Text Classification
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
NTU & MSRA Ming-Feng Tsai
1 Machine Learning Lecture 9: Clustering Moshe Koppel Slides adapted from Raymond J. Mooney.
DATA MINING LECTURE 10b Classification k-nearest neighbor classifier
Applying Support Vector Machines to Imbalanced Datasets Authors: Rehan Akbani, Stephen Kwek (University of Texas at San Antonio, USA) Nathalie Japkowicz.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert Samy Bengio Yoshua Bengio Prepared : S.Y.C. Neural Information Processing Systems,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
SVMs in a Nutshell.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Experience Report: System Log Analysis for Anomaly Detection
Machine Learning Lecture 9: Clustering
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Presented by Steven Lewis
Classification of class-imbalanced data
COSC 4335: Other Classification Techniques
Concave Minimization for Support Vector Machine Classifiers
Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Active Learning for Class Imbalance Problem

Problem to be addressed Motivation class imbalance problem referring to the situation that at least one of class having significantly less number of training examples or examples in training data belonging to one class heavily outnumber the examples in the other class Currently, most of the machine learning algorithms assume the training data to be balanced, support vector machine, logistic regression, naïve bayesian classifier etc,. During the last few decades, some effective methods have been proposed to attack this problem, like up-sampling, down-sampling and asymmetric bagging, etc,.

Problem to be addressed Detailed problem Traditional machine learning algorithms are often biased toward the majority class Since the goal of the classifiers is to reduce the training error, not taking the data distribution into consideration Consequently, examples from the majority class are well-classified while the examples from minority class tend to be misclassified

Several Common Approaches From the data perspective Over-sampling Under-sampling Asymmetric Bagging From the learning algorithm perspective Adjusting the cost function Tuning the related parameters

Background Knowledge Active Learning Similar to semi-supervised learning method, the key idea is to use both the labeled and unlabeled data for classifier training. Active learning is composed of four components A small set of labeled training data, a large pool of unlabeled data, a based learning algorithm and an active learner (selection strategy) Active learning is not a machine learning algorithm, It can be seen as a enhancing wrapper method The difference between semi-supervised learning and active learning

Background Knowledge Active Learning Goals of active learning Maximizing the learning performance while minimizing the required labeled training examples Achieving better performance using the same amount of labeled training data Needing less training samples to obtain the same learning performance

Background Knowledge

Background Knowledge

An Example SVM-based Active Learning A small set of labeled training examples A large pool of unlabeled data Base learning algorithm SVM Active Learner (selection strategy) Instances closest to the current separating hyperplane are selected and asks for human labeling

Problems SVM-based Active Learning In classical active learning methods, the most informative samples are selected from the entire unlabeled pool In other words, each iteration of active learning involves the computation of distance of each sample to the decision boundary For large-scale data set, it is time-consuming and computationally inefficient

Paper Contribution Proposed method Instead of querying the whole unlabeled pool , a subset is first selected Select the closed sample from using the criterion that is among the top closest instances with probability

Paper Contribution Proposed Method The probability that at least one of the L instances is among the closest is We have

Paper Contribution Proposed Method For example The active learner will pick one instance, with 95% probability, that is among the top 5% closed instances to the separating hyperplane, by randomly sampling only instances regardless of the training set size

Experiments

Experiments Evaluation Metric g-means where sensitivity and specifity are the accuracies of the positive and negative instances respectively

Experiments

Experiments

Experiments

Experiments

Conclusions This paper propose a method to address the class imbalance problem using active learning technique Experimental results show that this approach can achieve a significant decrease in the training time, while maintaining the same or even higher g-means value by using less number of training examples Active selection of informative examples from a randomly selected subset avoid searching the whole unlabeled pool

Thank You

Q & A