Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Albert Gatt Corpora and Statistical Methods Lecture 13.
On-line learning and Boosting
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Nearest Neighbour Condensing and Editing David Claus February 27, 2004 Computer Vision Reading Group Oxford.
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Proceedings of the 2007 SIAM International Conference on Data Mining.
Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.
Recognizing Hebrew alphabet of specific handwriting Introduction to Computational and Biological Vision 2010 By Rotem Dvir and Osnat Stein.
A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows Raphaël Marée, Pierre Geurts, Justus Piater, Louis Wehenkel.
Active Learning for Class Imbalance Problem
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach by: Craig A. Knoblock, Kristina Lerman Steven Minton, Ion Muslea Presented.
Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
CSE 185 Introduction to Computer Vision Face Recognition.
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Agnostic Active Learning Maria-Florina Balcan*, Alina Beygelzimer**, John Langford*** * : Carnegie Mellon University, ** : IBM T.J. Watson Research Center,
Post-Ranking query suggestion by diversifying search Chao Wang.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Hierarchical Sampling for Active Learning Sanjoy Dasgupta and Daniel Hsu University of California, San Diego Session : Active Learning and Experimental.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Introduction to Machine Learning, its potential usage in network area,
Deep Learning Amin Sobhani.
Data Mining, Neural Network and Genetic Programming
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Trees, bagging, boosting, and stacking
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
Introductory Seminar on Research: Fall 2017
Importance Weighted Active Learning
Instance Based Learning (Adapted from various sources)
Students: Meiling He Advisor: Prof. Brain Armstrong
UN Workshop on Data Capture, Bangkok Session 7 Data Capture
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Boosting Nearest-Neighbor Classifier for Character Recognition
K Nearest Neighbor Classification
Outline S. C. Zhu, X. Liu, and Y. Wu, “Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo”, IEEE Transactions On Pattern Analysis And Machine.
Prepared by: Mahmoud Rafeek Al-Farra
UN Workshop on Data Capture, Dar es Salaam Session 7 Data Capture
Instance Based Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
CS4670: Intro to Computer Vision
Concave Minimization for Support Vector Machine Classifiers
Autoencoders hi shea autoencoders Sys-AI.
Using Manifold Structure for Partially Labeled Classification
Semi-Automatic Data-Driven Ontology Construction System
Classifier-Based Approximate Policy Iteration
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
SPECIAL ISSUE on Document Analysis, 5(2):1-15, 2005.
Presentation transcript:

Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun Bidirectional Active Learning: A Two-Way Exploration Into Unlabeled and Labeled Data Set By Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun Presented By Ruhani Faiheem Rahman

But what about mislabeled data? Abstract Labelling the data is one of the major problem in Machine Learning We have huge amount of unlabeled data and few has the label Active Learning helps in that case. But what about mislabeled data? This noise will propagate to the model This paper explores both labeled and unlabeled data sets simultaneously

Introduction Classic Machine Learning Supervised Learning Unsupervised Learning If unlabeled data explored along with the labeled data, considerable amount of improvement possible Actively select the most informative instances to improve the model

Unidirectional Active Learning Traditional active learning Chose instance from the sample, to learn the model effectively Uncertainty sampling - choose the least certain instance Query by Committee - a voting is done among the classes. Most disagreed samples are selected Decision Theoretic Approach - instance which reduce the model’s generalization error if its label was known Noise in the labeled data can jeopardize the learning performance

Unidirectional Active Learning

Bidirectional Active Learning Forward Learning Backward Learning Backward Instance Detecting Instance-level detecting Label-level detecting Backward Instance Processing Undo Redo Relabel Reselect Backward Learning Algorithm

Forward Learning Similar to Unidirectional Active Learning (UDAL) Selects a forward instance from Unlabeled data set based on the selection mechanism described before Add the instance to label data set and removes it from unlabeled data set Train a new model

Backward Learning Detect a Backward Instance Explores the labeled data set instead of Unlabeled data set Detect an instance from the labeled data set based on Instance level detecting FInd the instance without which the entropy over unlabeled data set would be minimized Label level detecting Find the most suspiciously mislabeled instance If the label is changed and the entire error is minimized

Backward Learning Process the backward instance Undo Eliminate the negative influence by removing it from the training set Suitable for instance level method Redo Relabel Backward instance is returned to be labeled for the second time If new label is the same as previous then it will be copied twice Otherwise replace it with the new label Reselect Find the nearest neighbour of the backward instance in the unlabeled data set Probability of the neighbour’s label and the backward instance is higher Add this instance to the train model

BDAL Algorithm

Experiments Synthesis Data Classification Handwritten Digit Classification Image Classification Patent Document Classification

Synthesis Data Classification Two class synthesis data 410 instances 205 instances for each class 5 instances from each class are selected randomly for initial training

Synthesis Data Classification

B. Handwritten Digit Classification MNIST data set 10,000 instances of test data Each image has 28 * 28 pixels 100 images randomly picked for initial training For model update, 100 images are labeled with 5% noise Result is averaged over 20 runs

B. Handwritten Digit Classification

C. Image Classification 50 categories of images. Like car, ship, human etc Each category contains 100 images 10 images from each category used for initial training the model So, total 500 images as initial training data For each model update 500 images with 5% noise. Result is averaged over 50 runs.

C. Image Classification

D. Patent Document Classification 5000 patents data from Innography database. All those data are manually classified by domain experts into 5 classes. 5484 terms are extracted as raw text file, then use PCA for dimention reduction 150-D feature vector 50 instances are picked randomly for initial training data For each model update, 100 instances are labeled with 5% noise Result is averaged over 20 runs

D. Patent Document Classification

Conclusion BDAL performs better in all the experiments BDAL > UDAL > NonAL Redo Strategy receives slightly improved performance than UDAL Undo Strategy outperforms most of the experiments

Future Plan Different strategies can be adopted during the backward learning process based on the noise on the data Fast approximation algorithm will be studied for computational efficiency

Thank You

Any Question?