Semi-supervised Machine Learning Gergana Lazarova

Slides:



Advertisements
Similar presentations
Random Forest Predrag Radenković 3237/10
Advertisements

Co Training Presented by: Shankar B S DMML Lab
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
Data Mining Classification: Alternative Techniques
Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.
SVM—Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Supervised Learning Recap
A Survey on Text Categorization with Machine Learning Chikayama lab. Dai Saito.
Texture Segmentation Based on Voting of Blocks, Bayesian Flooding and Region Merging C. Panagiotakis (1), I. Grinias (2) and G. Tziritas (3)
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
The loss function, the normal equation,
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.
Redaction: redaction: PANAKOS ANDREAS. An Interactive Tool for Color Segmentation. An Interactive Tool for Color Segmentation. What is color segmentation?
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Chapter 5 Data mining : A Closer Look.
Ensemble Learning (2), Tree and Forest
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
Active Learning for Class Imbalance Problem
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
NEURAL NETWORKS FOR DATA MINING
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Classification Heejune Ahn SeoulTech Last updated May. 03.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Watch Listen & Learn: Co-training on Captioned Images and Videos
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Ensemble Methods in Machine Learning
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Semi-Supervised Clustering
Machine Learning – Classification David Fenyő
Sofus A. Macskassy Fetch Technologies
Advanced data mining with TagHelper and Weka
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Efficient Image Classification on Vertically Decomposed Data
Trees, bagging, boosting, and stacking
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
SSL Chapter 4 Risk of Semi-supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers.
COMBINED UNSUPERVISED AND SEMI-SUPERVISED LEARNING FOR DATA CLASSIFICATION Fabricio Aparecido Breve, Daniel Carlos Guimarães Pedronette State University.
Categorizing networks using Machine Learning
Machine Learning Week 1.
Efficient Image Classification on Vertically Decomposed Data
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Semi-Supervised Time Series Classification
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
Semi-Supervised Learning
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

Semi-supervised Machine Learning Gergana Lazarova Sofia University “St. Kliment Ohridski”

Semi-Supervised Learning Labeled examples Unlabeled examples Training data Usually, the number of unlabeled examples is much bigger than that of the labeled ones Unlabeled examples are easy to collect

Self-Training At first, only the labeled instances are used for learning After that, this classifier predicts the labels of the unclassified instances. A portion of the newly labeled examples (former unlabeled) augments the set of labeled examples and the classifier is retrained. An iterative procedure

Cluster-then-label It first clusters the instances (labeled and unlabeled) into k groups, performing unsupervised clustering algorithm. After that, for each cluster Cj - based on the labeled examples in it, a supervised algorithm is learned and used to classify the unlabeled examples, which belong to Cj.

Semi-supervised Support Vector Machines

Semi-supervised Support Vector Machines Since unlabeled examples do not have labels, we do not know on which side of the boundary they are Hat loss function: Decision boundary

Graph-based Semi-supervised Learning Graph-based semi-supervised learning constructs a graph from the training examples. The nodes of the graph are data points (labeled and unlabeled) and the edges represent similarities between points. Fig. 1 A semi-supervised graph

Graph-based Semi-supervised Learning An edge between two vertices represents the similarity (wij) between them. The closer two vertices are, the higher the value of wij is. MinCut Algorithm - find a minimum set of edges whose removal blocks the whole flow from one of the classes to the other class.

Semi-supervised Multi-view Learning Fig. 2 Semi-supervised Multi-view Learning

Multi-View Learning– examples Fig. 3 – Multiple Sources of Information

Semi-supervised Multi-view Learning Co-training - the algorithm augments the set of labeled examples of each classifier, based on the other learner's predictions. (1) Each view (set of features) is sufficient for classification; (2) The two views (feature sets of each instance) are conditionally independent given the class. Co-ЕМ

Multi-View Learning – error minimization Loss function - measures the amount of loss of the prediction. Risk. The risk associated with f is defined as the expectation of the loss function Emperical Risk- the average loss of f on a labeled training set. Multi-view minimization problem

Semi-supervised Multi-view Genetic Algorithm Minimizes the semi-supervised multi-view learning error It can be applied to multiple sources of data It works for convex and non-convex functions. Approaches based on gradient descend require a convex function. When a function is not convex, it is a hard optimization problem.

Semi-supervised Multi-view Genetic Algorithm Individual: Fitness Function Do not change the size of the chromosome and do not mix the features of different views when applying crossover and mutation. view 1 w11 … w1s view j wj1 wjl view k wk1 wkp

Experimental Results “Diabetes” (UCI Machine Learning Repository) Views: k = 2, x = (x(1), x(2)) MAX_ITER = 20000, N = 100 Comparison to supervised equivalents Тable 2 Comparison to supervised equivalents Algorithm % labeled examples RMSE SSMVGA 3% 0.63 Linear regression 90% 0.40 kNN 0.45 Backpropagation Steps=5000 0.54

Sentiment analysis in Bulgarian Most of the research has been conducted in English. Sentiment analysis in Bulgarian suffers from labeled examples shortage. A Sentiment Analysis System in Bulgarian – Each instance has attributes from multiple sources of data (a Bulgarian and English view)

DataSet English reviews – amazon Bulgarian reviews - www.cinexio.com

Big Data Bulgarian view: 17099 features English view: 12391 features Fig. 4 Big Data - Modelling

Examples (1) Rating: ** F(SSMVGA) = 1.965 F(supervised) = 3.13

Examples(2) Rating : ** F(SSMVGA) = 1.985 F(supervised) = 1.98

Examples(3) Rating: ***** F(SSMVGA) = 1.985 F(supervised) = 1.98

Multi-view Teaching Algorithm A semi-supervised two-view learning algorithm A modification of the standard co-training algorithm Improve only the weaker classifier Uses only the most confident examples of the stronger view Combining the views Application – object segmentation

A Semi-supervised Image Segmentation System A “teacher” should label few points of each class, giving the algorithm the idea of the clusters The aim is to augment the training set with more labeled examples, reaching a better predictor. The first view contains the coordinates of the pixels (x, y): view1 = (X, Y) The second view contains the RGB values of the pixels (red, green, blue values ranging from 0 to 255)

DataSet Fig. 5 – Original Image, desired segmentation

Experimental Results 2 experiments: Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier: Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL):

Results (1) Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier The image consists of 50700 pixels. At each cross-validation step only a small amount of labeled pixels is used. Multiple tests were held depending on the number of labeled examples (4, 6, 10, 16, 20, 50 pixels). Тable 4 Accuracy based on the number of labeled examples Algorithm 4 6 10 16 20 50 NB 63.30% 76.23% 85.44% 89.57% 90.33% 92.37% MTA 68.62% 81.30% 88.14% 90.74% 91.24% 92.51%

Results (1) Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier 16 labeled examples Таблица 5 Сравнение на алгоритмите NB и MVTA   MVTA NB Image 1 90.74% 89.57% Image 2 80.76% 78.82% Image 3 90.10% 89.12%

Results (2) Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL): 16 labeled examples Таблица 6 Comparison of MND-MVTA and MND-SL Algorithm MND-MVTA MND-SL Image 1 84.36% 79.22% Image 2 79.14% 73.74% Image 3 86.02% 80.18%

Examples Multi-view Teaching Naïve Bayes Supervised

Examples Multi-view Teaching Naïve Bayes Supervised

Благодаря за вниманието! Thank you! Благодаря за вниманието! どうもありがとうございます!