Download presentation
Presentation is loading. Please wait.
Published byMilo McDonald Modified over 6 years ago
1
Semi-supervised Machine Learning Gergana Lazarova
Sofia University “St. Kliment Ohridski”
2
Semi-Supervised Learning
Labeled examples Unlabeled examples Training data Usually, the number of unlabeled examples is much bigger than that of the labeled ones Unlabeled examples are easy to collect
3
Self-Training At first, only the labeled instances are used for learning After that, this classifier predicts the labels of the unclassified instances. A portion of the newly labeled examples (former unlabeled) augments the set of labeled examples and the classifier is retrained. An iterative procedure
4
Cluster-then-label It first clusters the instances (labeled and unlabeled) into k groups, performing unsupervised clustering algorithm. After that, for each cluster Cj - based on the labeled examples in it, a supervised algorithm is learned and used to classify the unlabeled examples, which belong to Cj.
5
Semi-supervised Support Vector Machines
6
Semi-supervised Support Vector Machines
Since unlabeled examples do not have labels, we do not know on which side of the boundary they are Hat loss function: Decision boundary
7
Graph-based Semi-supervised Learning
Graph-based semi-supervised learning constructs a graph from the training examples. The nodes of the graph are data points (labeled and unlabeled) and the edges represent similarities between points. Fig. 1 A semi-supervised graph
8
Graph-based Semi-supervised Learning
An edge between two vertices represents the similarity (wij) between them. The closer two vertices are, the higher the value of wij is. MinCut Algorithm - find a minimum set of edges whose removal blocks the whole flow from one of the classes to the other class.
9
Semi-supervised Multi-view Learning
Fig. 2 Semi-supervised Multi-view Learning
10
Multi-View Learning– examples
Fig. 3 – Multiple Sources of Information
11
Semi-supervised Multi-view Learning
Co-training - the algorithm augments the set of labeled examples of each classifier, based on the other learner's predictions. (1) Each view (set of features) is sufficient for classification; (2) The two views (feature sets of each instance) are conditionally independent given the class. Co-ЕМ
12
Multi-View Learning – error minimization
Loss function - measures the amount of loss of the prediction. Risk. The risk associated with f is defined as the expectation of the loss function Emperical Risk- the average loss of f on a labeled training set. Multi-view minimization problem
13
Semi-supervised Multi-view Genetic Algorithm
Minimizes the semi-supervised multi-view learning error It can be applied to multiple sources of data It works for convex and non-convex functions. Approaches based on gradient descend require a convex function. When a function is not convex, it is a hard optimization problem.
14
Semi-supervised Multi-view Genetic Algorithm
Individual: Fitness Function Do not change the size of the chromosome and do not mix the features of different views when applying crossover and mutation. view 1 w11 … w1s view j wj1 wjl view k wk1 wkp
15
Experimental Results “Diabetes” (UCI Machine Learning Repository)
Views: k = 2, x = (x(1), x(2)) MAX_ITER = 20000, N = 100 Comparison to supervised equivalents Тable 2 Comparison to supervised equivalents Algorithm % labeled examples RMSE SSMVGA 3% 0.63 Linear regression 90% 0.40 kNN 0.45 Backpropagation Steps=5000 0.54
16
Sentiment analysis in Bulgarian
Most of the research has been conducted in English. Sentiment analysis in Bulgarian suffers from labeled examples shortage. A Sentiment Analysis System in Bulgarian – Each instance has attributes from multiple sources of data (a Bulgarian and English view)
17
DataSet English reviews – amazon Bulgarian reviews -
18
Big Data Bulgarian view: 17099 features English view: 12391 features
Fig. 4 Big Data - Modelling
19
Examples (1) Rating: ** F(SSMVGA) = F(supervised) = 3.13
20
Examples(2) Rating : ** F(SSMVGA) = F(supervised) = 1.98
21
Examples(3) Rating: ***** F(SSMVGA) = F(supervised) = 1.98
22
Multi-view Teaching Algorithm
A semi-supervised two-view learning algorithm A modification of the standard co-training algorithm Improve only the weaker classifier Uses only the most confident examples of the stronger view Combining the views Application – object segmentation
23
A Semi-supervised Image Segmentation System
A “teacher” should label few points of each class, giving the algorithm the idea of the clusters The aim is to augment the training set with more labeled examples, reaching a better predictor. The first view contains the coordinates of the pixels (x, y): view1 = (X, Y) The second view contains the RGB values of the pixels (red, green, blue values ranging from 0 to 255)
24
DataSet Fig. 5 – Original Image, desired segmentation
25
Experimental Results 2 experiments:
Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier: Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL):
26
Results (1) Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier The image consists of pixels. At each cross-validation step only a small amount of labeled pixels is used. Multiple tests were held depending on the number of labeled examples (4, 6, 10, 16, 20, 50 pixels). Тable 4 Accuracy based on the number of labeled examples Algorithm 4 6 10 16 20 50 NB 63.30% 76.23% 85.44% 89.57% 90.33% 92.37% MTA 68.62% 81.30% 88.14% 90.74% 91.24% 92.51%
27
Results (1) Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier 16 labeled examples Таблица 5 Сравнение на алгоритмите NB и MVTA MVTA NB Image 1 90.74% 89.57% Image 2 80.76% 78.82% Image 3 90.10% 89.12%
28
Results (2) Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL): 16 labeled examples Таблица 6 Comparison of MND-MVTA and MND-SL Algorithm MND-MVTA MND-SL Image 1 84.36% 79.22% Image 2 79.14% 73.74% Image 3 86.02% 80.18%
29
Examples Multi-view Teaching Naïve Bayes Supervised
30
Examples Multi-view Teaching Naïve Bayes Supervised
31
Благодаря за вниманието!
Thank you! Благодаря за вниманието! どうもありがとうございます!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.