Enhancing Machine Learning with Semi-Supervised Techniques for Improved Performance

Semi-supervised Machine LearningGergana Lazarova Sofia University “St. Kliment Ohridski”

Semi-Supervised Learning • Labeled examples • Unlabeled examples • Training data • Usually, the number of unlabeled examples is much bigger than that of the labeled ones • Unlabeled examples are easy to collect

Self-Training • At first, only the labeled instances are used for learning • After that, this classifier predicts the labels of the unclassified instances. • A portion of the newly labeled examples (former unlabeled) augments the set of labeled examples and the classifier is retrained. • An iterative procedure

Cluster-then-label • It first clusters the instances (labeled and unlabeled) into k groups, performing unsupervised clustering algorithm. • After that, for each cluster Cj - based on the labeled examples in it, a supervised algorithm is learned and used to classify the unlabeled examples, which belong to Cj.

Semi-supervised Support Vector Machines

Semi-supervised Support Vector Machines • Since unlabeled examples do not have labels, we do not know on which side of the boundary they are • Hat loss function: • Decision boundary

Graph-based Semi-supervised Learning • Graph-based semi-supervised learning constructs a graph from the training examples. • The nodes of the graph are data points (labeled and unlabeled) and the edges represent similarities between points. Fig. 1 A semi-supervised graph

Graph-based Semi-supervised Learning • An edge between two vertices represents the similarity (wij) between them. The closer two vertices are, the higher the value of wij is. • MinCut Algorithm - find a minimum set of edges whose removal blocks the whole flow from one of the classes to the other class.

Semi-supervised Multi-view Learning Fig. 2Semi-supervised Multi-view Learning

Multi-View Learning– examples Fig. 3 – Multiple Sources of Information

Semi-supervised Multi-view Learning Co-training - the algorithm augments the set of labeled examples of each classifier, based on the other learner's predictions. (1) Each view (set of features) is sufficient for classification; (2) The two views (feature sets of each instance) are conditionally independent given the class. Co-ЕМ

Multi-View Learning – error minimization • Loss function - measures the amount of loss of the prediction. • Risk. The risk associated with f is defined as the expectation of the loss function • Emperical Risk- the average loss of f on a labeled training set. • Multi-view minimization problem

Semi-supervised Multi-view Genetic Algorithm • Minimizes the semi-supervised multi-view learning error • It can be applied to multiple sources of data • It works for convex and non-convex functions. Approaches based on gradient descend require a convex function. When a function is not convex, it is a hard optimization problem.

Semi-supervised Multi-view Genetic Algorithm • Individual: • Fitness Function • Do not change the size of the chromosome and do not mix the features of different views when applying crossover and mutation.

Experimental Results • “Diabetes” (UCIMachineLearningRepository) • Views: k = 2, x = (x(1), x(2)) • MAX_ITER = 20000, N = 100 • Comparison to supervised equivalents Тable 2Comparison to supervised equivalents

Sentiment analysis in Bulgarian • Most of the research has been conducted in English. • Sentiment analysis in Bulgarian suffers from labeled examples shortage. • A Sentiment Analysis System in Bulgarian– Each instance has attributes from multiple sources of data (a Bulgarian and English view)

DataSet • English reviews– amazon • Bulgarian reviews - www.cinexio.com

Big Data • Bulgarian view: 17099 features • English view: 12391 features Fig. 4Big Data - Modelling

Examples (1) • Rating: ** • F(SSMVGA) = 1.965 F(supervised) = 3.13

Examples(2) • Rating: ** • F(SSMVGA) = 1.985 F(supervised) = 1.98

Examples(3) • Rating: ***** • F(SSMVGA) = 1.985 F(supervised) = 1.98

Multi-view Teaching Algorithm • A semi-supervised two-view learning algorithm • A modification of the standard co-training algorithm • Improve only the weaker classifier • Uses only the most confident examples of the stronger view • Combining the views • Application – object segmentation

A Semi-supervised Image Segmentation System • A “teacher” should label few points of each class, giving the algorithm the idea of the clusters • The aim is to augment the training set with more labeled examples, reaching a better predictor. • The first view contains the coordinates of the pixels (x, y): view1 = (X, Y) • The second view contains the RGB values of the pixels (red, green, blue values ranging from 0 to 255)

DataSet Fig. 5 – Original Image, desired segmentation Fig. 6 – Original Image, desired segmentation Fig. 7 – Original Image, desired segmentation

Experimental Results • 2 experiments: • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier: • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL):

Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • The image consists of 50700 pixels. At each cross-validation step only a small amount of labeled pixels is used. Multiple tests were held depending on the number of labeled examples (4, 6, 10, 16, 20, 50 pixels). Тable 4 Accuracy based on the number of labeled examples

Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • 16 labeled examples Таблица 5Сравнение на алгоритмите NB и MVTA

Results (2) • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL): • 16 labeled examples Таблица 6 Comparison of MND-MVTA andMND-SL

Examples • Multi-view Teaching Naïve Bayes Supervised

Thank you! Благодаря за вниманието! どうもありがとうございます！

Enhancing Machine Learning with Semi-Supervised Techniques for Improved Performance

Enhancing Machine Learning with Semi-Supervised Techniques for Improved Performance

Presentation Transcript

Semi-supervised Learning

Semi-Supervised Learning over Text

Chapter 8: Semi-Supervised Learning

Semi-Supervised Learning

Supervised machine learning

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised Learning

Inductive Semi-supervised Learning

Supervised and semi-supervised learning for NLP

Semi-Supervised Learning

Semi-Supervised Learning

Semi-supervised Learning

Semi-Supervised Learning

COMP3503 Semi-Supervised Learning

Semi-Supervised Learning