1 / 31

Semi-supervised Machine Learning Gergana Lazarova

Semi-supervised Machine Learning Gergana Lazarova. Sofia University “St. Kliment Ohridski”. Semi-Supervised Learning. Labeled examples Unlabeled examples Training data Usually, the number of unlabeled examples is much bigger than that of the labeled ones

edwardwatts
Download Presentation

Semi-supervised Machine Learning Gergana Lazarova

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semi-supervised Machine LearningGergana Lazarova Sofia University “St. Kliment Ohridski”

  2. Semi-Supervised Learning • Labeled examples • Unlabeled examples • Training data • Usually, the number of unlabeled examples is much bigger than that of the labeled ones • Unlabeled examples are easy to collect

  3. Self-Training • At first, only the labeled instances are used for learning • After that, this classifier predicts the labels of the unclassified instances. • A portion of the newly labeled examples (former unlabeled) augments the set of labeled examples and the classifier is retrained. • An iterative procedure

  4. Cluster-then-label • It first clusters the instances (labeled and unlabeled) into k groups, performing unsupervised clustering algorithm. • After that, for each cluster Cj - based on the labeled examples in it, a supervised algorithm is learned and used to classify the unlabeled examples, which belong to Cj.

  5. Semi-supervised Support Vector Machines

  6. Semi-supervised Support Vector Machines • Since unlabeled examples do not have labels, we do not know on which side of the boundary they are • Hat loss function: • Decision boundary

  7. Graph-based Semi-supervised Learning • Graph-based semi-supervised learning constructs a graph from the training examples. • The nodes of the graph are data points (labeled and unlabeled) and the edges represent similarities between points. Fig. 1 A semi-supervised graph

  8. Graph-based Semi-supervised Learning • An edge between two vertices represents the similarity (wij) between them. The closer two vertices are, the higher the value of wij is. • MinCut Algorithm - find a minimum set of edges whose removal blocks the whole flow from one of the classes to the other class.

  9. Semi-supervised Multi-view Learning Fig. 2Semi-supervised Multi-view Learning

  10. Multi-View Learning– examples Fig. 3 – Multiple Sources of Information

  11. Semi-supervised Multi-view Learning Co-training - the algorithm augments the set of labeled examples of each classifier, based on the other learner's predictions. (1) Each view (set of features) is sufficient for classification; (2) The two views (feature sets of each instance) are conditionally independent given the class. Co-ЕМ

  12. Multi-View Learning – error minimization • Loss function - measures the amount of loss of the prediction. • Risk. The risk associated with f is defined as the expectation of the loss function • Emperical Risk- the average loss of f on a labeled training set. • Multi-view minimization problem

  13. Semi-supervised Multi-view Genetic Algorithm • Minimizes the semi-supervised multi-view learning error • It can be applied to multiple sources of data • It works for convex and non-convex functions. Approaches based on gradient descend require a convex function. When a function is not convex, it is a hard optimization problem.

  14. Semi-supervised Multi-view Genetic Algorithm • Individual: • Fitness Function • Do not change the size of the chromosome and do not mix the features of different views when applying crossover and mutation.

  15. Experimental Results • “Diabetes” (UCIMachineLearningRepository) • Views: k = 2, x = (x(1), x(2)) • MAX_ITER = 20000, N = 100 • Comparison to supervised equivalents Тable 2Comparison to supervised equivalents

  16. Sentiment analysis in Bulgarian • Most of the research has been conducted in English. • Sentiment analysis in Bulgarian suffers from labeled examples shortage. • A Sentiment Analysis System in Bulgarian– Each instance has attributes from multiple sources of data (a Bulgarian and English view)

  17. DataSet • English reviews– amazon • Bulgarian reviews - www.cinexio.com

  18. Big Data • Bulgarian view: 17099 features • English view: 12391 features Fig. 4Big Data - Modelling

  19. Examples (1) • Rating: ** • F(SSMVGA) = 1.965 F(supervised) = 3.13

  20. Examples(2) • Rating: ** • F(SSMVGA) = 1.985 F(supervised) = 1.98

  21. Examples(3) • Rating: ***** • F(SSMVGA) = 1.985 F(supervised) = 1.98

  22. Multi-view Teaching Algorithm • A semi-supervised two-view learning algorithm • A modification of the standard co-training algorithm • Improve only the weaker classifier • Uses only the most confident examples of the stronger view • Combining the views • Application – object segmentation

  23. A Semi-supervised Image Segmentation System • A “teacher” should label few points of each class, giving the algorithm the idea of the clusters • The aim is to augment the training set with more labeled examples, reaching a better predictor. • The first view contains the coordinates of the pixels (x, y): view1 = (X, Y) • The second view contains the RGB values of the pixels (red, green, blue values ranging from 0 to 255)

  24. DataSet Fig. 5 – Original Image, desired segmentation Fig. 6 – Original Image, desired segmentation Fig. 7 – Original Image, desired segmentation

  25. Experimental Results • 2 experiments: • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier: • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL):

  26. Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • The image consists of 50700 pixels. At each cross-validation step only a small amount of labeled pixels is used. Multiple tests were held depending on the number of labeled examples (4, 6, 10, 16, 20, 50 pixels). Тable 4 Accuracy based on the number of labeled examples

  27. Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • 16 labeled examples Таблица 5Сравнение на алгоритмите NB и MVTA

  28. Results (2) • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL): • 16 labeled examples Таблица 6 Comparison of MND-MVTA andMND-SL

  29. Examples • Multi-view Teaching Naïve Bayes Supervised

  30. Examples • Multi-view Teaching Naïve Bayes Supervised

  31. Thank you! Благодаря за вниманието! どうもありがとうございます!

More Related