Unsupervised and Transfer Learning Challenge

Unsupervised and Transfer Learning Challenge Isabelle Guyon Clopinet, California http://clopinet.com/ul

CREDITS Data donors: Handwriting recognition (AVICENNA) -- Reza Farrahi Moghaddam, Mathias Adankon, Kostyantyn Filonenko, Robert Wisnovsky, and Mohamed Chériet (Ecole de technologie supérieure de Montréal, Quebec) contributed the dataset of Arabic manuscripts. The toy example (ULE) is the MNIST handwritten digit database made available by Yann LeCun and Corinna Costes. Object recognition (RITA) -- Antonio Torralba, Rob Fergus, and William T. Freeman, collected and made available publicly the 80 million tiny image dataset. Vinod Nair and Geoffrey Hinton collected and made available publicly the CIFAR datasets. See the techreport Learning Multiple Layers of Features from Tiny Images, by Alex Krizhevsky, 2009, for details. Human action recognition (HARRY) -- Ivan Laptev and Barbara Caputo collected and made publicly available the KTH human action recognition datasets. Marcin Marszałek, Ivan Laptev and Cordelia Schmid collected and made publicly available the Hollywood 2 dataset of human actions and scenes. Text processing (TERRY) -- David Lewis formatted and made publicly available the RCV1-v2 Text Categorization Test Collection. Ecology (SYLVESTER) -- Jock A. Blackard, Denis J. Dean, and Charles W. Anderson of the US Forest Service, USA, collected and made available the (Forest cover type) dataset. Web platform: Server made available by Prof. Joachim Buhmann, ETH Zurich, Switzerland. Computer admin.: Thomas Fuchs, ETH Zurich. Webmaster: Olivier Guyon, MisterP.net, France. Protocol review and advising: • David W. Aha, Naval Research Laboratory, USA. • Gideon Dror, Academic College of Tel-Aviv Yaffo, Israel. • Vincent Lemaire, Orange Research Labs, France. • Gavin Cawley, University of east Anglia, UK. • Olivier Chapelle, Yahoo!, California, USA. • Gerard Rinkus, Brandeis University, USA. • Yoshua Bengio, Universite de Montreal, Canada. • David Grangier, NEC Labs, USA. • Andrew Ng, Stanford Univ., Palo Alto, California, USA • Graham Taylor, NYU, New-York. USA. • Andrew Ng, Stanford Univ., Palo Alto, California, USA. • Yann LeCun, NYU. New-York, USA. Beta testing and baseline methods: • Gideon Dror, Academic College of Tel-Aviv Yaffo, Israel. • Vincent Lemaire, Orange Research Labs, France. http://clopinet.com/ul

What is the problem? http://clopinet.com/ul

Labeling data is expensive $$$$$ $$ Unlabeled data Labeling data http://clopinet.com/ul

Examples of domains • Chemo-informatics • Handwriting & speech recognition • Image & video processing • Text processing • Marketing • Ecology • Embryology http://clopinet.com/ul

Scenarios Active Learning Semi-supervised Learning Transfer Learning Unsupervised Learning http://clopinet.com/ul

Setting Learned data representation Billions of images unlabeled or with different class labels Philip and Thomas Omar, Thomas and Philip Anna, Thomas and GM Anna Solene Personal data, only a few labeled examples Martin Thomas Philip Philip Bernhard http://clopinet.com/ul

Datasets http://clopinet.com/ul

Difficulties • Sparse data • Unbalanced class distributions • Noisy data • Large datasets • No categorical variables • No missing values • Must turn in results on ALL datasets. http://clopinet.com/ul

Protocol http://clopinet.com/ul

Data Split (phase 1) Development data Competitors Evaluators Validation data Type 2 Labels Final evaluation data Type 3 Labels http://clopinet.com/ul

Data Split (phase 2) Development data Competitors Type 1 Labels Evaluators Validation data Type 2 Labels Final evaluation data Type 3 Labels http://clopinet.com/ul

On-line feed-back For each dataset in {Avicenna, Harry, …} • Download the (P x N) development data matrix and the (4096 x N) validation & final data matrices. • Create transformed data matrices (4096 x M), M4096, or similarity matrices (4096 x 4096), for the validation and/or final data. • Submit on the website. • Retrieve the learning curves on validation data. http://clopinet.com/ul

Two phases • Phase 1: Unsupervised Learning • Only unlabeled data available. • Deadline: February 28, 2011. • Phase 2: Transfer Learning • A limited amount of transfer labels available (labels on examples of the development set of classes not used for evaluation). • Deadline: April 15, 2011. http://clopinet.com/ul

Evaluation http://clopinet.com/ul

AUC score For each set of samples queried, we assess the predictions of the learning machine with the Area under the ROC curve. http://clopinet.com/ul

Area under the Learning Curve(ALC) Linear interpolation. Horizontal extrapolation. http://clopinet.com/ul

Classifier used • Linear discriminant: f(x) = w . x = Si wi xi • Hebbian learning: X = (p, N) training data matrix Y {–1/p– , +1/p+}p target vector w = X’ Y = (1/p+)Skposxk –(1/p–) Sknegxk http://clopinet.com/ul

Kernel version • Kernel classifier: f(x) = Skak k(xk ,x) with a linear kernel k(xk ,x) = xk . x and with ak = –1/p– , if xk  neg ak = +1/p+ , if xk  pos • Equivalent linear discriminant f(x) = (1/p+)Skposxk . x – (1/p–) Sknegxk . x = w . x with w = (1/p+)Skposxk – (1/p–) Sknegxk http://clopinet.com/ul

Justification • Simple classifier • Robust against overfitting • Puts emphasis on learning a good data representation • Easily kernelized http://clopinet.com/ul

Getting started… http://clopinet.com/ul

Phase 1: no labels • No learning at all: • Normalization of examples or features • Construction of features (e.g. products) • Generic data transformations (e.g. taking the log, Fourier transform, smoothing, etc.) • Unsupervised learning: • Manifold learning to reduce dimension (and/or orthogonalize features) • Clustering to construct features • Generative models and latent variable models http://clopinet.com/ul

PCA • The canonical example of manifold learning • Diagonalize: X X' = U D U' • The eigenvectors U constitute a set of orthogonal features: U'U=I. • Select a few U corresponding to the largest eigenvalues as new feature vectors. • Similar effect as regularization (for example ridge regression). http://clopinet.com/ul

Other manifold algorithms • ICA • Kernel PCA • Kohonen maps • Auto-encoders • MDS, Isomap, LLE, Laplacian Eigenmaps • Regularized principal manifolds http://clopinet.com/ul

K-means clustering • Start with random cluster centers. • Iterate: • Assign the examples to their closest center to form clusters. • Re-compute the centers by averaging the cluster members. • Create features, e.g. • fk= exp –g ||x-xk|| Clusters of ULE valid after 5 it. http://clopinet.com/ul

Other clustering algorithms • Overlapping clusters (Gaussian mixtures, fuzzy C-means) • Hierarchical clustering • Graph partitioning • Spectral clustering http://clopinet.com/ul

Deep Learning Greedy layer-wise unsupervised pre-training of multi-layer neural networks and Bayesian networks, including: • Deep Belief Networks (stacks of Restricted Boltzmann machines) • Stacks of auto-encoders decoder encoder http://clopinet.com/ul

Resources • Unsupervised Learning. Zoubin Ghahramani. http://www.gatsby.ucl.ac.uk/~zoubin/course04/ul.pdf • Nonlinear dimensionality reduction. http://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction • Data Clustering: A Review. Jain et al. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.2720 • Why Does Unsupervised Pre-training Help Deep Learning? Erhan et al. http://jmlr.csail.mit.edu/papers/volume11/erhan10a/erhan10a.pdf http://clopinet.com/ul

Phase 2: transfer learning • This challenge: No transfer labels available for the primary/target task(s). Some labels available for secondary/source tasks. • Upcoming challenge on Inductive Transfer Learning: A few labels available for the primary/target task(s) and many more labels available for secondary/source tasks. http://clopinet.com/ul

Transfer learning taxonomy Adapted from survey of Sinno Jialin Pan and Qiang Yang 30 http://clopinet.com/ul

Cross-task Learning • Similarity or kernel learning: • Siamese neural networks • Graph-theoretic methods • Data representation learning: • Deep neural networks • Deep belief networks (re-use the internal representation created by the hidden units and/or output units) http://clopinet.com/ul

Resources • A Survey on Transfer Learning. Pan and Yang. http://www1.i2r.a-star.edu.sg/~jspan/publications/TLsurvey_0822.pdf • Signature Verification using a "Siamese" Time Delay Neural Network. Bromley et al. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.28.4792 • Fast Graph Laplacian Regularized Kernel Learning via Semideﬁnite–Quadratic–Linear Programming. Wu et al. http://books.nips.cc/papers/files/nips22/NIPS2009_0792.pdf • Transfer Learning Techniques for Deep Neural Nets. Gutstein thesis. http://robust.cs.utep.edu/~gutstein/sg_home_files/thesis.pdf http://clopinet.com/ul

UTL Challenge Dec 2010-March 2011 http://clopinet.com/ul Development data • Prizes: $6000 + free registrations + travel awards • Dissemination: Workshops at ICML and IJCNN; proceedings in JMLR W&CP. Competitors Type 1 Labels Validation data Type 2 Labels Final evaluation data Evaluators Type 3 Labels http://clopinet.com/ul

Inductive TL Challenge July 2011, ICML - Dec 2011, NIPS http://clopinet.com/tl Development Data Primary and Secondary tasks Two domains of tasks: - Synthetic, Real-world - Supervised training examples - Concept (binary class) tasks - 5-10 secondary tasks, 1 primary - Impoverished primary task data - Diversity of tasks with varying degree of relatedness to primary task Competitors Labels July, 2011 Validation Data Primary task Labels Challenge data Primary task Evaluators Sept, 2011 Labels http://clopinet.com/ul

Gesture Recognition Challenge June 2011-Nov. 2011 http://clopinet.com/gs (in preparation) STEP 1: Develop a “generic” sign language recognition system that can learn new signs with a few examples. STEP 2: At conference: teach the system new signs. STEP 3: Live evaluation in front of audience. http://clopinet.com/ul

Unsupervised and Transfer Learning Challenge

Unsupervised and Transfer Learning Challenge

Presentation Transcript

Unsupervised Learning

Unsupervised Learning

Deep Learning of Representations for Unsupervised and Transfer Learning

Unsupervised Learning and Clustering

Unsupervised Learning

Unsupervised learning

Unsupervised Learning

Unsupervised Learning

Unsupervised Learning

Unsupervised and Transfer Learning Challenge

Unsupervised Learning

Unsupervised learning

Unsupervised Learning

Unsupervised Learning

Unsupervised learning

Unsupervised Learning

Unsupervised Learning and Clustering

Unsupervised Learning

Unsupervised Transfer Classification

Unsupervised Learning