Autoencoders: Generative Modeling & Overfitting Solutions

ECE 599/692 – Deep LearningLecture 9 – Autoencoder(AE) Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: hqi@utk.edu

Outline • Lecture9: Points crossed • GeneralstructureofAE • Unsupervised • Generative model? • The representative power • Basicstructureofalinearautoencoder • Denoising autoencoder (DAE) • AEinsolvingoverfittingproblem • Lecture10:Variationalautoencoder • Lecture11:Casestudy

General structure z W2 y W1 x

GenerativeModel • ThegoalistolearnamodelPwhichwecansamplefrom,suchthatPisassimilaraspossibletoPgt,wherePgtissomeunknowndistributionthathadgeneratedexamplesX • Theingredients • Explicitestimateofthedensity • Abilitytosampledirectly

Discriminationvs.RepresentationofData x2 y1 y2 Error: x1 • Bestdiscriminatingthedata • Fisher’slineardiscriminant(FLD) • NN • CNN • Bestrepresentingthedata • Principalcomponentanalysis(PCA)

PCAasLinearAutoencoder Raw data(Xnxd)  covariance matrix(SX)  eigenvaluedecomposition(ldx1andEdxd) principal component(Pdxm)Ynxm=Xnxd * Pdxm

Denoising Autoencoder (DAE) [DAE:2008]

The Two Papers in 2006 [Hinton:2006a] G.E. Hinton, S. Osindero, Y.W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, 18(7):1527-1554, 2006. [Hinton:2006b] G.E. Hinton, R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, 313:504-507, July 2006.

Techniques to Avoid Overfitting • Regularization • WeightdecayorL1/L2normalization • Use dropout • Dataaugmentation • Use unlabeled data to train a different network and then use the weight to initialize our network • Deep belief networks (basedonrestrictedBoltzmannMachineorRBM) • Deep autoencoders (basedonautoencoder)

[Hinton:2006b]

RBMvs.AE h1 h2 h0 v0 v1 v2 v3 Stochasticvs.Deterministic p(h,v)vs.Gradientoflog-likelihood

AE as PretrainingMethods • Pretraining step • Train a sequence of shallow autoencoders, greedily one layer at a time, using unsupervised data • Fine-tuning step 1 • Train the last layer using supervised data • Fine-tuning step 2 • Use backpropagation to fine-tune the entire network using supervised data

Recap • GeneralstructureofAE • Unsupervised • Generative model? • The representative power • Basicstructureofalinearautoencoder • Denoising autoencoder (DAE) • AEinsolvingoverfittingproblem

Autoencoders: Generative Modeling & Overfitting Solutions