160 likes | 201 Views
Dive into the structure of Autoencoders (AE) in deep learning, including Denoising AE and tackling overfitting. Explore Variational AE and case studies. Learn about Generative Models and Representation vs Discrimination of Data.
E N D
ECE 599/692 – Deep LearningLecture 9 – Autoencoder(AE) Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: hqi@utk.edu
Outline • Lecture9: Points crossed • GeneralstructureofAE • Unsupervised • Generative model? • The representative power • Basicstructureofalinearautoencoder • Denoising autoencoder (DAE) • AEinsolvingoverfittingproblem • Lecture10:Variationalautoencoder • Lecture11:Casestudy
General structure z W2 y W1 x
GenerativeModel • ThegoalistolearnamodelPwhichwecansamplefrom,suchthatPisassimilaraspossibletoPgt,wherePgtissomeunknowndistributionthathadgeneratedexamplesX • Theingredients • Explicitestimateofthedensity • Abilitytosampledirectly
Discriminationvs.RepresentationofData x2 y1 y2 Error: x1 • Bestdiscriminatingthedata • Fisher’slineardiscriminant(FLD) • NN • CNN • Bestrepresentingthedata • Principalcomponentanalysis(PCA)
PCAasLinearAutoencoder Raw data(Xnxd) covariance matrix(SX) eigenvaluedecomposition(ldx1andEdxd) principal component(Pdxm)Ynxm=Xnxd * Pdxm
Denoising Autoencoder (DAE) [DAE:2008]
The Two Papers in 2006 [Hinton:2006a] G.E. Hinton, S. Osindero, Y.W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, 18(7):1527-1554, 2006. [Hinton:2006b] G.E. Hinton, R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, 313:504-507, July 2006.
Techniques to Avoid Overfitting • Regularization • WeightdecayorL1/L2normalization • Use dropout • Dataaugmentation • Use unlabeled data to train a different network and then use the weight to initialize our network • Deep belief networks (basedonrestrictedBoltzmannMachineorRBM) • Deep autoencoders (basedonautoencoder)
RBMvs.AE h1 h2 h0 v0 v1 v2 v3 Stochasticvs.Deterministic p(h,v)vs.Gradientoflog-likelihood
AE as PretrainingMethods • Pretraining step • Train a sequence of shallow autoencoders, greedily one layer at a time, using unsupervised data • Fine-tuning step 1 • Train the last layer using supervised data • Fine-tuning step 2 • Use backpropagation to fine-tune the entire network using supervised data
Recap • GeneralstructureofAE • Unsupervised • Generative model? • The representative power • Basicstructureofalinearautoencoder • Denoising autoencoder (DAE) • AEinsolvingoverfittingproblem