1 / 4

Non-classical Bias-Variance Trade-off: Surprising Insights in Deep Learning

This lecture explores the non-classical bias-variance trade-off in deep learning, highlighting the over-parametrization regime and the robustness achieved through data augmentation and adversarial training. It also discusses the decoupling of uncertainties, symmetries in non-convex optimization, and the impact of sample complexity on generalization. The lecture touches on stability, interpretability, reinforcement learning, GANs, and the challenges in achieving good results. It concludes with suggestions for improvement in peer grading, application questions, and more diverse discussions.

kruger
Download Presentation

Non-classical Bias-Variance Trade-off: Surprising Insights in Deep Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS 165 Lecture 18 Summing up..

  2. Most surprising/interesting things learnt in this class • Bias-variance trade-off in non-classical setting • Over-parametrization regime where traditional bias-variance don’t behave the same • Robustness – min-max games (data augmentation + adversarial training) • Decoupling the uncertainties in errors (active learning, fairness) • Symmetries in non-convex optimization can be exploited • Non-isolated optimal points (lives in the manifold) • Generalization can be improved with robust training • Data collection and importance of having good train/test set • Role of local regularization and global regularization • Sample complexity results and how it is not straightforward in deep learning • Stability, effect of single data point in the dataset

  3. Missing pieces (not covered in class) • Interpretability • Discussed applications are limited to standard ones, not many deep learning or recent applications of established works • Reinforcement learning/Lifelong learning • Likelihood in test after training • GANs • Negative results and tricks in achieving good results • Explicit discussion of types of learning algorithms/ categorization • Decision trees/classical ML techniques • Recent properties of neural network architectures and computation scales • Sequence models/autoencoders and embeddings • Causal inference(besides Bayesian networks?)

  4. Other comments.. • Peer grading (not much learning from peer grading)(maybe on the open ended ones?) • Maybe different application questions • Paper to code kind of questions • More applications/discussions of tensors • Send the good homeworks to students instead of sending randomly • More recitations on both theory and applications

More Related