1 / 28

CS 461: Machine Learning Lecture 4

CS 461: Machine Learning Lecture 4. Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu. Plan for Today. Solution to HW 2 Support Vector Machines Review of Classification Non-separable data: the Kernel Trick Regression Neural Networks Perceptrons Multilayer Perceptrons Backpropagation.

addison
Download Presentation

CS 461: Machine Learning Lecture 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 461: Machine LearningLecture 4 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu CS 461, Winter 2008

  2. Plan for Today • Solution to HW 2 • Support Vector Machines • Review of Classification • Non-separable data: the Kernel Trick • Regression • Neural Networks • Perceptrons • Multilayer Perceptrons • Backpropagation CS 461, Winter 2008

  3. Review from Lecture 3 • Decision trees • Regression trees, pruning • Evaluation • One classifier: • Errors, confidence intervals, significance • Baselines • Comparing two classifiers: McNemar’s test • Cross-validation • Support Vector Machines • Classification • Linear discriminants, maximum margin • Learning (optimization): gradient descent, QP • Non-separable classes CS 461, Winter 2008

  4. Support Vector Machines Chapter 10 CS 461, Winter 2008

  5. Support Vector Machines • Maximum-margin linear classifiers • Support vectors • How to find best w, b? Optimization in action: SVM applet CS 461, Winter 2008

  6. What if Data isn’t Linearly Separable? • Add “slack” variables to permit some errors • [Andrew Moore’s slides] • Embed data in higher-dimensional space • Explicit: Basis functions (new features) • Implicit: Kernel functions (new dot product) • Still need to find a linear hyperplane CS 461, Winter 2008

  7. Build a Better Dot Product:Basis Functions • Preprocess input x by basis functions z = φ(x) g(z)=wTz g(x)=wT φ(x) CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  8. Build a Better Dot Product:Basis Functions -> Kernel Functions • Linear • Polynomial • RBF • Sigmoid Optimization in action: SVM applet CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  9. Example: Orbital Classification EO-1 • Linear SVM flying on EO-1 Earth Orbiter since Dec. 2004 • Classify every pixel • Four classes • 12 features (of 256 collected) Ice Water Land Snow Hyperion Classified CS 461, Winter 2008 [Castano et al., 2005]

  10. SVM Regression • Use a linear model (possibly kernelized) f(x)=wTx+w0 • Use the є-insensitive error function CS 461, Winter 2008

  11. Neural Networks Chapter 11 CS 461, Winter 2008

  12. Math Perceptron Graphical CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  13. Perceptrons in action • Regression: y=wx+w0 • Classification: y=1(wx+w0>0) y y y s w0 w0 w w x w0 x x x0=+1 CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  14. “Smooth” Output: Sigmoid Function • Why? • Converts output to probability! • Less “brittle” boundary CS 461, Winter 2008

  15. Regression: Classification: Softmax K outputs CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  16. Training a Neural Network • Randomly initialize weights • Update = Learning rate * (Desired - Actual) * Input CS 461, Winter 2008

  17. Learning Boolean AND Perceptron demo CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  18. Multilayer Perceptrons = MLP = ANN CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  19. x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2) CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  20. Backpropagation: MLP training CS 461, Winter 2008

  21. Backpropagation: Regression Backward Forward x CS 461, Winter 2008

  22. Backpropagation: Classification Backward Forward x CS 461, Winter 2008

  23. Backpropagation Algorithm CS 461, Winter 2008

  24. Examples • Digit Recognition • Ball Balancing CS 461, Winter 2008

  25. ANN vs. SVM • SVM with sigmoid kernel = 2-layer MLP • Parameters • ANN: # hidden layers, # nodes • SVM: kernel, kernel params, C • Optimization • ANN: local minimum (gradient descent) • SVM: global minimum (QP) • Interpretability? About the same… • So why SVMs? • Sparse solution, geometric interpretation, less likely to overfit data CS 461, Winter 2008

  26. Summary: Key Points for Today • Support Vector Machines • Non-separable data: basis functions, the Kernel Trick • Regression • Neural Networks • Perceptrons • Sigmoid • Training by gradient descent • Multilayer Perceptrons • Backpropagation (gradient descent) • ANN vs. SVM CS 461, Winter 2008

  27. Review for Midterm CS 461, Winter 2008

  28. Next Time • Midterm Exam! • 9:00 - 10:30 a.m. • Open book, open notes (no computer) • Covers all material through today • Bayesian Methods (read Ch. 3.1, 3.2, 3.7, 3.9) • If you’re rusty on probability, read Appendix A • Questions to answer from the reading • Posted on the website (calendar) CS 461, Winter 2008

More Related