1 / 42

Support Vector Machine II

Support Vector Machine II. Jia-Bin Huang Virginia Tech. ECE-5424G / CS-5824. Spring 2019. Administrative. Please use piazza. No emails. HW 2 released. Support Vector Machine. Cost function Large margin classification Kernels Using an SVM. Support vector machine.

virginiap
Download Presentation

Support Vector Machine II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector Machine II Jia-Bin Huang Virginia Tech ECE-5424G / CS-5824 Spring 2019

  2. Administrative • Please use piazza. No emails. • HW 2 released.

  3. Support Vector Machine • Cost function • Large margin classification • Kernels • Using an SVM

  4. Support vector machine If “y = 1”, we want ( If “y = 0”, we want( Slide credit: Andrew Ng

  5. SVM decision boundary • Let’s say we have a very large • Whenever • Whenever Slide credit: Andrew Ng

  6. SVM decision boundary: Linearly separable case Slide credit: Andrew Ng

  7. SVM decision boundary: Linearly separable case margin Slide credit: Andrew Ng

  8. Why large margin classifiers? margin

  9. Vector inner product length of vector length of projection of onto Slide credit: Andrew Ng

  10. SVM decision boundary Simplication: What’s ? Slide credit: Andrew Ng

  11. SVM decision boundary Simplication: small large large can be small Slide credit: Andrew Ng

  12. Rewrite the formulation Let , with margin

  13. Data not linearly separable? NP-hard 

  14. Convex relaxation NP-hard  : slack variables

  15. Hinge loss Image credit: https://math.stackexchange.com/questions/782586/how-do-you-minimize-hinge-loss

  16. Hard-margin SVM formulation Soft-margin SVM formulation

  17. Support Vector Machine • Cost function • Large margin classification • Kernels • Using an SVM

  18. Non-linear classification • How do we separate the two classes using a hyperplane?

  19. Non-linear classification

  20. Kernel • a legal definition of inner product: s.t.

  21. Why Kernels matter? • Many algorithms interact with data only via dot-products • Replace with • Act implicitly as if data was in the higher-dimensional -space

  22. Example corresponds to

  23. Example corresponds to Slide credit: Maria-Florina Balcan

  24. Example kernels • Linear kernel • Gaussian (Radial basis function) kernel • Sigmoid kernel

  25. Constructing new kernels • Positive scaling • Exponentiation • Addition • Multiplication with function • Multiplication

  26. Non-linear decision boundary Predict Is there a different/better choice of the features ? Slide credit: Andrew Ng

  27. Kernel Give , compute new features depending on proximity to landmarks , , Gaussian kernel Slide credit: Andrew Ng

  28. Predict Ex: Slide credit: Andrew Ng

  29. Choosing the landmarks • Given Predict Where to get Slide credit: Andrew Ng

  30. SVM with kernels • Given • Choose , , • Given example : • For training example : Slide credit: Andrew Ng

  31. SVM with kernels • Hypothesis: Given , compute features • Predict • Training (original) • Training (with kernel)

  32. Support vector machines (Primal/Dual) • Primal form • Lagrangian dual form

  33. SVM (Lagrangian dual) Classifier: • The points for which Support Vectors Replace with

  34. SVM parameters • Large : Lower bias, high variance. Small Higher bias, low variance. • Large features vary more smoothly. • Higher bias, lower variance • Small features vary less smoothly. • Lower bias, higher variance Slide credit: Andrew Ng

  35. SVM Demo • https://cs.stanford.edu/people/karpathy/svmjs/demo/

  36. SVM song • https://www.youtube.com/watch?v=g15bqtyidZs Video source:

  37. Support Vector Machine • Cost function • Large margin classification • Kernels • Using an SVM

  38. Using SVM • SVM software package (e.g., liblinear, libsvm) to solve for • Need to specify: • Choice of parameter . • Choice of kernel (similarity function): • Linear kernel: Predict • Gaussian kernel: • , where • Need to choose . Need proper feature scaling Slide credit: Andrew Ng

  39. Kernel (similarity) functions • Note: not all similarity functions make valid kernels. • Many off-the-shelf kernels available: • Polynomial kernel • String kernel • Chi-square kernel • Histogram intersection kernel Slide credit: Andrew Ng

  40. Multi-class classification • Use one-vs.-all method. Train SVMs, one to distinguish from the rest, get • Pick class with the largest Slide credit: Andrew Ng

  41. Logistic regression vs. SVMs • number of features (, number of training examples • If is large (relative to ): Use logistic regression or SVM without a kernel (“linear kernel”) • If is small, is intermediate: Use SVM with Gaussian kernel • If is small, is large: Create/add more features, then use logistic regression of linear SVM Neural network likely to work well for most of these case, but slower to train Slide credit: Andrew Ng

  42. Things to remember • Cost function • Large margin classification • Kernels • Using an SVM margin

More Related