1 / 24

-- classifier, forward neural network, supervised learning

CH. 13: Kernel Machines. (A) Support Vector Machine (SVM). -- classifier, forward neural network, supervised learning. Difficulties with SVM: i) binary classifier, ii) linearly separable patterns. SVM finds optimal separating hyperplane ( OSH )

jmilam
Download Presentation

-- classifier, forward neural network, supervised learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CH. 13: Kernel Machines (A) Support Vector Machine (SVM) -- classifier, forward neural network, supervised learning Difficulties with SVM: i) binary classifier, ii) linearly separable patterns

  2. SVM finds optimal separating hyperplane (OSH) With the maximal margin between two support hyperplaneswhich are formed bysupport vectors.

  3. Data points: Let the equation of the OSH be : normal vectorpoints toward positive data : distance to the origin e.g.,

  4. Let : support hyperplanes Distances between them Then Rewrite Likewise, Margin:

  5. Replace Minimizing Maximizing margin subject to ( ) are called satisfying support vectors Lagrange Multiplier Method – converts a constrained to an unconstrained problem.

  6. The objective function: The optimal solution is given by the saddle point of , which is minimized w.r.t. w and b, while maximized w.r.t. i.e., . ThroughKarush–Kuhn–Tucker (KKT) conditions, L defined in the primal space of w, b, is translated to the dual space of

  7. --- (A) --- (B) --- (C)

  8. From (B), From (A), The problem becomes

  9. After solving by letting find w by (A) . For non-support vectors, From (C), Support vectorsare those whose Determine b using any support vector. Consider any support vector : # support vectors

  10. Overlapping patterns: the patterns that violate Define the constraint as Soft margin: : slack variables Two ways of violation:

  11. Problem: Find a separating hyperplane for which • minimal (ii) (iii) minimal (soft error) Lagrange objective function in the primal space, C:penalty factor

  12. ThroughKKT conditions, Dual space: the space of subject to Different from the separable case in that

  13. e.g., 2D 3D (B) Kernel Machines 13.5 Kernel Trick Cover’s theorem: Make nonlinearly separable data linearly separable by mapping them from low to high dimensional space x : a vector in the original N-D space

  14. : a set of functions that transform x to a space of infinite dimensionality. Let The OSH in the new space where 14

  15. Substitute (2), (3) into (1), : kernel function Let 15

  16. Mercer conditions: requirements of a kernel function A kernel function can be considered as a a measure of similarity between data points. 1. Symmetric 2. 3. 4.

  17. 13.6 Examples of Kernel Functions i) Linear kernel: ii) Polynomial kernel with degree d : e.g., d = 2

  18. iii) Perceptron kernel: iv) Sigmoidal kernel: v) Radial basis functionkernel: 13.8 Multiple Kernel Learning A new kernel can be constructed by combining simpler kernels, e.g.,

  19. (K > 2 classes) 13.9 Multiclass Kernel Machines • Train K 2-class classifiers , each one distinguishing one class from all other classes combined. During testing, 2. Train K(K-1)/2 pairwise classifiers 3. Train a single multiclass classifier

  20. 13.10 Kernel Machines for Regression • Consider a linear model Define constraints: : slack variables Problem: subject to constraints

  21. The Lagrangian Through KKT conditions:

  22. The dual: subject to

  23. (a) The examples that fall in the tube have (b) The support vectors satisfy

  24. The fitted line kernel function

More Related