1 / 12

Support Vector Machines Part 2

Support Vector Machines Part 2. Recap of SVM algorithm. Given training set S = {( x 1 , y 1 ), ( x 2 , y 2 ), ..., ( x m , y m ) | ( x i , y i )  n {+1, -1} Choose a cheap-to-compute kernel function k ( x , z )

hollye
Download Presentation

Support Vector Machines Part 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector MachinesPart 2

  2. Recap of SVM algorithm Given training set S = {(x1, y1), (x2, y2), ..., (xm, ym) | (xi, yi)n{+1, -1} • Choose a cheap-to-compute kernel function k(x,z) 2. Apply quadratic programming procedure (using the kernel function k) to find {i}and bias b, where i≠ 0 only if xiis a support vector.

  3. 3. Now, given a new instance, x, find the classification of x by computing

  4. Clarifications from last time

  5. Length of the margin margin http://nlp.stanford.edu/IR-book/html/htmledition/img1260.png Without changing the problem, we can rescale our data to set a = 1

  6. w is perpendicular to decision boundary

  7. More on Kernels • So far we’ve seen kernels that map instances in nto instances in z where z > n. • One way to create a kernel: Figure out appropriate feature space Φ(x), and find kernel function k which defines inner product on that space. • More practically, we usually don’t know appropriate feature space Φ(x). • What people do in practice is either: • Use one of the “classic” kernels (e.g., polynomial), or • Define their own function that is appropriate for their task, and show that it qualifies as a kernel.

  8. How to define your own kernel • Given training data (x1, x2, ..., xn) • Algorithm for SVM learning uses kernel matrix (also called Gram matrix): • We can choose some function k, and compute the kernel matrix K using the training data. • We just have to guarantee that our kernel defines an inner product on some feature space. • Not as hard as it sounds.

  9. What counts as a kernel? • Mercer’s Theorem: If the kernel matrix K is “symmetric positive definite”, it defines a kernel on the training data, that is, it defines an inner product in some feature space. • We don’t even have to know what that feature space is! It can have a huge number of dimensions.

  10. In-class exercises Note for part (c):

More Related