1 / 26

Support Vector Machine (SVM)

Support Vector Machine (SVM). Presented by Robert Chen. Introduction. High level explanation of SVM SVM is a way to classify data We are interested in text classification. What is a SVM.

gad
Download Presentation

Support Vector Machine (SVM)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector Machine(SVM) Presented by Robert Chen

  2. Introduction • High level explanation of SVM • SVM is a way to classify data • We are interested in text classification

  3. What is a SVM • “In essence, an SVM is a mathematical entity, an algorithm (or recipe) for maximizing a particular mathematical function with respect to a given collection of data.” William S Noble

  4. What is a SVM • It is a computer algorithm that learns through the training data we provide in order to categorize new data in future cases. • SVM can’t cluster data, it can only classify data: we use SVD to cluster the data.

  5. SVM hyperplanes • 1)Seperating hyperplane • 1d, 2d, 3d • 2)Maximum-margin hyperplane • Separates classes, while maintaining the maximal distance from any one of the given expression profiles • 3)Soft margin hyperplane • Generalized Optimal hyperplane (name used in Vapnik’s book)

  6. Soft Margin Hyperplane • Allows some outlier data points to push their way through the margin of the separating hyperplane without affecting the final result • “Soft margin parameter specifies a trade-off between hyperplane violations and the size of the margin.” W. Noble

  7. Soft Margin Hyper plane  Suggested by Corinna Cortes and Vladimir Vapnik in 1995 Won the 2008 ACM Paris Kanellakis Award

  8. Kernal function • Mathematical solution to determining the hyperplane when: • 1) No clear boundary • 2) Soft margin doesn’t help

  9. Kernal Function • Projects data from a low dimensional state to a high dimensional state • We then project the SVM hyperplane in that state back to a lower drawable state such as 2-D. • Kernals that have a very high-dimension can result in the SVM overfitting the data.

  10. Types of Kernels linear: K(xi, xj) = xiTxj . polynomial: K(xi, xj) = (γ xiT xj + r)d, γ > 0. radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 sigmoid: K(xi, xj) = tanh(γxiTxs + r).

  11. Notes • radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space

  12. Example • Data Set: 1 dimensional set • Class, X1 • +1, 0 • -1, 1 • -1, 2 • +1, 3 • Φ(X1) = (X1, X1)

  13. Support Vectors • <w · x> + b = +1 (positive labels) (1) • <w · x> + b = -1 (negative labels) (2) • <w · x> + b = 0 (hyperplane) (3) • Any vectors on expressions (1) or (2) are support vectors.

  14. Importance of SVM in Support Vector Machines • Complexity of SVM depends on the number of support vectors rather that on the dimensionality of the feature space

  15. Positive label • w1x1 + w2x2 + b = +1 • w10 + w20 + b = +1 • w13 + w29 + b = +1

  16. Negative label • w11 + w21 + b = -1 • w12 + w24 + b = -1 • w1 = -3, w2 = 1, b = 1

  17. Hyperplane • w1x1 + w2x2 + b = 0 • -3x1 + 1x2 + 1 = 0 • x2 = -1 + 3x1 • X1; X2 0, -1 1, 2 2, 5 3, 8

  18. Maximum-Margin Hyperplane • 2/sqrt( w · w) • 2/sqrt(-32 + 12) margin = 0.632456

  19. Recommended Article • What is a support vector machine? • By William S Noble

  20. Recommended Article • Support Vector Machines for Text Categorization • A. Basu, C. Watters, and M. Shepherd • Faculty of Computer Science • Dalhousie University • Halifax, Nova Scotia, Canada B3H 1W5 • {basu | watters | shepherd@cs.dal.ca}

  21. Recommended Book • The Nature of Statistical Learning Theory • By Vladimir N. Vapnik

  22. Library doesn’t have this bookAuthor:Thorsten Joachims

  23. Thank you • Questions? • Comments?

  24. Multiclass SVM • Multiclass ranking SVMs, in which one SVM decision function attempts to classify all classes. • One-against-all classification, in which there is one binary SVM for each class to separate members of that class from members of other classes. • Pairwise classification, in which there is one binary SVM for each pair of classes to separate members of one class from members of the other.

  25. Types of Kernels linear: K(xi, xj) = xiTxj . polynomial: K(xi, xj) = (γ xiT xj + r)d, γ > 0. radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 sigmoid: K(xi, xj) = tanh(γxiTxs + r).

  26. Notes • radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space

More Related