1 / 23

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001. Glenn Fung & Olvi Mangasarian. Data Mining Institute University of Wisconsin - Madison. Key Contributions. Fast new support vector machine classifier

nhung
Download Presentation

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proximal Support Vector Machine ClassifiersKDD 2001San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of Wisconsin - Madison

  2. Key Contributions • Fast new support vector machine classifier • An order of magnitude faster than standard classifiers • Extremely simple to implement • 4 lines of MATLAB code • NO optimization packages (LP,QP) needed

  3. Outline of Talk • (Standard) Support vector machines (SVM) • Classify by halfspaces • Proximal support vector machines (PSVM) • Classify by proximity to planes • Linear PSVM classifier • Nonlinear PSVM classifier • Full and reduced kernels • Numerical results • Correctness comparable to standard SVM • Much faster classification! • 2-million points in 10-space in 21 seconds • Compared to over 10 minutes for standard SVM

  4. Support Vector MachinesMaximizing the Margin between Bounding Planes A+ A-

  5. Proximal Vector MachinesFitting the Data using two parallel Bounding Planes A+ A-

  6. Given m points in n dimensional space • Represented by an m-by-n matrix A • Membership of each in class +1 or –1 specified by: • An m-by-m diagonal matrix D with +1 & -1 entries • Separate by two bounding planes, • More succinctly: where e is a vector of ones. Standard Support Vector MachineAlgebra of 2-Category Linearly Separable Case

  7. Solve the quadratic program for some : min (QP) , s. t. where , denotes or membership. • Marginis maximized by minimizing Standard Support Vector Machine Formulation

  8. min (QP) s. t. Solving for in terms of and gives: min PSVM Formulation We have from the QP SVM formulation: This simple, but critical modification, changes the nature of the optimization problem tremendously!!

  9. Advantages of New Formulation • Objective function remains strongly convex • An explicit exact solution can be written in terms of the problem data • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space • Exact leave-one-out-correctness can be obtained in terms of problem data

  10. We want to solve: min Linear PSVM • Setting the gradient equal to zero, gives a nonsingular system of linear equations. • Solution of the system gives the desired PSVM classifier

  11. Here, • The linear system to solve depends on: which is of the size is usually much smaller than Linear PSVM Solution

  12. Input Define Calculate Solve Classifier: Linear Proximal SVM Algorithm

  13. Linear PSVM: (Linear separating surface: ) : min (QP) s. t. . Maximizing the margin By QP “duality”, in the “dual space” , gives: min min • Replace by a nonlinear kernel Nonlinear PSVM Formulation

  14. The nonlinear classifier: : • Gaussian (Radial Basis) Kernel • The represents the “similarity” -entryof of data points and The Nonlinear Classifier • Where K is a nonlinear kernel, e.g.:

  15. Similar to the linear case, setting the gradient equal to zero, we obtain: Defining slightly different: • Here, the linear system to solve is of the size Nonlinear PSVM However, reduced kernels techniques can be used (RSVM) to reduce dimensionality.

  16. Input Define Calculate Classifier: Classifier: Linear Proximal SVM Algorithm Non Solve

  17. Linear & Nonlinear PSVM MATLAB Code function [w, gamma] = psvm(A,d,nu)% PSVM: linear and nonlinear classification % INPUT: A, d=diag(D), nu. OUTPUT: w, gamma% [w, gamma] = psvm(A,d,nu); [m,n]=size(A);e=ones(m,1);H=[A -e]; v=(d’*H)’ %v=H’*D*e; r=(speye(n+1)/nu+H’*H)\v % solve (I/nu+H’*H)r=v w=r(1:n);gamma=r(n+1); % getting w,gamma from r

  18. Linear PSVM Comparisons with Other SVMsMuch Faster, Comparable Correctness

  19. Linear PSVM vs LSVM 2-Million Dataset Over 30 Times Faster

  20. Nonlinear PSVM: Spiral Dataset94 Red Dots & 94 White Dots

  21. Nonlinear PSVM Comparisons * A rectangular kernel was used of size 8124 x 215

  22. Conclusion • PSVM is an extremely simple procedure for generating linear and nonlinearclassifiers • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space for a linear classifier • Comparable test set correctness to standard SVM • Much faster than standard SVMs : typically an order of magnitude less.

  23. Future Work • Extension of PSVM to multicategory classification • Massive data classification using an incremental PSVM • Parallel formulation and implementation of PSVM

More Related