1 / 23

Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001

Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001. Glenn Fung & Olvi Mangasarian. Data Mining Institute University of Wisconsin - Madison. Second Annual Review June 1, 2001. Key Contributions. Fast new support vector machine classifier

yamal
Download Presentation

Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proximal Plane ClassificationKDD 2001San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of Wisconsin - Madison Second Annual Review June 1, 2001

  2. Key Contributions • Fast new support vector machine classifier • An order of magnitude faster than standard classifiers • Extremely simple to implement • 4 lines of MATLAB code • NO optimization packages (LP,QP) needed

  3. Outline of Talk • (Standard) Support vector machine (SVM) classifiers • Proximal support vector machines (PSVM) classifiers • Geometric motivation • Linear PSVM classifier • Nonlinear PSVM classifier • Full and reduced kernels • Numerical results • Correctness comparable to standard SVM • Much faster classification! • 2-million points in 10-space in 21 seconds • Compared to over 10 minutes for standard SVM

  4. Support Vector MachinesMaximizing the Margin between Bounding Planes A+ A-

  5. Proximal Vector MachinesFitting the Data using two parallel Bounding Planes A+ A-

  6. Changing to 2-norm and measuring margin in( )space: min (QP) s. t. At the solution of (QP) : , where Hence (QP) is equivalent to : min SVM as an Unconstrained Minimization Problem

  7. min (QP) s. t. Solving for in terms of and gives: min PSVM Formulation We have from the QP SVM formulation: This simple, but critical modification, changes the nature of the optimization problem tremendously!!

  8. Advantages of New Formulation • Objective function remains strongly convex • An explicit exact solution can be written in terms of the problem data • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space • Exact leave-one-out-correctness can be obtained in terms of problem data

  9. We want to solve: min Linear PSVM • Setting the gradient equal to zero, gives a nonsingular system of linear equations. • Solution of the system gives the desired PSVM classifier

  10. Here, • The linear system to solve depends on: which is of the size is usually much smaller than Linear PSVM Solution

  11. Input Define Calculate Solve Classifier: Linear Proximal SVM Algorithm

  12. Linear PSVM: (Linear separating surface: ) : min (QP) s. t. . Maximizing the margin By QP “duality”, in the “dual space” , gives: min min • Replace by a nonlinear kernel Nonlinear PSVM Formulation

  13. The nonlinear classifier: : • Polynomial Kernel : • Gaussian (Radial Basis) Kernel The Nonlinear Classifier • Where K is a nonlinear kernel, e.g.:

  14. Similar to the linear case, setting the gradient equal to zero, we obtain: Defining slightly different: • Here, the linear system to solve is of the size Nonlinear PSVM However, reduced kernels techniques can be used (RSVM) to reduce dimensionality.

  15. Input Define Calculate Classifier: Classifier: Linear Proximal SVM Algorithm Non Solve

  16. PSVM MATLAB Code function [w, gamma] = psvm(A,d,nu)% PSVM: linear and nonlinear classification % INPUT: A, d=diag(D), nu. OUTPUT: w, gamma% [w, gamma] = pvm(A,d,nu); [m,n]=size(A);e=ones(m,1);H=[A -e]; v=(d’*H)’ %v=H’*D*e; r=(speye(n+1)/nu+H’*H)\v % solve (I/nu+H’*H)r=v w=r(1:n);gamma=r(n+1); % getting w,gamma from r

  17. Linear PSVM Comparisons with Other SVMsMuch Faster, Comparable Correctness

  18. Linear PSVMComparisons on Larger Adult Dataset Much Faster & Comparable Correctness

  19. Linear PSVM vs LSVM 2-Million Dataset Over 30 Times Faster

  20. Nonlinear PSVM: Spiral Dataset94 Red Dots & 94 White Dots

  21. Nonlinear PSVM Comparisons * A rectangular kernel was used of size 8124 x 215

  22. Conclusion • PSVM is an extremely simple procedure for generating linear and nonlinear classifiers • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space for a linear classifier • Comparable test set correctness to standard SVM • Much faster than standard SVMs : typically an order of magnitude less.

  23. Future Work • Extension of PSVM to multicategory classification • Massive data classification using an incremental PSVM • Parallel extension and implementation of PSVM

More Related