A practical guide to svm
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

A Practical Guide to SVM PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

A Practical Guide to SVM. Yihua Liao Dept. of Computer Science 2/3/03. Outline. Support vector machine basics GIST LIBSVM (SVMLight). Classification problems. Given: n training pairs, (<x i >, y i ), where

Download Presentation

A Practical Guide to SVM

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A practical guide to svm

A Practical Guide to SVM

Yihua Liao

Dept. of Computer Science

2/3/03


Outline

Outline

  • Support vector machine basics

  • GIST

  • LIBSVM (SVMLight)


Classification problems

Classification problems

  • Given: n training pairs, (<xi>, yi), where

    <xi>=(xi1, xi2,…,xil) is an input vector, and yi=+1/-1, corresponding classification H+ /H-

  • Out: A label y for a new vector x


Support vector machines

Support vector machines

Goal: to find discriminator

That maximize the margins


A little math

A little math

  • Primal problem

  • Decision function


Example

Example

  • Functional classifications of Yeast genes based on DNA microarray expression data.

  • Training dataset

    • genes that are known to have the same Function f

    • genes that are known to have a different function than f


A practical guide to svm

Gist

  • http://microarray.cpmc.columbia.edu/gist/

  • Developed by William Stafford Noble etc.

  • Contains tools for SVM classification, feature selection and kernel principal components analysis.

  • Linux/Solaris. Installation is straightforward.


Data files

Data files

  • Sample.mtx(tab-delimited, same for testing)

    gene alpha_0X alpha_7X alpha_14X alpha_21X …

    YMR300C -0.1 0.82 0.25 -0.51 …

    YAL003W 0.01 -0.56 0.25 -0.17 …

    YAL010C -0.2 -0.01 -0.01 -0.36 …

  • Sample.labels

    gene Respiration_chain_complexes.mipsfc

    YMR300C -1

    YAL003W 1

    YAL010C -1


Usage of gist

Usage of Gist

  • $compute-weights -train sample.mtx -class sample.labels > sample.weights

  • $classify -train sample.mtx -learned sample.weights -test test.mtx > test.predict

  • $score-svm-results -test test.labelstest.predict sample.weights


Test predict

Test.predict

# Generated by classify # Gist, version 2.0

….

gene classification discriminant

YKL197C -1 -3.349

YGL022W -1 -4.682

YLR069C -1 -2.799

YJR121W 1 0.7072


Output of score svm results

Output of score-svm-results

Number of training examples: 1644 (24 positive, 1620 negative)

Number of support vectors: 60 (14 positive, 46 negative) 3.65%

Training results: FP=0 FN=3 TP=21 TN=1620

Training ROC: 0.99874

Test results: FP=12 FN=1 TP=9 TN=801

Test ROC: 0.99397


Parameters

Parameters

  • compute-weights

    • -power <value>

    • -radial -widthfactor <value>

    • -posconstraint <value>

    • -negconstraint <value>


Rules of thumb

Rules of thumb

  • Radial basis kernel usually performs better.

  • Scale your data. scale each attribute to [0,1] or [-1,+1] to avoid over-fitting.

  • Try different penalty parameters C for two classes in case of unbalanced data.


Libsvm

LIBSVM

  • http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  • Developed by Chih-Jen Lin etc.

  • Tools for (multi-class) SV classification and regression.

  • C++/Java/Python/Matlab/Perl

  • Linux/UNIX/Windows

  • SMO implementation, fast!!!


Data files for libsvm

Data files for LIBSVM

  • Training.dat

    +1 1:0.708333 2:1 3:1 4:-0.320755

    -1 1:0.583333 2:-1 4:-0.603774 5:1

    +1 1:0.166667 2:1 3:-0.333333 4:-0.433962

    -1 1:0.458333 2:1 3:1 4:-0.358491 5:0.374429

  • Testing.dat


Usage of libsvm

Usage of LIBSVM

  • $svm-train -c 10 -w1 1 -w-1 5 Train.dat My.model

    - train classifier with penalty 10 for class 1 and penalty 50 for class –1, RBK

  • $svm-predict Test.dat My.model My.out

  • $svm-scaleTrain_Test.dat > Scaled.dat


Output of libsvm

Output of LIBSVM

  • Svm-train

    optimization finished, #iter = 219

    nu = 0.431030

    obj = -100.877286, rho = 0.424632

    nSV = 132, nBSV = 107

    Total nSV = 132


Output of libsvm1

Output of LIBSVM

  • Svm-predict

    Accuracy = 86.6667% (234/270) (classification)

    Mean squared error = 0.533333 (regression)

    Squared correlation coefficient = 0.532639 (regression)

  • Calculate FP, FN, TP, TN from My.out


  • Login