1 / 14

Gist 2.3

Gist 2.3. John H. Phan MIBLab Summer Workshop June 28th, 2006. Overview. Gist 2.3 Tools Support Vector Machine (SVM) classification Kernel Principal Component Analysis (KPCA). Gist 2.3 Overview. Gist is a set of command line programs written in C Primary programs SVM and KPCA

Download Presentation

Gist 2.3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006

  2. Overview • Gist 2.3 Tools • Support Vector Machine (SVM) classification • Kernel Principal Component Analysis (KPCA)

  3. Gist 2.3 Overview • Gist is a set of command line programs written in C • Primary programs • SVM and KPCA • Auxiliary programs • Ranking and feature selection • Web interface for the SVM component

  4. Support Vector Machines • Supervised classification method • Maximal margin hyperplane http://www.dtreg.com/svm.htm

  5. Primary Gist Programs • gist-train-svm – train support vector machine • gist-classify – classify points with a trained support vector machine • gist-fast-classify – linear optimized classification • gist-kpca – kernel principal component analysis • gist-project – project points onto KPCA components

  6. Auxiliary Gist Programs • gist-fselect – linear feature selection • gist-matrix – basic matrix manipulations • gist-score-svm – performance of gist-train-svm and gist-classify • gist-rfe – recursive feature elimination • gist-sigmoid – classification probabilities • gist2html – convert output to HTML • gist-kernel – create a square kernel matrix

  7. gist-train-svm • Train a support vector machine • Input file is tab delimited but transposed • Output file contains 5 columns • Label, binary classification, SVM weights, predicted classification, discriminant value

  8. gist-fselect – Feature Selection • Fisher Criterion Score • t-test • Welch t-test • Mann-Whitney • SAM (significance analysis of microarrays) • Threshold number of mis-classifications

  9. gist-score-svm • Compute False and true positives on training and test sets • Compute area under the ROC curves for training and test sets

  10. gist-rfe • Recursive feature elimination – SVM • Initialize the data to contain all features • Train an SVM on the data • Rank features according to SVM weights • Eliminate lower 50% of features • Repeat until 1 feature is left

  11. Gist SVM Web Interface • SVM Training and Testing • Normalize data by mean centering or z-score • Adjust kernel settings (linear, polynomial, or radial basis) • Demo (http://svm.sdsc.edu/svm-intro.html)

  12. Comparison to MAGMA MAGMA Gist (Web) • Normalizations • Row (gene) mean center • Row (gene) median center • Column mean center • Column median center • Row z-score • Column z-score • Quantile • Handles missing values • Normalizations • Column (sample) mean center • Column (sample) z-score

  13. Comparison to MAGMA Classifiers SVM Fisher’s Discriminant SDF Data Representation Visualization of classifiers Database storage MAGMA Gist (Web) • Classifiers • SVM • Data Representation • Text files • HTML output

  14. Comparison to MAGMA Ranking Methods Resubstitution Cross validation Bootstrap Bolstering MAGMA Gist (Web) • Ranking Methods • Fisher criterion • T-test • SAM • Mann-Whitney • Welch t-test

More Related