slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR PowerPoint Presentation
Download Presentation
Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR

Loading in 2 Seconds...

play fullscreen
1 / 13
meena

Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR - PowerPoint PPT Presentation

187 Views
Download Presentation
Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Introduction • Support Vector Regression • QSAR Problems and Data • SVMs for QSAR • Linear Program Feature Selection • Model Selection and Bagging • Computational Results • Discussion

  2. Support Vector Regression e-insensitive loss function

  3. Quadratic SVMs with L2-norm

  4. Linear SVMs with L1-norm (n-SVR)

  5. QSAR Problems and Data Preparation of Input DATA (Bioactivity value, Structures) 3D Geometry Optimization Calculation of Descriptors SVMs for QSAR Statistical Analysis QSAR Model Building

  6. Data Sets • HIV dataset five classes of Anti-HIV molecules, 64 molecules, 620 descriptors • Lombardo benchmark dataset Brain-blood barrier partitioning dataset, 62 molecules, 649 descriptors Data Matrix descriptor1 descriptor2 - - - descriptor m Activity Molecule 1 x11 x12 x1m ln BB Molecule 2 x21 x22 x2m ln BB - - - - - - Molecule n x n1 x n2 x nm ln BB

  7. Data Matrix descriptor1 descriptor2 descriptor3 - - - descriptor m Activity Molecule 1 x11 x12 x13 x1m ln BB Molecule 2 x21 x22 x23 x2m ln BB - - - - - - Molecule n x n1 x n2 x n3 x nm ln BB

  8. SVMs for QSAR Construct Datasets Model Selection C, e, n, s Feature Selection Bagging Models Optimize Model Final Model

  9. Linear Program Feature Selection

  10. Model Selection • Choose SVM model parameters, C, e or n, s • Select evaluation function Q2 • Evaluate on testing data • Adjust using cross validation Bagging • Different validation sets give different models • Many local minima in SVM parameter search • Average models

  11. Methods (10-fold CV) Full Data (649) LP FS (21) NN SA (9) Computational Results Q2 q2 Q2 q2 Q2 q2 L1-SVM .384 .382 .157 .153 .219 .217 L2-SVM .310 .292 .171 .160 .247 .245 NN .320 .301 .222 .193 .247 .238

  12. Discussion • Robust optimization methods • LPFS outperforms NNSA • L1-SVM can run faster than L2-SVM • ? May improve LPFS method • ? May improve performance of L1-SVM This work is supported by NSF (IIS-9979860 and 970923)