Loading in 2 Seconds...

Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR

Loading in 2 Seconds...

187 Views

Download Presentation
##### Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Introduction**• Support Vector Regression • QSAR Problems and Data • SVMs for QSAR • Linear Program Feature Selection • Model Selection and Bagging • Computational Results • Discussion**Support Vector Regression**e-insensitive loss function**QSAR Problems and Data**Preparation of Input DATA (Bioactivity value, Structures) 3D Geometry Optimization Calculation of Descriptors SVMs for QSAR Statistical Analysis QSAR Model Building**Data Sets**• HIV dataset five classes of Anti-HIV molecules, 64 molecules, 620 descriptors • Lombardo benchmark dataset Brain-blood barrier partitioning dataset, 62 molecules, 649 descriptors Data Matrix descriptor1 descriptor2 - - - descriptor m Activity Molecule 1 x11 x12 x1m ln BB Molecule 2 x21 x22 x2m ln BB - - - - - - Molecule n x n1 x n2 x nm ln BB**Data Matrix descriptor1 descriptor2 descriptor3 - - -**descriptor m Activity Molecule 1 x11 x12 x13 x1m ln BB Molecule 2 x21 x22 x23 x2m ln BB - - - - - - Molecule n x n1 x n2 x n3 x nm ln BB**SVMs for QSAR**Construct Datasets Model Selection C, e, n, s Feature Selection Bagging Models Optimize Model Final Model**Model Selection**• Choose SVM model parameters, C, e or n, s • Select evaluation function Q2 • Evaluate on testing data • Adjust using cross validation Bagging • Different validation sets give different models • Many local minima in SVM parameter search • Average models**Methods (10-fold CV)**Full Data (649) LP FS (21) NN SA (9) Computational Results Q2 q2 Q2 q2 Q2 q2 L1-SVM .384 .382 .157 .153 .219 .217 L2-SVM .310 .292 .171 .160 .247 .245 NN .320 .301 .222 .193 .247 .238**Discussion**• Robust optimization methods • LPFS outperforms NNSA • L1-SVM can run faster than L2-SVM • ? May improve LPFS method • ? May improve performance of L1-SVM This work is supported by NSF (IIS-9979860 and 970923)