Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR Linear Program Feature Selection Model Selection and Bagging Computational.

Similar presentations


Presentation on theme: "Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR Linear Program Feature Selection Model Selection and Bagging Computational."— Presentation transcript:

1 Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR Linear Program Feature Selection Model Selection and Bagging Computational Results Discussion

2 Support Vector Regression  -insensitive loss function

3 Quadratic SVMs with L 2 -norm

4 Linear SVMs with L 1 -norm (-SVR)

5 QSAR Problems and Data SVMs for QSAR Statistical Analysis QSAR Model Building Statistical Analysis QSAR Model Building Calculation of Descriptors 3D Geometry Optimization Preparation of Input DATA (Bioactivity value, Structures) Preparation of Input DATA (Bioactivity value, Structures)

6 Data Sets  HIV dataset five classes of Anti-HIV molecules, 64 molecules, 620 descriptors  Lombardo benchmark dataset Brain-blood barrier partitioning dataset, 62 molecules, 649 descriptors Data Matrix descriptor1 descriptor2 - - - descriptor m Activity Molecule 1 x11 x12 x1m ln BB Molecule 2 x21 x22 x2m ln BB - - - - - - Molecule n x n1 x n2 x nm ln BB

7 Data Matrix descriptor1 descriptor2 descriptor3 - - - descriptor m Activity Molecule 1 x11 x12 x13 x1m ln BB Molecule 2 x21 x22 x23 x2m ln BB - - - - - - Molecule n x n1 x n2 x n3 x nm ln BB

8 SVMs for QSAR Construct Datasets Final Model Optimize Model Model Selection C, ,,  Bagging Models Feature Selection

9 Linear Program Feature Selection

10 Bagging Different validation sets give different models Many local minima in SVM parameter search Average models Model Selection Choose SVM model parameters, C,  or,  Select evaluation function Q 2 Evaluate on testing data Adjust using cross validation

11 Computational Results Methods (10-fold CV) Full Data (649) LP FS (21) NN SA (9) Q2Q2 q2q2 Q2Q2 q2q2 Q2Q2 q2q2 L 1 -SVM.384.382.157.153.219.217 L 2 -SVM.310.292.171.160.247.245 NN.320.301.222.193.247.238

12

13 This work is supported by NSF (IIS-9979860 and 970923) Discussion Robust optimization methods LPFS outperforms NNSA L1-SVM can run faster than L2-SVM ? May improve LPFS method ? May improve performance of L1-SVM


Download ppt "Introduction Support Vector Regression QSAR Problems and Data SVMs for QSAR Linear Program Feature Selection Model Selection and Bagging Computational."

Similar presentations


Ads by Google