Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.

Similar presentations


Presentation on theme: "ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams."— Presentation transcript:

1 ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams

2 ChemSpider : A Web-based Chemical Informatics Resource

3 3 What is ChemSpider? ChemSpider is a molecular structure-centric web service for chemists: ChemSpider is a molecular structure-centric web service for chemists: Chemical structure drawing, manipulation, visualization, modeling & databasing Chemical structure drawing, manipulation, visualization, modeling & databasing Web location to deposit, curate and enhance data associated with chemical structures Web location to deposit, curate and enhance data associated with chemical structures Web structure-based access to federated chemistry databases representing chemical vendors, literature, online data, patents and other forms of chemistry data Web structure-based access to federated chemistry databases representing chemical vendors, literature, online data, patents and other forms of chemistry data

4 4 How do people generally use ChemSpider? Searching for chemical structures, in rank order, via: Searching for chemical structures, in rank order, via: Registry numbers, trade names and synonyms. Registry numbers, trade names and synonyms. Structure identifiers such as SMILES or InChI Structure identifiers such as SMILES or InChI Intrinsic properties: commonly mass-based searches executed by mass spectrometrists Intrinsic properties: commonly mass-based searches executed by mass spectrometrists By systematic names: IUPAC or CAS Index name By systematic names: IUPAC or CAS Index name Generation of physicochemical properties Generation of physicochemical properties Text-based searching of Open Access articles Text-based searching of Open Access articles

5 5 ChemSpider Status August 2007 Online database of over 16.5 million structures Online database of over 16.5 million structures Systems in place for: Systems in place for: Single structure and data collection depositions Single structure and data collection depositions Association of analytical data with structures Association of analytical data with structures Ability to curate data for each individual record Ability to curate data for each individual record Indexing of and Integration to: Indexing of and Integration to: Over 70 individual databases Over 70 individual databases Patents from the US, European and Asian Patent offices Patents from the US, European and Asian Patent offices Text-based searching of over 50,000 Open Access articles Text-based searching of over 50,000 Open Access articles Over a thousand unique users access ChemSpider per day Over a thousand unique users access ChemSpider per day

6 6 Flexible Boolean Searching

7 7 Predicted Properties Details “Prozac”

8 8 Search result: 49 hits in 2.8 seconds

9 9 Integrated Visualization Tools

10 10 External Integrations - Wikipedia The links between Wikipedia and ChemSpider are formed automatically

11 11 What is ChemModLab? ChemModLab is a Web Service for building and evaluating QSAR models. ChemModLab is a Web Service for building and evaluating QSAR models. Send your data: assay results and SD file. Send your data: assay results and SD file. Use any or all of five descriptor types (2D). Use any or all of five descriptor types (2D). (Use your own descriptors) (Use your own descriptors) Use any or all of 16 statistical modeling methods. Use any or all of 16 statistical modeling methods. Predict potency of untested compound. Predict potency of untested compound.

12 12 Virtual Screening ChemSpider ChemModLab

13 13 ChemModLab Dialog (1) Data Input

14 14 ChemModLab Dialog (2) Five 2D Descriptor Sets

15 15 ChemModLab Dialogue (3) 16 Modeling Methods

16 16 ChemModLab Modeling Methods 16 Statistical Modeling Methods Trees: RandomForest, rpart, tree Neural networks k-nearest neighbors Support vector machines Partial least squares Partial least squares with linear discriminant analysis Least angle regression Ridge regression Elastic net Principal components regression Family ensemble of k-nearest neighbors, using 70% selection Family ensemble of tree, using 70% selection Family ensemble of rpart, using 70% selection randomForest using 70% selection

17 17 ECCR@NCSU + ChemSpider Plan User submits data to ChemModLab to get QSAR Model(s). Model is sent to ChemSpider. ChemSpider computes a “virtual screen”. The hit-list is clustered and sent to the user.

18 18 Accumulation curves Compare descriptor sets, given a method

19 19 Accumulation Curves Compare modeling methods, given a descriptor set

20 20 Diversity Map Cluster Active Compounds Modeling Methods

21 21 Continuous Response

22 22 Continuous Response

23 23 Continuous Response

24 24 Model Evaluation Take detailed looks at which models? AID348 (NCGC) : KNN – Ph ENet – CAP RF – B# RF – CAP RF – FF Tree – CAP Tree – Ph Tree – FF PLS – CAP

25 25 Summary 1.ChemSpider is a web chemical informatics center. 2.ChemModLab is a free, web service for QSAR. 3.Together they support sophisticated virtual screening. * ChemModLab is supported by the NCI RoadMap project.

26 26 ECCR@NCSU Group ChemSpider Group ChemModLab Team Jacqueline M. Hughes-Oliver Atina D. Brooks Gary W. Howell Kirtesh Patil Stan Young Qianyi Zhang ChemSpider Team Antony Williams (project lead) A rotating team of advisors and developers including many contributions from the Open Source community eccr.stat.ncsu.edu www.chemspider.com


Download ppt "ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams."

Similar presentations


Ads by Google