Download presentation
Presentation is loading. Please wait.
Published byBarry Price Modified over 9 years ago
1
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams
2
ChemSpider : A Web-based Chemical Informatics Resource
3
3 What is ChemSpider? ChemSpider is a molecular structure-centric web service for chemists: ChemSpider is a molecular structure-centric web service for chemists: Chemical structure drawing, manipulation, visualization, modeling & databasing Chemical structure drawing, manipulation, visualization, modeling & databasing Web location to deposit, curate and enhance data associated with chemical structures Web location to deposit, curate and enhance data associated with chemical structures Web structure-based access to federated chemistry databases representing chemical vendors, literature, online data, patents and other forms of chemistry data Web structure-based access to federated chemistry databases representing chemical vendors, literature, online data, patents and other forms of chemistry data
4
4 How do people generally use ChemSpider? Searching for chemical structures, in rank order, via: Searching for chemical structures, in rank order, via: Registry numbers, trade names and synonyms. Registry numbers, trade names and synonyms. Structure identifiers such as SMILES or InChI Structure identifiers such as SMILES or InChI Intrinsic properties: commonly mass-based searches executed by mass spectrometrists Intrinsic properties: commonly mass-based searches executed by mass spectrometrists By systematic names: IUPAC or CAS Index name By systematic names: IUPAC or CAS Index name Generation of physicochemical properties Generation of physicochemical properties Text-based searching of Open Access articles Text-based searching of Open Access articles
5
5 ChemSpider Status August 2007 Online database of over 16.5 million structures Online database of over 16.5 million structures Systems in place for: Systems in place for: Single structure and data collection depositions Single structure and data collection depositions Association of analytical data with structures Association of analytical data with structures Ability to curate data for each individual record Ability to curate data for each individual record Indexing of and Integration to: Indexing of and Integration to: Over 70 individual databases Over 70 individual databases Patents from the US, European and Asian Patent offices Patents from the US, European and Asian Patent offices Text-based searching of over 50,000 Open Access articles Text-based searching of over 50,000 Open Access articles Over a thousand unique users access ChemSpider per day Over a thousand unique users access ChemSpider per day
6
6 Flexible Boolean Searching
7
7 Predicted Properties Details “Prozac”
8
8 Search result: 49 hits in 2.8 seconds
9
9 Integrated Visualization Tools
10
10 External Integrations - Wikipedia The links between Wikipedia and ChemSpider are formed automatically
11
11 What is ChemModLab? ChemModLab is a Web Service for building and evaluating QSAR models. ChemModLab is a Web Service for building and evaluating QSAR models. Send your data: assay results and SD file. Send your data: assay results and SD file. Use any or all of five descriptor types (2D). Use any or all of five descriptor types (2D). (Use your own descriptors) (Use your own descriptors) Use any or all of 16 statistical modeling methods. Use any or all of 16 statistical modeling methods. Predict potency of untested compound. Predict potency of untested compound.
12
12 Virtual Screening ChemSpider ChemModLab
13
13 ChemModLab Dialog (1) Data Input
14
14 ChemModLab Dialog (2) Five 2D Descriptor Sets
15
15 ChemModLab Dialogue (3) 16 Modeling Methods
16
16 ChemModLab Modeling Methods 16 Statistical Modeling Methods Trees: RandomForest, rpart, tree Neural networks k-nearest neighbors Support vector machines Partial least squares Partial least squares with linear discriminant analysis Least angle regression Ridge regression Elastic net Principal components regression Family ensemble of k-nearest neighbors, using 70% selection Family ensemble of tree, using 70% selection Family ensemble of rpart, using 70% selection randomForest using 70% selection
17
17 ECCR@NCSU + ChemSpider Plan User submits data to ChemModLab to get QSAR Model(s). Model is sent to ChemSpider. ChemSpider computes a “virtual screen”. The hit-list is clustered and sent to the user.
18
18 Accumulation curves Compare descriptor sets, given a method
19
19 Accumulation Curves Compare modeling methods, given a descriptor set
20
20 Diversity Map Cluster Active Compounds Modeling Methods
21
21 Continuous Response
22
22 Continuous Response
23
23 Continuous Response
24
24 Model Evaluation Take detailed looks at which models? AID348 (NCGC) : KNN – Ph ENet – CAP RF – B# RF – CAP RF – FF Tree – CAP Tree – Ph Tree – FF PLS – CAP
25
25 Summary 1.ChemSpider is a web chemical informatics center. 2.ChemModLab is a free, web service for QSAR. 3.Together they support sophisticated virtual screening. * ChemModLab is supported by the NCI RoadMap project.
26
26 ECCR@NCSU Group ChemSpider Group ChemModLab Team Jacqueline M. Hughes-Oliver Atina D. Brooks Gary W. Howell Kirtesh Patil Stan Young Qianyi Zhang ChemSpider Team Antony Williams (project lead) A rotating team of advisors and developers including many contributions from the Open Source community eccr.stat.ncsu.edu www.chemspider.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.