Download presentation
Presentation is loading. Please wait.
Published byWesley Flowers Modified over 9 years ago
1
Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune, India February 3 rd 2014
2
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
4
Science map
5
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
6
Chemical space - 10 60
7
Navigation in chemical space
9
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
10
Structure-based Drug Design
12
Ligand-based Drug Design
14
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
15
Machine learning
16
Applied machine learning
17
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
18
~30 million chemicals and growing Data sourced from >500 different sources Crowdsourced curation and annotation Ongoing deposition of data from our journals and our collaborators A structure centric hub for web-searching
19
ChemSpider
21
Properties - experimental
22
Properties - ACDLabs
23
Properties – EPI Suite
24
Properties - ChemAxon
25
Literature references
26
Patents references
27
Books
28
Classification
29
Chemical vendors and datasources
30
Multimedia
31
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
32
ChemSpider Reactions
36
ChemSpider Spectra
38
ChemSpider Databases ChemSpider Compounds ChemSpider Reactions ChemSpider Spectra ChemSpider Crystals ChemSpider Materials ChemSpider Assays ChemSpider Algorithms
39
Research data inflow
40
Research data outflow
41
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
42
RSC Archive – since 1841
43
DERA - Digitally Enabling RSC Archive
44
Semantic mark-up of articles
45
It is so difficult to navigate… What’s the structure? Are they in our file? What’s similar? What’s the target? Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? Competitors? IP?
46
Data quality issue and CVSP –Robochemistry –Proliferation of errors in public and private databases –Automated quality control system
47
DrugBank dataset (6516 records) J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287
48
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
49
Research data management
50
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
51
Crowdsourcing
52
AltMetrics
53
RSC/Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1 st CSSP article.
54
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Visualization and navigation Building Global Chemistry Network
55
Visualization
56
Visualization and navigation
58
Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network
59
We are a part of a larger world
60
ChemSpider APIs
61
National Chemistry Database
62
http://www.openphacts.org Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones
64
OSDD
65
Thank you Email: tkachenkov@rsc.orgtkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.