Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune,

Similar presentations


Presentation on theme: "Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune,"— Presentation transcript:

1 Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune, India February 3 rd 2014

2 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

3

4 Science map

5 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

6 Chemical space - 10 60

7 Navigation in chemical space

8

9 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

10 Structure-based Drug Design

11

12 Ligand-based Drug Design

13

14 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

15 Machine learning

16 Applied machine learning

17 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

18 ~30 million chemicals and growing Data sourced from >500 different sources Crowdsourced curation and annotation Ongoing deposition of data from our journals and our collaborators A structure centric hub for web-searching

19 ChemSpider

20

21 Properties - experimental

22 Properties - ACDLabs

23 Properties – EPI Suite

24 Properties - ChemAxon

25 Literature references

26 Patents references

27 Books

28 Classification

29 Chemical vendors and datasources

30 Multimedia

31 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

32 ChemSpider Reactions

33

34

35

36 ChemSpider Spectra

37

38 ChemSpider Databases ChemSpider Compounds ChemSpider Reactions ChemSpider Spectra ChemSpider Crystals ChemSpider Materials ChemSpider Assays ChemSpider Algorithms

39 Research data inflow

40 Research data outflow

41 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

42 RSC Archive – since 1841

43 DERA - Digitally Enabling RSC Archive

44 Semantic mark-up of articles

45 It is so difficult to navigate… What’s the structure? Are they in our file? What’s similar? What’s the target? Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? Competitors? IP?

46 Data quality issue and CVSP –Robochemistry –Proliferation of errors in public and private databases –Automated quality control system

47 DrugBank dataset (6516 records) J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287

48 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

49 Research data management

50 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

51 Crowdsourcing

52 AltMetrics

53 RSC/Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1 st CSSP article.

54 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Visualization and navigation Building Global Chemistry Network

55 Visualization

56 Visualization and navigation

57

58 Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

59 We are a part of a larger world

60 ChemSpider APIs

61 National Chemistry Database

62 http://www.openphacts.org Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones

63

64 OSDD

65 Thank you Email: tkachenkov@rsc.orgtkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16


Download ppt "Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune,"

Similar presentations


Ads by Google