Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune,

Slides:



Advertisements
Similar presentations
Reaxys – Managing Complexity
Advertisements

SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Chemaxon's chemo-informatics toolkit integration into the Affectis Data Management System Database Automated Data Integration - Example: IC50 Data generated.
Supporting Engagement in Open Access: a Publishers Perspective
1 Coordinating and Financing R&D in Developing Countries-moving beyond PDP's The ANDI Experience Solomon Nwaka CEWG on R&D Financing & Coordination, April.
THE GLOBAL CHEMISTRY NETWORK David James Executive Director, Strategic Innovation Jim Iley Executive Director, Science and Education 3 rd September 2013.
UK National Chemical Database Service: An integration of commercial and public chemistry services to support chemists in the United Kingdom Antony Williams,
ChemSpider: Searching by Chemical Name. ChemSpider  What is ChemSpider?  How to conduct a search  What do you get?
WIPO Global Databases Seminar on WIPO Services and Initiatives
Bonn-Aachen International Center for Information Technology Evaluation of different benchmark sets and evaluation methods for automatic extraction of chemical.
Royal Society of Chemistry developments to support open drug discovery Antony Williams, Ken Karapetyan, Valery Tkachenko, Colin Batchelor Alexey Pshenichnov.
The Thomson Reuters CITATION CONNECTION Digital Library st March – 3 rd April 2014, Jasná David Horký Country Manager – Central and Eastern Europe.
AND TO MAKE A DECISION ON WHICH EXPERIMENT TO DO, YOU WANT TO ORGANIZE YOUR CONTENT, NORMALIZE AND COMPARE, TO UNDER- STAND WHICH COMPOUND INTERACTS WITH.
Philip J. Ditchfield Manager, Contracts & Licensing GlaxoSmithKline Data Mining and the Pharmaceutical Industry.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
SCIENTIFIC SOLUTIONS Thomson ResearchSoft Paul Torpey April 8, 2005.
Tunable Machine Vision-Based Strategy for Automated Annotation of Chemical Database ChemReader Jungkap Park, Gus R. Rosania, and Kazuhiro Saitou University.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Seminar on WIPO Services and Initiatives Topic 5; Global Databases for Intellectual Property, Platform and Tools for Connected Knowledge Economy Oslo October.
Discovery of new medicines through new models of collaboration Simon Ward Professor of Medicinal Chemistry & Director of Translational Drug Discovery Group.
How community crowdsourcing and social networking is helping to build a quality online resource for chemists.
Crowdsourced Curation of Chemistry Data. How Bad is Online Chemistry Data? Antony Williams Wolfram Summit, September 2010.
Crowdsourcing Chemistry for the Community – 5 Years of Experiences Antony Williams NFAIS, February 28 th 2012.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
1 The Discovery Informatics Framework Pat Rougeau President and CEO MDL Information Systems, Inc. Delivering the Integration Promise American Chemical.
The Value of a Unique Researcher Identifier to ChemSpider Projects Antony Williams ORCID Meeting, Boston, May 18 th 2011.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Archive, Mine, Collaborate© 2009 Collaborative Drug Discovery, Inc. Copyright © 2011 All Rights Reserved Collaborative Drug Discovery Barry Bunin, Ph.D.
Role of Pakistan in INIS Activities
ChemSpider – A Crowdsourcing Environment for Hosting and Validating Chemistry Resources (and lessons from President Bush) Antony Williams 5th Meeting on.
Paul Groth VU University Amsterdam Convergence Meeting: Semantic Interoperability for Clinical Research & Patient.
Royal Society of Chemistry activities to develop a data repository for chemistry-specific data Aileen Day, Alexey Pshenichnov, Ken Karapetyan, Colin Batchelor,
Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”
ChemSpider – A Combination Platform of Free Chemistry Database, Free Prediction Engines and Crowdsourcing Environment Antony Williams University of Oregon,
Microsoft Academic Search Search | Explore | Discover Alex D. Wade Director - Scholarly Communication.
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.
Chemical health and safety data online – data consistency Antony Williams iRAMP Meeting, Ithaca, Feb 2014.
Marrying ACD/Labs technologies to eScience Projects at the Royal Society of Chemistry Antony Williams ACD/Labs User Meeting June 2013.
The Benefits of Participation in the Social Web of Science Antony Williams Research Square October 30 th 2014.
Delivering an online service for validating and standardizing chemical structure files using the ChemSpider platform.
Pathway: a collection of genes, proteins, and /or small molecules that modulate a cellular process or disease state Growing demand in biological sciences.
Vendor Session: ChemSpider, from Royal Society of Chemistry.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Revolutionary System Models, The Net, & The Public Interest The Interspace Prototype ( ) Digital Libraries Initiative ( ) Worm Community.
One publisher’s perspectives on an evolving industry Grace Baynes Nature Publishing Group October 2009.
Data enhancing the Royal Society of Chemistry publication archive Antony Williams, Colin Batchelor, Peter Corbett, Ken Karapetyan and Valery Tkachenko.
Construction of Shanghai Life Science & Bio-technology Service Platform for Data Access and Sharing International Workshop on Strategies Presentation of.
SciFinder for Academic Research Sci-Edge Information, Pune Chemical Abstracts Service Representative -
MDL Information Systems, Inc. Powering the Process of Invention Donna del Rey Director, Business Planning
Copyright GeneGo Cover Slide Cytoscape Reteat November 7 th 2007 Mark Hughes PhD.
Chemistry for tomorrow’s world Royal Society of Chemistry Stephen Hawthorne Sales Director Zhenqzhou, China.
Clustering the Royal Society of Chemistry chemical repository to enable enhanced navigation across millions of chemicals Valery Tkachenko, Ken Karapetyan,
A Chemistry Data Repository to Serve Them All Antony Williams.
Structure verification and elucidation using the ChemSpider database Antony J Williams, Valery Tkachenko and Alexey Pshenichnov SERMACS, November 16 th.
Indiana University School of Indiana University ECCR Summary Infrastructure: Cheminformatics web service infrastructure made available as a community resource.
Ingenuity Pathway Analysis Alex Pico. Description "IPA is a software application that enables researchers to analyze and understand the complex biological.
Microsoft Academic Search Search | Explore | Discover
Open Research Data and Open Access publications: How do they sit in the Web of Science? Guillaume Rivalle, Manager, Europe solution specialists
Building linked-data, large-scale chemistry platform: challenges, lessons and solutions Valery Tkachenko, Alexey Pshenichnov, Aileen Day, Colin Batchelor,
Applying Royal Society of Chemistry Cheminformatics Skills to Support the PharmaSea Project Antony Williams, Alexey Pshenichnov, Valery Tkachenko, Ken.
Experiences in Hosting Big Chemistry Data Collections for the Community Antony Williams July 30th 2014, NIST.
Dealing with the complex challenge of managing diverse chemistry data online Antony Williams, Valery Tkachenko, Alexey Pshenichnov and Ken Karapetyan.
It is a web-based tool for the retrieval of chemistry information and data from published literature. The content covers more than 200 years of chemistry.
ATOM Accelerating Therapeutics for Opportunities in Medicine
Real-time BioPharmaceutical R&D
ESciDoc Introduction M. Dreyer.
Consortium: National networks in 16 European countries.
Consortium: National networks in 16 European countries.
Jonathan Griffin, Managing Director, IFIS Publishing &
Presentation transcript:

Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune, India February 3 rd 2014

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Science map

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Chemical space

Navigation in chemical space

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Structure-based Drug Design

Ligand-based Drug Design

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Machine learning

Applied machine learning

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

~30 million chemicals and growing Data sourced from >500 different sources Crowdsourced curation and annotation Ongoing deposition of data from our journals and our collaborators A structure centric hub for web-searching

ChemSpider

Properties - experimental

Properties - ACDLabs

Properties – EPI Suite

Properties - ChemAxon

Literature references

Patents references

Books

Classification

Chemical vendors and datasources

Multimedia

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

ChemSpider Reactions

ChemSpider Spectra

ChemSpider Databases ChemSpider Compounds ChemSpider Reactions ChemSpider Spectra ChemSpider Crystals ChemSpider Materials ChemSpider Assays ChemSpider Algorithms

Research data inflow

Research data outflow

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

RSC Archive – since 1841

DERA - Digitally Enabling RSC Archive

Semantic mark-up of articles

It is so difficult to navigate… What’s the structure? Are they in our file? What’s similar? What’s the target? Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? Competitors? IP?

Data quality issue and CVSP –Robochemistry –Proliferation of errors in public and private databases –Automated quality control system

DrugBank dataset (6516 records) J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST DB06287

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Research data management

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

Crowdsourcing

AltMetrics

RSC/Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1 st CSSP article.

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Visualization and navigation Building Global Chemistry Network

Visualization

Visualization and navigation

Big Data Chemical Space Drug Discovery pipeline Machine learning Training sets RSC/ChemSpider platforms RSC/Archive Research data management Data quality, crowdsourcing and AltMetrics Building Global Chemistry Network

We are a part of a larger world

ChemSpider APIs

National Chemistry Database

Open PHACTS is an Innovative Medicines Initiative (IMI) project, aiming to reduce the barriers to drug discovery in industry, academia and for small businesses. Semantic web is one of the corner stones

OSDD

Thank you Slides: