DO (WILL) GRIDS MATTER IN DRUG DISCOVERY? Arthur Thomas SIB/Vital-IT and SwissBioGrid.

Slides:



Advertisements
Similar presentations
Discovery Studio AtlasStore: Protein/Ligand Database Steve Potts, Ph.D., MBA Product Manager Biological Informatics
Advertisements

The post-genomic challenge Exploring function across protein families using chemical probes  The CPFM is in early stages of development  Projects focus.
Pharma/BIOTECH industry overview
An Academic Model to Bridge the Valley of Death April 17, 2009 Scott Weir, PharmD, PhD Institute for Advancing Medical Innovation University of Kansas.
Quality by Design: A Challenge to the Pharma Industry CAMP Member Companies March 2002 CAMP.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Future CAMD Workloads and their Implications for Computer System Design IEEE 6th Annual Workshop on Workload Characterization.
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
January 3, 2003 Kevin Rakin President and Chief Executive Officer Kevin Rakin President and Chief Executive Officer Economic Summit and Outlook 2003.
Super fast identification and optimization of high quality drug candidates.
Discovery of new medicines through new models of collaboration Simon Ward Professor of Medicinal Chemistry & Director of Translational Drug Discovery Group.
Bioinformatics Ayesha M. Khan Spring Phylogenetic software PHYLIP l 2.
The South African Malaria Initiative A Case Study E Jane Morris Bridging the Gap in Global Health Innovation - from Needs to.
Important Points in Drug Design based on Bioinformatics Tools History of Drug/Vaccine development –Plants or Natural Product Plant and Natural products.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Serono Science Scientific computing and high performance applications
 The institute started in 1989 as a UNDP funded project called the National Agricultural Genetic Engineering Laboratory (NAGEL).  The Agricultural.
Copyright 2002 Pfizer Inc Internet2 & Pharma Industry National Priorities for Transforming Healthcare Quality Report - implications for patient safety.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Asia’s Largest Global Software & Services Company Genomes to Drugs: A Bioinformatics Perspective Sharmila Mande Bioinformatics Division Advanced Technology.
Genome-scale Metabolic Reconstruction and Modeling of Microbial Life Aaron Best, Biology Matthew DeJongh, Computer Science Nathan Tintle, Mathematics Hope.
Rational Drug Design Soma Mandal, Mee'nal Moudgil, Sanat K. Mandal.
Syngenta Biotechnology
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
CS 790 – Bioinformatics Introduction and overview.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
SUPPORTING R&D IN LIFE SCIENCES DIAA ABDELRAHMAN /10/2014.
5 th Annual International Business Research Forum Globalization of the Pharmaceutical Industry Implications to Information Technology Bruce Fadem March.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Bioinformatics Core Facility Guglielmo Roma January 2011.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
Strategies for developing India as a contract research hub Swaminathan Subramaniam Chief Operating Officer Aurigene Discovery Technologies.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE Grid-enabled drug discovery to address neglected diseases N. Jacq – CNRS-IN2P3 EGAAP meeting.
Intellectual Property Rights and Pharmaceutical Industry
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
1 Image-Based Biomedical Big Data Analytics Jens Rittscher Department of Engineering Science, Nuffield Department of Medicine, University of Oxford.
داده های عظیم در دوران پساژنوم Big Data in Post Genome Era مهدی صادقی پژوهشگاه ملی مهندسی ژنتیک و زیست فناوری پژوهشکده علوم زیستی، پژوهشگاه دانش های بنیادی.
Visual Knowledge ® Software Inc. Visual Knowledge BioCAD Case Study Parallels to Other Domains VK Semantic Web Server.
High throughput biology data management and data intensive computing drivers George Michaels.
Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.
A model collaboration using the Pool Susanne Hollinger, Ph.D., J.D. Chief Intellectual Property Officer Emory University.
Center for Bioinformatics and Genomic Systems Engineering Bioinformatics, Computational and Systems Biology Research in Life Science and Agriculture.
The presentation is prepared by: Andrey Voronkov, PhD – speaker (MIPT, Lomonosov Moscow State University) Vladimir Barinov – Grid Dynamics.
Shows tendency for mergers. These big companies may be shrinking – much research is now outsourced to low cost countries like Latvia, India, China and.
University of Pavia Dep. of Electrical, Computer and Biomedical Engineering Laboratory of Bioinformatics, Mathematical Modelling and Synthetic Biology.
Docking and Virtual Screening Using the BMI cluster
신기술 접목에 의한 신약개발의 발전전망과 전략 LGCI 생명과학 기술원. Confidential LGCI Life Science R&D 새 시대 – Post Genomic Era Genome count ‘The genomes of various species including.
전통적인 신약 개발 과정.
Molecular Modeling in Drug Discovery: an Overview
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
The Dengue Docking Project: Virtual Screening on the ProtoGRID
Data challenges in the pharmaceutical industry
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
Important Points in Drug Design based on Bioinformatics Tools
Important Points in Drug Design based on Bioinformatics Tools
Presentation transcript:

DO (WILL) GRIDS MATTER IN DRUG DISCOVERY? Arthur Thomas SIB/Vital-IT and SwissBioGrid

Biology: Big Science! Osaka/Hitachi UHVEM: World’s largest electron microscope Argonne Advanced Photon Source: World’s largest X-ray Crystallography System US NHMFL: 900MHz 21-T wide-bore NMR Facility Automation Partnership: HTS “Factory” 2.5x10 5 /8hr 10 7 data points/year Sanger Institute Sequencing Factory Siemens PET scanner

Biology: Big Data! 32,000 measures/spectrum 900 spectra/LC run = 28,800,000 measurements (55MB)/LC run 55 MB/LC run 3 MS-MS/spectrum 200 KB/MS-MS (900 x 3 x 200 KB) + 55 MB = 595 MB 10 spectra/mm = 100 spectra/mm x 100 = 10,000 spectra/cm 2 16 x 16 cm 2 gel 6 x 16 x 10,000 = 2,560,000 spectra/gel 2,560,000 x 200 KB = 512 TB [Source: Ron Appel (SIB)] 32,000 measures/spectrum 900 spectra/LC run = 28,800,000 measurements (55MB)/LC run 55 MB/LC run 3 MS-MS/spectrum 200 KB/MS-MS (900 x 3 x 200 KB) + 55 MB = 595 MB 10 spectra/mm = 100 spectra/mm x 100 = 10,000 spectra/cm 2 16 x 16 cm 2 gel 6 x 16 x 10,000 = 2,560,000 spectra/gel 2,560,000 x 200 KB = 512 TB [Source: Ron Appel (SIB)] [Source: Selinger et al. Trends in Biotech. (2003)]

Biology: Big Data! Source: GenomeNet, Kyoto ~1000 different biology reference data bases: Genome/Nucleotide Sequence Databases RNA sequence databases Protein sequence databases Structure Databases Metabolic and Signaling Pathways Human Genes and Diseases Microarray and other Gene Expression Databases Proteomics Resources Other Molecular Biology Databases Organelle databases Plant databases Immunological databases Source: M Y Galperin, Nucleic Acids Research (2006)

Biology: Visualisation! Collaboration! NCMIR “BioWall” SAGE HP Halo Collaboration Studio

Drug Discovery & Development 12+ years, $ billion HTS QSAR ADME/Tox Sequence Homology, Gene Expression, Proteomics, Comb. Libraries System & Disease Modelling Trial Design ‘Omics Paradigm Change Old ScienceNew Science Classical chemistryCombinatorial chemistry Basic biology‘Omics, Biotechnology Experimentation Computation Low throughputHigh throughput Animal studiesMolecular imaging Paradigm Change Old ScienceNew Science Classical chemistryCombinatorial chemistry Basic biology‘Omics, Biotechnology Experimentation Computation Low throughputHigh throughput Animal studiesMolecular imaging

Impact of ‘omics Source: H. Rauwerda et al Drug Discovery Today (2006)

The Discovery Sieve

Getting Less and Less for More and More Source: PPD Inc.

Pharma Challenges Declining productivity and ROI –$1+ billion to bring a drug to market, $1 million/day revenue lost to delay, declining post-patent lifetimes (5-7 years) –Most drug candidates fail 1:10 development candidates fail 1:2 clinical trial candidates fail –Number of NCEs has been falling for a decade –2:3 drugs do not generate a lifetime return –Blockbuster (“one size fits all”) and “me too” mentalities not sustainable; many patents (~$72b) expiring in next 5 years –Stricter regulation (pre- and post-market), greater price pressure and greater liability (Vioxx, Baycol, …) Deluge of data, drought of knowledge –Huge investment in high-throughput data generation technologies not matched by investment in data analysis technologies –Poorly integrated data silos Increasingly collaborative landscape –Challenges of sharing information across enterprise boundaries

New Pharma Ecosystem? 1,500 ($50b+) pharma/biotech partnerships in last 7 years –e.g. 50% of Roche pharma/diagnostic revenues from licensing deals Source: Recombinant Capital

Typical Grid Applications Drug Discovery –Sequence analysis –Microarray analysis/network inference –Virtual Screening (Autodock, CHARMM, Glide, FlexX) Development –ADME, PK/PD (NONMEM, WinNonLin) –Trial design (TrialSimulator) –Process validation, compliance Marketing –Market data analysis (SAS, SPSS) “Instead of spending millions of dollars and years in the lab screening hundreds of thousands of compounds, now it will be possible to screen hundreds of millions of molecules in months” (Graham Richards)

Pharma Grids: the Good News J&JPRD 1 –1,200 rising to 3,000 PCs; mix of Linux (clusters) and Windows (desktops) –20+ applications –United Devices GridMP Novartis 2 –Began in 2001 –Now 2,700+ PCs (out of 65,000), 5+ Tflops, 25,000 PC’s eventually? –Apps: docking, genome annotation, chemoinformatics, clinical trial simulation, text mining –$400k investment, $2+ millions annual savings –United Devices GridMP for PC farm –Rigidly standardized PC environment gsk 1 –1,000+ PCs –$1 million estimated annual savings –United Devices GridMP for PC farm 1 Source: United Devices, Inc. 2 Source: Manuel Peitsch, Novartis

Pharma Grids: not-so-good News “Less than half of the top 20 pharmaceutical companies are implementing Grids” [ William Fellows, 451 Group]

Barriers to Grid adoption Difficulty of Building a Business Case –Cui bono? –Measuring the ROI? Unsuitable licensing models: driving open source? Trust and Access Control issues –Extending to the balkanized (fire-walled) global enterprise –Extending to the whole development ecosystem Technical Barriers –Lack of suitable (“embarassingly parallel”) applications –Heterogeneity of platforms –Poor standardization of middleware (commercial vs open source): will SOA (OGSA) solve this? –Poor data grid management, semantic integration: driving development of ontologies? –Limited bandwidth: increasing use of Lambda rails?

Overcoming the Barriers: Building a Business Case Capacity Improvement –Driven by ROI –Reduced build and running costs of PC Grids cf. dedicated clusters R&D Process Innovation –Driven by need for new ways of doing –Collaborative research (industry/academia) –“Open source research” (NIH, Wellcome)

Overcoming the Barriers: Technical Software –Less intrusive, more standardized middleware –Web services, OGSA Data Management –DataGrid techologies Data Integration –Ontologies and shared knowledge spaces “Utility/On-Demand” Computing Bandwidth –National and international LambdaRails Virtual Laboratories/Organizations

LambdaRails™ Source: OptiPuter Group

SwissBioGrid: A National Resource Dedicated to large-scale computational applications in bioinformatics, modelling, chemoinformatics and bio-medical sciences CSCS manages GRID infrastructure, middleware, security SIB/Vital-IT has primary responsibility for providing bioinformatics application validation and optimization, Web services, database services Some sites compute-intensive, some data-intensive

SwissBioGrid: A Mixture of Clusters and PCs UniZH Matterhorn (Sun Grid Engine) SIB Vital-IT (Platform LSF) ETHZ Hreidar (Sun Grid Engine) NorduGRID/ ARC NorduGRID/ ARC CSCS - Ticino Cluster (Itanium, LSF) - Terrane Cluster (PS 5, PBS) - Sun Cluster (PBS) UniBS/FMI PC farms ProtoGRID Metascheduler UniBS BC2 cluster (Platform LSF)

Some Good News… “Open source discovery” is thriving! Anthrax (7,000+ CPU years) Smallpox (68,000+ CPU years) –400,000+ CPUs, 53,000+ CPU years to date, 75+ CPU years/day Human Proteome folding, Phase II (761+ CPU years) Cancer project Phase II (437+ CPU years) AIDS project (25,000+ CPU years)

Dengue [10 million infections, 100,000 deaths/year] – Autodock, Glide – Mixed PC and cluster Grid – 130,000 ligands from NCI DTP library docked against dengue NS5 protein – ~ 1 CPU min/dock – 70 hits found, being evaluated in vitro – Plan to dock 2.7 million ligands from ZINC library – 1875 CPU-days for 1 target/1 site/1 parameter set/1 library (“parameter sweep”) More Good News… WISDOM Malaria [500 million infections, 1.3 million deaths/year] –Autodock, FlexX –80 CPU years in 6 weeks –1,000,000 ligands against 11 targets –Top 1,000 hits identified Avian Flu [the next Big One] –77 CPU years on 2000 computers –300,000 ligands against 8 Influenza A neuraminidase targets –Hits now being analyzed

From Data Sharing to Knowledge Sharing DataGrid –SwissBioGrid experiment in data grid using Avaki –Complex update patterns KnowledgeGrid –Aggressive use of ontologies for knowledge standardization and sharing Gene Ontology

Thank You! Questions?