Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
Advertisements

Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks World-wide in silico drug discovery against.
INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Informatics Support for Vaccine Projects Using and extending the UCSC bioinformatics infrastructure.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
---- Mark Borodovsky a short intro Position open: Scientist - Pathway Informatics (June 2009) THE POSITION The successful candidate will join the Computational.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Luciano Milanesi CNR-ITB.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Santiago de Chile, 1st EELA Conference, 4-5/9/06 1 Status.
Serono Science Scientific computing and high performance applications
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
FKPPL workshop May 2012 BUI The Quang Prof. Vincent Breton Prof. Doman Kim Prof. NGUYEN Hong Quang Prof. PHAM Quoc Long Grid enabled in silico drug discovery.
Chapter 13. The Impact of Genomics on Antimicrobial Drug Discovery and Toxicology CBBL - Young-sik Sohn-
Milanesi Luciano CAPI Milan, Italy HPC AND GRID BIOCOMPUTING APPLICATIONS IN LIFE SCIENCE Milanesi Luciano National Research Council Institute of.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
Application of e-infrastructure to real research.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Building Grid-enabled Virtual Screening Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Application Case Study: Distributed.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid projects in Europe Giuseppe Andronico.
BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM, a grid enabled virtual screening initiative Yannick Legré LPC Clermont-Ferrand,
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
INFSO-RI Enabling Grids for E-sciencE Status of the Biomedical Applications in EELA Project (E-Infrastructures Shared Between Europe.
Samudrala group - overall research areas CASP6 prediction for T Å C α RMSD for all 70 residues CASP6 prediction for T Å C α RMSD for all.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
1 e-Infrastructures e-Infrastructures Taking stock and looking ahead an European perspective Bernhard Fabianek European Commission - DG INFSO GÉANT & e-Infrastructure.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Avian Flu Data Challenge Hsin-Yen Chen ASGC 29 Aug APAN24.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
An approach to carry out research and teaching in Bioinformatics in remote areas Alok Bhattacharya Centre for Computational Biology & Bioinformatics JAWAHARLAL.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
Data Management Support for Life Sciences or What can we do for the Life Sciences? Mourad Ouzzani
I.U. School of Informatics Motif Discovery from Large Number of Sequences: A Case Study with Disease Resistance Genes in Arabidopsis thaliana by Irfan.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Discovery of Therapeutics to Improve Quality of Life Ram Samudrala University of Washington.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.
FESR Consorzio COMETA - Progetto PI2S2 Molecular Modelling Applications Laura Giurato Gruppo di Modellistica Molecolare (Prof.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to Grids and the EGEE project.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
BME435 BIOINFORMATICS.
MATLAB Distributed, and Other Toolboxes
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
생물정보학 Bioinformatics.
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Predicting Active Site Residue Annotations in the Pfam Database
In silico docking on grid infrastructures
Consortium: National networks in 16 European countries.
Presentation transcript:

Milanesi Luciano Catania, Italy 13/03/2007 Bioinformatics challenges in European projects in Grid. Milanesi Luciano National Research Council Institute for Biomedical Technologies, Milan, Italy Alessandro Orro National Research Council Institute for Biomedical Technologies, Milan, Italy

Milanesi Luciano Catania, Italy 13/03/ Related EU projects EUGRID ISS e G BEinGRID EUIndia

Milanesi Luciano Catania, Italy 13/03/ Introduction Bioinformatics applications have become an ideal research area where computer scientists can apply and further develop new intelligent computation methods, in both experimental and theoretical cases. Bioinformatics needs –Data storage (sequencing, genotyping, microarray) –Connection with HPC infrastructure –Data sharing and distribution The European Bioinformatics initiative based on infrastructure created by the EGEE and BioinfoGRID try to address these issues. - 2 years: 1 Gen 2006 – 31 Dic 2007

Milanesi Luciano Catania, Italy 13/03/ BioinfoGRID Project The BIOINFOGRID project aims to –promote the Bioinformatics Grid application for life science in the bioinformatics community –Evaluate and adopt high-level user interfaces –Evaluate bioinformatics applications in five main fields  Genomics, Proteomics, Transcriptomics, Molecular dynamics, Biological database Partners

Milanesi Luciano Catania, Italy 13/03/ BioinfoGRID

Milanesi Luciano Catania, Italy 13/03/ Research Main research fields –Genomics –Proteomics –Transcriptomics –Molecular dynamics –Biological database

Milanesi Luciano Catania, Italy 13/03/ Genomics applications Genomics Bioinformatics applications are typically data driven and have long running times because it is necessary to integrate many different biological databases and tools. Comparative approach: sequence search, multiple alignment, domain search

Milanesi Luciano Catania, Italy 13/03/ Genomics applications Validation of the W3H-Task-System

Milanesi Luciano Catania, Italy 13/03/ Proteomics Applications the evaluation of different programs and databases to perform high throughput proteomics analysis in grid, in order to face genome scale analysis both in sequence based functional identification and in structural studies of the three dimensional atoms configuration

Milanesi Luciano Catania, Italy 13/03/ Proteomics Applications Pipeline for protein functional domain analysis –BlastProDom is a wrapper script on top of a Blast package used to search against PRODOM families –FPrintScan is used to search against the PRINTS collection of protein signatures –HMMPfam is used to search against the Pfam HMM database, against SMARTHMM database and against TIGRFAMs collection of HMMs. –ScanRegExp is used to search against the PROSITE patterns collection and verify the matches by statistically significant CONFIRM patterns. –Superfamily is used to search against the SUPERFAMILY database of structures. –SignalPHMM for prediction and location of signal peptide cleavage sites, using HMM.

Milanesi Luciano Catania, Italy 13/03/ Proteomics Applications Protein surface calculation : the grid will be used to elaborate the volumetric description of the protein obtaining a precise representation of the corresponding surface.

Milanesi Luciano Catania, Italy 13/03/ Transcriptomics Applications Computational GRIDs to analyse trascriptomics data Description To perform algorithmic tools for gene expression data analysis in GRID: evaluate the computational tools for extracting biologically significant information from gene expression data. Algorithms will focus on clustering steady state and time series gene expression data, multiple testing and meta analysis of different microarray experiments.

Milanesi Luciano Catania, Italy 13/03/ Transcriptomics Applications Samples Genes Sample annotations Gene annotations Gene expression matrix Gene expression levels

Milanesi Luciano Catania, Italy 13/03/ Transcriptomics Applications Green = Expression level low with respect to reference sample. Red = Expression level high with respect to reference sample. Black = Expression level comparable to reference sample. The columns are ordered such that similar expression profiles neighbor each other. Eisen et al. PNAS 1998.

Milanesi Luciano Catania, Italy 13/03/ Transcriptomics Applications Case studies: breast cancer

Milanesi Luciano Catania, Italy 13/03/ Molecular applications in GRID Molecular Dynamics = computation of the motion of atoms within a molecular system using molecular mechanics Molecular Dynamics is commonly used for drug design and drug discovery –Molecular modelling of drugs –Measurement of binding energies between ligands and biological targets Grids offer promising perspectives for in silico drug discovery –Identification of drug candidates using computing tools –Virtual screening (docking) = rapid assessment of large libraries of chemical structures in order to guide the selection of likely drug candidates Resutl from docking a diphenyl urea compound against plasmepsins (WISDOM-I, credit: V. Kasam)

Milanesi Luciano Catania, Italy 13/03/ Molecular applications in GRID Aim : The objective is to docking and Molecular Dynamics simulations, which usually take a very long time to complete the analysis. Description Wide In Silico Docking On Malaria initiative WISDOM- II:This project perform the docking and molecular dynamics simulation on the GRID platform for discovery new targets for neglected diseases. Analysis can be performed notably using the data generated by the WISDOM application on the EGEE infrastructure.

Milanesi Luciano Catania, Italy 13/03/ Influenza A Neuraminidase Grid-enabled High-throughput in-silico Screening against Influenza A Neuraminidase Encouraged by the success of the first EGEE biomedical data challenge against malaria (WISDOM), the second data challenge battling avian flu was kicked off in April 2006 to identify new drugs for the potential variants of the Influenza A virus. In this project, the impact of a world-wide Grid infrastructure to efficiently deploy large scale virtual screening to speed up the drug design process has been demonstrated.

Milanesi Luciano Catania, Italy 13/03/ Influenza A Neuraminidase Results Completed dockings308,585 Estimated duration on 1 CPU16.7 year Duration of experiment30 days Number of jobs2580 Max number of concurrent CPUs240 Number of CE36

Milanesi Luciano Catania, Italy 13/03/ Conclusions 2007 activities –Interfaces Improvement –Dissemination and training –Target extensions (medical and biomedical informatics)

Milanesi Luciano Catania, Italy 13/03/ Acknowledgments BioinfoGRID EGEE Enabling Grid for E-science project EELA: e-Infrastructure between Europe and Latin America project Euchinagrid: Interconnection & Interoperability of Grids between Europe & China project. FIRB-MIUR LITBIO: Laboratory for Interdisciplinary Technologies in Bioinformatics