BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari

Slides:



Advertisements
Similar presentations
An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
Advertisements

INFSO-RI Enabling Grids for E-sciencE WISDOM mini-workshop Vincent Breton (CNRS-IN2P3, LPC Clermont-Ferrand) ISGC 2007 March 28th,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Luciano Milanesi CNR-ITB.
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.
1 Dynamic Application Installation (Case of CMS on OSG) Introduction CMS Software Installation Overview Software Installation Issues Validation Considerations.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
GENIUS kick-off - November 2013 GENIUS kick-off meeting WP400 – Tools for data exploitation X. Luri.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
INFSO-RI Enabling Grids for E-sciencE EGEE - a worldwide Grid infrastructure opportunities for the biomedical community Bob Jones.
N*Grid – Korean Grid Research Initiative Funded by Government (Ministry of Information and Communication) 5 Years from 2002 to million US$ Including.
CSIU Submission of BLAST jobs via the Galaxy Interface Rob Quick Open Science Grid – Operations Area Coordinator Indiana University.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
INFSO-RI Enabling Grids for E-sciencE V. Breton, 30/08/05, seminar at SERONO Grid added value to fight malaria Vincent Breton EGEE.
Work Package 2 Coordination with EGEE activities Marco Verlato (INFN-Padova) CYCLOPS Review 26 October 2007 Bruxelles, Belgium.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Training services offered by SZTAKI for EGEE and EGI Gergely Sipos MTA SZTAKI (Hungarian.
Overall Goal of the Project  Develop full functionality of CMS Tier-2 centers  Embed the Tier-2 centers in the LHC-GRID  Provide well documented and.
INFSO-RI Enabling Grids for E-sciencE Biomedical applications V. Breton, CNRS-IN2P3.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE Gergely Sipos
INFSO-RI Enabling Grids for E-sciencE In silico docking on EGEE infrastructure, the case of WISDOM Nicolas Jacq LPC of Clermont-Ferrand,
EU-IndiaGrid (RI ) is funded by the European Commission under the Research Infrastructure Programme WP5 Application Support Marco.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support in EGEE Gergely Sipos MTA SZTAKI EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WISDOM in EGEE-2, biomed meeting, 2006/04/28 WISDOM : Grid-enabled Virtual High Throughput.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
BIOINFOGRID: Bioinformatics Grid Application for life science MILANESI, Luciano National Research Council Institute of.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress on first user scenarios Stephen.
INFSO-RI Enabling Grids for E-sciencE GILDA and GENIUS Guy Warner NeSC Training Team An induction to EGEE for GOSC and the NGS NeSC,
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Kati Lassila-Perini EGEE User Support Workshop Outline: – CMS collaboration – User Support clients – User Support task definition – passive support:
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Thomas Gutberlet HZB User Coordination NMI3-II Neutron scattering and Muon spectroscopy Integrated Initiative WP5 Integrated User Access.
II EGEE conference Den Haag November, ROC-CIC status in Italy
EGEE is a project funded by the European Union under contract IST Compchem VO's user support EGEE Workshop for VOs Karlsruhe (Germany) March.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
Bioinformatics Grid Application for Life Science. COMMUNICATION NETWORK DEVELOPMENT SPECIFIC SUPPORT ACTION BIOINFOGRID Andreas Gisel & Luciano Milanesi.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.
Enabling Grids for E-sciencE University of Perugia Computational Chemistry status report EGAAP Meeting – 21 rst April 2005 Athens, Greece.
Grid Computing: Running your Jobs around the World
Regional Operations Centres Core infrastructure Centres
BaBar-Grid Status and Prospects
Eleonora Luppi INFN and University of Ferrara - Italy
Tamas Kiss University Of Westminster
INFN-GRID Workshop Bari, October, 26, 2004
Brief overview on GridICE and Ticketing System
Roberto Barbera (a nome di Livia Torterolo)
Nicolas Jacq LPC, IN2P3/CNRS, France
Leigh Grundhoefer Indiana University
The GENIUS portal and the GILDA t-Infrastructure
Job Application Monitoring (JAM)
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

BIOINFOGRID: Bioinformatics Grid Application for Life Science Giorgio Maggi INFN and Politecnico di Bari

EGAAP meeting 3 March 2006, Geneve 2 Project descriptions Bioinformatics Grid Application for Life Science (BIOINFOGRID) project. The BIOINFOGRID projects proposes to combine the Bioinformatics services and applications for molecular biology users with the Grid Infrastructure created by EGEE. In the BIOINFOGRID initiative we plan to evaluate genomics, transcriptomics, proteomics and molecular dynamics applications studies based on GRID technology. BIOINFOGRID will evaluate the Grid usability in wide variety of applications, and explore and exploit common solutions. The project start date: 1st January 2006 kickoff meeting: last night

EGAAP meeting 3 March 2006, Geneve 3 BIOINFOGRID: the Workpackages WPDescription (Responsible Institution) WP1Genomics Applications in GRID (DKFZ) WP2Proteomics Applications in GRID (CNR-ITB - CILEA) WP3Transcriptomics Applications in GRID (UCAM-CLAB) WP4Database and Functional Genomics Applications (CNR-ITB - CILEA) WP5Molecular Dynamics Applications (CNRS) WP6Coordination of technical aspects and relation with RI Projects, user training, application support and resources integration. (INFN) WP7Dissemination and Outreach. (CNRITB, INFN, DKFZ, CNRS, UCAM-CLAB, CILEA) WP8Project Management Office (CNRITB)

EGAAP meeting 3 March 2006, Geneve 4 The BIOINFOGRID resources connected to EGEE Limited amount for the moment Enabled VO: –Biomed, Dteam –Bio, Cms More resources are expected to come in (in one year time)

EGAAP meeting 3 March 2006, Geneve 5 Dissemination of knowledge: WP6 and WP7 Major objective of the BIOINFOGRID project are: raise the awareness, inside the bioinformatics community, about the potentialities offered by the Grid technology in solving Bioinformatics research problems. built a solid kernel of specialists with knowledge about the major aspects of bioinformatics applications on the GRID. evaluate and adopt common solutions to port the BIOINFOGRID applications to the Grid.

EGAAP meeting 3 March 2006, Geneve 6 WP6 and WP7 activity: the web site

EGAAP meeting 3 March 2006, Geneve 7 WP6 and WP7 activity The BIOINFOGRID Initial training course Bari 8-10 March 2006 Course objectives –provide to bioinformatics users a general overview of the state of the art in the development of the Grid Middleware and infrastructures. In particular the state of LCG and gLite Middleware and of the EGEE infrastructure will be presented; –provide detailed technical information and precise instructions on how to use the GRID to enable new users to start using the Grid in the best possible way. Course program ( events/training-course-program ) events/training-course-program –The EGEE infrastructure,the GILDA t-infrastructure, the g-Lite middleware, Authorisation and Authentication, the Information System, the Workload Management System, the Data Management System, the Grid Monitoring –Practical –Access to Databases: AMGA, OGSA-DAI, G-DSE –Portals and workflows: GENIUS and TRIANA, Taverna, GRB –Examples of Bioinformatics applications and tools already deployed on the EGEE GRID infrastructure –Ethical issues

EGAAP meeting 3 March 2006, Geneve 8 WP6 and WP7 activity The BIOINFOGRID Initial training course has been opened to other bioinformatics national projects (≈ 25 participants) Many thanks to EGEE for the really strong support. We expect that the trained BIOINFOGRID personnel will join the Biomed VO after the course. A second training course will be organized in about one year from now A project conference will be organized towards the end of the project –(I.e. one year and half from now)

EGAAP meeting 3 March 2006, Geneve 9 Project applications (WP1 trough WP5) The project will support studies on applications for: distributed laboratory management systems for microarray technology gene expression studies gene data mining analysis of cDNA data phylogenetics analysis distributed database access protein functional analysis molecular dynamics simulations in GRID

EGAAP meeting 3 March 2006, Geneve 10 WP5 Mandate: deploy Molecular Dynamics application in a grid environment –Evaluation of performances Goal: contribute to the WISDOM initiative dedicated to in silico drug discovery –Start from the results of WISDOM docking data challenge in 2005 –Rerank the best hits using Molecular Dynamics Strategy: deployment of MD softwares on different grid infrastructures –Grid of PCs: EGEE-II –Grid of supercomputers: DEISA

EGAAP meeting 3 March 2006, Geneve 11 WP4: Functional Analogous Finder Goal: compare gene products according to their described function instead of by the more conventional sequence comparison. Data source: Gene Ontology (GO) and gene association  GO-terms, ~ 1.3M gene products, 7.1M associations Selection: only well described gene products are considered (>15 go terms) (≈1 million gene products) Processing: one gene against all others  1 CPU hour Output:text file with 100 best hits  compiled as an additional packages of the GO DB go terms associated to the BCL2_Human gene

EGAAP meeting 3 March 2006, Geneve 12 Functional Analogous Finder: the grid implementation Build, once for all, a text file with all the gene products well described –The text file will be transferred at run time from the SE’s where it is located to the WN The gene comparison is done by a perl script which uses statistical libraries –Need to install the perl libraries in every WN (do it at run time). –The libraries are “relocated” to avoid the need for “root” privileges The list of gene products to compare to all the others is stored into a central mySQL DB. –The DB keeps track of the completed gene products, the failed and the running ones. –The DB acts as “task queue” for automatic job submission

EGAAP meeting 3 March 2006, Geneve 13 Functional Analogous Finder: when a job lands on a WN Downloads the input files (from one of three available SE) Decompresses them Installs the perl libraries Reads from a DB the 10 genes to compare (10 hours jobs) (chosen between the not completed genes or running ones from more then 48 hours) Start the perl script and the comparison UI RB Farm1 Farm2 SE2 SE1 The job submitted is always the same Don’t have to worry about failed jobs DB

EGAAP meeting 3 March 2006, Geneve 14 Functional Analogous Finder First test run ( gene products) done over Christmas After optimization of the script and procedure –a new test run was started (100 k gene products) –It is running now using the VO “bio” on the Italian infrastructure We would like to extend the comparison to the full set of “well described” gene products (of the order of 1 million) –Using the biomed VO over the full EGEE infrastructure