EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org I. Blanquer(1), V. Hernandez(1), G. Aparicio (1), M. Pignatelli(2), J. Tamames(2)

Slides:



Advertisements
Similar presentations
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Advertisements

Deployment and Preparation of Metagenomic Analysis on the EELA Grid Gabriel Aparício, et al.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
Massive Ray Tracing in Fusion Plasmas on EGEE J.L. Vázquez-Poletti, E. Huedo, R.S. Montero and I.M. Llorente Distributed Systems Architecture Group Universidad.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Support for Astrophysical applications in.
Metagenomic Analysis Using MEGAN4
INFSO-RI Enabling Grids for E-sciencE Application of the Grid to Pharmacokinetic Modelling of Contrast Agents in Abdominal Imaging.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Damià.
CSIU Submission of BLAST jobs via the Galaxy Interface Rob Quick Open Science Grid – Operations Area Coordinator Indiana University.
EGEE-III INFSO-RI Enabling Grids for E-sciencE I. Blanquer (1), L. Martí (2), V. Hernandez (1) (1) Universidad Politécnica de Valencia.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Bioinformatics.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks S. Natarajan (CSU) C. Martín (UCM) J.L.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
1 Large-Scale Profile-HMM on the Grid Laurent Falquet Swiss Institute of Bioinformatics CH-1015 Lausanne, Switzerland Borrowed from Heinz Stockinger June.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
Metagenomic Analysis Using MEGAN4 Peter R. Hoyt Director, OSU Bioinformatics Graduate Certificate Program Matthew Vaughn iPlant, University of Texas Super.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America EELA Demo: Blast in Grids Ignacio Blanquer.
INFSO-RI Enabling Grids for E-sciencE Status of the Biomedical Applications in EELA Project (E-Infrastructures Shared Between Europe.
EGEE-III INFSO-RI Enabling Grids for E-sciencE I. Blanquer (1), V. Hernandez (1), L. Martí (2), D. Quilis (1), J. Salavert (1) (1)
Using SWARM service to run a Grid based EST Sequence Assembly Karthik Narayan Primary Advisor : Dr. Geoffrey Fox 1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Quality Plan for EGEE III Geneviève.
Update on replica management
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.
INFSO-RI Enabling Grids for E- sciencE Pharmacokinetics on Contrast Agents in Abdominal Cancer Technical University of Valencia.
Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE Training Follow-on survey 2009 Conducted on past EGEE training course participants 120 respondents.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Design of an Expert System for Enhancing.
INFSO-RI Enabling Grids for E-sciencE Graphical User Interface. for Charon Extension Layer System. and Application Dashboards Jan.
INFSO-RI Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann.
INFSO-RI Grupo de Redes y Computación de Altas Prestaciones Actividades del Grupo de Redes y Computación de Altas Prestaciones.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Abel Carrión Ignacio Blanquer Vicente Hernández.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress on first user scenarios Stephen.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks BiG: A Grid Service to Distribute Large BLAST.
INFSO-RI Enabling Grids for E-sciencE Charon Extension Layer. Modular environment for Grid jobs and applications management Jan.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ignacio Blanquer Vicente Hernández Damià.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Standard network trouble tickets exchange.
INFSO-RI Enabling Grids for E-sciencE Activities of the UPV in NA4- Biomed Ignacio Blanquer Vicente Hernández Universidad Politécnica.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
User Scenarios in VENUS-C Focus on Structural Analysis Ignacio Blanquer I3M - UPV.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Computational chemistry with ECCE on EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Martín, A. Lorca (UCM) Introduction to.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks On-line Visualization for Grid-based Astronomical.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE III User Forum – Clermont Ferrand Analysis of Metagenomes on the EGEE Grid Gabriel.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
Scaling bio-analyses from computational clusters to grids George Byelas University Medical Centre Groningen, the Netherlands IWSG-2013, Zürich, Switzerland,
INFSO-RI Enabling Grids for E-sciencE Pharmacokinetics on Contrast Agents in Abdominal Cancer Technical University of Valencia Ignacio.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
Galaxy based BLAST submission to distributed high throughput computing resources Rob Quick and Soichi Hayashi Open Science Grid Operations Indiana University.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Report from.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
U.S. ATLAS Grid Production Experience
Nicolas Jacq LPC, IN2P3/CNRS, France
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE I. Blanquer(1), V. Hernandez(1), G. Aparicio (1), M. Pignatelli(2), J. Tamames(2) (1) Universidad Politécnica de Valencia - ITACA (2) Instituto Cavanilles de la Biodiversidad - Universidad de Valencia Experiences on metagenomic analysis using the EGEE Grid

Enabling Grids for E-sciencE EGEE-III INFSO-RI Contents Metagenomic Analysis Execution Environment at the UPV Experiments Performed. Current Results and Conclusions. Work in Progress. 2

Enabling Grids for E-sciencE EGEE-III INFSO-RI Metagenomic Analysis Execution Environment at the UPV Objective: Perform the Alignment of Large Eukariotic and Prokariotic Metagenomic Samples. Procedure BLASTBLAST Mas king NRNR NRFilteredNRFiltered MetagenomeMetagenome Target sequences & score ~2Gbs <1Gb 5-40Gbs CPU years 3

Enabling Grids for E-sciencE EGEE-III INFSO-RI Steps of the Execution Script Objective: Platform-Independent, Fault- Tolerant. –Download BLAST from NCBI. –Configure and Install Locally. –Download and Reference Database from SE+Catalogue. –Execute Processing. –Copy and Register Output on SE+Catalogue. –Uninstall and Clean BLAST. 4

Enabling Grids for E-sciencE EGEE-III INFSO-RI Steps of the Submission Schema Preprocessing –Download (NCBI), Filter, Reindex, Copy and Register the Reference Databases. –Split the Input Metagenome Sample. –Define the JDL Template and Create JDLs. Execution –Submit all Jobs. –Periodically Monitor the Status of the Execution.  Resubmit Failing Jobs or Jobs on Extremely Slow Resources. Data Retrieval –Copy, Check and Sort all Result Files from the Storage Elements. 5

Enabling Grids for E-sciencE EGEE-III INFSO-RI Case Studies - Biological Farm Soil –A Sample from a Nutrient-rich and Moderately Contaminated Soil Environment.  This Community is very Diverse and Complex.  Many yet Unknown Enzymes are Probably Present There. Whale Fall –Sample From a Whale Carcass.  They are Known to be a Nutrient-rich Environment in the Bottom of the Ocean.  A Heterogeneous Mixture of Bacteria Flourish There. Sargasos’s Sea –Oceanic Samples Taken from Surface Waters.  They Represent the Diversity of Bacteria that Live Planktonically. 6

Enabling Grids for E-sciencE EGEE-III INFSO-RI Case Studies – Medical Gut Metagenomes –Several Metagenomes of the Human Intestinal Microbiota. –A consortia of Bacteria that helps its host to Metabolize Many Nutrients that would be Indigestible Otherwise. –It is Involved in other Functions  Maturation and Modulation of the Immune Response of the Host.  Prevention of Infection by Bacterial Pathogens. 7

Enabling Grids for E-sciencE EGEE-III INFSO-RI Resource Consume The Different Experiments have Consumed more than 10 CPU Years with more than 20 thousand jobs, Producing More than 120 Gbytes of Effective Results. The Performance has Reached a Peak Performance of Sequences per Hour (More than 120 Times Faster). The Load Balancing Still Requires Some Replication of Jobs To Ensure High Reliability and Quicker Response Time. 8

Enabling Grids for E-sciencE EGEE-III INFSO-RI Metrics Time (minutes) Evolution of the Sequences Processed per Hour Seqs per Hour Completion Percentage (%) Finished Jobs Time (minutes) 9

Enabling Grids for E-sciencE EGEE-III INFSO-RI Experiment Objective –Analysis of the Current Knowledge of the Universe of Sequences, and how This Knowledge has Evolved in the Past Years. –The possible Extent of Failures when Taxonomically Assigning ORFs to their Closest Relatives. Methods –Execution of BLASTX Searches for Several Metagenomes Against the Release 159 (April 2007) of GeneBank Non-redundant Database. –Restrict the list of the Homologues to those Already Present in a Particular Date, to Simulate the Results in that Time. 10

Enabling Grids for E-sciencE EGEE-III INFSO-RI Conclusions In the Experiment of Soil Farm the Results Showed that Many Assignments Have Changed, Even in Recent Times. The Rate of Change Does not Decrease in Recent Times, and Even it Increases for Most Cases. This Indicates that the Full Diversity of Theses Communities is Still not Well Described in the Current Databases. Ref: Miguel Pignatelli, Gabriel Aparicio, Ignacio Blanquer, Vicente Hernández, Andrés Moya and Javier Tamames, "Metagenomics reveals our incomplete knowledge of global diversity", Bioinformatics, ISSN , Oxford University Press

Enabling Grids for E-sciencE EGEE-III INFSO-RI Further Work Work is Continuing Through Four Ways: –Execution of new Experiments.  Experiments of Virus Metagenomes have been Recently Started. –Migration of the Execution Environment to Different Problems  The Execution Environment is Being Generalised for Other Problems.  Currently we are Testing this Generalisation with the JAGS (Just Another Gibbs Sampler). –Extension of the Execution Pipeline with More Steps.  Creation of Phylogenetic Trees. –Refinement of the Resubmission System  A Trade-off Between Efficiency and Performance is Being Analysed to Increase the Number of Replicas in Failing Jobs. 12