THE BIOVEL PROJECT: ROBUST PHYLOGENETIC WORKFLOWS RUNNING ON THE GRID Bachir Balech (IBBE-CNR) www.biovel.eu.

Slides:



Advertisements
Similar presentations
Analysis of Affymetrix expression data using R on Azure Cloud Anne Owen Department of Mathematical Sciences University of Essex 15/16 March, 2012 SAICG.
Advertisements

Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Introduction to Web services MSc on Bioinformatics for Health Sciences May 2006 Arnaud Kerhornou Iván Párraga García INB.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Methods for Phylogenetics and Evolutionary analysis Jianpeng Xu University of Nebraska-Omah a.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Evaluating alignments using motif detection Let’s evaluate alignments by searching for motifs If alignment X reveals more functional motifs than Y using.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
What EDIT brings : Funding, Fieldwork, Training, Web, Software Gaël Lancelot EDIT Communication officer.
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
Drivers for a PRAGMA Biodiversity Science Expedition Reed Beaman Florida Museum of Natural History University of Florida.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Measuring Habitat and Biodiversity Outcomes Sara Vickerman and Frank Casey September 26, 2013 Defenders of Wildlife.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Building and Using Workflows Within the DE; Phylogenetics.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
The DEER Distributed European Electronic Resource Dr Suzanne Keene Francesca Monti University College London.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
ViBRANT Virtual Biodiversity Research Project overview Isabella Van de Velde Royal Belgian Institute of Natural Sciences, Brussels.

CSIU Submission of BLAST jobs via the Galaxy Interface Rob Quick Open Science Grid – Operations Area Coordinator Indiana University.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen.
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
1 Large-Scale Profile-HMM on the Grid Laurent Falquet Swiss Institute of Bioinformatics CH-1015 Lausanne, Switzerland Borrowed from Heinz Stockinger June.
Implementing computational analysis through Web services Arnaud Kerhornou CRG/INB Barcelona - BioMed Workshop IRB November 2007.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Ecosystem Service Indicators, Biome-BGC and the SZTAKI Desktop Grid P. Ittzés 1, A. Cs. Marosi 2, Z. Barcza 1, F. Horváth 1 1. MTA Centre for Ecological.
PhyloGrid: a development for a workflow in Phylogeny E. Montes 1, R. Isea 2 and R. Mayo 1 1 CIEMAT, Avda. Complutense, 22, Madrid, Spain 2 Fundación.
Holding slide prior to starting show. A Portlet Interface for Computational Electromagnetics on the Grid Maria Lin and David Walker Cardiff University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 A Steering Portal for Condor/DAGMAN Naoya Maruyama on behalf of Akiko Iino Hidemoto Nakada, Satoshi Matsuoka Tokyo Institute of Technology.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
EMBOSS over a Grid 1. 1st EELA Grid School December 4th of 2006 Eduardo MURRIETA LEON Romualdo ZAYAS-LAGUNAS Pierre-Alain BRANGER Jérôme VERLEYEN Roberto.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Build an Automated Workflow Visual Workflow Creator Discovery Environment.
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Biodiversity Data Exchange Using PRAGMA Cloud Umashanthi Pavalanathan, Aimee Stewart, Reed Beaman, Shahir Shamsir C. J. Grady, Beth Plale Mount Kinabalu.
Biomedical Informatics Research Network BIRN Workflow Portal.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
SHIWA and Coarse-grained Workflow Interoperability Gabor Terstyanszky, University of Westminster Summer School Budapest July 2012 SHIWA is supported.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
The DEER Distributed European Electronic Resource Dr Suzanne Keene Francesca Monti University College London.
A Cyberinfrastructure for Drought Risk Assessment An Application of Geo-Spatial Decision Support to Agriculture Risk Management.
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
What Biodiverse: spatial analysis of diversity using indices WebBiodiverse: a web layer TOLKIN: systematic and phylogenetic (tree of life) research.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Introductory remarks Wouter Los LifeWatch Infrastructure for Biodiversity and Ecosystem Research.
+ Support multiple virtual environment for Grid computing Dr. Lizhe Wang.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Building and Using Workflows Within the DE; Phylogenetics.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Norman Morrison Senior Research Fellow, The University of Manchester Biodiversity Virtual e-Laboratory An e-Infrastructure and e-Science environment supporting.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Bioinformatics Overview
Metagenomic Species Diversity.
Ecological Niche Modelling in the EGI Cloud Federation
Introduction to Bioinformatics Resources for DNA Barcoding
MATLAB Distributed, and Other Toolboxes
Expanding and Scaling Lifemapper Computations Using CCTools
EC FP7 - Cooperation Theme 6: Environment (incl. climate change)
Sequence Based Analysis Tutorial
Genes to Trees Daniel Ayres and Adam Bazinet
High Performance Computing Center – HLRS
Presentation transcript:

THE BIOVEL PROJECT: ROBUST PHYLOGENETIC WORKFLOWS RUNNING ON THE GRID Bachir Balech (IBBE-CNR)

The Biovel Project BioVeL is a virtual e-laboratory that supports research on Biodiversity issues using large amounts of data from cross-disciplinary sources. It is a consortium of 15 partners from 9 countries, as well as an outer circle of ‘Friends of BioVeL’  Access a worldwide network of expert scientists  Sharing knowledge on Biodiversity research Biodiversity Issues o Species identification, discovery and distributions o The changing nature of ecosystems altering organismal composition o The increased risks of species extinction Decision making in biodiversity management at multiple scales (genomic, organismal, habitat, ecosystem, landscape, etc…)

Biodiversity Solutions Services: data processing techniques. Each technique is available as a single executable application which can be used either alone or within a workflow builder environment (e.g. Taverna) Workflows: examples of services use that can be modified Services and Workflows for Biodiversity Analysis:  Taxonomy  Phylogenetics  Metagenomics  Ecological Niche Modeling  Ecosystem Functioning and Valuation  Geospatial Visualization Sharing Services Workflows

Example of Phylogenetic Services Alignment Phylogeny Inference Job Submission Tool

Job Sumbission Tool: JST Frontend: Username Task status Dependencies of each task Priority Job provenance Task description Number of failures Date and time of execution Infrastructure information (grid, local farm, interactive server) Backend: Task submission at a given rate Stops jobs submission when no more unassigned tasks are found in the TaskList

Multiple Sequence Alignment Workflow

In progress: Multiple Domain Coding sequences Alignment Higher alignment precision given by: HMM search assigning a per site quality score (posterior probability) Back-align (amino acid -> DNA) Multiple Alignment of DNA coding Translation HMM search Pfam profile selection HMM align & Back-align File upload

Example Phylogenetic Inference Workflow

MrBayes Web Interface Bayesian Phylogeny Computation & Output Retrieval GeoKS Execution Consensus Tree Calculation Tree Visualization Example Phylogenetic Inference Workflow in Taverna Other available Phylogenetic Services:  Maximum Likelihood (RaxML)  Phylogenetic Diversity (Phylocom) Peculiarity:  Partitioned models  Convergence calculation  Short Computation time on the Grid (even for long jobs)

Bioinformatic Scientists Prof. Graziano Pesole Dr. Saverio Vicario Acknowlegments ICT specialists Dr. Giacinto DONVITO Dr. Pasquale NOTARANGELO Funding: European Commission 7 th Framework Programme (FP7), through the grant agreement: