E-Science Tools For The Genomic Scale Characterisation Of Bacterial Secreted Proteins Tracy Craddock, Phillip Lord, Colin Harwood and Anil Wipat Newcastle.

Slides:



Advertisements
Similar presentations
ISWC 2005, Galway Seven Bottlenecks to Workflow Reuse and Repurposing Antoon Goderis Ulrike Sattler Phillip Lord Carole Goble University of Manchester.
Advertisements

IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
 Preparing undergraduates to succeed in college and beyond in a bioinformatics-rich curriculum  Discussion of existing resources, opportunities, and.
Simon Woodman Hugo Hiden Paul Watson Jacek Cala. Outline 1. What is e-Science Central? 2. Architecture and Features 3. Workflows and Applications.
GADA Workshop 1-2 November 2005 Life Science Grid Middleware in a More Dynamic Environment Milena Radenkovic & Bartosz Wietrzyk The University of Nottingham,
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
On the Use of Agents in a BioInformatics Grid with slides from Luc Moreau, University of Southampton,UK myGrid.
Workflow discovery in e-science Antoon Goderis Peter Li Carole Goble University of Manchester, UK
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
The MashMyData project Combining and comparing environmental science data on the web Alastair Gemmell 1, Jon Blower 1, Keith Haines 1, Stephen Pascoe 2,
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
EBI is an Outstation of the European Molecular Biology Laboratory. Web Services Programmatic access to Life Sciences resources. Rodrigo Lopez.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
GMOD Projects at the Center for Genomics and Bioinformatics Chris Hemmerich - Indiana University, Bloomington.
Proteome data integration characteristics and challenges K. Belhajjame 1, R. Cote 4, S.M. Embury 1, H. Fan 2, C. Goble 1, H. Hermjakob, S.J. Hubbard 1,
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact e-Science.
MyGrid: Personalised e-Biology on the Grid Professor Carole Goble Contact
Integrating BioMedical Text Mining Services into a Distributed Workflow Environment Rob Gaizauskas, Neil Davis, George Demetriou, Yikun Guo, Ian Roberts.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
TMpro: Transmembrane Helix Prediction using Amino Acid Properties and Latent Semantic Analysis Madhavi Ganapathiraju, N. Balakrishnan, Raj Reddy and Judith.
The DAME project Professor Jim Austin University of York.
Tom Oinn, In general a grid system is, or should be : “A collection of a resources able to act collaboratively in pursuit of an overall.
DAME: A Distributed Diagnostics Environment for Maintenance Duncan Russell University of Leeds.
Anil Wipat University of Newcastle upon Tyne, UK A Grid based System for Microbial Genome Comparison and analysis.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Workflow in Grid Systems Workshop Dave Berry, Research Manager UK National e-Science Centre GGF10, Mar 2004.
CaliBayes and BASIS: e-Science applications for Systems Biology research Yuhui Chen Institute for Ageing and Health Centre for Integrated Systems Biology.
Association of variations in I kappa B-epsilon with Graves' disease using classical and my Grid methodologies Peter Li School of Computing Science University.
GGF Summer School 24th July 2004, Italy Part 2: Architecture overview Professor Carole Goble University of Manchester
Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath,
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Bioinformatics Workflows Chris Wroe (based on material from the myGrid team & May Tassabehji / Hannah Tipney Medical Genetics, St Marys)
Distributed Computing With Triana A Short Course Matthew Shields, Ian Taylor & Ian Wang.
PharmaGrid 2004, Switzerland, July Part 5: Wrap Up Professor Carole Goble University of Manchester
A Collaborative Research Environment for Avian Flu Research Luo Ze Computer Network Information Center, CAS
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
BioVLAB-Microarray: Microarray Data Analysis in Virtual Environment Youngik Yang, Jong Youl Choi, Kwangmin Choi, Marlon Pierce, Dennis Gannon, and Sun.
E-Science Process. Thoughts on the e-Science Mediator in myGrid M.Nedim Alpdemir.
The my Grid Information Model Nick Sharman, Nedim Alpdemir, Justin Ferris, Mark Greenwood, Peter Li, Chris Wroe AHM2004, 1 September
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
Amsterdam December 4-6, 2006 eScience 2006 A Grid-based Architecture for the Composition and the Execution of Remote Interactive Measurements Andrea BagnascoAriannaPoggi,
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bioinformatics activity Christophe BLANCHET.
MyGrid: Personalised Bioinformatics on the Information Grid Robert Stevens, Alan Robinson & Carole Goble University of Manchester & EBI, UK myGrid project.
Workflow and myGrid Justin Ferris IT Innovation Centre 7 October 2003 Life Sciences Grid GGF9.
1 Case Study: Business Intelligence & Customer Data Customer Support Web-based Dashboard VP Marketing SQL XSLT XML Data Grid Customer Data Customer Order.
Reading e-Science Centre Technical Director Jon Blower ESSC Director Rachel Harrison CS Director Keith Haines ESSC Associated Personnel External Collaborations.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
University of Chicago and ANL
Programmatic access to EMBL-EBI resources
Whole-cell models: combining genomics and dynamical modeling
SDM workshop Strawman report History and Progress and Goal.
Metagenomics Microbial community DNA extraction
Biological Information and Biological Databases
Collaborative RO1 with NCBO
A Sample Gbrowse-Moby BioMoby Browsing Session
Scientific Workflows Lecture 15
Presentation transcript:

e-Science Tools For The Genomic Scale Characterisation Of Bacterial Secreted Proteins Tracy Craddock, Phillip Lord, Colin Harwood and Anil Wipat Newcastle University

Outline  Computational challenges of bioinformatics  Secretion in Bacillus  Classification and analysis workflows  Results and discussion

Computational Challenges of Bioinformatics  New requirements from bioinformatics  3 major problems  Heterogeneity  Distribution  Autonomy  Experiments - series of workflows

my Grid and Taverna Scufl Simple Conceptual Unified Flow Language Taverna Writing, running workflows & examining results SOAPLAB Makes applications available Freefluo Workflow engine to run workflows Freefluo SOAPLAB Web Service Any Application Web Service e.g. DDBJ BLAST

Microbase  Grid-based system for microbial genome comparison and analysis  Information repository (and execution environment)  Pre-computed data

Outline  Computational challenges of bioinformatics  Secretion in Bacillus  Classification and analysis workflows  Results and discussion

Secretion in Bacillus  Predict characteristics & behavior of bacteria  Identify secreted proteins  Bacillus species diverse behaviour  Soil inhabitants  Harmful bacteria

Importance of Secretion  Mechanism of interaction with environment  Reveal capabilities of an organism  Pathogens are of great interest

Secretory Proteins Cytoplasm Medium Membrane Cell Wall Signal Peptide Lipoprotein Cell wall binding Transmembrane LPXTG

Outline  Computational challenges of bioinformatics  Secretion in Bacillus  Classification and analysis workflows  Results and discussion

Bioinformatic Tools Cytoplasm Medium Membrane Cell Wall Signal Peptide Lipoprotein Cell wall binding Transmembrane LPXTG Signalp TMHMM tmap MEMSAT LipoP ps_scan

Classification Workflow

Process of Analysis Putative secreted proteins Protein families Functional classification Relations

Analysis Workflow

Architecture  Custom-designed database  Provenance tracking  Analysis – computationally intensive  Architecture differs from other systems

Web Portal

Outline  Computational challenges of bioinformatics  Secretion in Bacillus  Classification and analysis workflows  Results and discussion

Classification Results

Functions of the Clusters Number of families

Biologist’s Outlook  Results available for subsequent analysis  Data and results are of great interest

eScientist’s Outlook  Microbase simplified data analysis But …  Autonomy - most services provided originally by external parties  Licensing – limits exposure of services  Distribution - difficulty came from the relatively large datasets

Future Enhancements  Use notification to automatically analyse recently annotated genomes  Migrate workflows to a remote enclosed environment?

Acknowledgments  Phillip Lord  Colin Harwood  Anil Wipat my Grid  Carole Goble  Tom Oinn … and the rest of the my Grid team Microbase  Yudong Sun  Anil Wipat  Matthew Pocock  Pete A. Lee  Paul Watson  Keith Flanagan  James T. Worthington