EMBOSS, MyGrid and EMBRACE

Slides:



Advertisements
Similar presentations
Introductory to database handling Endre Sebestyén.
Advertisements

How to Author MIRC Teaching File Documents. MIRC M edical I maging R esource C enter.
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
CCPN project modeling framework University of Cambridge European Bioinformatics Institute MSD group.
Using Taverna to access SOAP-based web services Per Larsson CBR
Peter Rice and Mahmut Uludag EMBOSS as an Efficient DAS Annotation Source Peter Rice, EBI Mahmut Uludag, EBI 10th March.
European Life Sciences Infrastructure for Biological Information Rafael C Jimenez ELIXIR CTO EMBL-EBI workshop networks and pathways.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
GCG vs EMBOSS Gary Williams. Which is better GCG or EMBOSS? n You must decide for yourselves n You may find other packages that do what you want n Use.
CoMPAS Pro: Comprehensive Meta Prediction and Annotation Services for Proteins Sebastian J. Schultheiß Christoph Malisi.
What is EDAM? EMBRACE Data and Methods Ontology for bioinformatics tools and data A set of defined terms, relationships between terms and rules that govern.
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
26-28 th April 2004BioXHIT Kick-off Meeting: WP 5.2Slide 1 WorkPackage 5.2: Implementation of Data management and Project Tracking in Structure Solution.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
The HDF Group July 8, 2014HDF 2014 ESIP Summer Meeting HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann The.
Arne Elofsson EMBRACE: workshop on protein bioinformatics Welcome EMBRACE The new type of bioinformatics Web-services Membrane protein.
Peter J. Briggs, Liz Potterton *, Pryank Patel, Alun Ashton, Charles Ballard, Martyn Winn CLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK *
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
EMBRACE Web Services Taavi Hupponen CSC – Center for Scientific Computing, Finland BOSC 2007.
Introducing EMBOSS/ Jemboss European Molecular Biology Open Software Suite Dr. Erik Bongcam-Rudloff.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
BioMart A Federated Query Architecture Arek Kasprzyk European Bioinformatics Institute 26 April 2004.
Privacy issues in integrating R environment in scientific workflows Dr. Zhiming Zhao University of Amsterdam Virtual Laboratory for e-Science Privacy issues.
Funded by: EMBRACE and EMBOSS Integrating everything and Integrated by everything Peter Rice, EBI June 2006.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,

Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
EMBRACE An example of Grid Integration (I): The EMBRACE project Jean SALZEMANN CNRS/IN2P3.
Professor Carole Goble
Implementing computational analysis through Web services Arnaud Kerhornou CRG/INB Barcelona - BioMed Workshop IRB November 2007.
Association of variations in I kappa B-epsilon with Graves' disease using classical and my Grid methodologies Peter Li School of Computing Science University.
Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath,
LiveBASE, the Bioinformatics Application SuitE. Introduction: Mission Statement Leading Provider of Business Process Integration Solutions for Life Science.
EMBOSS over a Grid 1. 1st EELA Grid School December 4th of 2006 Eduardo MURRIETA LEON Romualdo ZAYAS-LAGUNAS Pierre-Alain BRANGER Jérôme VERLEYEN Roberto.
UK MRC Human Genome Mapping Project Resource Centre Jemboss – a Graphical User Interface for the EMBOSS suite of programs.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Copyright OpenHelix. No use or reproduction without express written consent1.
Portals and my Grid Stefan Rennick Egglestone Mixed Reality Laboratory University of Nottingham.
ISMB Demo, 01 July 2009 Franck Tanoh University of Manchester, UK.
Introduction to wEMBOSS (EMBOSS) Shahid Manzoor Adnan Niazi SLU Global Bioinformatics Centre, Uppsala, Sweden.
MyGrid: Personalised Bioinformatics on the Information Grid Robert Stevens, Alan Robinson & Carole Goble University of Manchester & EBI, UK myGrid project.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Workflow and myGrid Justin Ferris IT Innovation Centre 7 October 2003 Life Sciences Grid GGF9.
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
July LJM Introduction to Bioinformatics Lisa Mullan, HGMP-RC.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
EMBL’s European Bioinformatics Institute
Professor Carole Goble University of Manchester, UK
Grid Portal Services IeSE (the Integrated e-Science Environment)
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Introduction to bioinformatics
A portal interface to myGrid workflow technology
Functional Annotation of the Horse Genome
Migrating to Unified Content
Taverna workflow management system
Malte Dreyer – Matthias Razum
Lesson 3 Bioinformatics Laboratory
An Introduction to Designing and Executing Workflows with Taverna
Presentation transcript:

EMBOSS, MyGrid and EMBRACE European Molecular Biology Open Software Suite Taverna workbench and workflows Web Services Peter Rice pmr@ebi.ac.uk

Who do we serve? Expert software developers Expert users Bioinformaticians Computer scientists Expert users Biology research community Industry Scientific users 22.05.2018 China-UK Data Transmission

What do we serve? Sequence analysis tools Data resources Workflows Open source Comprehensive package Data resources Public sequence database resources Locally installed data Users’ own datasets Workflows Taverna workbench Web services Standard SOAP web services Web service registry 22.05.2018 China-UK Data Transmission

EMBOSS: A quick introduction European Molecular Biology Open Software Suite Open source package for sequence analysis ANSI C source code GPL licensed applications, LGPL libraries 200+ applications 100+ third party applications in 15 associated packages Project started 1996 at Sanger Centre and HGMP Now based at EBI Release 6.1.0 15th July 2009 Funded by UK-BBSRC and EMBL-EBI 22.05.2018 China-UK Data Transmission

EMBOSS World Wide We have users in every continent - and a picture to prove it. This is British Antarctica. We are promised another photo from the frozen North The first EMBOSS course was in Beijing, April 1999. The wEMBOSS interface is from Canada, Argentina and Belgium 22.05.2018 China-UK Data Transmission

EMBOSS command line interface EMBOSS applications run from the command line This is not the only interface There are over 100 interfaces and packaged systems available Web interfaces Graphical user interfaces (GUIs) Web services All applications have a command definition file (.acd) Defines all inputs, outputs, and other options Read at startup Contains all command line options with descriptions Template for any other interface 22.05.2018 China-UK Data Transmission

EMBOSS command line example % antigenic Input protein sequence(s): uniprot:actb1_fugru Minimum length of antigenic region [6]: Output report [actb1_fugru.antigenic]: % antigenic uniprot:actb1_fugru -auto 22.05.2018 China-UK Data Transmission

EMBOSS ACD File integer: minlen [ standard: "Y" minimum: "1" maximum: "50" default: "6" information: "Minimum length of antigenic region" ] endsection: required section: output [ information: "Output section" type: "page" report: outfile [ parameter: "Y" rformat: "motif" multiple: "Y" taglist: "int:pos=Max_score_pos" endsection: output application: antigenic [ documentation: "Finds antigenic sites in proteins" groups: "Protein:Motifs" ] section: input [ information: "Input section" type: "page" seqall: sequence [ parameter: "Y" type: "PureProtein" endsection: input section: required [ information: "Required section" 22.05.2018 China-UK Data Transmission

EMBOSS makes things easy ACD files define sequence input Sequence type for DNA/protein, possible ambiguity codes, gaps Sequences in files 40+ formats supported - auto detection Sequence databases Remote servers - SRS, Entrez, MRS, URL Locally indexed - using the original data files Local script utilities Sequence output 40+ formats supported : sequence and features DAS support (Distributed Annotation Servers) 22.05.2018 China-UK Data Transmission

Example Dasty screen: Protein annotation 22.05.2018 China-UK Data Transmission

Example Ensembl: DNA annotation 22.05.2018 China-UK Data Transmission

EMBOSS Future plans Three open source books: users, developers, admin Cambridge University Press Original text can be freely reused New areas of interest Metadata and ontologies (EDAM, taxonomy, GO, SO, …) (all) public data resources Coordinate systems (ensembl, gene/protein input/results) Project-based working Next-generation sequence data – used by ordinary biologists 100+ new applications Database index updates Scientific advisory board Developer / User courses courses: anywhere, any time 22.05.2018 China-UK Data Transmission

Taverna workbench 22.05.2018 China-UK Data Transmission

Taverna Workbench MyGrid UK e-Science project Taverna 2.0 Workbench for bioinformatics data and tools services Open source Integrates SOAP web services Workflows can be saved, and exchanged by email Data passed to service, results returned Complete record available Taverna 2.0 Designed for scalability New workflow model Multiple servers Data passed by reference 22.05.2018 China-UK Data Transmission

EMBRACE Web Services EMBRACE EC Network of Excellence 18 partners Application interface standards for data content: DNA and protein sequence data Structure and image data Gene and protein expression Literature and text mining Analysis tools using data content standards Sequence analysis tools (EMBOSS etc.) Structure analysis tools ... and tools for other data types Taverna as an example user interface 22.05.2018 China-UK Data Transmission

EMBRACE Registry 22.05.2018 China-UK Data Transmission

EMBRACE Registry Registry of EMBRACE Web Services BioCatalogue Requires standard web service definitions Test suites defined by service providers Simple report of service availability Standard annotation Requires an ontology of terms for datatypes and methods BioCatalogue Manchester/EBI joint project EMBRACE Registry is a prototype Sharing a common schema BioCatalogue will take over when EMBRACE ends in 2010. 22.05.2018 China-UK Data Transmission

What do we serve? Sequence analysis tools Data resources Workflows Open source Comprehensive package Data resources Public sequence database resources Locally installed data Users’ own datasets Workflows Taverna workbench Web services Standard SOAP web services Web service registry 22.05.2018 China-UK Data Transmission

Acknowledgements EBI: Peter Rice, Alan Bleasby, Jon Ison, Martin Senger, Tom Oinn, Jaina Mistry, Rodrigo Lopez, Sharmilla Pillai, Hamish McWilliam RFCGR/HGMP: Alan Bleasby, Jon Ison, Tim Carver, Hugh Morgan, Claude Beazley, Lisa Mullan, Damian Counsell, Gary Williams, Val Curwen, Mark Faller, Sinead O’Leary, Thon deBoer, Martin Bishop LION: Thomas Laurent, Bijay Jassal, Bren Vaughan, Thure Etzold Sanger Institute: Ian Longden, Richard Bruskiewich, Simon Kelley National bioinformatics service providers in: Norway, Spain, Italy, Netherlands, Germany, Belgium, Russia, China, Canada, Australia, Argentina Others: Catherine Letondal, Don Gilbert, Rodger Staden, Bill Pearson, Webb Miller, Marie-Laetitia Denayer, Amandine Schurmann, Gabriele Weiler, Luke McCarthy, David Mathog, David Bauer, Henrikki Almusa, Thomas Siegmund, Scott Markel, Darryl Leon, Bastien Chevreux... IBM, Hewlett-Packard, (Compaq), Apple, SGI, Sun, LION bioscience, SciTegic, Accelrys, Cambridge University Press Open-Bio Foundation, Sourceforge ... And the British Antarctic Survey http://emboss.sourceforge.net http://emboss.open-bio.org/wiki 22.05.2018 China-UK Data Transmission