Implementing computational analysis through Web services Arnaud Kerhornou CRG/INB Barcelona - BioMed Workshop IRB November 2007
Current situation in Bioinformatics
Discovery Service description Ontologies Data transfert Automation Limits
BioMoby architecture PublishFind Bind Service registry Service Provider Service Descriptions Service Description Service WDSL, UDDIWSDL, UDDI Service Requestor A web service is an interface that describes a collection of operations that are network accessible through standardized XML messaging
BioMoby a unifying framework approach The bioMoby project aims to provide bioinformatics resources through the web. It can be data retrieval resources or analysis resources. It defines an ontology-based messaging standard The services are registered in a central “yellow pages” server to facilitate the discovery The services specifications are formalized in a description language.
It provides: A Central Registry of services A set of standards to specify: Message formatting, Error reporting Asynchronous requests An API written in two languages, perl and java Ontologies to represent Types of services, Data types The BioMoby framework
Ontology Data exchange relies on the use of Ontologies. Ontology to represent knowledge in a given domain In bioinformatics: –OBO (GO, SO and many many more) –Biomoby datatypes to classify service input/output –Biomoby service types
Establish Ontologies to formalize the representation of: Types of services Types of data The BioMoby ontologies
Bioinformatics Sequence Analysis MultipleSequence Alignment PairwiseSequence Alignment Alignment GeneFinding is-a Service The Service Type Ontology
Object String Integer Virtual Sequence Generic Sequence DNA Sequence AminoAcid Sequence text_plain text_formatted GFF has-a is-a has-a The Data Type Ontology
AAATGTCGCTCGATACGATCAGCTACGA 28 Moby DNASequence Object
BioMoby Service specs Service name: Free Text Service type: Moby service type ontology Description: Free text One or more inputs: Moby data type ontology One or more outputs: Moby data type ontology One or more parameters: –name (a string) –value (an ‘primitive’, ie a String or an Integer etc.)
Example Service type: GeneFinding Description: ab-initio gene finding software Input: a DNASequence object Output: a GFF object Parameters: –Profile (Default is Human) –Strand (Default is both strands) RunGeneIDGFF service specifications:
Client Side There are different kind of clients Some of them allow the creation of workflows Programmatic libraries:
Java based graphical integrated workbench It allows the construction of complex distributed workflows It can handle different kind of services (Moby and others) Client Side: Taverna I
Processors = Webservices Inputs Outputs Client Side: Taverna II
Client Side: Taverna III Moby Web service Configuration
All the info accessible at the Moby homepage at: – Taverna Web site – Remora Web interface – MowServ Web interface – Genome Analysis services page – BioMoby on the Web