Download presentation
Presentation is loading. Please wait.
Published byTyler Jones Modified over 9 years ago
1
Workflows over Grid-based Web services General framework and a practical case in structural biology BioMOBY Services Enrique de Andrés
2
2/20/2007BioMOBY Services2 Outline The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements
3
2/20/2007BioMOBY Services3 The problem… Scientific work requires: –Data resources: Genomic sequences, protein sets, expression data, … –Computational resource: Similarity searches, alignments, domain prediction, functional classification, clustering, … Often, these resources are existent and available, but: –Hard to find. –Distributed all over the world. –No common format.
4
2/20/2007BioMOBY Services4 Result… painful research!
5
2/20/2007BioMOBY Services5 Solution… Web Services: –Provides data or computational resources over the WWW. –Can be accessed automatically: application-centric web Additional advantages: –Works for every one who has internet access No firewall obstacles, … –Independent of programming languages. –Usage of broadly accepted protocols.
6
2/20/2007BioMOBY Services6 Outline The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements
7
2/20/2007BioMOBY Services7 BioMOBY BioMOBY was initiated in 2001 as collaboration of some model organism database providers. System for interoperability between biological data hosts and analytical services. –Simple, open source platform for discovery, integration, representation and retrieval of biological data. Two branches: –MOBY-S: follows the Web Service paradigm. –S-MOBY: using semantic web technology (not covered here).
8
2/20/2007BioMOBY Services8 The MOBY-S plan Create an ontology of bioinformatics data-types. Define a serialization of this ontology (data syntax). Create an open API over this ontology (let independent service providers build data-types). Define Web Service inputs and outputs using that ontology Register services in an ontology-aware registry. BioMOBY advantages: –Machines can find an appropriate service. –Machines can execute that service unattended. –Ontology is community-extensible
9
2/20/2007BioMOBY Services9 MOBY-S vs. General WS The registry is the MOBY-Central. Usage of ontologies. BioMOBY services operate on MOBY objects. Usage of namespaces. Own messaging structure for registration, detection and invocation of services
10
2/20/2007BioMOBY Services10 Outline The problem The BioMOBY idea BioMOBY ontologies –Object ontology –Service ontology How BioMOBY works Message exchanges BioMOBY elements
11
2/20/2007BioMOBY Services11 BioMOBY ontology Ontology: –A formally defined system of things and relations between these things for representation of knowledge. –Usually, an ontology builds a hierarchy of objects to describe relations in a certain domain. BioMOBY ontology: –Usage of namespaces. –Object (data) ontology: Semantic/syntactic data-types. –Service ontology.
12
2/20/2007BioMOBY Services12 Object ontology Any identifiable piece of data is an “entity”. Identifiers for these entities fall under “Namespaces” –NCBI has gi numbers (gi namespace) –GO terms have accession numbers (GO namespace) Namespaces indicate data’s semantic type. –GO:0003476 a Gene Ontology Term –gi|163483 a GenBank record Namespace + ID precisely specifies a data “entity” Identifiers are not opaque – they are semantically rich
13
2/20/2007BioMOBY Services13 Object ontology Data types defined in an open, shared GO-like ontology: –GO used as a model because of its familiarity in the community. –Nodes define data classes –Edges define the relationships between classes. Edges define one of three relationships: –ISA: Inheritance relationship. All properties of the parent are present in the child. –HASA: Container relationship of exactly 1. –HAS: Container relationship with 1 or more node Edge
14
2/20/2007BioMOBY Services14 The simplest MOBY data-type Object The combination of a namespace and an identifier within that namespace uniquely identify a data entity, not its location(s), nor its representation
15
2/20/2007BioMOBY Services15 Primitive Data-types Object Integer String Float DateTime ISA 38
16
2/20/2007BioMOBY Services16 38 Object Integer Virtual Sequence String ISA HASA Derived data-types
17
2/20/2007BioMOBY Services17 38 ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC Object Integer Virtual Sequence String ISA HASA Generic Sequence ISA HASA Derived data-types
18
2/20/2007BioMOBY Services18 38 ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC Object Integer Virtual Sequence String ISA HASA Generic Sequence ISA HASA DNA Sequence ISA Derived data-types
19
2/20/2007BioMOBY Services19 Legacy file formats Containing “String” allow us to define ontological classes that represent legacy data-types. TBLASTN 2.0.4 [Feb-24-1998] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= gi|1401126 (504 letters) Database: Non-redundant GenBank+EMBL+DDBJ+PDB sequences 336,723 sequences; 677,679,054 total letters Searchingdone Score E Sequences producing significant alignments: (bits) Value gb|U49928|HSU49928 Homo sapiens TAK1 binding protein (TAB1) mRNA... 1009 0.0 emb|Z36985|PTPP2CMR P.tetraurelia mRNA for protein phosphatase t... 58 4e-07
20
2/20/2007BioMOBY Services20 Binaries – pictures, movies, … We base64 encode binaries, and then define a hierarchy of data classes that Contain String base64_encoded_jpeg ISA text/base64 ISA text/plain HASA String MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV BAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBUb3duMQ8wDQYDVQQKEwZUaGF3dGUx HTAbBgNVBAsTFENlcnRpZmljYXRlIFNlcnZpY2VzMSgwJgYDVQQDEx9QZXJzb25hbCBGcmVl bWFpbCBSU0EgMjAwMC44LjMwMB4XDTAyMDkxNTIxMDkwMVoXDTAzMDkxNTIxMDkwMVowQjEf MB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEfMB0GCSqGSIb3DQEJARYQamprM0Bt
21
2/20/2007BioMOBY Services21 Extending legacy data-types With legacy data-types defined, we can extend them as we see fit –annotated_jpeg ISA base64_encoded_jpeg –annotated_jpeg HASA 2D_Coordinate_set –annotated_jpeg HASA Description 3554 663 This is the phenotype of a ufo-1 mutant under long daylength, 16’C MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV
22
2/20/2007BioMOBY Services22 Additional information Information Blocks provides the ability of including additional information into the objects –Cross Reference Information Blocks (CRIB) –Provision Information Blocks (PIB) 3554 663 This is the phenotype of a ufo-1 mutant under long daylength, 16’C MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJQDCC Av4wggJnoAMCAQICAwhH9jANBgkqhkiG9w0BAQQFADCBkjELMAkGA1UEBhMCWkExFTATBgNV
23
2/20/2007BioMOBY Services23 Cross Reference Information Blocks (CRIB) Content of the CRIB may include only two types of element: –A base MOBY Object ('Object' Class) cross-referenced piece of data –An Xref type Cross-Reference object service which could be executed in order to interpret the meaning of the piece of data... one or more cross-references... <Xref namespace='' id='‘ authURI='' serviceName='‘ evidenceCode='' xrefType=''>... Description...
24
2/20/2007BioMOBY Services24 Cross Reference Information Blocks (CRIB) Namespace and id: fulfil the same role as in the Object style cross- reference. authURI and serviceName: act as a unique identifier to a particular MOBY Service that the current service provider suggests you execute using this cross-reference (namespace/id) in order to correctly interpret its meaning. xrefType: –should get its value from the Cross-Reference-Type Ontology which defines a variety of semantic relationships that may exist between cross-references and the Objects that contain them. This ontology doesn't exist yet. –now, xrefType’s are free form strings. evidenceCode: indicates the 'quality' of the evidence that was used to make the cross-reference assertion. It is a term from the GO evidence codes list: –IC: Inferred by Curator –IDA: Inferred from Direct Assay –…
25
2/20/2007BioMOBY Services25 Cross Reference Information Blocks (CRIB) <moby:Xref moby:namespace="EMBL“ moby:id="X112345“ authURI="www.illuminae.com“ serviceName="getEMBLRecord" evidenceCode="IEA“ xrefType="transform"/>
26
2/20/2007BioMOBY Services26 Provision Information Blocks (PIB) Contains metadata concerning the service that was invoked: –database version, software version, execution time –additional parameters used to invoke the service,... In the current MOBY API, the content of these elements is only loosely defined, and is meant primarily to be human-readable.... one or more of the provision elements (below)... comment here
27
2/20/2007BioMOBY Services27 Service ontology Simple ISA hierarchy. Primitive types include, but it can be modified: –Analysis –Parsing –Registration –Retrieval –Resolution –Conversion –Rendering
28
2/20/2007BioMOBY Services28 Service ontology Service Blast NCBI_Blast WU_Blast Parse_NCBI_Blast Parsing Alignment Analysis
29
2/20/2007BioMOBY Services29 Outline The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements
30
2/20/2007BioMOBY Services30 How BioMOBY works 1) Service development 2) Service publication3) Service discovery 4) Service request5) Service response Technologically, BioMOBY services are general Web Services
31
2/20/2007BioMOBY Services31 How BioMOBY works BioMOBY defines a new layer on the protocol stack in order to work with its ontology. BioMOBY has its own messaging structure for registration, detection and invocation of services
32
2/20/2007BioMOBY Services32 Outline The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements
33
2/20/2007BioMOBY Services33 Client-Provider interaction Primary articles (simples / collections) Secondary articles Primary articles (simples / collections) BioMOBY service request N Primary articles (simples / collections) Secondary articles Primary articles (simples / collections) BioMOBY service request 1 TCP / IP HTTP SOAP Moby WSDL Network XML Message Service Description Biological Data 1 input parameter containing the full XML BioMOBY input 1 output parameter containing the full XML BioMOBY output Primary articles (simples / collections) Secondary articles Primary articles (simples / collections) BioMOBY service request 0
34
2/20/2007BioMOBY Services34 Client → Provider messages … SEVERAL SERVICE REQUESTS INTO ONE INVOCATION BioMOBY service request 0 BioMOBY service request 1 BioMOBY service request N
35
2/20/2007BioMOBY Services35 Provider → Client messages … SEVERAL SERVICE RESPONSES INTO ONE INVOCATION RESPONSE BioMOBY service response 0 BioMOBY service response 1 BioMOBY service response N
36
2/20/2007BioMOBY Services36 Elemental requests/responses … … … param_value …
37
2/20/2007BioMOBY Services37 Global service information Global service information block: serviceNotes Free text Service Notes …
38
2/20/2007BioMOBY Services38 Error handling Extension of the global service information block (serviceNotes) code message Free text Service Notes error: fatal error in the service warning: service detects an error or potential problem but continues information: non erroneous informative message (optional) refers to the queryID of the offending input mobyData (optional) refers to the article of the offending input simple or collection
39
2/20/2007BioMOBY Services39 Error handling: example response 600 Unable to execute the service Free text Service Notes
40
2/20/2007BioMOBY Services40 Outline The problem The BioMOBY idea BioMOBY ontologies How BioMOBY works Message exchanges BioMOBY elements –MOBY-Central –Client side –Server side
41
2/20/2007BioMOBY Services41 BioMOBY Elements
42
2/20/2007BioMOBY Services42 Worldwide Distribution of MOBY Services The Registry: Moby Central Moby project provides Moby Central as a Perl server It is a directory of services, datatypes and how to locate them
43
2/20/2007BioMOBY Services43 Client Side There are different kind of clients Some of them allow the creation of workflows Programmatic libraries:
44
2/20/2007BioMOBY Services44 Client Side: MOWServ Web browser based client Discovery of services based on data type ontology or on service type ontology It allows to connect easily service outputs to service inputs Interface helps to the Moby object construction
45
2/20/2007BioMOBY Services45 Client Side: MOWServ Data types and service ontologies
46
2/20/2007BioMOBY Services46 Client Side: MOWServ 1) Ontology browsing & service selection 2) Input submission 3) Selection output name 4) Service submission 5) Check execution status 6) Check results
47
2/20/2007BioMOBY Services47 Client Side: MOWServ List of available services for this datatype object Integrated HTML visualizer Raw XML visualizer Download MOBY object
48
2/20/2007BioMOBY Services48 Client Side: Taverna Java based graphical integrated workbench It allows the construction of complex distributed workflows It can handle different kind of services (Moby and others)
49
2/20/2007BioMOBY Services49 Client Side: Taverna Processors = Webservices Inputs Outputs
50
2/20/2007BioMOBY Services50 Client Side: Dashboard 1) Select client execution tab 2) Select service to execute 3) Fill up input 4) Execute service 5) Check output
51
2/20/2007BioMOBY Services51 Client comparison TavernaMOWServDashboard Easy to build workflowsHard to build workflowsNo workflow support Discovery of services based on providers Discovery of services based on ontology Secondary inputs cannot be modified Secondary inputs can be modified Secodary inputs can be modified Java programWeb browser accessJava program
52
2/20/2007BioMOBY Services52 Server Side Moby provides libraries for easier service development in different platforms & languages (Perl & Java) These libraries provide an abstraction of the underlayer protocols. The developer does not need to handle internet connections or SOAP messages and he can concentrate on the biological problem
53
2/20/2007BioMOBY Services53 Server Side: Steps for Developing MOBY services Design the MOBY Objects for the inputs/outputs of your service. Register them if they don’t exist. Choose the MOBY Service Type for your service. Register it if it doesn’t exist. Choose the MOBY Namespaces that will use your service. Register them if they don’t exist. Construct your MOBY Service. Register your MOBY Service. Test your MOBY Service as a client (discover and execute it).
54
Workflows over Grid-based Web services General framework and a practical case in structural biology References
55
2/20/2007BioMOBY Services55 References BioMOBY homepage: –http://www.biomoby.org/http://www.biomoby.org/ –All the tools and libraries downloadable via CVS Tutorial on INB Technologies (Msc on Bioinformatics for Health Sciences – Universitat Pompeu Fabra) –http://genome.imim.es/courses/INB2006/index.htmlhttp://genome.imim.es/courses/INB2006/index.html PlaNet Workshop: –http://mips.gsf.de/projects/plants/PlaNetPortal/workshop/index.htmlhttp://mips.gsf.de/projects/plants/PlaNetPortal/workshop/index.html Taverna: –http://taverna.sourceforge.net/http://taverna.sourceforge.net/ MOWServ: –http://www.inab.org/MOWServ/http://www.inab.org/MOWServ/ Dashboard (as part of jMoby): –http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs/http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.