Wrapping analytical services for caBIG Taverna-caGrid technical review meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
Agenda Project overview Primary goals Service selection Services identified Architecture Service outputs UML model Template workflow Work so far Implementation plan
Project overview Taverna caGrid cooperation Taverna workbench enhancements for caGrid Grid-enabling analytical services caGrid security support for Taverna This presentation deals with the analytical services
Primary goals Identify two publicly available analytical web services currently accessible through Taverna caGrid-enable the services; semantically described using caBIG’s infrastructure Demonstrate building of workflows combining the new services with existing caBIG services
Service selection Selected services in collaboration with the caGrid Workflow working group, lead by Juli Winners: NCBI Blast hosted by EBI InterProScan hosted by EBI
Why these services? Freely available Highly reliable, hosted by EBI Widely used by the scientific community Can be combined with existing caBIG tools in biologically meaningful workflows caBIO, GridPIR, etc.
Services identified NCBI Blast A popular similarity search tool using local sequence alignment Supports sequences of proteins, DNA, RNA Searches sequences in a whole range of databases SWISSPROT, UNIPROT, NCBI, EMBL, etc. SOAP web service hosted by EMBL-EBI
Services identified InterProScan Integrates various databases of protein domains and functional sites Searches using protein signature recognition methods SOAP web service hosted by EMBL-EBI
Architecture
Architecture as pseudo code class CaGridClient: def main(): endpointReference = wrappedService.invoke(inputs) endpointReference.subscribe() def resourcePropertyChanged(): outputs = endpointReference.getResourceProperty() print "Result", outputs class WrappedService: def invoke(inputs): convertedInputs = dataConverter.convertFromCaGrid(inputs) jobId = serviceInvoker.invoke(convertedInputs) endpointReference = new EndpointReference(jobId) return endpointReference def outputReturned(jobId, outputs): convertedOutputs = dataConverter.convertToCaGrid(outputs) endpointReference.setResourceProperty(convertedOutputs) class ServiceInvoker: def invoke(convertedInputs): jobId = originalService.invoke(convertedInputs) return jobId
Output InterProScan (Untranslated) xsi:noNamespaceSchemaLoca.. /Header> <protein id="unipro <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> Molecular Function protease inhibitor activity <match id="G3DSA: " name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score=" E-5" status="T" evidence="Gene3D" /> <location start="30" end="72" score=" E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score=" E-14" status="T" evidence="HMMPfam" /> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197">...
Output InterProScan (Untranslated) xsi:noNamespaceSchemaLoca.. /Header> <protein id="unipro <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> Molecular Function protease inhibitor activity <match id="G3DSA: " name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score=" E-5" status="T" evidence="Gene3D" /> <location start="30" end="72" score=" E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score=" E-14" status="T" evidence="HMMPfam" /> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197">...
UML model: wrapped InterproScan
UML model: wrapped NCBIBlast
Template workflow EBI_dbfetch_fetchBatch will be replaced with the caBIG service caBIO This workflow uses both NCBIBlast and InterproScan which will be replaced with the wrapped services
Work so far Identified services and example workflow Described services (Deliverable 3.2) Modelled service inputs and outputs in UML according to caGrid guidelines Still a few tweaks needed for WS-Resource usage Architecture and implementation plan for wrapping services (Deliverable 3.3) JavaDoc needs updating for WS-Resource
Implementation plan Generate Common Data Elements for inputs and outputs and verify Silver compatability Generate semantically annotated XMIs Submit Silver compatability review package Implement and deploy wrapped services Using Introduce and possibly gRavi Implement, test, deploy We’ll start with this before submitting CDEs Build caGrid-based workflow using services
Any questions..?