Download presentation
Presentation is loading. Please wait.
1
1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)
2
Outline The presentation will discuss the challenges encountered in exposing the EMBOSS suite of command line sequence analysis tools as a ‘stateful’ SOAP based web service. An overview of the proposed framework for client-side requests, server-side job submission and results delivery will then be given.
3
What is EMBOSS? EMBOSS is "The European Molecular Biology Open Software Suite". What can I use EMBOSS for? Consists of approx 300 command line applications covering areas such as: Sequence alignment Rapid database searching with sequence patterns Protein motif identification, including domain analysis Phylogenetic analysis Presentation tools for publication
4
What is JAX-WS? In the words of SUN: JAX-WS - Java API for XML Web Services (JAX-WS). is the centerpiece of a newly rearchitected API stack for Web services, the so-called "integrated stack" that includes JAX-WS 2.0, JAXB 2.0, and SAAJ 1.3. Essentially a SOAP toolkit for Java The implementation has been renamed (JAXRPC) It brings clear improvements on data binding capabilities through its tight integration with JAXB – Java API for XML Binding
5
Current State of (old) EBI EMBOSS Web Service The current server-side implementation is Perl-based. Sample clients are available in.Net, SOAP::Lite and Java (Axis) solutions. Currently accepts free text as data input – weak typing – poor validation capability Supports both Synchronous and Asynchronous job submission. Asynchronous requests are allocated a job id Migrating to a Java-based JAX-WS server side implementation enables us to have more control over the generated artifacts, increased data validation capabilities and to rapidly improve on the functionality provided.
6
EMBOSS Data Types There are 52 datatypes (at the last count) used within the EMBOSS suite of applications. These fall under five headings 1. Simple – Array, Boolean, Integer, String … 2. Input – Codon, Features, Sequence, Seqall … 3. Selection Lists – List, Selection … 4. Output – Align, Report, Seqout … 5. Graphics – Graph, Xygraph
7
EMBOSS Qualifiers EMBOSS command line program Accepts application name + qualifiers (each of which is a datatype): Water -asequence tsw:hba_human -bsequence tsw:hbb_human : (water sequence seqall) -asequence is of datatype Sequence, bsequence of Seqall Qualifiers consist of associated qualifiers which can be also passed to the command line to enable advanced configuration of the application call. - sbegin, -send, -sformat
8
General, Additional & Advanced Qualifiers General are common to all EMBOSS applications -auto true - Turn off prompts (boolean datatype) -stdout true - Write standard output (boolean)
9
Web Service Development In accordance with the Technology Recommendation we have chosen Top-Down approach to WS Development, not Bottom- Up. Top-Down Approach to WS Development Express data types in schema Write WSDL (include schema) Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class
10
Top-Down Approach to WS Development Top-Down Express data types in schema Write WSDL (include schema) Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class Package (WAR file) Deploy WAR file to server
11
Sample EMBOSS Application Schema (Head) <definitions targetNamespace=“emboss" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ebi.ac.uk/ws/emboss/water/> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.ebi.ac.uk/ws/emboss/applications/water/" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" jxb:version="1.0">
12
Application Schema – Custom Bindings (cont’d)
13
Express Application Parameters
14
Express asequenceQualifiers ……
15
Encapsulate all data types inside an application element
16
Using JAXB Generated Java Beans at the client side Java Bean Objects are generated using for client using JAX-WS ‘wsimport’ tool – compiles wsdl + schema Generated objects are populated using setter (client-side) i.e. Sequence asequence = newSequence(); asequence.setUsa("tsw:hba_human"); asequenceQual.setSprotein(true); asequenceQual.setSbegin(0);
17
EMBOSS Applications (300) Manually create the schema – Not scaleable Maven is a software project management & build tool. Written an EMBOSS ACD parser plugin for our Maven WS Software Build Java class Takes EMBOSS application definitions (ACD) as input Output XML Schema, WSDL, representing each EMBOSS application These schema are passed to a JAXB compiler which generates our Java Bean objects
18
Advantages of WS EMBOSS Software Build Advantage of this approach is We can auto-generate XML schema, Application WSDLs Generate Java Objects for use on Client-Side We can easily integrate new EMBOSS applications as a WS by running the ACD file through our software build
19
Generated Artifacts
20
Why go to these lengths? Because of sheer number of EMBOSS apps, necessary to provide a clear means of representing the invocation of separate applications and the passing of parameters appropriate to that app. ******* CLIENT SIDE CODE ********** RunEmbossRequest run = new RunEmbossRequest(); EmbossParams water = new EmbossParams(); water.setAsequence(asequence); water.setBsequence(bsequence); Emboss emboss = new Emboss(); emboss.setApplication(EmbossApplication.WATER); emboss.setApplicationParams(water); run.setEmbossParams(emboss); service = new WSEmbossService(); WSEmboss wsemboss = service.getWSEmboss(); RunEmbossResponse response = wsemboss.run(run);
21
Server-side – Reverse Process At the server-side level, to obtain values objects can be de- serialised using the Java getter methods, i.e. ******* SERVER-SIDE CODE ********** Emboss emboss = input.getEmbossParams(); EmbossApplication embossApp = emboss.getApplication(); String appname = embossApp.value(); EmbossParams water = emboss.getApplicationParams(); Sequence asequence = water.getAsequence(); Seqall bsequence = water.getBsequence(); This solution does not scale well
22
How do we get from a Web Service payload to a valid command line? We are looking at the possibility of developing a generic mechanism to transform the SOAP envelope (our WS inputs – Water params etc) using XSL (Extensible Stylesheets) into a form (that can used to access the EMBOSS binary (application)
23
Understanding our Job Submission Requirements Building a valid & secure command line (approx 300 EMBOSS applications) Issuing the command line (300 applications) Retrieving results from the EMBOSS application Our WS Job Submission should fulfill the EMBRACE Technology recommendations of: Being a ‘Stateful Web Service’ Implement both synchronous and asynchronous functionality Synchronous – submit a job (locked in to that application untill it returns a result) Asynchronous (not synchronised) – submit a job but retain a free hand (not locked in) – we can poll the service with a jobid to obtain job status and results
24
Operations to support requirement of ‘Stateful’ WS RunJob: i.e. runJob(water); – all parameters for the job are encapsulated in the water object. Operation will return a jobid. CancelJob: i.e. cancelJob(“water12”); This can be used to cancel the job execution GetStatus: i.e. getStatus(“water12”); Waiting, Scheduled, Running, Done, Cancelled, Aborted) GetResult: i.e. getResult(“water12”); Retrieve result of job, given a identifier
25
Do we have to reinvent the wheel? – Enter OMII We propose borrowing established technology as one possible solution to our requirements Recently (this week) I met with Software Group Leader at OMII – Open Middleware Infrastructure Institute based at University of Southampton – www.omii.ac.ukwww.omii.ac.uk OMII is an established GRID middleware service provider – very keen to have real users (developers using their products) OMII design GRID related software products
26
What can they offer us? We are interested in their GridSAM product GridSAM consists of several subsystems that support: Pluggable job persistence (if your job fails, it will be retried) Job Queuing, Launching Job Monitoring Pending, staging in, active, executed, staging out, job completed
27
GridSAM cont’d File Staging (stage in input files, stage out output files) All this functionality is available through an API – JobManager Interface Providing us with rich job submission functionality at little cost Typically this functionality will be invoked from within the embedding Application – web service – using the API
28
How do I pass my job content to GridSAM Server Jobs are launched by passing a JSDL (Job Submission Description Language) document to the GridSAM server from a GridSAM client using the JobManager API All of this can exist underneath your web service layer Opportunity for a shared EMBRACE server perhaps!
29
Sample JSDL http://schemas.ggf.org/jsdl/2005/11/jsdl http://schema.gff.org.jsdl/2005/11/jsdl-posix /bin/echo </Application
30
Very good! – What about the EMBOSS WS As mentioned, we propose to transform the EMBOSS WS payloads (soap message) at runtime into a valid JSDL document to be submitted to GridSAM GridSAM looks promising! We will use the EMBOSS WS as a test bed If successful we may make a recommendation to WP3
31
Thank you for listening!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.