Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 New EMBOSS Web Service Shaun McGlinchey

Similar presentations


Presentation on theme: "1 New EMBOSS Web Service Shaun McGlinchey"— Presentation transcript:

1 1 New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

2 Outline The presentation will discuss the challenges encountered in exposing the EMBOSS suite of command line sequence analysis tools as a ‘stateful’ SOAP based web service. An overview of the proposed framework for client-side requests, server-side job submission and results delivery will then be given.

3 What is EMBOSS? EMBOSS is "The European Molecular Biology Open Software Suite". What can I use EMBOSS for? Consists of approx 300 command line applications covering areas such as: Sequence alignment Rapid database searching with sequence patterns Protein motif identification, including domain analysis Phylogenetic analysis Presentation tools for publication

4 What is JAX-WS? In the words of SUN: JAX-WS - Java API for XML Web Services (JAX-WS). is the centerpiece of a newly rearchitected API stack for Web services, the so-called "integrated stack" that includes JAX-WS 2.0, JAXB 2.0, and SAAJ 1.3. Essentially a SOAP toolkit for Java The implementation has been renamed (JAXRPC) It brings clear improvements on data binding capabilities through its tight integration with JAXB – Java API for XML Binding

5 Current State of (old) EBI EMBOSS Web Service The current server-side implementation is Perl-based. Sample clients are available in.Net, SOAP::Lite and Java (Axis) solutions. Currently accepts free text as data input – weak typing – poor validation capability Supports both Synchronous and Asynchronous job submission. Asynchronous requests are allocated a job id Migrating to a Java-based JAX-WS server side implementation enables us to have more control over the generated artifacts, increased data validation capabilities and to rapidly improve on the functionality provided.

6 EMBOSS Data Types There are 52 datatypes (at the last count) used within the EMBOSS suite of applications. These fall under five headings 1. Simple – Array, Boolean, Integer, String … 2. Input – Codon, Features, Sequence, Seqall … 3. Selection Lists – List, Selection … 4. Output – Align, Report, Seqout … 5. Graphics – Graph, Xygraph

7 EMBOSS Qualifiers EMBOSS command line program Accepts application name + qualifiers (each of which is a datatype): Water -asequence tsw:hba_human -bsequence tsw:hbb_human : (water sequence seqall) -asequence is of datatype Sequence, bsequence of Seqall Qualifiers consist of associated qualifiers which can be also passed to the command line to enable advanced configuration of the application call. - sbegin, -send, -sformat

8 General, Additional & Advanced Qualifiers General are common to all EMBOSS applications -auto true - Turn off prompts (boolean datatype) -stdout true - Write standard output (boolean)

9 Web Service Development In accordance with the Technology Recommendation we have chosen Top-Down approach to WS Development, not Bottom- Up. Top-Down Approach to WS Development Express data types in schema Write WSDL (include schema) Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class

10 Top-Down Approach to WS Development Top-Down Express data types in schema Write WSDL (include schema) Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class Package (WAR file) Deploy WAR file to server

11 Sample EMBOSS Application Schema (Head) <definitions targetNamespace=“emboss" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ebi.ac.uk/ws/emboss/water/> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.ebi.ac.uk/ws/emboss/applications/water/" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" jxb:version="1.0">

12 Application Schema – Custom Bindings (cont’d)

13 Express Application Parameters

14 Express asequenceQualifiers ……

15 Encapsulate all data types inside an application element

16 Using JAXB Generated Java Beans at the client side Java Bean Objects are generated using for client using JAX-WS ‘wsimport’ tool – compiles wsdl + schema Generated objects are populated using setter (client-side) i.e. Sequence asequence = newSequence(); asequence.setUsa("tsw:hba_human"); asequenceQual.setSprotein(true); asequenceQual.setSbegin(0);

17 EMBOSS Applications (300) Manually create the schema – Not scaleable Maven is a software project management & build tool. Written an EMBOSS ACD parser plugin for our Maven WS Software Build Java class Takes EMBOSS application definitions (ACD) as input Output XML Schema, WSDL, representing each EMBOSS application These schema are passed to a JAXB compiler which generates our Java Bean objects

18 Advantages of WS EMBOSS Software Build Advantage of this approach is We can auto-generate XML schema, Application WSDLs Generate Java Objects for use on Client-Side We can easily integrate new EMBOSS applications as a WS by running the ACD file through our software build

19 Generated Artifacts

20 Why go to these lengths? Because of sheer number of EMBOSS apps, necessary to provide a clear means of representing the invocation of separate applications and the passing of parameters appropriate to that app. ******* CLIENT SIDE CODE ********** RunEmbossRequest run = new RunEmbossRequest(); EmbossParams water = new EmbossParams(); water.setAsequence(asequence); water.setBsequence(bsequence); Emboss emboss = new Emboss(); emboss.setApplication(EmbossApplication.WATER); emboss.setApplicationParams(water); run.setEmbossParams(emboss); service = new WSEmbossService(); WSEmboss wsemboss = service.getWSEmboss(); RunEmbossResponse response = wsemboss.run(run);

21 Server-side – Reverse Process At the server-side level, to obtain values objects can be de- serialised using the Java getter methods, i.e. ******* SERVER-SIDE CODE ********** Emboss emboss = input.getEmbossParams(); EmbossApplication embossApp = emboss.getApplication(); String appname = embossApp.value(); EmbossParams water = emboss.getApplicationParams(); Sequence asequence = water.getAsequence(); Seqall bsequence = water.getBsequence(); This solution does not scale well

22 How do we get from a Web Service payload to a valid command line? We are looking at the possibility of developing a generic mechanism to transform the SOAP envelope (our WS inputs – Water params etc) using XSL (Extensible Stylesheets) into a form (that can used to access the EMBOSS binary (application)

23 Understanding our Job Submission Requirements Building a valid & secure command line (approx 300 EMBOSS applications) Issuing the command line (300 applications) Retrieving results from the EMBOSS application Our WS Job Submission should fulfill the EMBRACE Technology recommendations of: Being a ‘Stateful Web Service’ Implement both synchronous and asynchronous functionality  Synchronous – submit a job (locked in to that application untill it returns a result)  Asynchronous (not synchronised) – submit a job but retain a free hand (not locked in) – we can poll the service with a jobid to obtain job status and results

24 Operations to support requirement of ‘Stateful’ WS RunJob: i.e. runJob(water); – all parameters for the job are encapsulated in the water object. Operation will return a jobid. CancelJob: i.e. cancelJob(“water12”); This can be used to cancel the job execution GetStatus: i.e. getStatus(“water12”); Waiting, Scheduled, Running, Done, Cancelled, Aborted) GetResult: i.e. getResult(“water12”); Retrieve result of job, given a identifier

25 Do we have to reinvent the wheel? – Enter OMII We propose borrowing established technology as one possible solution to our requirements Recently (this week) I met with Software Group Leader at OMII – Open Middleware Infrastructure Institute based at University of Southampton – www.omii.ac.ukwww.omii.ac.uk OMII is an established GRID middleware service provider – very keen to have real users (developers using their products) OMII design GRID related software products

26 What can they offer us? We are interested in their GridSAM product GridSAM consists of several subsystems that support: Pluggable job persistence (if your job fails, it will be retried) Job Queuing, Launching Job Monitoring  Pending, staging in, active, executed, staging out, job completed

27 GridSAM cont’d File Staging (stage in input files, stage out output files) All this functionality is available through an API – JobManager Interface Providing us with rich job submission functionality at little cost Typically this functionality will be invoked from within the embedding Application – web service – using the API

28 How do I pass my job content to GridSAM Server Jobs are launched by passing a JSDL (Job Submission Description Language) document to the GridSAM server from a GridSAM client using the JobManager API All of this can exist underneath your web service layer Opportunity for a shared EMBRACE server perhaps!

29 Sample JSDL http://schemas.ggf.org/jsdl/2005/11/jsdl http://schema.gff.org.jsdl/2005/11/jsdl-posix /bin/echo </Application

30 Very good! – What about the EMBOSS WS As mentioned, we propose to transform the EMBOSS WS payloads (soap message) at runtime into a valid JSDL document to be submitted to GridSAM GridSAM looks promising! We will use the EMBOSS WS as a test bed If successful we may make a recommendation to WP3

31 Thank you for listening!


Download ppt "1 New EMBOSS Web Service Shaun McGlinchey"

Similar presentations


Ads by Google