1 / 31

New EMBOSS Web Service

New EMBOSS Web Service. Shaun McGlinchey (shaun@ebi.ac.uk). Outline.

lowell
Download Presentation

New EMBOSS Web Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New EMBOSS Web Service Shaun McGlinchey (shaun@ebi.ac.uk)

  2. Outline • The presentation will discuss the challenges encountered in exposing the EMBOSS suite of command line sequence analysis tools as a ‘stateful’ SOAP based web service. An overview of the proposed framework for client-side requests, server-side job submission and results delivery will then be given.

  3. What is EMBOSS? EMBOSS is "The European Molecular Biology Open Software Suite". What can I use EMBOSS for? • Consists of approx 300 command line applications covering areas such as: • Sequence alignment • Rapid database searching with sequence patterns • Protein motif identification, including domain analysis • Phylogenetic analysis • Presentation tools for publication

  4. What is JAX-WS? • In the words of SUN: JAX-WS - Java API for XML Web Services (JAX-WS). is the centerpiece of a newly rearchitected API stack for Web services, the so-called "integrated stack" that includes JAX-WS 2.0, JAXB 2.0, and SAAJ 1.3. • Essentially a SOAP toolkit for Java • The implementation has been renamed (JAXRPC) • It brings clear improvements on data binding capabilities through its tight integration with JAXB – Java API for XML Binding

  5. Current State of (old) EBI EMBOSS Web Service • The current server-side implementation is Perl-based. Sample clients are available in .Net, SOAP::Lite and Java (Axis) solutions. • Currently accepts free text as data input – weak typing – poor validation capability • Supports both Synchronous and Asynchronous job submission. • Asynchronous requests are allocated a job id • Migrating to a Java-based JAX-WS server side implementation enables us to have more control over the generated artifacts, increased data validation capabilities and to rapidly improve on the functionality provided.

  6. EMBOSS Data Types • There are 52 datatypes (at the last count) used within the EMBOSS suite of applications. These fall under five headings • Simple – Array, Boolean, Integer, String … • Input – Codon, Features, Sequence, Seqall … • Selection Lists – List, Selection … • Output – Align, Report, Seqout … • Graphics – Graph, Xygraph

  7. EMBOSS Qualifiers • EMBOSS command line program • Accepts application name + qualifiers (each of which is a datatype): • Water -asequence tsw:hba_human -bsequence tsw:hbb_human : (water sequence seqall) • -asequence is of datatype Sequence, bsequence of Seqall • Qualifiers consist of associated qualifiers which can be also passed to the command line to enable advanced configuration of the application call. • - sbegin, -send, -sformat

  8. General, Additional & Advanced Qualifiers • General are common to all EMBOSS applications • -auto true - Turn off prompts (boolean datatype) • -stdout true - Write standard output (boolean)

  9. Web Service Development • In accordance with the Technology Recommendation we have chosen Top-Down approach to WS Development, not Bottom-Up. • Top-Down Approach to WS Development • Express data types in schema • Write WSDL (include schema) • Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class

  10. Top-Down Approach to WS Development • Top-Down • Express data types in schema • Write WSDL (include schema) • Generate Artifacts (JavaBeans – data objects, server side stubs, implementation class • Package (WAR file) • Deploy WAR file to server

  11. Sample EMBOSS Application Schema (Head) <?xml version="1.0" encoding="UTF-8"?> <definitions targetNamespace=“emboss" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <types> <xsd:schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ebi.ac.uk/ws/emboss/water/> <?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.ebi.ac.uk/ws/emboss/applications/water/" xmlns:jxb="http://java.sun.com/xml/ns/jaxb" jxb:version="1.0">

  12. Application Schema – Custom Bindings (cont’d) <xsd:annotation> <xsd:appinfo> <jxb:schemaBindings> <jxb:package name="uk.ac.ebi.ws.emboss.applications.water"> </jxb:package> </jxb:schemaBindings> </xsd:appinfo> </xsd:annotation>

  13. Express Application Parameters <xsd:element name="asequence“/> <xsd:complexType name="asequence"> <xsd:sequence> <xsd:element name="asequence" type="xsd:string" nillable="false"/> <xsd:element name="asequenceQualifiers" type="tns:asequenceQualifiers" nillable="true"/> </xsd:sequence> </xsd:complexType> </xsd:element>

  14. Express asequenceQualifiers <xsd:element name=“asequenceQualifiers”> <xsd:complexType name=“asequenceQualifiers"> <xsd:sequence> <xsd:element name="sbegin" type="xsd:integer"/> <xsd:element name="send" type="xsd:integer"/> <xsd:element name=“usa" type="xsd:string"/> …… </xsd:sequence> </xsd:complexType> </xsd:element>

  15. Encapsulate all data types inside an application element <xsd:element name="water" type="tns:water"/> <xsd:complexType name="water"> <xsd:sequence> <xsd:element name="asequence" type="tns:asequence"/> <xsd:element name="bsequence" type="tns:bsequence"/> <xsd:element name="datafile" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element>

  16. Using JAXB Generated Java Beans at the client side • Java Bean Objects are generated using for client using JAX-WS ‘wsimport’ tool – compiles wsdl + schema • Generated objects are populated using setter (client-side) i.e. Sequence asequence = newSequence(); asequence.setUsa("tsw:hba_human"); asequenceQual.setSprotein(true); asequenceQual.setSbegin(0);

  17. EMBOSS Applications (300) • Manually create the schema – Not scaleable • Maven is a software project management & build tool. • Written an EMBOSS ACD parser plugin for our Maven WS Software Build • Java class • Takes EMBOSS application definitions (ACD) as input • Output XML Schema, WSDL, representing each EMBOSS application • These schema are passed to a JAXB compiler which generates our Java Bean objects

  18. Advantages of WS EMBOSS Software Build • Advantage of this approach is • We can auto-generate XML schema, Application WSDLs • Generate Java Objects for use on Client-Side • We can easily integrate new EMBOSS applications as a WS by running the ACD file through our software build

  19. Generated Artifacts

  20. Why go to these lengths? • Because of sheer number of EMBOSS apps, necessary to provide a clear means of representing the invocation of separate applications and the passing of parameters appropriate to that app. ******* CLIENT SIDE CODE ********** RunEmbossRequest run = new RunEmbossRequest(); EmbossParams water = new EmbossParams(); water.setAsequence(asequence); water.setBsequence(bsequence); Emboss emboss = new Emboss(); emboss.setApplication(EmbossApplication.WATER); emboss.setApplicationParams(water); run.setEmbossParams(emboss); service = new WSEmbossService(); WSEmboss wsemboss = service.getWSEmboss(); RunEmbossResponse response = wsemboss.run(run);

  21. Server-side – Reverse Process • At the server-side level, to obtain values objects can be de-serialised using the Java getter methods, i.e. ******* SERVER-SIDE CODE ********** Emboss emboss = input.getEmbossParams(); EmbossApplication embossApp = emboss.getApplication(); String appname = embossApp.value(); EmbossParams water = emboss.getApplicationParams(); Sequence asequence = water.getAsequence(); Seqall bsequence = water.getBsequence(); • This solution does not scale well

  22. How do we get from a Web Service payload to a valid command line? • We are looking at the possibility of developing a generic mechanism to transform the SOAP envelope (our WS inputs – Water params etc) using XSL (Extensible Stylesheets) into a form (that can used to access the EMBOSS binary (application)

  23. Understanding our Job Submission Requirements • Building a valid & secure command line (approx 300 EMBOSS applications) • Issuing the command line (300 applications) • Retrieving results from the EMBOSS application • Our WS Job Submission should fulfill the EMBRACE Technology recommendations of: • Being a ‘Stateful Web Service’ • Implement both synchronous and asynchronous functionality • Synchronous – submit a job (locked in to that application untill it returns a result) • Asynchronous (not synchronised) – submit a job but retain a free hand (not locked in) – we can poll the service with a jobid to obtain job status and results

  24. Operations to support requirement of ‘Stateful’ WS • RunJob: i.e. runJob(water); – all parameters for the job are encapsulated in the water object. Operation will return a jobid. • CancelJob: i.e. cancelJob(“water12”); • This can be used to cancel the job execution • GetStatus: i.e. getStatus(“water12”); • Waiting, Scheduled, Running, Done, Cancelled, Aborted) • GetResult: i.e. getResult(“water12”); • Retrieve result of job, given a identifier

  25. Do we have to reinvent the wheel? – Enter OMII • We propose borrowing established technology as one possible solution to our requirements • Recently (this week) I met with Software Group Leader at OMII – Open Middleware Infrastructure Institute based at University of Southampton – www.omii.ac.uk • OMII is an established GRID middleware service provider – very keen to have real users (developers using their products) • OMII design GRID related software products

  26. What can they offer us? • We are interested in their GridSAM product • GridSAM consists of several subsystems that support: • Pluggable job persistence (if your job fails, it will be retried) • Job Queuing, Launching • Job Monitoring • Pending, staging in, active, executed, staging out, job completed

  27. GridSAM cont’d • File Staging (stage in input files, stage out output files) • All this functionality is available through an API – JobManager Interface • Providing us with rich job submission functionality at little cost • Typically this functionality will be invoked from within the embedding Application – web service – using the API

  28. How do I pass my job content to GridSAM Server • Jobs are launched by passing a JSDL (Job Submission Description Language) document to the GridSAM server from a GridSAM client using the JobManager API • All of this can exist underneath your web service layer • Opportunity for a shared EMBRACE server perhaps!

  29. Sample JSDL <xml version”1.0” encoding=“UTF-8”?> <JobDefinition xmlns=http://schemas.ggf.org/jsdl/2005/11/jsdl> <JobDescription> <Application> <POSIXApplication xmlnshttp://schema.gff.org.jsdl/2005/11/jsdl-posix> <Executable>/bin/echo</Executable> </Application </JobDescription> </JobDefinition>

  30. Very good! – What about the EMBOSS WS • As mentioned, we propose to transform the EMBOSS WS payloads (soap message) at runtime into a valid JSDL document to be submitted to GridSAM • GridSAM looks promising! • We will use the EMBOSS WS as a test bed • If successful we may make a recommendation to WP3

  31. Thank you for listening!

More Related