1 / 17

CATH Soap Web Services

CATH Soap Web Services. http://farmer2.biochem.ucl.ac.uk/cgi-bin/SWSScan.cgi SWSScan. What we want to offer. Access to the CATH algorithms Access to CATH data “Programmable” access to our own resources for a new automated update protocol. Sequence Comparisons.

archie
Download Presentation

CATH Soap Web Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CATH Soap Web Services http://farmer2.biochem.ucl.ac.uk/cgi-bin/SWSScan.cgi SWSScan

  2. What we want to offer • Access to the CATH algorithms • Access to CATH data • “Programmable” access to our own resources for a new automated update protocol Orengo Group, UCL

  3. Sequence Comparisons • BLAST query sequences against complete CATH domain sequence database • NW query sequences against complete CATH domain sequence database • Scan query sequences against the CATH SamHMM library • Scan query sequences against the CATH HMMer HMM library Orengo Group, UCL

  4. Structural Comparisons • SSAP is a residue-based pairwise comparison algorithm that scores structural similarity between chains/domains in PDB format. • Cathedral uses graphs to identify the fold of the query structure and then confirms fold assignment using SSAP • Both algorithms require additional files that can be derived from the original PDB on the fly Orengo Group, UCL

  5. Why SOAP? • It’s Simple • Allows management of our farms without changing the current configuration. • A protocol being adopted by other FP6 projects such as BioSapiens Orengo Group, UCL

  6. Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal Animal SWSScan SOAP HTTP Layer Perl Layer SGE Farm Manager Orengo Group, UCL

  7. SOAP Bits #!/usr/local/bin/perl –w use SOAP::Lite; my $urn = “SWSScan”; my $proxy = “http://farmer2.biochem.ucl.ac.uk/cgi-bin/SWSScan.cgi”; my $URI = SOAP::Lite->uri(“urn:$urn”); my $hWebService = $URI->proxy($proxy); my $job_id = $hWebService->submit_scan($in, “blast”)->result; #!/usr/local/bin/perl –w use SOAP::Transport::HTTP; SOAP::Transport:HTTP::CGI ->dispatch_to(“/usr/local/bin/cath/perl/SWSScan.pm ->handle; Orengo Group, UCL

  8. Input formats • Accept as many formats as possible to make the service as useable as possible • Makes the web service responsible for parsing data rather than the user • Ultimately parses all input data into a standard data structure Orengo Group, UCL

  9. Input Formats • Scalar • Single or multiple sequence FASTA • Full or chain/domain PDB • Single or list of CATH chain/domain ids • Single or list of PDB codes • Array • As scalar in each element • Object • Standard data structure only Orengo Group, UCL

  10. SWSScan Output • Job id is returned rather than the results because • Apache has a timeout • The farm could be busy • The job could be big • Requires a monitoring call and a retrieval call Orengo Group, UCL

  11. SWSScan Output • A structured data object consisting of a results object that contains in a keyed hash query sequence objects which in turn contain an array of matching sequence objects • All returned data is contained in the object including any requested files Orengo Group, UCL

  12. Wrapper date e.g. ‘date’ source e.g. ‘CATH’ Sequence e.g. Sequence Object Orengo Group, UCL

  13. Sequence date e.g. ‘set date’ source e.g. ‘source of data’ type e.g. ‘?’ Id e.g. ‘1cuk001’ length e.g. size of residue array sequence pdb_header/footer e.g. ‘raw text’ fasta_header e.g. ‘raw text’ wolf_file, sec_file, etc. e.g. ‘raw text’ matches-> an array of sequence objects Orengo Group, UCL

  14. Residue sequence_number e.g. key to Residue array letter e.g. ‘residue single letter’ coords e.g. ‘ ATOM 1 N MET 1 -7.750 -4.498 -20.265 1.00 21.82 ATOM 2 CA MET 1 -7.178 -5.177 -19.122 1.00 18.39 ATOM 3 C MET 1 -7.857 -4.686 -17.798 1.00 20.00 ATOM 4 O MET 1 -8.202 -5.472 -16.932 1.00 18.10 ATOM 5 CB MET 1 -5.728 -4.808 -19.125 1.00 19.97 ATOM 6 CG MET 1 -4.838 -5.882 -18.694 1.00 29.38 ATOM 7 SD MET 1 -3.162 -5.241 -18.626 1.00 33.40 ATOM 8 CE MET 1 -3.477 -3.687 -19.473 1.00 39.23 Orengo Group, UCL

  15. Sequence - matches date e.g. 'Tue Nov 29 10:29:27 GMT 2005’ type e.g. 'BLAST', id e.g. '12asA0', source e.g. 'CATH', sequence_length e.g. '330', matched_length e.g. '22', query_start e.g. '1', query_stop e.g. '20', match_start e.g. '185' match_stop e.g. '206', raw_score e.g. '15.4' evalue e.g. '1.9' sequence_id e.g. '36.364' Orengo Group, UCL

  16. Other Web Services • Retrieval of CATH data • Accession codes • PDB definitions of domains • Derived CATH files • Gene3D • CATH <-> Uniprot mappings Orengo Group, UCL

  17. Questions • We are also offering our machines to be used, but is there some kind of quota system expected to be used? • Is using a job id an acceptable protocol? • If you only use a retrieval call how do you tell the difference between “no results” with “no results YET”? • Will there be a central repository of code? • What about BioPerl? • How much more complex is the structured data model going to get? Orengo Group, UCL

More Related