1 / 56

BLAST Sequence Searching in Registry

BLAST Sequence Searching in Registry. ®. Soichi Tokizane November 2002. You will learn…. How sequences are represented in the Registry file today How to use BLAST for similarity searching Techniques for finding references to BLAST results. Sequence Information from CAS.

marisa
Download Presentation

BLAST Sequence Searching in Registry

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BLAST Sequence Searching in Registry ® Soichi Tokizane November 2002

  2. You will learn… • How sequences are represented in the Registry file today • How to use BLAST for similarity searching • Techniques for finding references to BLAST results

  3. Sequence Information from CAS

  4. CAS creates the Registry database CAS Registry growth since 1965 Substances Registered (millions) 01

  5. CA has covered biochemistry journals and patents since 1907 Oxidizing enzymes - (III) Specific nature of tyrosinase and its action on products of disintegration of protein compds. Arch. Sci. phys. nat. gen. 24 1907

  6. Today, the CA database contains a very complete bioscience collection • Journals and patents from • More than 3,000 bioscience titles • Patents from 33 countries plus EP and WO • Over 500 books and over 300 book series • Conference proceedings • Dissertations It covers:

  7. Over 40% of the 21.5 million bibliographic records in CA cover bioscience information

  8. Biomolecules (sequences) are a major substance class in REGISTRY 36 million substances

  9. Virtually all types of sequences are covered in Registry • Sequences from earlier literature • Novel nucleic acid primers and probes • Protein sequences deduced from gene translation and ESTs • Sequences with uncommon or non-natural residues • Chemically modified sequences • Fusion proteins • Genetically engineered sequences • Protein nucleic acids (PNAs)

  10. BLAST Sequence Similarity Searching

  11. Registry offers several sequence search techniques • BLAST similarity (“homology”) searching • similarity searching is the retrieval of sequence matches based on identity, conservation, and gaps • Sequence code match: exact, family, motif, pattern • Sequence name search

  12. BLAST is a similarity matching algorithm • BLAST stands for Basic Local Alignment Search Tool • Produced and offered by the U.S. National Center of Biotechnology Information (NCBI) • Designed to quickly compare nucleic and amino acid sequences against desired databases

  13. Search Application Find patent references for sequences similar to the following recombinant human collagen. Conduct a comprehensive search in Registry on STN. MRAWIFFLLCLAGRALAAPLADYKDDDDKP GYLGGFLLVLHSQTDQEPTCPLGMPRLWTG YSLLYLEGQEKAHNQDLGLAGSCLPVFSTL HQVCHYAQRNDRSYWLASAAPLPRAWIFF MMPLSEEAIRPYVSRCAVCEAPAQAVAVHS QDQSIPPCPQTWRSLWIGYSFLMHTGAGDQ GGGQALMSPRAAPFLECQGRQGTLADY CHFFANKYSFWLTTVKADLQFSSAPAPDTL KESQAISRCQVCVKYS

  14. CAS Registry BLAST via STN on the Web is easy to use 1. Install sequence plug-in 2. Conduct Registry BLAST similarity search 3. Search selected BLAST answers in STN to get the literature references

  15. BLAST is available via STN on the Web • A plug-in must be downloaded and installed before using the BLAST module • It is a one-time only requirement • The plug-in is free • Clicking on “Get Sequence Plug-in” takes you to easy-to-use Instructions

  16. Plug-in instruction page

  17. Conduct Registry BLAST Similarity Search

  18. Follow these steps for Registry BLAST searching • Launch CAS Registry BLAST • Submit sequence query • Examine results and return to STN • Continue searching in STN on the Web

  19. Logon to STN on the Web and select the Sequence Assistant 1. 2.

  20. Select from one of three STN online options before launch Click on Launch button

  21. The main and new search windows appear

  22. Submit sequence query • In a new session, the only available option is Similar Sequences • Fast BLAST is available after the first search • Click on the Similar Sequences button to open the Search by Sequence query page Search by Sequence

  23. The Search by Sequence screen is easy to use

  24. Type in a result name • Type desired name for sequence search • Alpha or numeric • Spaces and punctuation allowed • STN will assign sequential number if you do not name the search • The name can also be changed later in the Main Menu

  25. Recall Sequence is useful for re-submitting the same query with different settings • The most recently searched sequence is stored in a buffer that can be retrieved using this function • This function is grayed out when you first begin

  26. Read from File allows you to upload directly from a file • The file can be: • A text file (e.g. .txt) • In GCG or FASTA format • An STN record (SQIDE display)

  27. The sequence query must be 1-letter code • The sequence query can be • Copied and pasted • Read from File • Typed directly • a Recalled sequence • The sequence length limit is 50,000 characters

  28. This screen is for inserting a sequence query from file

  29. The BLAST program to be used is selected next

  30. Searches can be run on a subset of the Registry File • For proteins, the three options are: • The default is all CA sequences Other options are available for nucleic acids, such as include or exclude GenBank records.

  31. BLAST default settings are optimized • Parameters can be modified • Search sensitivity • Low complexity filtering • Maximum number of answers • Show advanced options

  32. Advanced functions should only be modified with a thorough understanding of BLAST principles • Users are encouraged to contact bioinformatics departments for details, advice, and recommendations • Additional information is also available at the NCBI Web page http://www.ncbi.nlm.nih.gov/

  33. The Main Window is for managing results • The Main Window has columns for • Assigned name • Type of search • Time created • Status • Results • Reviewed status

  34. Results can be viewed once the search is complete • The results are permanently stored on STN, until deleted by the user • Old results can be reviewed when desired • Up to 50 results sets can be stored Highlight Then view

  35. Alignments can be viewed individually

  36. Alignments can be saved or printed

  37. The saved file has a summary of all the hits and scores

  38. Select desired alignments for transfer to STN • Check boxes • Select by score category • Select all

  39. Transfer RNs to STN • Select Transfer RNs to STN • Message indicates when the transfer is complete • Log off the BLAST system -- Select Exit from File menu or close browser

  40. Retrieve RNs from BLAST • The Sequence Assistant page appears after you exit BLAST • Select the Retrieve RNs from BLAST option

  41. Return to STN on the Web • STN will indicate if session is logged off • If so, log on to STN on the Web • Select Sequence Assistant • Retrieve RNs from BLAST To obtain a transcript of your session, you must log in again. Back to the STN on the Web login page

  42. Continue STN Searching

  43. The Sequence Assistant transferred several “packets” of numbers, which are all OR’ed together in L6. L-Numbers are created from the automatic transfer

  44. L-Numbers are used for reference searches These search results can be optionally combined with DGENE, with routine use of STN’s multifile search interaction.

  45. STN Express with Discover! 6.01is now available for Sequence Searching http://www.cas.org/ONLINE/STN/interact/express.html

  46. CAS REGISTRY BLAST is now searchable from Express

  47. x Transferring BLAST data into an STN session is seamlessly integrated into the software

  48. A report merges an STN transcript and BLAST alignment data

  49. A report merges an STN transcript and BLAST alignment data

More Related