1 / 92

Click anywhere to go on to the next slide

Tour of ViroBIKE Sequence comparison. ViroBIKE (Biological Integrated Knowledge Environment) combines: Knowledge: All completed viral genomes known to NCBI Many viral metagenomes

jhelms
Download Presentation

Click anywhere to go on to the next slide

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tour of ViroBIKESequence comparison ViroBIKE (Biological Integrated Knowledge Environment) combines: Knowledge:All completed viral genomes known to NCBI Many viral metagenomes Analytical Tools: A powerful graphical language that permits creative expression to those with no programming experience It is accessible through a virological community web site: http://ixion.csbc.vcu.edu/virobike This demonstration is best viewed as a slide show,enabling you to simulate a session and make changes in cursor position more obvious.To do this, click Slide Show on the top tool bar, then View show. Click anywhere to go on to the next slide

  2. Tour of ViroBIKESequence comparison In this tour, you'll see how to: Slide 4 8 17 24 32 37 80 85 92 • Log onto ViroBIKE • Speak BioBIKE (the language of ViroBIKE) • Display the sequence of a metagenome contig • Find similar sequences amongst metagenomes • Find similar sequences amongst known viruses • Find similar sequences amongst everything in GenBank • Make a sequence alignment • Make a phylogenetic tree • Save your work session You can go to any slide in this tour at any time by typing the slide number and pressing Enter.

  3. Coming Attractions! If you like this tour, you might also try Analysis of Metagenome Aggregates, where you'll see how to: • Find the number of contigs in a metagenome • Find the average contig size in a metagenome • Find the average GC content within a metagenome • Visualize the distribution of GC content amongst the contigs of a metagenome

  4. URL: htpp://ixion.csbc.vcu.edu/virobike The public can access everything at the community web site (except member names and e-mails), but only registered users can write to it. For now you're public. Access ViroBIKE through the blue bar.

  5. Click the link to the public access login screen.

  6. Your name (no spaces) Enter anything you like as a login name, but no spaces or symbols.No password necessary.Click New Login

  7. You can leave the blue bar to access other resources, but screen space is scarce, so grab it and move it offscreen to the left.

  8. Function palette Workspace The BioBIKE environment is divided into three areas as shown. You'll bring functions down from the function palette to the workspace, execute them, and note the results in the results window Results window

  9. HELP! PROBLEM Two very important buttons on the function palette: On-line help (general) Something went wrong? Tell us!

  10. Two very important buttons in the workspace: Undo (return to workspace before last action) Redo (Get back the workspace you undid)

  11. Our Story Suppose you have a special interest in a sequence, a contig, derived from the metagenome taken from the Arctic Ocean. The metagenome is called p-arct. The sequence is called C60790. What does the sequence look like?

  12. Clicking on any palette button brings down choices of functions or data to bring into the workspace. Click the function DISPLAY-SEQUENCE-OF.

  13. A DISPLAY-SEQUENCE-OF function box is now in the workspace. Before continuing with the problem, let's consider what function boxes mean.

  14. Argument(object) Function-name Flag Keyword object General Syntax of BioBIKE The basic unit of BioBIKE is the function box. It consists of the name of a function, perhaps one or more required arguments, and optional keywords and flags. A function may be thought of as a black box: you feed it information, it produces a product.

  15. Argument(object) Function-name Flag Keyword object General Syntax of BioBIKE Function boxes contain the following elements: • Function-name (e.g. SEQUENCE-OF or LENGTH-OF) • Argument: Required, acted on by function • Keyword clause: Optional, more information • Flag: Optional, more (yes/no) information

  16. Argument(object) Function-name Flag Keyword object • Option icon: Brings up a menu of keywords and flags • Action icon: Brings up a menu enabling you to execute a function, copy and paste, information, get help, etc • Clear/Delete icon: Removes information you entered or removes box entirely General Syntax of BioBIKE … and icons to help you work with functions:

  17. Back to our story… we were displaying the sequence of our favorite metagenome contig, C60790. Click on the gray argument box to activate it for entry, either from the keyboard or by insertion.

  18. Now that the box is open, type in the name of the contig, C60790. Upper/lower case doesn't matter. When you're done, close the box by pressing Enter or Tab. If you forget to close the box, the function will not work.

  19. To set the length of the lines to be displayed by mousing over the Options icon and clicking LINE-LENGTH. Actually, the default line length is perfectly OK. I did this just to show you an option in action.

  20. Enter a value into the option entry box in the same way you entered a value into the argument box: Click on the box, type, then close the box by pressing Enter or Tab.

  21. The default format for sequences is lines preceded by coordinates. If you want the sequence in FastA format, mouse over the Optionsicon and click FastA. (An example of a Flag in action)

  22. The function is now complete. To execute it, mouse over the Actionicon and click Execute.

  23. Displayed results appear in popup windows, which you can copy or save. When your done with it, click the red X in the upper right hand corner to get rid of it. FireFox has an upper limit on popup windows, so it's a good idea to clean up as you go.

  24. Is the DNA sequence similar to any other metagenome sequence? To find out, mouse over the STRINGS-SEQUENCES menu and click SEQUENCE-SIMILAR-TO. This function allows you to search for similarity by pattern, by mismatches, or by Blast (default).

  25. The function asks for two arguments: the query sequenceand the target sequences against which the query will be compared. The query is c60790, of course. We could enter it by typing, as before, but it is more interesting to copy and paste what you already typed. To do this mouse over the Action icon of the box containing c60790.

  26. Click Copy.

  27. To paste, mouse over the Action icon of the box into which you're pasting and click Paste.

  28. Now to enter the target sequences – the set of all metagenome sequences. Click on the target box to open it for entry. Once the box is open, you could specify by typing that you want to search metagenomic sequences… if you knew what to type.

  29. If you don't know, then mouse over the DATA button, then Organisms, then Metagenomes. Clicking on Metagenomes transfers it to the open target box.

  30. Execute the completed function as before, mousing over the Action icon of the function and clicking Execute. Doing so starts Blast, which may take several seconds to complete execution.

  31. You might expect that your sequence from P-Arct would find other sequences from the same metagenome. It does, but interestingly, after itself, the next 10 best hits are from the P-BBC metagenome. Use browser controls to save the box, if you like, then X out of it.

  32. Of course the metagenome sequences are not annotated. Perhaps you can learn more about your sequence by comparing it to sequences from known viruses. To do this, clear the target box, open it up again by clicking on it…

  33. …and bring down Known Viruses into the box.

  34. Protein searches will find more sequences, mouse over the Options icon and specify that your DNA sequence is to be translated and compared to viral proteins.

  35. Execute the completed function. Again, execution may take several seconds.

  36. Only one hit, and a very poor one at that! This is typical, because while ViroBIKE has virtually all known viral genomes, those that are known cover only a tiny fraction of viruses that exist in nature. X out of the window and clear known viruses so that we can try another approach.

  37. There is a good deal more variety in organismal genomes than viral genomes, so let's search them. ViroBIKE does not keep organismal genomes locally, so we need to go out to GenBank. Click on the DATA button again.

  38. …and this time click GenBank.

  39. Execute the function as usual. This time we will be at the mercy of NCBI, and depending on the time of day and the phase of the moon, execution may take a minute or longer. By default, ViroBIKE times out execution at 40 seconds. If this occurs, you'll get a message like…

  40. *** TIMEOUT ! TIMEOUT ! TIMEOUT *** *** COMPUTATION ABORTED AFTER 40 SECONDS *** *** YOU CAN: *** - contact support for help: BioLinguaSupport@lists.Stanford.EDU *** - use the TOOLS -> PREFS menu or the SET-TIMELIMIT function to extend your timeout up to 1 hour *** - use RUNJOB to run your code in a separate process *** - type (explain-timeout) at the weblistener for detailed info. You can change the time limit, but let's say that fate is with us and you get your result.

  41. Interesting! Many highly significant hits from various bacteria…

  42. …at different regions of your sequence. At NCBI, that would be the end of the story. In ViroBIKE, it's the beginning, since you can work with your Blast results. First, we'll want to give the result a name.

  43. To name a result, mouse over the DEFINITION menu and click DEFINE.

  44. The DEFINE function asks for two arguments: the name of the variable and the value that will be assigned to it. Click on the variable entry box.

  45. You can name the result anything you like, so long as the name does not contain spaces (hyphens and underscores are OK). I chose c67090-vs-NR. Press Tab after typing a name.

  46. Tabbing opens up the next argument, the value box. The value to be assigned is the Blast table. There are many ways to retrieve that result. One way is to recognize that it is the result of the previous function. Click the OTHER-COMMAND button...

  47. …and click Previous-Result.

  48. Executing the function will cause the variable you named to spring into existence, accessible through a new button. Watch for it!

  49. We'll be using that VARIABLES button in a moment. For now, mouse over STRINGS-SEQUENCES, then SEARCH/COMPARE, and…

  50. Click on BLAST-VALUE. This function allows you to extract values from the Blast table.

More Related