1 / 26

BLAST

BLAST is a heuristic method for performing local alignments through searches of high scoring segment pairs (HSPs). It is the fastest and most frequently used sequence alignment tool, offering both sensitivity and speed. This lecture covers the uses of BLAST and provides access to different flavors of BLAST, as well as instructions on how to run NCBI BLAST.

snurse
Download Presentation

BLAST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3 BLAST

  2. BLAST • Basic Local Alignment Search Tool • A heuristic method for performing local alignments through searches of high scoring segment pairs (HSP’s) • 1st to use statistics to predict significance of initial matches - • Offers both sensitivity and speed

  3. BLAST • Looks for clusters of nearby or locally dense “similar or homologous” k-tuples • Uses “look-up” tables to shorten search time • Uses larger “word size” than FASTA to accelerate the search process • Performs both Global and Local alignment • Fastest and most frequently used sequence alignment tool -- THE STANDARD

  4. Uses of BLAST • Identifying species With the use of BLAST, you can possibly correctly identify a species and/or find homologous species. • Locating domains When working with a protein sequence you can input it into BLAST, to locate known domains within the sequence of interest. • Establishing phylogeny Using the results received through BLAST you can create a phylogenetic tree using the BLAST web-page. It should be noted that phylogenies based on BLAST alone are less reliable than other purpose-built computational phylogenetic methods, so should only be relied upon for "first pass" phylogenetic analyses. • DNA mapping When working with a known species, and looking to sequence a gene at an unknown location, BLAST can compare the chromosomal position of the sequence of interest, to relevant sequences in the database(s). • Comparison When working with genes, BLAST can locate common genes in two related species, and can be used to map annotations from one organism to another.

  5. BLAST Access • NCBI BLAST • http://www.ncbi.nlm.nih.gov/BLAST/ • Canadian Bioinformatics Resource BLAST • http://cbr-rbc.nrc-cnrc.gc.ca/blast/ • European Bioinformatics Institute BLAST • http://www.ebi.ac.uk/blastall/ • http://www.ebi.ac.uk/blast2/

  6. Different Flavours of BLAST • BLASTP - protein query against protein DB • BLASTN - DNA/RNA query against GenBank (DNA) • BLASTX - 6 frame trans. DNA query against proteinDB • TBLASTN - protein query against 6 frame GB transl. • TBLASTX - 6 frame DNA query to 6 frame GB transl. • PSI-BLAST - protein ‘profile’ query against protein DB • PHI-BLAST - protein pattern against protein DB

  7. Other BLAST Services • MEGABLAST - for comparison of large sets of long DNA sequences • RPS-BLAST - Conserved Domain Detection • BLAST 2 Sequences - for performing pairwise alignments for 2 chosen sequences • Genomic BLAST - for alignments against select human, microbial or malarial genomes

  8. Running NCBI BLAST

  9. MT0895 • MMKIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIMGRVASKEEIKKILS

  10. Running NCBI BLAST • Paste in sequence (FASTA format, raw sequence or type in GI or accession number) >Mysequence MT0895 KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS OR > KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS OR KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS

  11. Running NCBI BLAST • Choose a range of interest in the sequence “set subsequences” (not usually used) • Select the database from pull-down menu (usually choose nr = non-redundant) • Leave “Options” unchanged (use defaults)

  12. Running NCBI BLAST Select Database

  13. Running NCBI BLAST Click BLAST!

  14. Formatting Results

  15. BLAST Format Options

  16. BLAST Output

  17. BLAST Output

  18. BLAST Output

  19. BLAST Output

  20. BLAST Output

  21. BLAST Output

  22. BLAST Parameters • Identities - No. & % exact residue matches • Positives - No. and % similar & ID matches • Gaps - No. & % gaps introduced • Score - Summed HSP score (S) • Bit Score - a normalized score (S’) • Expect (E) - Expected # of chance HSP aligns

  23. Conclusions • BLAST is the most important program in bioinformatics (maybe all of biology) • BLAST is based on sound statistical principles (key to its speed and sensitivity) • A basic understanding of its principles is key for using/interpreting BLAST output • Use NBLAST or MEGABLAST for DNA • Use PSI-BLAST for protein searches

More Related