Practical algorithms in Sequence Alignment. Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576/ [email protected] Sep 17 th , 2013. Key concepts. Extreme value distribution gives an analytical form to compute the significance of a score Heuristic algorithms BLAST algorithm
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Practical algorithms in Sequence Alignment
Sep 17th, 2013
Prior log odds score
query sequence: QLNFSAGW
word length w = 2 (default for protein usually w = 3)
word score threshold T = 9
Step 1: determine all words of length w in query sequence
QL LN NF FS SA AG GW
Determine all words that score at least T when compared to a word in the query sequence
words with T≥9
Additional words in
Transition from 2 or 3 to 0,
happens when C is not observed.
Can result in “restarting” or
Enter the query sequence
Choose the database
The sequence corresponds to the human HBB (hemoglobin) gene.
But we will select the mouse DB
Use Megablast (large word size)
Assesses significance of a score.
Related to P-value, but gives the
expected number of alignments of
this score value or higher.