1 / 27

Final presentation Tandem Cyclic Alignment

Final presentation Tandem Cyclic Alignment. Sequence Alignment. Needleman-Wunch Algorithm – global alignment, fixed gap penalty Waterman-Smith-Beyer Algorithm – local alignment, affine gap penalty function Gotoh ’ s algorithm – local alignment, affine gap penalty function.

cybil
Download Presentation

Final presentation Tandem Cyclic Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Final presentationTandem Cyclic Alignment

  2. Sequence Alignment • Needleman-Wunch Algorithm – global alignment, fixed gap penalty • Waterman-Smith-Beyer Algorithm– local alignment, affine gap penalty function • Gotoh’s algorithm – local alignment, affine gap penalty function

  3. Needleman-Wunch Algorithm(Global Alignment)

  4. Waterman-Smith-Beyer Algorithm(Local Alignment)

  5. Goth’s Algorithm – (Local Alignment) Consider the gapless sequences a and b. Let g(k) = a + kb be an affine gap penalty function and let w(ai,bj) be a cost function. D is the distance matrix. P is the matrix with the minimal distances for all alignments with bo ending in a gap. Q is the matrix with the minimal distances for all alignments with ao ending in a gap.

  6. Gotoh’s Algorithm • Uses dynamic programming with three matrices (instead of 1). • Traceback – need to track movement through all three matrices.

  7. Tandem Repeats • Tandem repeats are a special class of repeats with very short repeat units. Each repeat unit is frequently of a few nucleotides long. • For example, one tandem repeat in human comprises of hundreds of copies of a 6-nucleotide repeat TTAGGG. These are often called microsatellites. • In eukaryotic genomes, repeats with longer repeating units of up to 25 nucleotides (called minisatellites) are also abundant. They are located mostly in non-transcribed regions.

  8. Finding Tandem Repeats • A straightforward approach to look for tandem repeats with repeat unit of length k is to look for consecutive exact occurrences of a pattern of length k. This can be accomplished efficiently. • However, it is often the case that some of the repeat units are mutated. We will need to allow for mismatches when looking for these imperfect repeats. • It becomes much more difficult to obtain an efficient algorithm as the number of mismatches allowed increases.

  9. Finding Tandem Repeats by Alignment • If the dominating repeating pattern is known, another way to locate imperfect repeats is by solving the following alignment problem: • Let p be a pattern of length m (repeat unit) and s be a sequence of length n (search string). Let pn be the concatenation of p with itself n times. Finding an imperfect tandem repeat is equivalent to finding an optimal local alignment between pn and s. p p p … s

  10. Local alignment S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 P P

  11. Wraparound Method O(mn) • When aligning a sequence with tandem repeats, use the ‘wrap around’ method to minimize calculations. • When implementing the wrap around method, look at the section with tandem repeats separately. • Write the repeated sequence only once in the similarity matrix. • Align as usual except when reaching the end of the repeated sequence, use that value as the first value in the next row and repeat this procedure.

  12. Wraparound Method

  13. Wraparound Algorithm • When developing a dynamic programming implementation for the wraparound algorithm, there is a problem with determining the Q matrix. • In order to define Qi,1, it is necessary to know Qi,|b|. • Hence, there must be two passes to correctly detemine Q

  14. Wraparound Method

  15. Wraparound Method

  16. Wraparound Method

  17. Cyclic global alignment O(n2m) • Given sequences X and Y • Find the best scoring alignment of X [i] vs Y over all possible i, • 1<=i<=|X|,where all of Y and exactly one whole (cyclically permuted) copy of X must occur in the alignment. X Y

  18. The Maes algorithm for cyclic global alignment O(nmlog n)

  19. Non-crossing alignments

  20. Tandem Cyclic Alignment X* Y

  21. An example

  22. No alignment crosses “the same" alignment more than once

  23. Proof

  24. O(nmlog n). X X X X X X X Y C-1 C C+1

  25. Bounded wraparound dynamic programming

More Related