1 / 15

Sequence Alignment I Dot Matrices

Sequence Alignment I Dot Matrices. Reading. Mount, Chapters 1, 2, and 3 (up to page 94). Why compare sequences?. To find whether two (or more) genes or proteins are evolutionarily related to each other To find structurally or functionally similar regions within proteins.

espy
Download Presentation

Sequence Alignment I Dot Matrices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence Alignment IDot Matrices

  2. Reading • Mount, Chapters 1, 2, and 3 (up to page 94)

  3. Why compare sequences? • To find whether two (or more) genes or proteins are evolutionarily related to each other • To find structurally or functionally similar regions within proteins

  4. Similar genes arise by gene duplication • Copy of a gene inserted next to the original • Two copies mutate independently • Each can take on separate functions • All or part can be transferred from one part of genome to another

  5. Sequence Comparison Methods • Dot matrix analysis • Dynamic Programming • Word or k-tuple methods (FASTA and BLAST)

  6. Dot matrices c g g a c a c a c g

  7. Dot matrix comparison

  8. Interpretation • Regions of similarity appear as diagonal runs of dots • Reverse diagonals (perpendicular to diagonal) indicate inversions • Reverse diagonals crossing diagonals (Xs) indicate palindromes

  9. Interpretation • Can link separate diagonals to form alignment with gaps • Each a.a. or base can only be used once • Can't double back • A gap is introduced by each vertical or horizontal skip

  10. Filtering • Dot matrices for long sequences can be noisy due to insignificant matches • Solution: use a window and a threshold • compare character by character within a window (have to choose window size) • require certain fraction of matches within window in order to display it with a dot

  11. Dot plot comparison using windows Window size = 11 Stringency = 7 (Put a dot only if 7 out of next 11 positions are identical.)

  12. Uses for dot matrices • Aligning two proteins or two nucleic acid sequences • Finding amino acid repeats within a protein by comparing a protein sequence to itself • Repeats appear as a set of diagonal runs stacked vertically and/or horizontally

  13. Repeats Human LDL receptor protein sequence (Genbank P01130) W = 1 S = 1 (Mount, Fig. 3.6)

  14. Repeats W = 23 S = 7 (Mount, Fig. 3.6)

  15. Using substitution matrices • Dots can have weights • Some matches are rewarded more than others, depending on likelihood • Use PAM or BLOSUM matrix (more on these later) • Put a dot only if a minimum total or average weight is achieved • See Mount, Fig. 3.5

More Related