1 / 19

Basic String Alignment

Basic String Alignment. Probability theory and statistics String alignment problem Basic string alignment algorithms. Author: Roel Wijgers email: rwijgers@cs.uu.nl. Probability Theory. Conditional chance: P(A|B) = P(A / B) / P(B) Independence of A and B: when P(A / B) = P(A)P(B).

bess
Download Presentation

Basic String Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic String Alignment Probability theory and statistics String alignment problem Basic string alignment algorithms Author: Roel Wijgers email: rwijgers@cs.uu.nl

  2. Probability Theory • Conditional chance: • P(A|B) = P(A /\ B) / P(B) • Independence of A and B: • when P(A /\ B) = P(A)P(B) Author: Roel Wijgers email: rwijgers@cs.uu.nl

  3. String Alignment • No gaps allowed: • Gaps allowed in one of the strings: • Gaps allowed in both strings: Author: Roel Wijgers email: rwijgers@cs.uu.nl

  4. Matching models The random model, i.e. each letter a occurs independently with some frequency qa This means that the probability of two sequences x and y is defined as follows : Author: Roel Wijgers email: rwijgers@cs.uu.nl

  5. Matching models(2) Independence between values xiand yjis not very usefull: odds ratio: Author: Roel Wijgers email: rwijgers@cs.uu.nl

  6. Matching models(3) We rather have an additional scoring system, i.e.: This scoring system is called the log-odds ratio, and associated with it is the log-likelihood ratio: Author: Roel Wijgers email: rwijgers@cs.uu.nl

  7. Log likelihood table Author: Roel Wijgers email: rwijgers@cs.uu.nl

  8. Gap penalties We expect to penalise gaps. You can use different functions for this, although the linear function is most common to use: Author: Roel Wijgers email: rwijgers@cs.uu.nl

  9. Gap penalties(2) Where f(g) is a geometric distribution: Author: Roel Wijgers email: rwijgers@cs.uu.nl

  10. Alignment algorithms Author: Roel Wijgers email: rwijgers@cs.uu.nl

  11. Global alignment: Needleman-Wunsch algorithm Find the optimal global alignment between 2 sequences, allowing gaps. Author: Roel Wijgers email: rwijgers@cs.uu.nl

  12. Global alignment: Needleman-Wunsch algorithm(2) Author: Roel Wijgers email: rwijgers@cs.uu.nl

  13. Local alignment: Smith-Waterman algorithm Find the best alignment between subsequences of x and y. Author: Roel Wijgers email: rwijgers@cs.uu.nl

  14. Local alignment: Smith-Waterman algorithm Author: Roel Wijgers email: rwijgers@cs.uu.nl

  15. Repeated Matches Search for multiple local matches. • One of the sequences is fixed and contains the domain or motif. • We have some threshold T to exclude short local alignments. Author: Roel Wijgers email: rwijgers@cs.uu.nl

  16. Repeated Matches(2) Author: Roel Wijgers email: rwijgers@cs.uu.nl

  17. Overlap matches We expect that one of the sequences contains the other, or they overlap. Author: Roel Wijgers email: rwijgers@cs.uu.nl

  18. Overlap matches(2) Author: Roel Wijgers email: rwijgers@cs.uu.nl

  19. Questions Author: Roel Wijgers email: rwijgers@cs.uu.nl

More Related