1 / 28

MAT 4830 Mathematical Modeling

MAT 4830 Mathematical Modeling. 4.5 Phylogenetic Distances I. http://myhome.spu.edu/lauw. Preview. Phylogenetic : of or relating to the evolutionary development of organisms Estimate the amount of total mutations (observed and hidden mutations). Example from 4.1. S0 : Ancestral sequence

velika
Download Presentation

MAT 4830 Mathematical Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAT 4830Mathematical Modeling 4.5 Phylogenetic Distances I http://myhome.spu.edu/lauw

  2. Preview • Phylogenetic: of or relating to the evolutionary development of organisms • Estimate the amount of total mutations (observed and hidden mutations).

  3. Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1

  4. Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Observed mutations: 2

  5. Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Actual mutations: 5

  6. Example from 4.1 S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Actual mutations: 5, (some are hidden mutations)

  7. Distance of Two Sequences • We want to define the “distance” between two sequences. • It measures the average no. of mutations per site that occurred, including the hidden ones.

  8. Distance of Two Sequences • Let d(S0,S) be the distance between sequences S0 and S. What properties it “should” have? 1. 2. 3.

  9. Jukes-Cantor Model • Assume α is small. • Mutations per time step are “rare”.

  10. Jukes-Cantor Model • q(t)=conditional prob. that the base at time t is the same as the base at time 0 A

  11. Jukes-Cantor Model • q(t)=fraction of sites with no observed mutations A

  12. Jukes-Cantor Model • p(t)=1-q(t)=fractions of sites with observed mutations A

  13. Jukes-Cantor Model • p(t)=1-q(t)=fractions of sites with observed mutations A

  14. Jukes-Cantor Model • p can be estimated from the two sequences A

  15. Example from 4.1 Observed mutations: 2

  16. Jukes-Cantor Distance • Given p (and t), the J-C distance between two sequences S0 and S1 is defined as

  17. Jukes-Cantor Distance • Given p (and t), the J-C distance between two sequences S0 and S1 is defined as Why?

  18. Jukes-Cantor Distance

  19. Jukes-Cantor Distance

  20. Jukes-Cantor Distance

  21. Example from 4.3 Suppose a 40-base ancestral and descendent DNA sequences are

  22. Example from 4.3 Suppose a 40-base ancestral and descendent DNA sequences are

  23. Example from 4.3 0.275 observed sub. per site. 0.3426 sub. estimated per site.

  24. Example from 4.3 11 observed sub. 13.7 sub. estimated.

  25. Performance of JC distance (Homework Problem 4) • Write a program to simulate of the mutations of a sequence for t time step using the Jukes-Cantor model with parameter α. • Count the number of base substitutions occurred. • Compute the Jukes-Cantor distance of the initial and finial sequence. • Compare the actual number of base substitutions and the estimation from the Jukes-Cantor distance.

  26. Maple: Strings Handling II • Concatenating two strings

  27. Maple: Strings Handling II • However, no “re-assignment”.

  28. Classwork • Work on HW #1, 2

More Related