1 / 23

Phylogenetic Tree Generation

Phylogenetic Tree Generation. Brandon Andrews CS6030. Topics. What is a phylogenetic tree? Goals in a phylogenetic tree generator Distance based method Fitch- Margoliash Method Example Verification Demo. What is a phylogenetic tree?. B and C are similar

tavita
Download Presentation

Phylogenetic Tree Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phylogenetic Tree Generation Brandon Andrews CS6030

  2. Topics • What is a phylogenetic tree? • Goals in a phylogenetic tree generator • Distance based method • Fitch-Margoliash Method • Example • Verification • Demo

  3. What is a phylogenetic tree? • B and C are similar • A and B are more similar than A and C which have a longer distance A B C Also known as an evolutionary tree Attempts to map the genetic similarity of organisms into a tree where longer branches indicate more dissimiliarity

  4. Goals in a phylogenetic tree generator • Given the sequences and calculated or known dissimilarity construct a tree which correctly maps this data • Naïve method: Generate every possible tree and grade its quality

  5. Distance based method • Take a distance matrix that stores the distance from every sequence to every other sequence • Construct a tree which preserves these distances • Most don’t 100% preserve the distances

  6. Fitch-Margoliash Method Clustering algorithm that works bottom up to create an unrooted tree Weights are used to help lower the error rate for long paths

  7. Example • Calculate a distance matrix • Hamming distance can be used, but a better dissimilarity function is advised

  8. Steps D d • dist(ABC, D) is the average distance from ABC to D • Dist(ABC, E) is the average distance from ABC to E • d = (dist(D, E) + (dist(ABC, D) - dist(ABC, E))) / 2; • e = dist(D, E) - d; • abc = dist(ABC, D) - d; A, B, C abc e E Add all the sequences to an array of nodes and mark them as leaves Select the closest nodes by scanning the distance matrix Those two nodes, in our example D and E will make up the two branches in a 3-branch calculation to find the branch lengths

  9. Steps Continued • dist(ABC, D) and dist(ABC, E) • Calculate by taking the distance from each of the elements A, B, and C and averaging them • d = (10 + (32.6… - 34.6…)) / 2 = 4 • e = 10 - 4 = 6 • abc = 32.6… - 4 = 28.6…

  10. D 4 A, B, C 28.6… 6 E Now we can create a new node with distance 28.6… and set D and E to their respective distances Since D and E are leaves their distance are kept. However, if they weren’t then the average of the child distances would be subtracted as seen later

  11. Steps Continued • The final step in this iteration is to recalculate the nodes and distance matrix • The nodes array has the new merged node DE appended to the end and D and E are removed • The distance matrix is updated with DE merged and D and E are removed:

  12. Steps Continued C c • dist(AB, C) is the average distance from AB to C • Dist(AB, DE) is the average distance from AB to DE • c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2; • de = dist(C, DE) - c; • ab = dist(AB, C) - c; A, B ab de DE • Look at the new distance matrix find the closest pair, C and DE • Now there is a special step. C is a leaf so it gets the calculated distance • DE is not a leaf so we need to subtract from DE the average child distance

  13. Merging A and B to calculate the average distance to C and DE. • dist(AB, C) • dist(AB, DE)

  14. Steps Continued 1 4 2 5 6 3 • Average child distance example • Recursively take the average of each branches • ((5 + ((2 + (4 + 6) / 2) + 3) / 2) + 1) / 2 = 5.5

  15. Steps Continued • So for DE which has two child nodes we need to subtract the average of the children. • Since DE has two leaf nodes we perform: • (4 + 6) / 2 = 5 • So now we calculate c, de, and ab: • c = (dist(C, DE) + (dist(AB, C) - dist(AB, DE))) / 2 = (19 + (40 – 41)) / 2 = 9 • de = dist(C, DE) – c – AverageDistance(DE) = 19 – 9 – (4 + 6) / 2 = 5 • ab = dist(AB, C) – c = 40 – 9 = 31 • Notice that the distance at de replaces whatever was previously there

  16. Steps Continued C 9 A, B D 31 4 5 6 E With the new node added: Recalculated distance matrix:

  17. Steps Continued A a • dist(CDE, A) is the average distance from CDE to A • Dist(CDE, B) is the average distance from CDE to B • a = (dist(A, B) + (dist(CDE, A) - dist(CDE, B))) / 2 = 10 • b = dist(A, B) - c = 12 • cde = dist(CDE, A) - a = 29.5 CDE cde b B • As before choose the next closest nodes by looking at the distance matrix • A and B are chosen • Now a and b can be calculated since they are leaves, but notice we’re linking two trees at cde, so we need a special step to subtract the average distance

  18. A C 10 9 29.5 CDE A, B D cde cde 4 12 5 B A C 10 9 6 20 E D 4 12 5 B 6 • So 29.5 - AverageDistance(CDE) • 29.5 - ((5 + (4 + 6) / 2) + 9) / 2 = 29.5 - 9.5 = 20 E

  19. Steps Continued 10 10 10 5 9 12 A B C 4 6 D E • So we have a completely defined unrooted tree. How do we root it? • Just take the last branch and divide it by two

  20. Verification • Original: • From thegenerated tree: • Exact match • Rare to happen • Usually off by asmall amount

  21. Demo http://sirisian.com/javascript/CS6030Project.html

  22. Conclusion Distance based methods such as the Fitch-Margoliash method produce very accurate trees given an accurate distance matrix in a very timely manner

  23. References Bacardit, J., Krasnogor, N. Phylogenetic Trees[PPT document]. Retrieved from http://www.cs.nott.ac.uk/~jqb/G53BIO/Slides/Phylogenetic%20Trees.ppt Louhisuo K. (2004, May 4). Constructing phylogenetic trees with UPGMA and Fitch- Margoliash. Retrieved from http://www.niksula.cs.hut.fi/~klouhisu/Bioinfo/phyltree.pdf

More Related