1 / 77

Distance matrix methods

Distance matrix methods. calculate a measure of distance between each pair of species , then find a tree that predicts the observed set of distances. Branch lengths and times.

gabby
Download Presentation

Distance matrix methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distance matrix methods calculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.

  2. Branch lengths and times in distance matrix methods, branch lengths reflect the expected amount of evolution in different branches of the tree. branch length = ri• ti elapsedtime rate of evolution

  3. The least squares method minimise the difference between the observed matrix of distances and the matrix of distances predicted by the tree. Observed matrix

  4. The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c

  5. The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c

  6. The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.08+0.05+0.10 0.05 d Expected matrix c

  7. The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c

  8. The least squares method Q is a measure for the discrepancy between the observed and the expected matrix. expecteddistance between species i and j n n Q = S Swij (Dij – dij)2 i=1 j=1 observeddistance between species i and j

  9. The least squares method distances can be weighed or not. weight (1, 1/D2, 1/D) n n Q = S S wij(Dij – dij)2 i=1 j=1

  10. The least squares method Xij, k is a handy variable b a v1 v2 v7 v5 v6 e v4 v3 d c xij,k= 1 if branch k is on the path between species j and k = 0 if branch k is not on the path between species j and k

  11. The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1

  12. The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1 Xa-b,7= 1

  13. The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1 Xa-b,7= 1 Xa-b,3= 0

  14. The least squares method rewrite dij, the expected values n n Q = S S wij(Dij – dij)2 i=1 j=1 dij = S xij,kvk k

  15. The least squares method n n Q = S S wij(Dij – Sxij,kvk)2 i=1 j=1 k

  16. The least squares method differentiate Q and equate the derivative to zero dQ dvk n n Q = S Swij (Dij – Sxij,kvk)2 i=1 j=1 k n n = -2 S S wijxij, k (Dij – Sxij,kvk) i=1 j=1 k

  17. The least squares method for the unweighted case dQ dvk n n = -2 S S xij, k (Dij – Sxij,kvk) = 0 i=1 j=1 k

  18. The least squares method written in full dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j:j≠1 k xAB,1(DAB-SxAB,kvk) + xAC,1(DAC-SxAC,kvk) + xAD,1(DAD-SxAD,kvk) + xAB,1(DAE-SxAE,kvk) + xBC,1(DBC-SxBC,kvk) + xBD,1(DBD-SxBD,kvk)+ xBE,1(DBE-SxBE,kvk) + xCD,1(DCD-SxCD,kvk) + xCE,1(DCE-SxCE,kvk) + xDE,1(DDE-SxDE,kvk) = 0 i=3 i=2 i=1 i=4 j=2 j=3 j=4 j=5 j=3 j=4 j=5 j=4 j=5 j=5

  19. The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c

  20. The least squares method many terms are zero dQ dv1 n n = -2 S Sxij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k 1 (DAB-SxAB,kvk) + 1 (DAC-SxAC,kvk)+ 1 (DAD-SxAD,kvk)+ 1 (DAE-SxAE,kvk) + 0 (DBC-SxBC,kvk) + 0 (DBD-SxBD,kvk)+ 0 (DBE-SxBE,kvk) + 0 (DCD-SxCD,kvk) + 0 (DCE-SxCE,kvk) + 0 (DDE-SxDE,kvk) = 0

  21. The least squares method non-zero terms expanded dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 b a v1 =1•v1 + 1•v2 + 0•v3 + 0•v4 + 0*v5 + 0•v6 + 1*v7 v2 v7 v5 v6 e v4 v3 d c

  22. The least squares method dQ dv1 n n = -2 S Sxij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 b a v1 =1•v1 + 0•v2 + 1•v3 + 0•v4 + 0*v5 + 1•v6 + 0*v7 v2 v7 v5 v6 e v4 v3 d c

  23. The least squares method rearranging to dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7

  24. The least squares method dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 equation for v1

  25. The least squares method mutatis mutandis for v2 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7 equation for v1 equation for v2

  26. The least squares method and all other branches DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7 DAC + DBC + DCD + DDE = v1 + v2 + 4v3 + v4 + v5 + 3v6 + 2v7 DAD + DBD + DCD + DDE = v1 + v2 + v3 + 4v4 + v5 + 2v6 + 3v7 DAE + DBE + DCE + DDE = v1 + v2 + v3 + v4 + 4v5 + 3v6 + 2v7 DAC + DAE + DCE + DBE + DCD + DDE = 2v1 + 2v2 + 3v3 + 2v4 + 3v5 + 6v6 + 4v7 DAB + DAD + DBC + DCD + DBE + DDE = 2v1 + 3v2 + 2v3 + 3v4 + 2v5 + 4v6 + 6v7 equation for v1 equation for v2 v3 v4 v5 v6 v7

  27. The least squares method solving linear equations with matrices x + 2y = 4 3x - 5y = 1 4 1 1 2 3 -5 A = = B -5 -2 -3 1 -5 -2 -3 1 -5 -2 -3 1 1 1 1 A-1= = = - | A | 11 1*(-5)- 3*2 4 1 -22 -11 2 1 -5 -2 -3 1 1 1 X = A-1 B = - = - = 11 11

  28. Clusteringalgorithms clustering methods have no criterion but apply algorithms to come up with trees

  29. Clusteringalgorithms: UPGMA UPGMA assumes that evolutionary rates are the same in all lineages an ultrametric tree Unweighted Pair Group Method with Arithmetic mean

  30. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.

  31. Clusteringalgorithms: UPGMA sealion seal Find species i and j with the smallest distance . Calculate branch length between i and j. 12

  32. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group.

  33. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  34. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  35. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  36. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  37. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.

  38. Clusteringalgorithms: UPGMA raccoon sealion bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12

  39. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  40. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.

  41. Clusteringalgorithms: UPGMA raccoon sealion bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 18.75 6.75 5.75

  42. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  43. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.

  44. Clusteringalgorithms: UPGMA raccoon sealion weasel bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 19.75 6.75 5.75

  45. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (4*44.5 + 1*51)/5 1 species in weasel 4 species in BRSS

  46. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (4*44.5 + 1*51)/5 1 species in weasel 4 species in BRSS

  47. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).

  48. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group.

  49. Clusteringalgorithms: UPGMA raccoon sealion weasel bear seal dog Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 19.75 22.9 6.75 5.75

  50. Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (5*88.2 + 1*98)/6 1 species in dog 5 species in BRSSW

More Related