Building phylogenetic trees. Contents. Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances UPGMA method (+ an example) Neighbor-Joining method (+ an example) Comparison of methods Conclusion. Phylogeny.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Distance methods:
Hamming Distance data
Jukes-Cantor, Kimura-2-parameter K2P, HKY (Hasegawa-Kishino-Yano), F84, Tamura-Nei, General time-reversible model, General 12-parameter model
PAM-matrices, BLOSUM-matrices
A
C
B
D
The divergence of sequences is assumed to occur at the same constant rate at all points in the tree
dij = dik = djkor two of them are equal and one is smaller: djk < dij = dik
UPGMA is guaranteed to build the correct tree if distances are ultrametric
Initialisation:
Assign each sequence i in dataset to its own cluster
Define one leaf of T for each sequence, and place at height zero
Iteration:
Find the two clusters iand j for which dijis the smallest (pick randomly if several equal distances)
Define a new cluster ijby Cij = Ci UCj. Cluster ijhas nij = ni + njmembers ( initially ni = 1 )
Connect iand jon the tree to a new node v
The branch lengths from new node to iand jare
placed at height
Iteration (cont.)
Compute the distances between the new cluster and the remaining clusters by using
Add ij to the current clusters and remove iand j
Termination:
When only two clusters iandjremain, place the root at height
for four items (sequences)
A, B, C and D
Actually distances are not ultrametric, because three distances are not equal
dij≠ dik≠ djkor two of them are not equal and one is smaller: djk < dij≠ dik
Step 1. Find the smallest distance, dij, between two clusters
A and C, where dij is 7
Step 2. Define new cluster ij, which has nij = ni + nj
members (initially ni = 1)
New cluster A and C
nAC= nA+ nC=2
Step 3. Connect A and C on the tree to a new node v1
Step 4. The branch lengths from new node v1 to A and C
3,5
A
C
3,5
Step 5. Compute the distances between the new cluster AC and the remaining clusters (B and D):
Step 6. Delete the columns and rows of the distance matrix that correspond to clusters A and C, and add a column and a row for cluster AC
New distance matrix
3,5
A
C
3,5
B
4,25
Step 5. Compute the distances between the new cluster and the remaining cluster (D)
Step 6. Delete the columns and rows of the distance matrix that correspond to clusters AC and B, and add a column and a row for cluster ACB
New distance matrix
Termination:
Only two clusters (ACB and D) remaining
Place the root height
Original distance matrix and final
phylogenetic tree(including the
branch lengths)
3,5
A
0,75
C
1,92
3,5
B
4,25
D
6,17
D
Def. Edge lengths are said to be additive if the distance between any pair of leaves is the sum of lengths of the edges on the path connecting them
B
A
C
Initialisation:
Define T to be the set of leaf nodes, one for each given sequence
Iteration:
Compute for each sequence, where n is the number of sequences in the distance matrix
Pick a pair iand j (for which dij – ui – ujis the smallest (pick randomly if several equal)
Join items iand j with a new node v
Compute the branch lengths from a new node v to items iand j
Compute the distances between new node v and remaining items
Remove iand jfrom the distance matrix and replace them by new node v
Termination:
When only two items i and jremain, add the remaining edge between i and j, with length dij
Step 1. Compute
for each row in
distance matrix
Step 2. Compute
(the lower-diagonal
matrix) and choose the
smallest (most negative)
Step 3. Join A and B together with a new node v1. Compute the edge lengths, from A to node v and from B to node v1
Step 4. Compute distances between the new node v1 and remaining items (C and D)
B
5
v1
3
A
New reduced distance matrix
Step 5. Delete A and B from the distance matrix and replace them by new item AB
Step 6. Continue from step 1, because more than two items remain
Step 1. Compute
for each row in
distance matrix
Step 2 Compute
and choose
the smallest (the lower-diagonal matrix)
Step 3 Join v1 and C together with a new node v2. Compute the edge lengths, from v1to node v2and from C to node v2
Step 4 Compute distances between the new node v2 and remaining items (D)
B
5
v1
v2
1
3
3
A
C
Step 5 Delete AB and C from the distance matrix and replace them by ABC
Step 6 Only two nodes remaining connect them
Original distance matrix and final phylogenetic tree (including the edge lengths)
D
8
B
5
1
3
3
A
C
UPGMA data
The total branch length from the root up to any leaf is equal
Produces a rooted tree, where the root is hypothesized ancestor of the sequences in the tree
Suitable for closely related sequences
Can be used to infer phylogenies if one can assume that evolutionary rates are the same in all lineages
Neighbor-joining
Unrooted tree, where the direction of evolution is unknown
Suitable for datasets with largely varying rates of evolution
Suitable for large datasets
ComparisonD
8
3,5
A
B
5
C
3,5
1
B
3
3
A
C
4,25
D
6,17
Multiple sequence alignment
Phylogeny packages
Viewing/plotting phylogenetic trees