Terminology of phylogenetic trees
1 / 36

Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny - PowerPoint PPT Presentation

  • Uploaded on

Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction. Phylogenetic tree (dendrogram). Nodes: branching points Branches: lines Topology: branching pattern.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny ' - oral

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Terminology of phylogenetic trees

Types of phylogenetic trees

Types of Data

Character Evolution

Approaches to Phylogeny Reconstruction

Slide2 l.jpg

Phylogenetic tree


Nodes: branching points

Branches: lines

Topology: branching pattern

Slide3 l.jpg

Sister Taxa: two taxa that are more closely related

to eachother than either is to a third taxon.

A + B

C + D

Slide6 l.jpg

Hard polytomy: simultaneous divergence.

Soft polytomy: lack of resolution.

Slide7 l.jpg

Rooted: unique path from root.

Unrooted: degree of kinship, no evolutionary path.

Slide8 l.jpg

Number of possible

phylogenetic trees

3 OTU’s: 1 unrooted tree

3 rooted trees

4 OTU’s: 3 unrooted trees

15 rooted trees.

Slide10 l.jpg


Slide11 l.jpg

Newick (shorthand) format

- text based representation of relationships.

Slide12 l.jpg

Qualitative vs. quantitative data

Quantitative: continuous data

(i.e.height or length)

Qualitative: discrete (2 or more values)

Binary: 2 values

Mulitstate: more than 2 values

Most molecular data are qualitative

Binary: presence or absence of band, or gap in sequence

Multistate: nucleotide data (A, T, G, C)

Slide13 l.jpg

Nucleotide character data

Characters: position in the nucleotide sequence.

(i.e. position 352)

Character states: nucleotide at the position

in the nucleotide sequence.

(G, A, T, or C)

Slide14 l.jpg

Assumptions About Character Evolution

Unordered: change from one character to

another occurs in one step.

(i.e. nucleotide changes)

Ordered: number of steps from one state

to another equals the absolute value of

the difference between their state number.

1 2 3 4 5 requires 4 steps

5 4 3 2 1 requires 4 steps

(reversible vs. unreversible)

Slide15 l.jpg

Phylogenetic reconstruction methods take into assumption:

(1) # of discrete steps required for one character state to

change into another

(2) probability with which such change occurs.

Slide16 l.jpg

Step matrix

- number of

steps required

between character


Slide17 l.jpg

Approaches to Phylogeny Reconstruction

Cladistics (parsimony): recency of common ancestry

Maximum Likelihood: model of sequence evolution

Phenetics (UPGMA, neighbor joining): overall similarity

Slide18 l.jpg



General scientific criterion for choosing among

competing hypotheses that states that we should accept

the hypothesis that explains the data most simply and


Maximum parsimony method of phylogeny reconstruction:

The optimum reconstruction of ancestral character states is

the one which requires the fewest mutations in the phylogenetic

tree to account for contemporary character states.

Slide19 l.jpg

First step in maximum parsimony analysis:

Identify all of the informative sites.

Invariant: all OTU’s possess the same character

state at the site.

Any invariant site is uninformative.

Slide20 l.jpg

Two types of variable sites:

Informative: favors a subset of trees over other possible trees.

Uninformative: a character that contains no grouping

information relevant to a cladistic problem (i.e. autapomorphies).

Slide22 l.jpg

Parsimony Analysis 2nd step: Calculate the minimum number

of substitutions at each informative site

1 step

2 steps

2 steps

Informative: favors tree 1 over other 2 trees.

Slide23 l.jpg

Final step in parsimony analysis: Sum the number of changes

over all informative sites for each possible tree and choose the tree

associated with the smallest number of changes.

Site 3

Site 4

Site 5

Site 9

3 steps

3 steps

4 steps

Slide24 l.jpg

Parsimony Search Methods:

Exhaustive search method: searches all possible fully resolved topologies

and guarantees that all of the minimum length cladograms will be found.

(not a practical option, time consuming)

Branch and bound methods: begins with a cladogram. The length

of starting cladogram is retained as an upper bound for use

during subsequent cladogram construction. As soon as a length

of part of the tree exceeds the upperbound, the cladogram is

abandoned. If equal length, cladogram is saved as an optimal

topology. If length is less, it is substituted for the original as the optimal

upperbound. (good option for fewer than 20 taxa, time consuming)

Heuristic methods: approximate or “hill climbing technique”

Begin with a cladogram, add taxa and swap branches until

a shorter length cladogram is found. Procedure can be replicated many

times to increase chance of finding minimum length cladogram.

Slide25 l.jpg

Different types of parsimony analyses:

Unweighted parsimony: all character state changes are

given equal weight in the step matrix.

Weighted parsimony: different weights assigned to

different character state changes.

Transversion parsimony: transitions are completely

ignored in the analysis, only transversions are considered.

Slide26 l.jpg

Maximum Likelihood Method:

The likelihood (L) of a phylogenetic tree is the

probability of observing the data (nucleotide sequences)

under a given tree and a specified model of

character state changes.

The aim is to find the tree (among all possible trees)

with the highest L value.

Slide27 l.jpg

Models of character state changes (sequence evolution):

Jukes and Cantor 1 parameter model: all changes equal probability

Kimura 2 parameter model: transitions more frequent than


Other more complicated models…...

Slide28 l.jpg

1. Calculate likelihood

for each site on a

specific tree.

2. Sum up the L

values for all sites on

the tree.

3. Compare the L

value for all possible


4. Choose tree with

highest L value.

Slide29 l.jpg

Distance Methods: evolutionary distances (number of substitutions)

are computed for all pairs of taxa.

UPGMA: unweighted pairgroup method with arithmetic means

- assumes equal rate of substitutions

- sequential clustering algorithms

- pairs of taxa are clustered in order of decreasing similarity

Neighbor Joining: finding shortest (minimum evolution) tree by finding

neighbors that minimize the total length of the tree. Shortest pairs are

chosen to be neighbors and then joined in distance matrix as one OTU.

Slide30 l.jpg

Consensus Methods:

Consensus trees are derived from a set of trees and

summarize the phylogenetic information of several

trees in a single tree.

Most commonly used consensus trees:

Strict consensus: all conflicting branching patterns are


50% majority rule consensus: branching patterns that

occur with a frequency of 50% or more are retained,

all others are collapsed.

Slide31 l.jpg





































Slide32 l.jpg

Bootstrap method of assessing tree reliability:

Inferred tree is constructed from data set.

Characters are resampled from the data set with replacement.

Resampling is replicated several (100-1000) times.

Bootstrap trees are constructed from the resampled data sets.

Bootstrap tree is compared to original inferred tree.

% of bootstrap trees supporting a node are determined for

each node in the tree.

Slide33 l.jpg

Homoplasy: non-homologous similarity

- resemblance not due to common ancestry

- evolved independently

- considered “noise”

Slide35 l.jpg

Known bacterial phylogeny:

ancestors at each node known.

Hillis & Huelsenbeck 1992

tested the ability of different methods,

of finding the “true” phylogeny.

Maximum parsimony and

maximum likelihood performed

well, UPGMA & neighbor

joining did not.

Slide36 l.jpg

Strengths and Weaknesses:

UPGMA & neighbor-joining: fast but not as accurate as

other methods.

Maximum parsimony: time consuming, but more accurate.

can combine morphological characters with DNA characters

in a single analysis.

Maximum likelihood: very time consuming, including

information from morphology is a new technique (but it is

controversial), can invoke a specific model of sequence evolution.

Reference: Molecular Systematics 2nd Ed., Hillis et. al (1996),

Sinauer Associates. ISBN:0-87893-282-8