What is phylogenetic analysis and why should we perform it?
Download
1 / 38

What is phylogenetic analysis and why should we perform it? - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

What is phylogenetic analysis and why should we perform it? Phylogenetic analysis has two major components: (1) Phylogeny inference or “tree building”

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' What is phylogenetic analysis and why should we perform it?' - faye


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

What is phylogenetic analysis and why should we perform it?

Phylogenetic analysis has two major components:

(1) Phylogeny inference or “tree building”

the inference of the branching orders, and ultimately the evolutionary relationships, between “taxa” (entities such as genes, populations, species, etc.)

(2) Analyzing change in traits (phenotypes, genes)

using phylogenies as analytical frameworks

for rigorous understanding of the evolution of

various traits or conditions of interest

Germline and somatic evolution included!


  • Uses of Phylogenetics in the Study of

  • Health & Disease

  • Evolutionary history of humans, between and within species

  • Analysis of evolution of phenotypic and genetic traits in humans, especially human-specific traits - evolved when, where, why, how

  • Evolution of parasites and pathogens, in relation to their hosts (us)

  • Evolution of cancer cell lineages, and somatic evolution more generally.

    (5) Study of adaptation in humans and other taxa


What you will learn in this lecture

About phylogenies, terminology, what they are,

how they work, ‘tree thinking’

(2) How to infer phylogenies

(3) How we can use phylogenies to answer questions

related to human adaptation, health and disease


Common Phylogenetic Tree Terminology

Terminal Nodes

Branches or

Lineages

A

Represent the

TAXA (genes,

populations,

species, etc.)

used to infer

the phylogeny

B

C

D

Ancestral Node

or ROOT of

the Tree

E

Internal Nodes or

Divergence Points (represent hypothetical ancestors of the taxa)


Taxon B

Taxon C

No meaning to the

spacing between the

taxa, or to the order in

which they appear from

top to bottom.

Taxon A

Taxon D

Taxon E

This dimension either can have no scale (for ‘cladograms’),

can be proportional to genetic distance or amount of change

(for ‘phylograms’ or ‘additive trees’), or can be proportional

to time (for ‘ultrametric trees’ or true evolutionary trees).

Phylogenetic trees diagram the evolutionary

relationships between the taxa

((A,(B,C)),(D,E)) = The above phylogeny as nested parentheses

These say that B and C are more closely related to each other than either is to A,

and that A, B, and C form a clade that is a sister group to the clade composed of

D and E. If the tree has a time scale, then D and E are the most closely related.


time

Three types of trees

Cladogram Phylogram Ultrametric tree

6

Taxon B

Taxon B

Taxon B

1

1

Taxon C

Taxon C

Taxon C

3

1

Taxon A

Taxon A

Taxon A

Taxon D

Taxon D

5

Taxon D

no meaning

genetic change

All show the same evolutionary relationships, or branching orders, between the taxa.


A

A

A

B

C

E

C

E

C

D

B

B

E

D

D

Polytomy or multifurcation

A bifurcation

A major goal of phylogeny inference is to resolve the

branching orders of lineages in evolutionary trees:

Completely unresolved

or "star" phylogeny

Partially resolved

phylogeny

Fully resolved,

bifurcating phylogeny

RESOLUTION AND SUPPORT for nodes


There are three possible unrooted trees for four taxa a b c d
There are three possible unrooted trees for four taxa (A, B, C, D)

Tree 1

Tree 2

Tree 3

A

C

A

B

A

B

D

D

C

D

B

C

Phylogenetic tree building (or inference) methods are aimed at discovering which of the possible unrooted trees is "correct".

We would like this to be the “true” biological tree — that is, one that accurately represents the evolutionary history of the taxa.

However, we must settle for discovering the computationally correct or optimal tree for the phylogenetic method of choice.


A

B

A

C

C

D

B

C

D

A

E

B

C

A

D

E

B

F

The number of unrooted trees increases in a greater than exponential manner with number of taxa

(2N - 5)!! = # unrooted trees for N taxa


B

C

Root

D

A

A

C

B

D

Rooted tree

Note that in this rooted tree, taxon A is no more closely related to taxon B than it is to C or D.

Root

Inferring evolutionary relationships between the taxa requires rooting the tree:

To root a tree mentally, imagine that the tree is made of string. Grab the string at the root and tug on it until the ends of the string (the taxa) fall opposite the root:

Unrooted tree

TIME


Now, try it again with the root at another position:

B

C

Root

Unrooted tree

D

A

A

B

C

D

Rooted tree

Note that in this rooted tree, taxon A is most closely related to taxon B, and together they are equally distantly related to taxa C and D.

Root

TIME


2

4

1

5

3

Rooted tree 1a

Rooted tree 1b

Rooted tree 1c

Rooted tree 1d

Rooted tree 1e

B

A

A

C

D

A

B

D

C

B

C

C

C

A

A

D

B

B

D

D

An unrooted, four-taxon tree theoretically can be rooted in five different places to produce five different rooted trees

A

C

The unrooted tree 1:

D

B

These trees showfive different evolutionary relationships among the taxa!


A

A

C

D

D

B

C

B

B

A

B

C

C

D

D

A

C

B

D

D

A

C

B

A

All of these rearrangements show the same evolutionary relationships between the taxa

Rooted tree 1a

D

C

A

B


Main way to root trees:

By outgroup:

Uses taxa (the “outgroup”) that are known to fall outside of the group of interest (the “ingroup”). Requires some prior knowledge about the relationships among the taxa.

outgroup


COMPUTATIONAL METHOD

Optimality criterion

Clustering algorithm

PARSIMONY

MAXIMUM LIKELIHOOD

Characters

DATA TYPE

MINIMUM EVOLUTION

LEAST SQUARES

UPGMA

NEIGHBOR-JOINING

Distances

Molecular phylogenetic tree building methods:

Are mathematical and/or statistical methods for inferring the divergence order of taxa, as well as the lengths of the branches that connect them. There are many phylogenetic methods available today, each having strengths and weaknesses. Most can be classified as follows:


Types of data used in phylogenetic inference:

Character-based methods:Use the aligned characters, such as DNA or protein sequences, directly during tree inference.

TaxaCharacters

Species A ATCGCTAGTCCTATAGTGCA

Species B ATCGCTAGTCCTATATTGCA

Species C TTCGCTAGACCTGTGGTCCA

Species D TTGACCAGACCTGTGGTCCG

Species E TTGACCAGTTCTGTGGTCCG ETC

ETC


6

Taxon B (eg HUMANS!)

1

1

Taxon C

3

1

Taxon A

5

Taxon D

C is more similar in sequence

to A (d = 3) than to B (d = 7),

but C and B are most closely

related (that is, C and B shared

a common ancestor more recently

than either did with A).

Similarity vs. Evolutionary Relationship:

Similarity and relationship are not the same thing, even though

evolutionary relationship is inferred from certain types of similarity.

Similar: having likeness or resemblance (an observation)

Related: genetically connected (an historical fact)

Two taxa can be most similar without being most closely-related:


Main computational approach:

Optimality approaches:Use either character or distance data. First define an optimality criterion (minimum branch lengths, fewest number of events, highest likelihood), and then use a specific algorithm for finding trees with the best value for the objective function. Can identify many equally optimal trees, if such exist.

Warning: Finding an optimal tree is not necessarily the same as finding the "true” tree. Random data will give you an ‘optimal’ (best ) tree!


Parsimony methods:

  • Optimality criterion: The ‘most-parsimonious’ tree is the one that

  • requires the fewest number of evolutionary events (e.g., nucleotide

  • substitutions, amino acid replacements) to explain the sequences.

  • Advantages:

  • Are simple, intuitive, and logical (many possible by ‘pencil-and-paper’).

  • Can be used on molecular and non-molecular (e.g., morphological) data.

  • Can be used for character (can infer the exact substitutions) and rate analysis.

  • Can be used to infer the sequences of the extinct (hypothetical) ancestors.

  • Disadvantages:

  • Not explicitly statistical

  • Can be fooled by high levels of parallel evolution


Use parsimony to infer the optimal (best) tree

Character-based methods:Use the aligned characters, such as DNA

or protein sequences, directly during tree inference.

TaxaCharacters

Species A ATCG CTAGACCTATAGTGCA

Species B ATCG CTAGACCTATATTGCA

Species C TTCG CTAGACCTGTGGTCCA

Species D TTGA CCAGACCTGTGGTCCG

Species E TTGA CCAGTTGTGTGGTCCG

OUTGROUP TTACCCATTTGTGTCCTCCG

Infer maximum parsimony tree using first four characters

Quality of trees (how likely it is that they reflect the one True

Tree) can be evaluated in various ways (random data will give you a

low-quality ‘best’ tree)


We can Statistically Comparealternative trees,

corresponding to specific biological hypotheses

of the history of some set of lineages


Time scales on trees molecular clocks

100%

Fibrinopeptides

Hemoglobin

% genetic divergence

Cytochrome c

25%

50%

75%

Histone IV

Time since divergence (Myr)

300

600

900

1200

1500

Timescales on trees: molecular clocks

Why such different profiles? Variation in mutation rate?

Variation in selection. Genes coding for some molecules under very strong stabilizing selection.


Dates for calibrating molecular clocks can come from geology,

fossils, or historical data

From known ages

of islands, for two genes


Calibrating using fossil data
Calibrating using fossil data

chimps

6 substitutions

humans

whales

60 substitutions

hippos

56 mya


Calibrating from known dates of the ages of samples:

for very fast-evolving

taxa such as HIV


Uses of Phylogenetics in the Study of

Health & Disease

Evolutionary history of humans, between and within species

Analysis of evolution of phenotypic and genetic traits in humans, especially human-specific traits - evolved when, where, why, how

Taxonomy and evolution of parasites and pathogens, and evolution in relation to their hosts

Evolution of cancer cell lineages, and somatic evolution more generally.

Study of adaptation in humans and other taxa, via analysis of divergence and convergence


EMERGING VIRUSES - THE GREATEST KNOWN HEALTH THREAT TO HUMANITY

VIRUS - what IS it?

Sequence it’s DNA and relate sequence to known viruses

Evolution of SIV and HIV viruses:

multiple transfers to humans, from

chimps and from green monkeys


SARS HUMANITY (severe acute respiratory syndrome)

what causes it and where did it come from?


HIV phylogeny HUMANITY

within humans in

different regions:

Haiti as stepping

stone to

North America


HIV evolves very HUMANITY

rapidly WITHIN hosts,

as a result of interactions

with the immune system

Can do phylogenetics:

-Pathogens within individuals,

-Pathogens between

Individuals (eg in different

or same regions)

How originate?

From other species?

How spread?

How does resistance to

Antibiotics evolve in pathogens,

& resistance to chemotherapeutic

agents evolve in cancer?


Cancer evolves HUMANITY

genetically

in the body during

carcinogenesis,

allowing the inference

of ‘oncogenetic trees’

Cytogenetic data:

Gains and losses of

Chromosomal regions

During evolution of cancers;

Lose tumor suppressor

gene copies, gain

Oncogene copies

Involves losses of

heterozygosity

and losses of imprinting


Cancer HUMANITY

Evolutionary

Phylogenomics

Compare

primary cancer

with metastatic

tumors


What you learned in this lecture HUMANITY

About phylogenies, terminology, what they are,

how they work, ‘tree thinking’

(2) How to infer and evaluate phylogenies

(3) How to use phylogenies to answer questions

related to human adaptation, health

and disease (viruses, cancer, etc)

(4) How to THINK in terms of evolutionary trees

(historical patterns of evolution), within and between species


ad