TREES
Download
1 / 91

TREES - PowerPoint PPT Presentation


  • 97 Views
  • Updated On :

TREES. Gorilla. Human. Chimp. Human. Chimp. Gorilla. Trees. =. Gorilla. Chimp. Human. =. =. Chimp. Human. Gorilla. s1. s1. s2. s2. s3. s3. s4. s4. s5. s5. Same thing…. =. Terminology. A branch = An edge. The root. Internal nodes. Chicken. Gorilla. Human. Chimp.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'TREES' - bessie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Trees

Gorilla

Human

Chimp

Human

Chimp

Gorilla

Trees

=

Gorilla

Chimp

Human

=

=

Chimp

Human

Gorilla


Trees

s1

s1

s2

s2

s3

s3

s4

s4

s5

s5

Same thing…

=


Trees

Terminology

A branch =An edge

The root

Internal nodes

Chicken

Gorilla

Human

Chimp

External node - leaf


Trees

אלו מהמשפטים הבאים נכון, בהתייחס לעץ הנ"ל?

א. האדם והגורילה יותר קרובים זה לזה מהשימפנזה והגורילה.

ב. האדם קרוב לתרנגולת ולברווז באותה מידה.

ג. התרנגולת יותר קרובה לגורילה מהאדם.

ד. א'+ב'.

ה. א'+ג'.

ו. ב'+ג'.

ז. א'+ב'+ג'.

ח. אף תשובה אינה נכונה.

תרגיל


Trees

The maximum parsimony principle. לעץ הנ"ל?

Tree building


Trees

Tree building לעץ הנ"ל?


Trees

Tree building לעץ הנ"ל?

Evaluate this tree…

s2

s1

s4

s3

s5


Trees

Tree building לעץ הנ"ל?

Gene number 1

1

0

1

s1

s4

s3

s2

s5

1

1

1

0

0


Trees

Tree building לעץ הנ"ל?

1

0

Gene number 1, Option number 1.

1

1

s1

s4

s3

s2

s5

1

1

1

0

0


Trees

Tree building לעץ הנ"ל?

0

1

0

1

1

1

1

0

0

Gene number 1, Option number 2.

s1

s4

s3

s2

s5

Number of changes for gene 1 (character 1) = 1


Trees

Tree building לעץ הנ"ל?

0

1

0

1

0

1

1

0

0

Gene number 2, Option number 1.

s2

s1

s4

s3

s5


Trees

Tree building לעץ הנ"ל?

1

1

0

1

0

1

1

0

0

Gene number 2, Option number 2.

s2

s1

s4

s3

s5


Trees

Tree building לעץ הנ"ל?

0

0

0

0

0

1

1

0

0

Gene number 2, Option number 3.

s2

s1

s4

s3

s5

Number of changes for gene 2 (character 2) = 2


Trees

Tree building לעץ הנ"ל?

0

0

1

0

0

0

0

1

1

Gene number 3, Option number 1.

s2

s1

s4

s3

s5


Trees

Tree building לעץ הנ"ל?

1

0

1

0

0

0

0

1

1

Gene number 3, Option number 2.

s2

s1

s4

s3

s5

Number of changes for gene 3 (character 3) = 1


Trees

Tree building לעץ הנ"ל?

1

1

1

1

1

1

0

0

1

Gene number 4, Option number 1.

s2

s1

s4

s3

s5


Trees

Tree building לעץ הנ"ל?

0

0

0

1

1

1

0

0

1

Gene number 4, Option number 2.

s2

s1

s4

s3

s5

Number of changes for gene 4 (character 4) = 2


Trees

Tree building לעץ הנ"ל?

Gene number 5 is the same as Gene number 4

Number of changes for gene 5 (character 5) = 2


Trees

Tree building לעץ הנ"ל?

0

0

0

0

0

1

0

0

0

Gene number 6, 1 option only:

s2

s1

s4

s3

s5

Number of changes for gene 6 (character 6) = 1


Trees

Tree building לעץ הנ"ל?

Sum of changes

Number of changes for gene 1 (character 1) = 1

Number of changes for gene 2 (character 2) = 2

Number of changes for gene 3 (character 3) = 1

Number of changes for gene 4 (character 4) = 2

Number of changes for gene 5 (character 5) = 2

Number of changes for gene 6 (character 6) = 1

Sum of changes for this tree topology = 9

Can we do better ???


Trees

Tree building לעץ הנ"ל?

The MP (most parsimonious) tree:

s2

s1

s4

s3

s5

Sum of changes for this tree topology = 8



Trees

The Fitch algorithm (1971): לעץ הנ"ל?

{A,C}

U

{A,C,G}

U

{A,G}

U

{A,C}

U

C

A

C

A

G

Chicken

Duck

Gorilla

Human

Chimp

Postorder tree scan. In each node, if the intersection between the leaves is empty: we apply a union operator. Otherwise, an intersection.


Trees

Number of changes לעץ הנ"ל?

{A,C}

U

{A,C,G}

U

{A,G}

U

{A,C}

U

C

A

C

A

G

Chicken

Duck

Gorilla

Human

Chimp

Total number of changes = number of union operators => 3 in this case.


Trees

תרגיל לעץ הנ"ל?

CAAG

GAAA

GCGA

GACA

GGGA

Chicken

Duck

Gorilla

Human

Chimp

Find minimum number of changes.


Trees

Chimpanzee לעץ הנ"ל?

Gorilla

Human


Trees

Human לעץ הנ"ל?

ACTAG

Chimp

ACAAC

Gorilla

AAAAT

Gorilla

Chimp

Human

U

Position 1 A A A

0

Position 2 A C C

1

Position 3 A A T

1

Position 4 A A A

0

Position 5 T C G

2

4


Trees

Human לעץ הנ"ל?

ACTAG

Chimp

ACAAC

Gorilla

AAAAT

Gorilla

Chimp

Human

U

Position 1 A A A

0

Position 2 A C C

1

Position 3 A A T

1

Position 4 A A A

0

Position 5 T C G

2

4


Trees

Human לעץ הנ"ל?

ACTAG

Chimp

ACAAC

Gorilla

AAAAT

Chimp

Gorilla

Human

U

Position 1 A A A

0

Position 2 C A C

1

Position 3 A A T

1

Position 4 A A A

0

Position 5 C T G

2

4


Trees

Human לעץ הנ"ל?

ACTAG

Chimp

ACAAC

Gorilla

AAAAT

Chimp

Gorilla

Human

Gorilla

Chimp

Human

Gorilla

Chimp

Human

These 3 trees will ALWAYS get the same score



Trees

A general observation: the position of the root does not affect the MP score.

E

A

D

D E C A B

B

C

A B C E D

A B C E D


Trees

1 affect the MP score.

0

Intuition as to why rooting does not change the score.

1

1

s1

s4

s3

s2

s5

1

1

1

0

0

The change will always be on the same branch, no matter where the root is positioned…


Trees

Which is not a rooted version of this tree? affect the MP score.

E

T3

תרגיל

A

D

C E D A B

B

C

T1

T2

A B D E C

A B C D E


Trees

Gorilla gorilla affect the MP score.

(Gorilla)

Pan troglodytes (Chimpanzee)

Homo sapiens (human)

Gallus gallus (chicken)


Trees

Human affect the MP score.

Human

Human

Chicken

Chimp

Chimp

Gorilla

Chicken

Gorilla

Chimp

Gorilla

Chicken

Evaluate all 3 possible UNROOTED trees:

MP tree


Trees

Rooting based on a priori knowledge: affect the MP score.

Human

Chicken

Gorilla

Chimp

Chicken

Gorilla

Human

Chimp


Trees

Ingroup / Outgroup: affect the MP score.

Chicken

Gorilla

Human

Chimp

OUTGROUP

INGROUP


Trees

Chicken affect the MP score.

Duck

Gorilla

Human

Chimp

Subtrees

A subtree


Trees

Monophyletic groups affect the MP score.

Chicken

Gorilla

Human

Chimp

The Gorilla+Human+Chimp are monophyletic.A clade is a monophyletic group.


Trees

Paraphyletic = Non-monophyletic groups affect the MP score.

Whale

Chimp

Drosophila

Zebrafish

The Zebrafish+Whale are paraphyletic


Trees

When an unrooted tree is given, you cannot know which groups are monophyletic. You can only say which are not.

Human

Chicken

Rat

Gorilla

Chimp

Chicken + Rat seems to be monophyletic but they are not, since the root of the tree is between Chicken and the rest.

Human and Gorilla are not monophyletic no matter where the root is…


Trees

HOW MANY TREES are monophyletic. You can only say which are not.


Trees

a are monophyletic. You can only say which are not.

b

a

b

c

b

a

a

c

c

b

a

c

a

c

b

b

c

a

d

b

d

d

b

b

c

a

b

b

c

a

a

a

a

a

d

d

c

d

d

c

c

b

d

d

b

c

c

b

c

b

a

a

b

d

a

d

c

d

c

b

d

c

b

b

d

d

c

a

a

a

TR = “TREE ROOTED”

How many rooted trees

N=2, TR(2) = 1

N=3, TR(3) = 3

N=4, TR(4) = 15


Trees

a are monophyletic. You can only say which are not.

b

c

a

b

b

a

c

d

d

b

c

a

TR = “TREE ROOTED”

How many rooted trees

c

c

c

2 branches. 3 possible places to add “c”

4 branches. 5 possible places to add “d”

6 branches. 7 possible places to add “e”

The number of branches is increased by 2 each time. The number of branches is an

arithmetic series.

0,2,4,6,8,…. A(n) = A(1)+(n-1)d. A(1) = 0; d=2. => A(n) = (n-1)*2 = 2n-2


Trees

a are monophyletic. You can only say which are not.

b

TR = “TREE ROOTED”

How many rooted trees

The number of branches is increased by 2 each time. The number of branches is an

arithmetic series.

0,2,4,6,8,…. A(n) = A(1)+(n-1)d. A(1) = 0; d=2. => A(n) = (n-1)*2 = 2n-2

c

c

c

2 branches. 3 possible places to add “c”

Each time we can add a new branch in Br(n)+1 places. [Br(n)=number of branches]

[Tr(n)=number of trees with n sequences]

TR(n+1) = TR(n)*(BR(n)+1)=TR(n)*(2n-1)

TR(5) = TR(4)*7=TR(3)*5*7=TR(2)*3*5*7=1*3*5*7

TR(n) = 1*3*5*7*…..*(2n-3)


Trees

TR = “TREE ROOTED” are monophyletic. You can only say which are not.

How many rooted trees

n!=1*2*3*4*5*6…..*n = n factorial.

TR(n) = 1*3*5*7*…..*(2n-3) =

1*2*3*4*5*6*7*…*(2n-3)

=

2*4*6*8*….*(2n-4)

1*2*3*4*5*6*7*…*(2n-3)

=

(2*1)*(2*2)*(2*3)*(2*4)*….*(2*(n-2))

(2n-3)!

=

(2(n-2))*(1*2*3*4*….(n-2))

(2n-3)!

=

(2(n-2))*(n-2)!


Trees

TR = “TREE ROOTED” are monophyletic. You can only say which are not.

How many rooted trees

TR(n) = 1*3*5*7*…..*(2n-3) =

=(2n-3)!!

(2n-3)!

=

(2(N-2))*(n-2)!


Trees

HEURISTIC SEARCH are monophyletic. You can only say which are not.


Trees

There are many trees.., are monophyletic. You can only say which are not.

We cannot go over all the trees. We will try to find a way to find the best tree.

These are approximate solutions…


Trees

Finding the maximum is the same thing as finding the minimum are monophyletic. You can only say which are not.

Say we have a computer procedure that given a function, it finds its minimum, and

we want to find the maximum of a function f(x). We can just find the minimum of

-f(x) and this is minus the maximum of f(x).

Example.

f(0) = 3; f(1) = 7; f(2) = -5; f(3) = 0; max f(x) = 7. argmax f(x) = 1;

-f(0)=-3; -f(1) = -7; -f(2) = 5; -f(3) =0; min(-f(x)) = -7. argmax –(f(x) = 1;


Trees

Score = 1700 are monophyletic. You can only say which are not.


Trees

Score = 1825 are monophyletic. You can only say which are not.

Score = 1700

Score = 1710

Score = 1695

Score = 1410


Trees

Score = 1828 are monophyletic. You can only say which are not.

Score = 1825

Score = 1910

Score = 1800


Trees

Max score = 2900 are monophyletic. You can only say which are not.


Trees

Problem number 1: local maximum are monophyletic. You can only say which are not.

Score = 3100

Global max

Score = 2900

Local max

Score = 2100


Trees

This algorithm is “greedy” – it seizes the first improvement encountered.

One way to avoid local maxima is to start from many random starting points


Trees

Option 1 improvement encountered.

Several options to define a neighbor.

Option 2


Trees

B improvement encountered.

C

D

D

C

B

B

C

A

A

A

D

Nearest-neighbor interchange

Each internal branch

defines two neighbors


Trees

How many neighbors do we check each time? improvement encountered.

B

C

Internal branches

A

NNI is possible only

in internal branches

D

External branches

E

For unrooted trees of n taxa, we have 2n-3 branches. However, only internal branches are interesting, thus we have n-3. Each defines two neighbors, thus the total number of neighbors in each NNI cycle is 2n-6.


Trees

I am greedy improvement encountered.


Trees

Greedy variants improvement encountered.

  • Most greedy: Start searching your neighbors. If you find something better – move there, and start the search again.

  • Just greedy: Check ALL your neighbors. Move to the one that is the highest.

  • Smart greedy: Try all NNI of trees that are tied for the best score.

There are many other variants of the greedy search

that would not be discussed in this course.


Trees

likelihood improvement encountered.


Trees

  • Parsimony has many shortcomings. To name a few: improvement encountered.

  • All changes are counted the same, which is not true for biological systems (Leu->Ile is much more likely than Leu->His).

  • Cannot take biological context into account (secondary structures, dependencies among sites, evolutionary distances between the analyzed organisms, etc).

  • Statistical basis questionable.


Trees

Alternative: improvement encountered.

MAXIMUM-LIKELIHOOD METHOD.


Trees

Maximum likelihood uses a probabilistic model of evolution improvement encountered.

Each amino acid has a certain probability to change and this probability depends on the evolutionary distances.

Evolutionary distances are inferred from the entire set of sequences.


Trees

Evolutionary distances improvement encountered.

Positions can be conserved because of two reasons. Either because of functional constraints, or because of short evolutionary time.

5 replacements in 10 positions between 2 chimps, is considered very variable. 5 replacements between human, and cucumber, is not considered that variable…

Maximum likelihood takes this information into account.


Trees

The likelihood computations improvement encountered.

X

t2

t1

Y

t4

t3

Z

t6

t5

K

M

A

C

We can infer the phylogenetic tree using maximum likelihood. This is more accurate than maximum parsimony.


Trees

Maximum likelihood tree reconstruction improvement encountered.

This is incredibly difficult (and challenging) from the computational point of view, but efficient algorithms to find approximate solutions were developed.



Trees

Human Immunodeficiency Virus (HIV) improvement encountered.

The virus = HIV

The disease = AIDS (Aquired Immunodeficiency Syndrome)

First recognized clinically in 1981

By 1992, it had become the major cause of death in individuals 25-44 years of age in the States.


Trees

HIV Statistics improvement encountered.

  • Till Dec 2007: 25 million people died of AIDS (20 million in 2002)

  • People living with HIV/AIDS in 2007 33.2 million

  • Africa has 12 million AIDS orphans (2007). 1 out of 3 children in some areas lost at least one of his/her parents


Trees

HIV is a lentivirus improvement encountered.

Species = HIV

Genus = Lentiviruses

Family = Retroviridae

Lentiviruses have long incubation time, and are thus called “slow viruses”.


Trees

HIV-1 and HIV-2 improvement encountered.

In 1986, a distinct type of HIV prevalent in certain regions of West Africa was discovered and was termed HIV type 2.

Individuals infected with type 2 also had AIDS, but had longer incubation time and lower morbidity (# of cases/population size).


Hiv subtypes
HIV subtypes improvement encountered.


Hiv subtypes1
HIV subtypes improvement encountered.

published by the International AIDS Vaccine Initiative


Trees

Five lines of evidence have been used to substantiate zoonotic transmission of primate lentivirus:

1. Similarities in viral genome organization;

2. Phylogenetic relatedness;

3. Prevalence in the natural host;

4. Geographic coincidence;

5. Plausible routes of transmission.


Trees

For HIV-2, a virus (SIVsm) that is genomically indistinguishable and closely related phylogenetically was found in substantial numbers of wild-living sooty mangabeys whose natural habitat coincides with the epicenter of the HIV-2 epidemic


Trees

מנגבי, קוף ארוך זנב מסוג סרקוסבוס מצוי באזורי היערות של אפריקה


Trees

Close contact between sooty mangabeys and humans is common because these monkey are hunted for food and kept as pets.

No fewer than six independent transmissions of SIVsm to humans have been proposed.

The origin of HIV-1 is much less certain.


Trees

HIV and SIV tree based on maximum parsimony because these monkey are hunted for food and kept as pets.

1990


Trees

Virus A because these monkey are hunted for food and kept as pets.

Primate A

Primate B

Virus B

Primate C

Virus C

Host-pathogen co-evolution in other SIV

This tree can be explained by co-evolution of virus and host.


Trees

Phylogenetic tree because these monkey are hunted for food and kept as pets.

1999

There are at least two different HIV-1 clades, and two different SIVcpz clades


Trees

2006. Nature because these monkey are hunted for food and kept as pets.


Trees

The origin of HIV-O because these monkey are hunted for food and kept as pets.

“We tested 378 chimpanzees and 213 gorilla fecal samples from remote forest regions in Cameroon for HIV-1 cross-reactive antibodies”

“Surprisingly, 6 of 213 fecal samples from wild-living gorillas also gave a positive HIV-1 signal”


Trees

Bayesian analysis because these monkey are hunted for food and kept as pets.

HIV-1 O is a sister clade of SIV from Gorilla!


Trees

The origin of HIV-O because these monkey are hunted for food and kept as pets.

It seems that chimpanzee transmitted SIV to gorilla and gorilla to human type O, or

Chimpanzee transmitted to both gorilla and to human type O

Note: gorilla and chimps rarely interact + gorilla are herbivores

?


Trees

Thank You… because these monkey are hunted for food and kept as pets.

Thanks

תודה