1 / 10

Molecular phylogenetics 4

Molecular phylogenetics 4. Level 3 Molecular Evolution and Bioinformatics Jim Provan. Page and Holmes: Sections 6.7-8. Have we got the true tree?. Several approaches developed to answer this question: Analysis:

montel
Download Presentation

Molecular phylogenetics 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 6.7-8

  2. Have we got the true tree? • Several approaches developed to answer this question: • Analysis: • In some cases (e.g. UPGMA) the phylogenetic method is simple enough that we can establish mathematically the exact conditions under which it will fail • Parsimony can fail under particular distribution of edge lengths • Known phylogenies • Best evidence for success of a tree-building method would be if it could accurately reconstruct a known phylogeny • Typically, only “known” phylogenies exist for crop plants and laboratory animals and even these are often suspect • Growth of bacteriophage T7 in the presence of mutagens allowed comparison of tree building methods

  3. Have we got the true tree? • Several approaches (continued): • Simulation: • Provide software with a tree and “evolve” DNA sequences along branches according to some model • Supply the resulting sequences for a range of tree-building methods and determine which (if any) recover the original tree • An advantage of this approach is that we can explore the effects of a wide range of parameters on the performance of tree reconstruction methods • A disadvantage is that the models used to generate the new sequences may be unrealistic, particularly in biasing the model towards a particular method

  4. UPGMA Parsimony The “Felsenstein Zone”

  5. Congruence • Congruence is the agreement between estimates of phylogeny based on different characters: • If data sets are independent, the probability of obtaining similar trees is extremely small • Conversely, if different data sets give similar trees then this suggests that both reflect the same underlying cause, namely they reflect the same evolutionary history • Two ways of using congruence: • To validate a method of inference: a method that constantly recovers similar trees from different data sets will be preferred to a method that produces different trees from different data sets • To validate a new source of data: does a newly sequenced gene contain phylogenetic information?

  6. Sampling error • If a data set contains homoplasy then different nucleotide sites support different trees: • Which tree(s) a given data set supports depends on which characters have been sampled • Estimates of phylogeny based on samples will be accompanied by sample error • Effects of sampling error evident by comparing trees for different mitochondrial genes: • Since there is no recombination, all mitochondrial genes share the same evolutionary history • Several different trees were obtained • Sampling of taxa is also important

  7. Bootstrapping • Bootstrapping is a way of calculating sampling error without taking repeated samples from the population / species under study: • Mimics the technique of repeated sampling from the original population by resampling from the original sample • Each resampling is a pseudoreplicate • Bootstrapping can be applied to phylogenetics by taking several pseudoreplicates: • Sampling with replacement gives a new data set based on the original sample: • Some sites represented more than once • Some sites not represented at all • Pseudoreplicate can be used to construct a new tree

  8. Original tree Bootstrap tree Bootstrapping 1 2 3 4 5 6 7 8 9 HumanT C C T T A A A A ChimpT T C T A T A A A GorillaT T A C A A T A A Orang-utanC C A C A A A T A GibbonC C A C A A A A T 2 7 7 3 1 7 4 9 6 C A A C T A T A A C A A C T A T A T A T T A T T C A A A A A A C A C A A A A A A C A C T A

  9. C H H G G C B B B H C G O O O 41/100 28/100 31/100 Bootstrapping

  10. What can go wrong? • Sampling error: • Almost all phylogenies are based on a sample of some sort • Especially true given the vagaries of homoplasy • Incorrect model of sequence evolution: • All methods make implicit or explicit assumptions about evolutionary process • Example is problem of base composition: • An AT rich part of a gene may be more similar to an AT rich part of a different gene purely by chance • Tree structure: • Evolutionary history is not always simple: • Rapid cladogenesis • Widely differing rates of divergence • Horizontal gene transfer

More Related