1 / 19

Building Phylogenies

Building Phylogenies. Maximum Likelihood. Methods. Distance-based Parsimony Maximum likelihood. Methods. Distance-based Parsimony Maximum likelihood. ML is based on a Markov model of evolution. Observed: The species labeling the leaves Hidden: The ancestral states

Download Presentation

Building Phylogenies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Phylogenies Maximum Likelihood

  2. Methods • Distance-based • Parsimony • Maximum likelihood

  3. Methods • Distance-based • Parsimony • Maximum likelihood

  4. ML is based on a Markov model of evolution • Observed: The species labeling the leaves • Hidden: The ancestral states • Transition probabilities: The mutation probabilities • Assumptions: • Only mutations are allowed • Sites are independent

  5. Models of evolution at a site • Transition probability matrix:M = [mij], i, j{A, C, T, G}where mij = Prob(ij mutation in 1 time unit) • Branches may have different lengths

  6. The probability of an assignment T G T A G C T Probability = mTG ·mGA · mGG · mTT · mTC · mTT

  7. Ancestral reconstruction: most likely assignment X Y Z A G C T L* = maxX,Y,Z {mXY · mYA · mYG· mXZ· mZC· mZT} Compute using Viterbi algorithm

  8. Likelihood of a tree X Y Z A G C T L* = X,Y,Z {mXY · mYA · mYG· mXZ· mZC· mZT} Compute using forward algorithm

  9. Analyzing a site

  10. Analysis for site j

  11. Analysis for all sites Use enumeration (exhaustive, branch and bound, branch swapping, etc.) to find ML tree

  12. Comments • ML is robust • ML converges to correct answer as more data is added • Can put in a Bayesian statistical framework, to obtain a distribution of possible phylogenies • ML can be slow

  13. Complicating factors

  14. Issues • Complicating factors: • Gene duplication • Horizontal gene transfer: Exchange of genetic material between species • Chimeric genes • Evolution may not be described by a tree, but by a network

  15. Gene Duplication  1 2    5’ 3’ human -globin

  16. Homology, orthology, and paralogy • Homology:Similarity attributed to descent from a common ancestor. • Orthologous sequences: Homologous sequences in different species that arose from a common ancestral gene during speciation • May or may not be responsible for a similar function • Paralogous sequences: Homologous sequences within a single species that arose by gene duplication.

  17. Orthology and Paralogy http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/Orthology.html

  18. A A B C C B Conflicts between genes y species? Species Genes

  19.    A B C B A A B C C Resolving the conflict Problem: Resolve conflicts using the minimum number of duplications

More Related