1 / 49

Probabilistic Approaches to Phylogeny

Probabilistic Approaches to Phylogeny. Wouter Van Gool & Thomas Jellema. Probabilistic Approaches to Phylogeny. Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter

Download Presentation

Probabilistic Approaches to Phylogeny

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema

  2. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  3. 8.1 Introduction Goal: • Formulate probabilistic models for phylogeny • Infer trees from sets of sequences Aim Probability-based Phylogeny: Rank trees according to - likelihood P(data |tree) - posterior probability P(tree|data)

  4. 8.1 Introduction Compute probability of a set of data given A tree: P(x* |T, t* ) x*: set of n sequences xj (j=1…n) T : tree with n leaves, with sequence j at leaf j t* : edge lengths of the tree

  5. 8.1 Introduction Example

  6. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  7. 8.2 Probabilistic Models of Evolution Given the sequence at the leafs x1…xn: • Pick a model of evolution: P(x |y,t),P(x) • Enumerate all possible tree topologies with n leaves • For each T, maximize over all possible edge lengths t: • Pick the T and t that have the largest probability

  8. 8.2 Probabilistic Models of Evolution Simplifying Assumptions: • Single base substitions only: ungapped alignments only • Each base evolves independently with the same model of evolution based on a substitution matrix

  9. 8.2 Probabilistic Models of Evolution Substitution Matrix for Phylogeny Many important families of substitution matrices are multiplicative: S(t)S(s) = S(T+s) Substitution matrices used in Phylogeny: • Jukes & Cantor Model [1969] • Kimura DNA Model [1980] • PAM Matrix [1978]

  10. 8.2 Probabilistic Models of Evolution Jukes-Cantor Model

  11. 8.2 Probabilistic Models of Evolution Kimura DNA model

  12. 8.2 Probabilistic Models of Evolution PAM matrix model

  13. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  14. 8.3 Calculating the likelihood for ungapped alignments Example: The likelihood of two nucleotide sequences

  15. 8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required

  16. 8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required

  17. 8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm Define a table of probabilities Fk,a for each site u and all tree nodes k and input characters a: = probability at a site u for subtree below node k assuming character u at node k is a

  18. 8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm

  19. 8.3 calculating the likelihood for ungapped alignments Likelihood for general case Overall algorithm: • Enumerate each tree topology t • Enumerate sets of values t (using some n-dimensional optimisation technique) • Run Felsenstein’s recursive algortihm for each site u and multiply likelihoods • Return best T&t

  20. 8.3 calculating the likelihood for ungapped alignments Reversibility & independence of root position • The score of the optimal tree is independent of the root position if and only if: - the substitution matrix is multiplicative - the substitution matrix is reversible • A substititution matrix is reversible if for all a,b and t:

  21. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  22. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  23. Demo

  24. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  25. 8.4 Using the likelihood for inference Maximum likelihood: • The best tree “could be “ the tree that maximises the likelihood • Computationally demanding

  26. 8.4 Using the likelihood for inference Sampling from the posterior distribution: • We use Bayes’ rule to compute the posterior probability • This is the probability of a model given the data

  27. 8.4 Using the likelihood for inference Example Model name prior chance of model data Model 1 10 100% A Model 2 40 50% A 50% B Model 3 50 100% B

  28. 8.4 Using the likelihood for inference Sampling from the posterior distribution: • We use Bayes’ rule to compute the posterior probability • This is the probability of a model given the data 33 100 10 30

  29. 8.4 Using the likelihood for inference Metropolis algorithm • It samples from the trees with probabilities given by their posterior distribution. • It is a sampling procedure that generates a sequence of trees, each from the previous one.

  30. 8.4 Using the likelihood for inference Metropolis algorithm

  31. 8.4 Using the likelihood for inference A proposal distribution 4 2 7 5 Time from root 6 3 8 1 Order of traversal

  32. 8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal

  33. 8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal

  34. 8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal

  35. 8.4 Using the likelihood for inference Metropolis algorithm 4 2 7 5 Time from root 6 3 8 1 Order of traversal

  36. 8.4 Using the likelihood for inference Metropolis algorithm

  37. 8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATC AATT

  38. 8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATC AATC AATT

  39. 8.4 Using the likelihood for inference Other phylogenetic uses of sampling AATT TTAA

  40. 8.4 Using the likelihood for inference Other phylogenetic uses of sampling AAAA AATC TCAA AATC AATT TTAA TCAA

  41. 8.4 Using the likelihood for inference Other phylogenetic uses of sampling • Inferring the history of populations Probability density of a coalesence in time = Probability of a coalesence between any pair = * =

  42. 8.4 Using the likelihood for inference Inferring the history of populations • When the value of n is large and the value of p is close to 0 the binomial distribution with parameters n and p can be approximated by a Poisson distribution with mean n*p n*p = = and x = 1 The probability of a coalesence at the end of the period tk The total probability of the tree

  43. 8.4 Using the likelihood for inference The bootstrap • The bootstrap can give a approximation to the posterior. • To much labour, so it is an unattractive alternative for sampling. • The bootstrap is probably more useful for non-probabilistic tree building methods.

  44. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  45. Demo

  46. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  47. Probabilistic Approaches to Phylogeny Conclusion • The methods of today can be used to find the most probable tree. • Most of the methods were computationally demanding • More realistic evolutionary models are explained Thursday

  48. Probabilistic Approaches to Phylogeny Contents • Introduction/Overview Wouter • Probabilistic Models of Evolution Wouter • Calculating the Likelihood Wouter • Pause • Evolution Demo Thomas • Using the likelihood for inference Thomas • Phylogeny Demo Thomas • Summary/Conclusion Thomas • Questions

  49. Probabilistic Approaches to Phylogeny Questions????

More Related