maximum likelihood ml method n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Maximum likelihood (ML) method PowerPoint Presentation
Download Presentation
Maximum likelihood (ML) method

Loading in 2 Seconds...

play fullscreen
1 / 22

Maximum likelihood (ML) method - PowerPoint PPT Presentation


  • 198 Views
  • Uploaded on

Maximum likelihood (ML) method. Jarno Tuimala Thanks to James McInerney for the slides with a darker background!. Maximum likelihood. Historically a new method (Felsenstein, 1980’s) ML assumes a model of sequence evolution

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Maximum likelihood (ML) method' - bethany-ellison


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
maximum likelihood ml method

Maximum likelihood (ML) method

Jarno Tuimala

Thanks to James McInerney for the slides with a darker background!

maximum likelihood
Maximum likelihood
  • Historically a new method (Felsenstein, 1980’s)
  • ML assumes a model of sequence evolution
  • Using the model, ML method tries to answer the question: what is the likelihood (conditional probability) of observing these data given a certain model
maximum likelihood goal
Maximum Likelihood - goal
  • To estimate the probability that we would observe a particular dataset, given a phylogenetic tree and some notion of how the evolutionary process worked over time.

)

(

Probability of

given

probability of observing a sequence
Probability of observing a sequence
  • What is the probability of observing a sequence ACGT, if
    • p(a)=p(c)=p(g)=p(t)=0.25 ?
    • Assumption: sequence sites evolve independently
  • P(ACGT) = p(a)*p(c)*p(g)*p(t)

= 0.25*0.25*0.25*0.25

= 0.00390625

  • LogP = log(0.00390625) = -5.545177
substitution matrix
Substitution matrix
  • For nucleotide sequences, there are 16 possible ways to describe substitutions - a 4x4 matrix.

Convention dictates that the order of the nucleotides is a,c,g,t

Note: for amino acids, the matrix is a 20 x 20 matrix and for codon-based models, the matrix is 61 x 61

does changing a model affect the outcome
Does changing a model affect the outcome?
  • There are different models
  • Jukes and Cantor (JC69):
    • All base compositions equal (0.25 each), rate of change from one base to another is the same
  • Kimura 2-Parameter (K2P):
    • All base compositions equal (0.25 each), different substitution rate for transitions and transversions).
  • Hasegawa-Kishino-Yano (HKY):
    • Like the K2P, but with base composition free to vary.
  • General Time Reversible (GTR):
    • Base composition free to vary, all possible substitutions can differ.
  • All these models can be extended to accommodate invariable sites and site-to-site rate variation.
probability of observing a sequence change 1 2
Probability of observing a sequence change 1/2
  • Alignment: ACCT

GCCT

  • Change probabilities (Jukes-Cantor, μ=0.1):
  • Tree: ACCT – GCCT
  • Nucleotide frequences: p(a)=p(c)=p(g)=p(t)=0.25
probability of observing a sequence change 2 2
Probability of observing a sequence change 2/2
  • P(ACCT, GCCT) =

∏ik (frequency*change probability)

  • P(ACCT, GCCT) = 0.25*0.0062*0.25*0.9815*0.25*0.9815*

0.25*0.9815 = 0.00002289932

  • Log(P(ACCT, GCCT)) = -4.64
different branch lengths
Different Branch Lengths
  • For very short branch lengths, the probability of a character staying the same is high and the probability of it changing is low (for our particular matrix).
  • For longer branch lengths, the probability of character change becomes higher and the probability of staying the same is lower.
  • The previous calculations are based on the assumption that the branch length describes one Certain Evolutionary Distance or CED.
  • If we want to consider a branch length that is twice as long (2 CED), then we can multiply the substitution matrix by itself (matrix2).

X

=

invariable sites
Invariable sites
  • For a given dataset we can assume that a certain proportion of sites are not free to vary - purifying selection (related to function) prevents these sites from changing).
  • We can therefore observe invariable positions either because they are under this selective constraint or because they have not had a chance to vary or because there is homoplasy in the dataset and a reversal (say) has caused the site to appear constant.
  • The likelihood that a site is invariable can be calculated by incorporating this possibility into our model and calculating for every site the likelihood that it is an invariable site.
  • It might improve the likelihood of the dataset if we remove a certain proportion of invariable sites (in a way that is analogous to the preceding discussion).
variable sites
Variable sites
  • Obviously other sites in the dataset are free to vary.
  • Selection intensity on these sites is rarely uniform, so it is desirable to model site-by-site rate variation.
  • This is done in two ways:
    • site specific (codon position, or alpha helix etc.)
    • using a discrete approximation to a continuous distribution (gamma distribution).
  • Again, these variables are modeled over all possibilities of sequence change over all possibilities of branch length over all possibilities of tree topology.
incorporating gamma 1 2
Incorporating gamma 1/2
  • Alignment: ACCT

GCCT

  • Change probabilities (Jukes-Cantor, μ=0.1):
  • Tree: ACCT – GCCT
  • Nucleotide frequences: p(a)=p(c)=p(g)=p(t)=0.25
  • Two gamma classes, p(g1)=0.8, p(g2)=0.2
incorporating gamma 2 2
Incorporating gamma 2/2
  • P(ACCT, GCCT) =
  • (0.25*0.0062*0.8 + 0.25*0.0062*0.2)* (0.25*0.9815*0.8 + 0.25*0.9815*0.2)* (0.25*0.9815*0.8 + 0.25*0.9815*0.2)* (0.25*0.9815*0.8 + 0.25*0.9815*0.2) =

0.00002289932

  • Log(P(ACCT, GCCT)) = -4.64
  • Using gamma, more calculations are done, and more time is consumed
selecting the correct model 1 4
Selecting the correct model 1/4
  • It was previously pointed out that parsimony can be inconsistent.
  • ML can be inconsistent too!
  • If the model used in the ML analysis is incorrect, the method might become inconsistent.
  • Before analysis, the correct model should be selected.
practical issues
Practical issues
  • There is an ML equivalent to Wagner method for generating initial trees, but it is very slow.
  • Many programs create an initial tree using parsimony or distance methods or use a completely random tree.
  • Search strategy is similar to parsimony:
    • 100 RAS + TBR for small dataset
    • In addition, simulated annealing can be used for larger datasets