Loading in 2 Seconds...
Loading in 2 Seconds...
Molecular Clocks and HIV-1 Polly R. Walker D. Phil Student Dept of Zoology, University of Oxford Summary of Talk Molecular clocks Measurably Evolving Populations (MEPs) Methods for measuring evolution Coalescent theory Application of the molecular clock Estimating divergence times
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Polly R. Walker
D. Phil Student
Dept of Zoology, University of Oxford
(per site, per year)
~ 1 x 10
Mammalian nuclear DNA
3.5 x 10
Plant nuclear DNA
~ 5 x 10
~5 x 10
1.5 x 10
Mammalian mitochondrial DNA
5.7 x 10
6.6 x 10
• For natural selection to produce a molecular clock population sizes, selection pressures, and mutation rates must be constant over evolutionary time.
How true is THAT for HIV?Constant Molecular Clocks are Difficult toObtain Under Natural Selection
• The rate of substitution of mutations with selective advantage depends on;
i. effective population size (4Ne)
ii. degree of selective advantage (s)
iii. mutation rate (m)
k = 4Nesm
• So, is there a good molecular clock?
• There are a variety of ways to test the molecular clock.
i. The dispersion index, R(t)
ii. The relative rate test
iii. The Likelihood Ratio test using ML statistics.
•Likelihood Ratio Test: The differences in log likelihood can be compared directly
LRT = Chidist 2(ABSlnL), df (n-2)
(not significantly different in this case - primate mitochondrial DNA)
Population is heterochronously sampled, spanning hundreds or thousands of generations, and contain a significant amount of genetic variation.
Hence, this typically includes either
1. Organisms with rapid evolution and small generation time
e.g, RNA viruses
2. Organisms with a wide range of sampling dates of dates
e.g ancient DNA samples
• RNA viruses often have different sampling times. Small differences can have big effects.
2000Maximum Likelihood Estimation of Viral Substitution Rates
Programme “Tip-Date” or “Rhino”
• Construct rooted maximum likelihood tree
• Optimise branch lengths under a single rate with relative tip positions consistent with isolation dates
• Test molecular clock using a likelihood ratio test
• Estimate confidence intervals
• Changes in population size affect the distribution of coalescent times (i.e. when in time branching events occur).
• In a constant sized population more coalescent events occur near the tips than the root, but in a growing population coalescent events more towards the root because the population size is smaller so that coalescent events are more likely (i.e. drift is more powerful in small populations).
• Therefore possible to distinguish continually large populations, from those that have only recently grown in size.
• Constant size (endemic) population;
- 1 parameter, population size (N)
• Exponentially growing (epidemic) population;
2 parameters, current (N0)
and rate of growth (r)
• More complex models:
- logistic (growth slows down toward the present)
- expansion (sudden change in growth rate)
• Estimate all parameters (e.g. N0, r) from tree structure
Can compare these nested models using the likelihood ratio test
A) Lineages coalesce independently
B) No more than one coalescent event can occur in a single generation
C) The time-scale is so large that it can be represented as continuous
• Works best for neutral mutations subject to genetic drift innon-recombiningpopulations - i.e. in this case any change in the structure of the genealogy must be due to demographic processes, rather than fitness differences (i.e. fit alleles produce more branches).
Estimating Demographic History of HIV-1 Subtype C
Step 1 Sequence selection
Example: CgagSR - ntax = 29, nchar = 1659
1993: C.IN.93.N904 C.IN.93.IN905 C.IN.93.IN101 C.IN.93.IN99,
1995: C.IN.95.IN21068 C.IN.95.IN21301
1996: C.BW.96.BW17B05 C.BW.96.BWM032 C.BW.96.BW0504 C.BW.96.BW1626 C.ZM.96.ZM651 C.ZM.96.ZM751
1998: C.TZ.98.TZ013 C.TZ.98.TZ017 C.ZA.98.TV001 C.ZA.TV002
1999: C.ZA.99.DU151 C.ZA.99.DU179 C.BW.99.BW47547 C.BW.99.BWMC168
2000: C.BW.00.BW18595 C.BW.00.BW18802 C.BW.00.BW192113 C.BW.00.BW20361 C.BW.00.BW20636
Step1. Sequence Alignment
AND manual alignment e.g. Se-Al version
Remove all incomplete or codons (*, ?), and in the correct reading frame.
Return to your original tree and use this sequence to root the tree (under rooting options)
Subtype B is the most
distantly related sequence.
Step 2. ML tree construction
Macintosh version - Runs on MacOS9 and MacOSX
UNIX/Linux version - could be compiled for Windows
Step 3 Tip-dating the Tree
The likelihood ratio test tells us whether we are justified in assuming a molecular clock. If a clock exists then the difference is not significant.
Using The Clock:1. Timing the origin of the epidemicTMRCA = tree node height = years since MRCA substitution rate
Not significant difference between timing of two subtypes.
Subtype C has a slightly lower point estimate for rate but broader CIs
Can apply the rates to other data sets, provided it is the same gene region
Determine the maximum likelihood population growth model.
Estimate the parameter of Rho under the best-fit growth model
Scale skyline plots according to the substitution rate.
4. Estimate parameter R, which is the growth rate in units year -1, or rho/
5. Estimate the doubling time :
Doubling time (years) = LN (2)
Comparing growth rates within different groups, e.g. risk group, HLA type, or the spread of different clades.
Detecting decreases in epidemic growth rate.
Molecular Clocks can be used to:
a.) time the origin of an epidemic
b.) determine population dynamics
c.) Your estimates are only as good as your clock.
d.) HIV is subject to variable rates of evolution among branches: needs new models which allow for this (relaxed clocks).