Molecular evolution:
Download
1 / 46

how do we explain the patterns of variation observed in DNA sequences? - PowerPoint PPT Presentation


  • 108 Views
  • Uploaded on

Molecular evolution: . how do we explain the patterns of variation observed in DNA sequences? how do we detect selection by comparing silent site substitutions to replacement substitutions?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'how do we explain the patterns of variation observed in DNA sequences?' - ashtyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
How do we explain the patterns of variation observed in dna sequences

Molecular evolution:

how do we explain the patterns of variation observed in DNA sequences?

how do we detect selection by comparing silent site substitutions to replacement substitutions?

how do we detect selection by comparing fixed differences between species to polymorphisms within species?

how do we detect selection by using hitchhiking?

Goal: understand the logic behind key tests.


Neutralist vs selectionist view
Neutralist vs. selectionist view

Are most substitutions due to drift or natural selection?

“Neutralist” vs. “selectionist”

Agree that:

Most mutations are deleterious and are removed.

Some mutations are favourable and are fixed.

Dispute:

Are most replacement mutations that fix beneficial or neutral?

Is observed polymorphism due to selection or drift?


Reminder substitution vs polymorphism
Reminder: substitution vs. polymorphism

What happen after a mutation changes a nucleotide in a locus

Polymorphism: mutant allele is one of several present in population

Substitution: the mutant allele fixes in the population. (New mutations at other nucleotides may occur later.)


Substitution schematic
Substitution schematic

Individual 1 2 3 4 5 6 7

Time 0: aaat aaat aaat aaat aaat aaat aaat

Time 10: aaat aaat aaat aaat acat aaat aaat

Time 20: aaat aaat acat aaat acat acat acat

Time 30: acat acat acat acat acat acat acat

Time 40: acat acat actt acat acat acat acat

Times 10-29: polymorphism

Time 30: mutation fixed -> substitution

Time 40: new mutation: polymorphism


Reminder substitution rates for neutral mutations
Reminder: substitution rates for neutral mutations

Most neutralmutations are lost

Only 1 out of 2N fix

Most that are lost go quickly (< 20 generations for population sizes from 100 - 2000)

Most replacementmutations are lost since deleterious: rate of loss is faster than neutral


Data in favor of neutrality
Data in favor of neutrality

  • Substitutions in DNA appear to be clock-like

Figure 6.21


Drift model pseudocode
Drift model pseudocode

Population with 2N – 1 copies of allele A, 1 of allele a

For each generation, draw from prior generation alleles.

-> generate a random number. If less than f(A), new allele = A. Otherwise, allele = a.

-> repeat until 2N alleles drawn

Check to see outcome of drift

->If a is lost, start over.

->If a has fixed, note the number of years

->Otherwise, next year with the new allele frequencies

Repeat 100x per population size

Test populations of 100, 500, 1000, 1500, and 2000


Times to fix for neutral alleles only 1 2n fix how long do they take
Times to fix for neutral alleles(Only 1/2N fix: how long do they take?)

Estimated formula: fixation time = 4.07 * N – 57

Theoretical formula: fixation time = 4N


Puzzle for neutrality

Expected pattern

Actual pattern

rabbits

rabbits

Substitutions

Substitutions

elephants

elephants

Years

Years

Puzzle for neutrality

  • Rates of substitution are clock-like per year, not per generation.



Can we distinguish selection from drift using sequence data
Can we distinguish selection from drift using sequence data?

  • Compare two species: infer where substitutions have occurred.

  • Silent site substitutions should be neutral (dS)

  • Non-synonymous substitutions are expected to be deleterious (usually) (dN)

  • so, expect < 1

    Translation: rate of non-synonymous (dN) is less than the rate of synonymous substitutions (dS)


How do we explain the patterns of variation observed in dna sequences

and inferences about selection

< 1: replacements are deleterious

= 1: replacements are neutral

> 1: replacements are beneficial


What happens to fixation time with selection model pseudocode
What happens to fixation time with selection? Model pseudocode

Population with 2N – 1 copies of allele A, 1 of allele a

WA = 1 + s; Wa = 1

For each generation, draw from prior generation alleles.

-> generate a random number. If greater than f(A), new alleel = a. Otherwise, test fitness: if random < WA, new allele = A.

-> repeat until 2N alleles drawn

Check to see outcome of drift

->If a is lost, start over.

->If a has fixed, note the number of years

->Otherwise, next year with the new allele frequencies

Repeat 100x per fitness

Test populations of 100



Time to fix neutral vs favourable
Time to fix: neutral vs. favourable pseudocode

Simulation results: black – neutral mutations; red – favourable mutations


Time to fixation drift is slow
Time to fixation: drift is slow pseudocode

Neutral:

New mutations per generation: 2Neu

Probability of fixing a new mutation: 1 / 2Ne

Fixations per generation: = 2Neu * 1 / 2Ne = u

Time to fix: 4Ne

Favored by selection

New mutations per generation: 2Neu (but how many favourable??)

Favored mutation probability of fixing: 2|s|

Fixations per generation: 2Neu * 2|s| * prob. favourable

Time to fix: 2 ln (2Ne) / |s|

2 ln (2Ne) / |s| << 4Ne

Shorter time to fixation

Derivations of these results are tough! See Kimura (1962) and Kimutra and Ohta (1969).



Dn ds data brca1
dN / dS data: BRCA1 pseudocode

> 1

< 1

Figure 6.21


Molecular evidence of selection ii mcdonald kreitman test
Molecular evidence of selection II: McDonald-Kreitman Test pseudocode

is very conservative: many selective events may be missed.

Example: immunoglobins.

= 0.37 overall

We suspect selection favoring new combinations at key sites. Antigen recognition sites:

> 3.0



Mcdonald kreitman test iii
McDonald-Kreitman test III pseudocode

If evolution of protein is neutral, the percentage of mutations that alter amino acids should be the same along any branch

If all mutations are neutral, all should have the same probability of persisting

So: dN / dS among polymorphisms should be the same as within fixed differences


Mcdonald kreitman logic
McDonald-Kreitman logic pseudocode

  • Silent sites

    - always neutral

    - fix slowly

    - contribute to polymorphism

  • Replacement sites

    • mainly unfavourable

    • if neutral, fix at same rate as silent and contribute to polymorphism

    • proportion of replacement mutations that are neutral determines dN / dS for polymorphism

    • if favourable, fix quickly and do not contribute to polymorphism: higher dN / dS for fixed differences, lower rate for polymorphism



Polymorphism and fixation
Polymorphism and fixation pseudocode

Neutral

Deleterious

Silent

Replacement

1 / 2N neutral mutations fix


Polymorphism and fixation1
Polymorphism and fixation pseudocode

Neutral

Deleterious

Favourable

Silent

Replacement

1 / 2N neutral mutations fix

- slow

2|s| fix

-fast

Neutral

Favourable


Dn ds for neutral and favourable
dN / dS for neutral and favourable pseudocode

Neutral

Favourable

Polymorphism

dN

dN

dS

dS

Fixation

dN

dN

dS

dS

=

<

poly

fixed

poly

fixed


Mcdonald kreitman hypotheses
McDonald-Kreitman hypotheses pseudocode

H0: All mutations are neutral.

Then, dN / dS for polymorphic sites should equal dN / dS for fixed differences

H1: replacements are favoured. Favoured mutations fix rapidly, so dN / dS for polymorphic < dN / dS fixed


Example of mk test adh in drosophilia
Example of MK test: ADH in pseudocodeDrosophilia

Compare sequences of D. simulans and D. yakuba for ADH (alcohol dehydrogenase)

Significance? Use χ2 test for independence


Evidence of selection iii selective sweeps
Evidence of selection III: pseudocodeselective sweeps

  • Imagine a new mutation that is strongly favored (e.g. insecticide resistance in mosquitoes)


Detecting selection using linkage g6pd in humans
Detecting selection using linkage: G6PD in humans pseudocode

Natural history:

  • Located on X chromosome

  • encodes glucose-6-phosphate dehydrogenase

  • Red blood cells lack mitochondria

  • Glycolysis only

  • NADPH only via pentose-phosphate shunt –requires G6PD

  • NADPH needed for glutathione, which protects against oxidation


G6pd and malaria
G6PD and malaria pseudocode

  • Malaria (Plasmodium falciparum) infects red blood cells

  • Has limited G6PD function typically (but can produce the enzyme)

  • Uses NADPH from red blood cell

  • In G6PD deficient individuals?


G6pd mutants
G6PD mutants pseudocode

  • Different mutants result in different levels of enzymatic activity

  • Severe mutants result in destruction of red blood cells and anemia

  • Most common mutant: G6PD-202A

  • Usually mild effects: may increase risk of miscarriage

  • Prediction: G6PD and malaria?



Has g6pd 202a been selected
Has G6PD-202A been selected? pseudocode

  • 14 markers up to 413,000 bp from G6PD

  • LD?

  • Long distance LD implies strong, recent selection


Has g6pd 202a been selected1
Has G6PD-202A been selected? pseudocode

Fig 7.14

Linkage disquilibrium

kb from core region


Alternative hypothesis drift caused linkage disequilibrium
Alternative hypothesis: drift caused linkage disequilibrium pseudocode

G6PD-202A

Allele frequency

Figure 7.14b



Detecting selection ii ccr5 321
Detecting selection II: CCR5 pseudocodeΔ32

  • Stephens (1998) found strong disequilibrium between CCR5-Δ32 and nearby markers

  • Implies recent origin (< 2000 years): recombination breaks down linkage

  • Implies selected


Detecting selection ii ccr5 322
Detecting selection II: CCR5 pseudocodeΔ32

  • But: new data – November 2005.

  • Better map:


Detecting selection summary
Detecting selection: summary pseudocode

  • Several approaches to detecting selection

    • dN / dS

    • McDonald-Kreitman test

    • using hitchhiking

      Challenges of each method?


Other uses of molecular data the coalescent
Other uses of molecular data: the coalescent pseudocode

Any two alleles in a population share a common ancestor in the last generation

1 / 2Ne

Therefore, going backwards in time, the expected time to find the common ancestor is 1 / (1 / 2Ne) = 2Ne


Coalescent ii
Coalescent II pseudocode


Coalescent and sequences
Coalescent and sequences pseudocode

Imagine that you have two sequences at a locus.

They shared a common ancestor 2Ne generations ago.

They accumulate mutations at rate u per generation per basepair.

2Ne generations / lineage * 2 lineages * u =

4Neu differences per basepair between the two sequences.


Coalescent example
Coalescent example pseudocode

We sequence 1000 base pairs from two sequences, and find 16 base pair differences, how large is the population/

Assume u = 2 x 10-8.

4Neu * 1000 = 16; 8 x 10-5 * Ne = 16;

Ne * 10-5 = 2; Ne = 200,000



Additional readings
Additional readings pseudocode

Eyre-Walker (2006) The genomic rate of adaptive evolution. Trends in Ecology and Evolution 29:569-575. (Well-written review)

Gillespie (2004). Population genetics: a concise guide. John Hopkins: Baltimore, MD. (Very short, clear, but dense!)

Graur and Li (2000) Fundamentals of molecular evolution. Sinauer: Sunderland, MA. (Very clear)

Kimura (1962) On the probability of fixation of mutant genes in populations. Genetics 47:713-719. (If you really want the derivation)

Kimura and Ohta (1969) The average number of generations until fixation of a mutant gene in a finite population. Genetics 61:763-771. (If you really want the derivation)

Sabeti et al (2006) The case for selection at CCR5-32. PLoS Biology 3:1963-1969.

Questions: 1. Explain why clock-like rates of substitutions per year did not fit with the neutral theory.

See posted molecular evolution practice questions: highly recommended!