slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
RA Fisher 1890 - 1962 PowerPoint Presentation
Download Presentation
RA Fisher 1890 - 1962

Loading in 2 Seconds...

play fullscreen
1 / 52

RA Fisher 1890 - 1962 - PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on

RA Fisher 1890 - 1962. “Natural selection is a mechanism for generating an exceedingly high degree of improbability”. Testing for the Extreme Value Domain of Attraction of Beneficial Fitness Effects. Craig J. Beisel Bioinformatics and Computational Biology Department of Mathematics

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'RA Fisher 1890 - 1962' - mulan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
RA Fisher

1890 - 1962

“Natural selection is a mechanism for generating an exceedingly high degree of improbability”

testing for the extreme value domain of attraction of beneficial fitness effects
Testing for the Extreme Value Domain of Attraction of Beneficial Fitness Effects

Craig J. Beisel

Bioinformatics and Computational Biology

Department of Mathematics

craig@beisel.net

www.beisel.net

slide3

Concepts

Natural SelectionThe differential survival and reproduction of individuals within a population based on hereditary characteristics.
slide4

Concepts

AdaptationThe adjustment of an organism or population to a new or altered environment through genetic changes brought about by natural selection.

slide5

Concepts

PhenotypeThe overall attributes of an organism arising due to the interaction of its genotype with the environment.
slide7

Concepts

FitnessDescribes the ability of a genotype to reproduce. More formally, it is defined as the ratio of the counts of a genotype before and after one generation.

fitness distribution the distribution of fitness for every possible genotype in a fixed environment

Concepts

Fitness DistributionThe distribution of fitness for every possible genotype in a fixed environment.

Lethal

Moderate

High

mutational landscape model
Mutational Landscape Model

John Maynard Smith

(1920 – 2004)

First remarked that adaptation does not take place in phenotypic space, but in sequence space…

mutational landscape model1
Mutational Landscape Model

Gillespie (1983)

Given a sequence of nucleotides of length L,

There are 4L possible sequences.

Each sequence has 3L neighboring sequences which are exactly one point mutation away.

mutational landscape model2
Mutational Landscape Model

Additionally, if we assume Strong Selection and Weak Mutation (SSWM) then we can ignore the possibility of clonal interference.

Formally 2Ns >>1, Nμ<1

Therefore new mutants will fix (or not) in the population before the next mutant arises.

Also, double mutants and neutral/deleterious mutations can be ignored.

mutational landscape model3
Mutational Landscape Model

Consider a sequence in an environment where it is currently the most fit.

A small change occurs in the environment which shifts it to be the ith most fit sequence among its one-step mutant neighbors where i is small.

mutational landscape model4
Mutational Landscape Model

There are then i-1 more fit sequences which the population could move to.

Notice that the fitnesses of these sequences are in the tail of the fitness distribution.

mutational landscape model5
Mutational Landscape Model

We would like to find the probability of the population fixing mutant j when starting with sequence i.

Since we are dealing with only the tail of the fitness distribution we can apply EVT.

orr s one step model
Orr’s One Step Model

Assumptions

The fitness distribution is in the Gumbel domain of attraction and therefore the fitnesses of the i-1 more fit one-step mutants can be considered to be drawn from an ‘exponential’ distribution by GPD.

This will allow a result which is independent of the underlying fitness distribution.

orr s one step model1
Orr’s One Step Model

Lemma

Let X1,…, Xn be iid observations where Xi~Exp and X(1),…,X(n) be their corresponding order statistics.

Then the spacings defined ΔXi = X(i-1) – X(i) are distributed exponential and

E(ΔXi)= ΔX1 / i

Sukhatme (1937)

orr s one step model2
Orr’s One Step Model

Sincej 2sj (Haldane 1927)

orr s one step model3
Orr’s One Step Model

Taking the expected value…

orr s one step model4
Orr’s One Step Model

Notice, we have an expression for the expected transition probability which is independent of the fitness of the individual sequences and depends only on i and j.

orr s one step model5
Orr’s One Step Model

Can this model be validated empirically?

orr s one step model6
Orr’s One Step Model

Experimental

Evolution

Natural Isolate ID11

~3% differ from G4

Microviridae

Host - E. Coli

5577 bp

orr s one step model7
Orr’s One Step Model

20 one-step walks

9 observed mutations

Rokyta et al (2005)

orr s one step model8
Orr’s One Step Model

Concluded Orr’s transition probabilities did not explain data as well as Wahl model even after correcting the model for mutation bias.

orr s one step model9
Orr’s One Step Model

Where did Orr go wrong?

Perhaps, the tail of the fitness distribution is not in the Gumbel domain of attraction and therefore not exponentially distributed?

extreme value theory

Extreme Value Theory

Extreme Value Theory

Field of statistical theory which attempts to describe the distribution of extreme values (maxima and minima) of a sample from a given probability distribution.

slide27

Extreme Value Theory

Notice that extreme values of a sample generally fall in the tail of the underlying probability distribution. For example the maximum of a sample of size 10 from a standard normal distribution…

slide28

Extreme Value Theory

Since the tail is all that must be considered, many results of extreme value theory are independent of the underlying probability distribution.

In fact, EVT shows almost all probability distributions can be classified into three groups by their tail behavior.

slide29

Extreme Value Theory

These three types are…

Gumbel Most Common Distributions

Exponential, Normal, Gamma, etc.

Weibull Finite Tail distributions

Fréchet Heavy Tail Distributions

Cauchy

slide30

Extreme Value Theory

EVT allows all three types of tail behavior to be described by the Generalized Pareto Distribution (GPD)

tau – scale kappa-shape

slide31

Extreme Value Theory

EVT allows all three types of tail behavior to be described by the Generalized Pareto Distribution (GPD)

slide32

Extreme Value Theory

The GPD not only provides the natural alternative distribution for testing against the exponential in this context, the null model of k=0 is nested which allows the application of Maximum Likelihood and Likelihood Ratio Testing.

slide33

Maximum Likelihood and LRT

Log-Likelihood for the GPD is given…

slide34

Maximum Likelihood and LRT

Distribution of the LRT test statistic?

Although a common approximation is to assume Chi-squared with one degree of freedom, this does not appear to be the case here.

Distribution of the test statistic was calculated using parametric bootstrap.

slide35

Maximum Likelihood and LRT

Power

Probability of rejecting the null when the alternative is true.

1-P(Type II error)

Can we hope to reject the null with a given data set?

slide37

Maximum Likelihood and LRT

Sensitivity Analysis

Determine the inflation of the Type I error rate under violations of the null.

If null is rejected, what is the chance that rejection was due to inflation of alpha due to violations in the assumptions of the null hypothesis?

slide38

Maximum Likelihood and LRT

  • Violations of the Null Assumptions
  • Small effect mutations have low probability of fixation and therefore may not be observed.
  • Observations include measurement error which may be normal or log-normal.
slide40

Maximum Likelihood and LRT

GPD is stable to shifts of threshold, analyze data relative to the smallest observed!

slide42

Maximum Likelihood and LRT

If measurement error is not considered and our test rejects it is likely that we are safe in our conclusion assuming error is small.

In the event that we fail to reject, it is likely due to the loss of power encountered when operating under a false null hypothesis.

In this case, we must reanalyze our data incorporating measurement error.

slide43

Maximum Likelihood and LRT

The likelihood equation of normal or lognormal measurement error conditional on the GPD has no closed form ;(

slide45

Maximum Likelihood and LRT

Standard optimization procedures fail to converge…

slide46

Metropolis-Hastings and Bayesian Methods

MH Algorithm

Given X(t)

1. Generate Y(t) ~ g(y-x(t))

2. Take X(t) =

Y(t) with probability min(1,f(Y(t))/f(X(t)))

X(t) otherwise

If g(z) is normal (symmetric) then convergence to posterior is assured

slide47

Metropolis-Hastings and Bayesian Methods

tau=1, kappa=-2, sigma=.1

mean=-1.64 95%CI=(-.826,-2.70)

slide48

Metropolis-Hastings and Bayesian Methods

tau=1, kappa=-2, sigma=.1

mean=.893 95%CI=(.509,1.41)

slide49

Metropolis-Hastings and Bayesian Methods

tau=1, kappa=-2, sigma=.1

mean=-1.818 CI=(-1.47,-2.23)

slide50

Metropolis-Hastings and Bayesian Methods

tau=1, kappa=-2, sigma=.1

mean=.083 95%CI=(.034,.160)

thanks to
Thanks to…

Darin Rokyta

Paul Joyce

Holly Wichman

Jim Bull

IBEST

NIH

E. Coli

references
References

Gillespie, J. H. 1984. Molecular evolution over the mutational landscape. Evolution 38:1116–1129.

Gillespie, J. H. 1991. The causes of molecular evolution. Oxford Univ. Press, New York.

Gumbel, E. J. 1958. Statistics of Extremes. Columbia Univ. Press, New York.

Orr, H. A. 2002. The population genetics of adaptation: The adaptation of DNA sequences.

Evolution 56:1317–1330.

Orr, H. A. 2003a. The distribution of fitness effects among beneficial mutations. Genetics

163:1519–1526.

Rokyta, D. R., Joyce, P., Caudle, S. B., and Wichman, H. A. 2005. An empirical test of the mutational landscape model of adaptation using a single-stranded DNA virus. Nat. Gen. 37:441–444.

Rokyta, R., C.J. Beisel and P. Joyce. Properties of adaptive walks on uncorrelated landscapes under strong selection and weak mutation. Journal of Theoretical Biology , 243, (1), 114-120, 2006.

Beisel, C.J., R. Rokyta, H.A. Wichman, P. Joyce. Testing the extreme value domain of attraction for beneficial fitness effects. (Submitted Genetics)