1 / 53

# Distances - PowerPoint PPT Presentation

Distances. A natural or ideal measure of distance between two sequences should have an evolutionary meaning. One such measure may be the number of nucleotide substitutions that have accumulated in the two sequences since they have diverged from each other.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Distances' - tarannum-hasan

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

A natural or ideal measure of distance between two sequences should have an evolutionary meaning.

One such measure may be the number of nucleotide substitutions that have accumulated in the two sequences since they have diverged from each other.

To derive a measure of distance, we need to make several simplifying assumptions regarding the probability of substitution of a nucleotide by another.

Jukes & Cantor simplifying assumptions regarding the probability of substitution of a nucleotide by another. one-parameter model

Assumption: simplifying assumptions regarding the probability of substitution of a nucleotide by another.

• Substitutions occur with equal probabilities among the four nucleotide types.

Kimura’s simplifying assumptions regarding the probability of substitution of a nucleotide by another. two-parameter model

Assumptions: simplifying assumptions regarding the probability of substitution of a nucleotide by another.

• The rate of transitional substitution at each nucleotide site is  per unit time.

• The rate of each type of transversional substitution is  per unit time.

NUMBER OF NUCLEOTIDE SUBSTITUTIONS BETWEEN TWO DNA SEQUENCES simplifying assumptions regarding the probability of substitution of a nucleotide by another.

After two nucleotide sequences diverge from each other, each of them will start accumulating nucleotide substitutions. If two sequences of length N differ from each other at n sites, then the proportion of differences, n/N, is referred to as the degreeof divergence or Hamming distance. Degrees of divergence are usually expressed as percentages (n/N 100%).

The observed number of differences is likely to be smaller than the actual number of substitutions due to multiple hits at the same site.

13 mutations than the actual number of substitutions due to =3 differences

Number of substitutions between two noncoding sequences than the actual number of substitutions due to

The one-parameter model than the actual number of substitutions due to

In this model, it is sufficient to consider only I(t), which is the probability that the nucleotide at a given site at time t is the same in both sequences.

where than the actual number of substitutions due to p is the observed proportion of different nucleotides between the two sequences.

L than the actual number of substitutions due to = number of sites compared in the ungapped alignment between the two sequences.

The two-parameter model than the actual number of substitutions due to

The differences between two sequences are classified into transitions and transversions. P = proportion of transitional differencesQ = proportion of transversional differences

ATCGG

ACCCG

Q = 0.2

P = 0.2

Numerical example (2P-model) transitions and transversions.

Substitution schemes with more than two parameters. transitions and transversions. - Parameter-free substitution schemes.

Difficulties with denominator: transitions and transversions.

1. The classification of a site changes with time: For example, the third position of CGG (Arg) is synonymous. However, if the first position changes to T, then the third position of the resulting codon, TGG (Trp), becomes nonsynonymous.

T transitions and transversions.

Trp

Nonsynonymous

Difficulties with denominator: transitions and transversions.

2. Many sites are neither completely synonymous nor completely nonsynonymous. For example, a transition in the third position of GAT (Asp) will be synonymous, while a transversion to either GAG or GAA will alter the amino acid.

Difficulties with nominator transitions and transversions. :1. The classification of the change depends on the order in which the substitutions had occurred.

Difficulties with nominator transitions and transversions. :2. Transitions occur with different frequencies than transversions. 3. The type of substitution depends on the mutation. Transitions result more frequently in synonymous substitutions than transversions.

Miyata & Yasunaga (1980) transitions and transversions. andNei & Gojobori (1986)method

Step 1: transitions and transversions. Classify Nucleotides into non-degenerate, twofold and fourfold degenerate sites

L0

L2

L4

Number of Amino-Acid Replacements between Two Proteins transitions and transversions.

• The observed proportion of different amino acids between the two sequences (p) is

p = n /L

• n = number of amino acid differences between the two sequences

• L = length of the aligned sequences.

Number of Amino-Acid Replacements between Two Proteins transitions and transversions.

The Poisson model is used to convert p into the number of amino replacements between two sequences (d ):

d = - ln(1 – p)

The variance of d is estimated as

V(d) = p/L (1 – p)

Theoretical Expectations transitions and transversions.

Neutral mutations

Deleterious mutations

Overdominant mutations