Zhi John Lu, Jason Gloor, and David H. MathewsUniversity of Rochester Medical Center, Rochester, New York

RNA Secondary and Tertiary Structure:

AAUUGCGGGAAAGGGGUCAA

CAGCCGUUCAGUACCAAGUC

UCAGGGGAAACUUUGAGAUG

GCCUUGCAAAGGGUAUGGUA

AUAAGCUGACGGACAUGGUC

CUAACCACGCAGCCAAGUCC

UAAGUCAACAGAUCUUCUGU

UGAUAUGGAUGCAGUUCA

Cate, et al. (Cech & Doudna).

(1996) Science 273:1678.

Waring & Davies. (1984) Gene 28: 277.

Gibbs Free Energy Change:

Ki =

=

= Ki/Kj =

The structure with the lowest DG° is the

most favored at a given temperature.

Nearest Neighbor Model for Free Energy Change of a Sample Hairpin Loop:

Mathews et al., J. Mol. Biol., 1999, 288: 911.

Mathews et al., PNAS, 2004, 101: 7287.

Percentage of Known Base Pairs Correctly Predicted:

Mathews, Disney, Childs, Schroeder, Zuker, & Turner. 2004. PNAS 101: 7287.

- A minimum free energy structure provides the single best guess for the secondary structure.
- Assumes that:
- RNA is at equilibrium
- RNA has a single conformation
- RNA thermodynamic parameters are without error
- Non-nearest neighbor effects
- Some sequence-specific stabilities are averaged

- A partition function can be used to determine the probability of a structure at equilibrium.

where k is the sum over all structures with the i-j base pair.

- Sensitivity – what percentage of known pairs occur in the predicted structure.
- Positive Predictive Value (PPV) – what percentage of predicted pairs occur in the known structure.
- PPV ≤ Sensitivity because the structures determined by comparative sequence analysis do not have all pairs and there is a tendency to over-predict base pairs by free energy minimization.

PPV

PBP≥ 90%

PPV

PBP≥ 70%

PPV

PBP> 50%

Positive

Predictive

Value (PPV)

PPV

PBP≥ 99%

PPV

PBP≥ 95%

Sensitivity

Mathews. RNA. 10: 1178. (2004).

PPV

PBP≥ 99%

PPV

PBP≥ 95%

PPV

PBP≥ 90%

PPV

PBP≥ 70%

PPV

PBP> 50%

Mathews. RNA. 10: 1178. (2004).

E. coli 5S rRNA

PBP≥ 99%

PBP≥ 90%

PBP≥ 70%

PBP> 50%

- “Statistical learning method” to predict Pi,j
- Generate structures:

Where:

Bioinformatics. 22: e90-e98. (2006).

- Zhi John Lu, Jason Gloor, David Mathews
- Implement dynamic programming algorithm
using partition function prediction of Pi,j.

- Also implement suboptimal structure prediction.
- Alternative hypotheses.

- Maximizing expected accuracy can predict structures with greater sensitivity and positive predictive value than free energy minimization.
- Maximizing expected accuracy using an underlying thermodynamic model is more accurate than an underlying statistical model.

Methanococcus thermolithotrophicus 5S rRNA (Szymanski et al., 1998):

MaxExpect Predicted Structure:

Minimum Free Energy Structure:

CONTRAfold

Predicted Structure: