Improving free energy functions for rna folding
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Improving Free Energy Functions for RNA Folding PowerPoint PPT Presentation


  • 46 Views
  • Uploaded on
  • Presentation posted in: General

Improving Free Energy Functions for RNA Folding. RNA Secondary Structure Prediction. Why RNA is Important. Machinery of protein construction Catalytic role in cells May be possible to destroy specific sequences of RNA (to interrupt protein production) RNase P (Cech/Altman c.1981).

Download Presentation

Improving Free Energy Functions for RNA Folding

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Improving free energy functions for rna folding

Improving Free Energy Functions for RNA Folding

RNA Secondary Structure Prediction


Why rna is important

Why RNA is Important

  • Machinery of protein construction

  • Catalytic role in cells

    • May be possible to destroy specific sequences of RNA (to interrupt protein production)

    • RNase P (Cech/Altman c.1981)


Rna structural levels

AAUCG...CUUCUUCCA

Primary

Tertiary

Secondary

RNA Structural Levels

Secondary: http://anx12.bio.uci.edu/~hudel/bs99a/lecture21/lecture2_2.html

Tertiary: http://www.leeds.ac.uk/bmb/courses/teachers/trnballs.html


Abstracting the problem

Abstracting the problem

A

G

C

G

C

A

U

C

Zuker (1981) Nucleic Acids Research 9(1) 133-149


Why it is hard

Why it is hard

  • Large search space (hard to enumerate)

Hofacker et al. (1994) Monat. Chem. 125 167-188


Why it is hard1

Why it is hard

  • Secondary structure does not exist.

    • Unlike proteins

    • Putative structures (prone to revision)

  • Quality of Energy Functions

    • Discussed later


Current algorithms

Current Algorithms

  • Single-Strand

    • Minimum Free Energy (Zuker et. al. 1981)

    • Partition Functions (McCaskill 1990)

  • Comparative Sequence Analysis

    • Max. Weighted Matching (Nussinov et. al. 1978)

    • Stochastic CFG (Sakikibara et. al. 1994)

    • Phylogenetic Trees (Gulko et. al. 1995)

    • Statistical Significance (Noller & Woese, early 80’s)

See proposal for references


Mfe tinoco hypothesis

MFE / Tinoco Hypothesis

The free energy of a secondary structure equals the sum of the free energies of the loops and stacked pairs

Tinoco et al. (1971) Nature 230 362-367.


Proposed system

Secondary

Structures

Proposed System

AAUCG...CUUCUUCCA

2

GA

(E’)

3

1

MFE

(E)

AAUCG...CUUCUUCCA


Step i calc mfe structure

Step I - Calc MFE Structure

  • Given a sequence  apply the MFE algorithm

    • Generates secondary structure S


Step ii structural similarity

Step II - Structural Similarity

  • Given a database of experimentally verified RNA structures

    • Let Q be the database structure most similar to S

    • Based on RNase P Database (Brown 1999)


Step iii construct e

Step III - Construct E’

  • Create a new energy function:


Discussion on e

Discussion on E’

  • E’ has global information

  • Global information precludes the use of dynamic programming (MFE, Partition)

  • Leaves (stochastic) combinatorial optimization

    • Gradient Descent (no E/S)

    • Genetic Algorithms / Simulated Annealing


Step iv genetic algorithm

Step IV - Genetic Algorithm

  • RNA Structural Prediction by GA

    • Input: sequence 

    • Output: structure that maximizes E’ for 

    • Steady State Genetic Algorithm

    • Pseudoknots forbidden (conflicts)

    • Fitness = -E’

    • Effect of Similarity(Q, S) diminishes with each generation (pseudo-SA).


Genetic algorithm repn

23

52

(23 52 3 3.2)

length

start

end

weight

Genetic Algorithm - Repn.

  • Stem-loop representation(Chen et. Al. 2000)

    • Window method (EMBOSS Palindrome)


Genetic algorithm operators

Fit stems of P2 into C1 or C2 randomly.

Placement must be conflict free.

C1

P1

P2

C2

Genetic Algorithm - Operators

  • Mutation

    • Add stem from stem pool to a child

  • Crossover


Preliminary results

Preliminary Results

  • E’ does not lead to drastic speed up

  • Genetic algorithm is very slow

    • If initial population generated randomly from stem pool.

    • Use suboptimal folding for initial population.


Preliminary results explained

Preliminary Results Explained

  • The real structure is usually very similar the Tinoco optimal structure.

  • View E’ as a way of choosing among the suboptimal structures.


Future work

Future Work

  • More testing on the entire RNase P Database (> 400 structures)

  • Tune E’

  • Accuracy comparison to MFE and Partition Function Algorithms

  • Parallelize genetic algorithm


Improving free energy functions for rna folding

END


  • Login