1 / 18

Prediction of Secondary Structure of RNA

Prediction of Secondary Structure of RNA. Swetha Nandyala. Overview. Introduction How RNA Secondary Structure is obtained ? Various types of loops possible in RNA Secondary Structure Motivation Resource Implements the Algorithms to be Understood Algorithm Nussinovs RNA Folding Example.

wayne
Download Presentation

Prediction of Secondary Structure of RNA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prediction of Secondary Structure of RNA Swetha Nandyala

  2. Overview Introduction How RNA Secondary Structure is obtained ? Various types of loops possible in RNA Secondary Structure Motivation Resource Implements the Algorithms to be Understood Algorithm Nussinovs RNA Folding Example

  3. RNA Secondary Structure • Primary Structure of RNA : A sequence of the bases A,G,C and U. • Due to hydrogen bonds, the bases of the RNA may form base pairs a) Watson-Click base pairs: • G≡C : formed by a triple hydrogen bond • A=U : formed by a double hydrogen bond b) Wooble base pairs: G-U : formed by single hydrogen bond • Secondary Structure of an RNA : The W-C and Wooble base pairs occurring in an RNA Fold

  4. Structural Features Of RNA Hairpin Loop Multibranched Loop Stem Bulge Internal Loop External Base

  5. MOTIVATION Why is Prediction of RNA Secondary Structure important? • Tertiary Structure Prediction • Identify highly conserved Motifs • Design and Testing of Pharmaceutical products.

  6. Algorithm • Input • an alignment of RNA sequence • Output • single common structure for the sequence • The Model • The evolutionary model (Nussinovs) • The SCFG

  7. http://ludwig-sun2.unil.ch/~bsondere/nussinov/form.html#nussinovhttp://ludwig-sun2.unil.ch/~bsondere/nussinov/form.html#nussinov Is this the only Secondary Structure possible? No How to distinguish the correct structure? Need for both a function and an algorithm arises at this point, Hence Dynamic Programming Key Idea of Nussinovs algorithm Recursive and only 4 possible ways Resource: Questions:

  8. Nussinovs’ Dynamic Programming i+1 i j-1 j i Unpaired base i Unpaired base j j j-1 i+1 i j k k+1 i j Bifurcation paired i,j

  9. Energy Matrix • E(i,j)=maximum energy for sub chain starting at i and ending at j and • s(ri,rj)=energy of pair ri, rj(rj=base at position j) • i is unpaired E(i,j)=E(i+1,j) • j is unpaired E(i,j)=E(i,j-1) • i,j is paired E(i,j)=E(i+1,j-1)+s(ri,rj) • Bifurcation E(i,j)=E(i,k)+E(k+1,j)

  10. RNA Secondary Structure Algorithm • Given: RNA Sequence x1,x2,x3…….,xL • Initialization: E(i,i-1)=0 for i=2 to L E(i,i)=0 for i=1 to L • Recursion: for n=2 to L //iteration over length E(i,j)=max{ E(i+1,j), E(i,j-1), E(i+1,j-1)+s(ri,rj) max i<k<j {E(i,k)+E(k+1,j)} }

  11. Example j Let s(ri,rj)=1 if ri, rj form a base pair and 0 otherwise Input : GGGAAAUCC E(i,j)=maximum energy conformation for sub chain from i to j i Here we should have max energy for AAAUC

  12. GGG (i=1,j=3) max {0, 0, 0+s(G,G) }=0 AAU (i=5,j=7) max {0, 0, 0+s(A,U) }=1

  13. Recovering the Structure • Main difference to sequence Alignment-We are tracing back a tree like structure not a single optimal path (Bifurcation introduces branch points) • Method 1: Leave pointers as you compute the table:for each element of the table store(atmost two) pointers the subsequences used in solution • Method2: Recover history based on numerical values in the table -Stacking- check value along diagonal -Bifurcation-find k such that E(i,k)+E(k+1,j)=E(i,j)

  14. Trace back Algorithm • Initialization: Push(1,L) onto stack • Recursion: • Repeat until stack is empty • pop(i,j) • if i>=j continue; else if E(i+1,j)=E(i,j) push(i+1,j) else if E(i+1,j)=E(i,j) push(i,j-1) else if E(i+1,j)=E(i,j) push(i,j) record i,j base pair push(i+1,j-1) else for k=i+1 to j-1 : if E(i,k)+E(k+1,j)=E(i,j); push(k+1,j) push(i,k) break

  15. A A A • U G • C G • C G

  16. Problems and Improvements • Advantages: • Accurate Output can be obtained if able to provide certain basic details • Cost:O(n3) • Success rate was 70% • Main Drawbacks: • Hairpin loops could be of any length. • Developed in 1978.Therefore, it is not state of art today, but is a good starting point • Improvements: • Minimization Gibbs’ Free energy • SCFGs’

  17. http://ludwig-sun2.unil.ch/~bsondere/nussinov/ http://www.daimi.au.dk/~schauser/genome_analysis_F03/lectures_F03/RNA-struct-prediction.pdf http://www.scs-pw.gmu.edu/jamison/CSI730F01/Lec10.pdf Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids- R.Durbin, S.Eddy, A.Korgh, G.Mitchison. University of Cambridge Press,1998 References:

  18. THANK YOU

More Related