1 / 42

Predicting RNA Structure and Function

Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA.

redford
Download Presentation

Predicting RNA Structure and Function

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PredictingRNA Structure and Function

  2. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered the roles played by the cell's leading actors, a familiar performer has turned up in a stunning variety of guises. RNA, long upstaged by its more glamorous sibling, DNA, is turning out to have star qualities of its own “ SCINECE NEWS 12: 2002

  3. Ribozyme

  4. The Ribosome : The protein factory of the cell mainly made of RNA

  5. Non coding DNA (98.5% human genome) • Intergenic • Repetitive elements • Promoters • Introns • untranslated region (UTR)

  6. Some biological functions of ncRNA • mRNA cellular localization • Control of mRNA stability • Control of splicing • Control of translation The function of the RNA molecule depends on its folded structure

  7. RNA Structural levels Tertiary Structure Secondary Structure tRNA

  8. Control of Ironlevels by mRNA structure Iron Responsive Element IRE G U A G CN N N’ N N’ N N’ N N’ C N N’ N N’ N N’ N N’ N N’ conserved Recognized by IRP1, IRP2 5’ 3’

  9. Low Iron IRE-IRP inhibits translation of ferritin IRE-IRP Inhibition of degradation of TR High Iron IRE-IRP off -> ferritin translated Transferin receptor degradated F: Ferritin = iron storage TR: Transferin receptor = iron uptake IRP1/2 IRE 3’ 5’ F mRNA IRP1/2 3’ TR mRNA 5’

  10. 3’ G A U C U U G A U C RNA Secondary Structure • The RNA molecule folds on itself. • The base pairing is as follows: G C A U G U hydrogen bond. LOOP U U C G U A A U G C 5’ 3’ STEM 5’

  11. RNA Secondary structureShort Range Interactions HAIRPIN LOOP G G A U U G C C G G A U A A U G C AG C U U BULGE INTERNAL LOOP STEM DANGLING ENDS 5’ 3’

  12. long range interactions of RNA secondary structural elements These patterns are excluded from the prediction schemes as their computation is too intensive. Pseudo-knot Kissing hairpins Hairpin-bulge contact

  13. Predicting RNA secondary Structure • Searching for a structure with Minimal Free Energy (MFE) • According to base pairing rules only Watson Crick A-T G-C and wobble pairs G-T can from stems

  14. Simplifying Assumptions for Structure Prediction • RNA folds into one minimum free-energy structure. • There are no knots (base pairs never cross). • The energy of a particular base pair in a double stranded regions is calculated independently • Neighbors do not influence the energy. Solution : Searching for MFE with Dynamic Programming Zucker and Steigler 1981

  15. Sequence dependent free-energy values of the base pairs (nearest neighbor model) U U C G U A A U G C A UCGAC 3’ U U C G G C A U G C A UCGAC 3’ 5’ 5’ • Assign negative energies to interactions between base pair regions. • Energy is influenced by the previous base pair • (not by the base pairs further down).

  16. Sequence dependent free-energy values of the base pairs (nearest neighbor model) U U C G U A A U G C A UCGAC 3’ U U C G G C A U G C A UCGAC 3’ 5’ 5’ • These energies are estimated experimentally from small synthetic RNAs. Example values: GC GC GC GC AU GC CG UA -2.3 -2.9 -3.4 -2.1

  17. Adding Complexity to Energy Calculations • Positive energy - added for destabilizing regions such as bulges, loops, etc. • More than one structure can be predicted

  18. Free energy computation U U A A G C G C A G C U A A U C G A U A3’ A 5’ +5.9 4 nt loop -1.1 mismatch of hairpin -2.9 stacking +3.3 1nt bulge -2.9 stacking -1.8 stacking -0.9 stacking -1.8 stacking 5’ dangling -2.1 stacking -0.3 G= -4.6 KCAL/MOL -0.3

  19. Prediction Tools based on Energy Calculation Fold, Mfold Zucker & Stiegler (1981) Nuc. Acids Res. 9:133-148 Zucker (1989) Science 244:48-52 RNAfold Vienna RNA secondary structure server Hofacker (2003) Nuc. Acids Res. 31:3429-3431

  20. Insight from Multiple Alignment Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C

  21. Compensatory Substitutions Mutations that maintain the secondary structure U U C G U A A U G C A UCGAC 3’ C G 5’

  22. RNA secondary structure can be revealed by identification of compensatory mutations U C U G C G N N’ G C G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C

  23. Insight from Multiple Alignment Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. • Conservation – no additional information • Consistent mutations (GC GU) – support stem • Inconsistent mutations – does not support stem. • Compensatory mutations – support stem.

  24. RNAalifold (Hofacker 2002) From the vienna RNA package Predicts the consensus secondary structure for a set of aligned RNA sequences by using modified dynamic programming algorithm that add alignment information to the standard energy model Improvement in prediction accuracy

  25. Other related programs Sean Eddy’s Lab WU http://www.genetics.wustl.edu/eddy • COVE RNA structure analysis using the covariance model (implementation of the stochastic free grammar method) • QRNA (Rivas and Eddy 2001) Searching for conserved RNA structures • tRNAscan-SEtRNA detection in genome sequences

  26. RNA families • Rfam : General non-coding RNA database (most of the data is taken from specific databases) http://www.sanger.ac.uk/Software/Rfam/ Includes many families of non coding RNAs and functional motifs, as well as their alignment and their secondary structures

  27. Rfam /Pfam • Pfam uses the HMMER (based on Hidden Markov Models) • Rfam uses the INFERNAL (based on Covariation Model)

  28. Rfam (currently version 7.0) • 503 different RNA families or functional Motifs from mRNA, UTRs etc. • View and download multiple sequence alignments • Read family annotation • Examine species distribution of family members • Follow links to otherdatabases

  29. An example of an RNA family miR-1 MicroRNAs mir-1 microRNA precursor family This family represents the microRNA (miRNA) mir-1 family. miRNAs are transcribed as ~70nt precursors (modelled here) and subsequently processed by the Dicer enzyme to give a ~22nt product. The products are thought to have regulatory roles through complementarity to mRNA.

  30. Seed alignment (based on 7 sequences)

  31. BACK TO PROTEINS

  32. Predicting Protein function • Expression data • Protein Structure

  33. 2.0 0 -2.0 Microarray data for yeast genes other RNA processing splicing export splicing transcription decay wt

  34. Using SVMs to predict function based on expression data Each dot represents a vector of the expression pattern taken from a microarray experiment . For example the expression pattern of all genes coding for proteins involved in splicing Splicing factors others

  35. kernel How do SVM’s work with expression data? In this example blue dots can be proteins involved in splicing and red are all the rest The SVM is trained on experimentally verified data

  36. How do SVM’s work with expression data? In this example blue dots can be proteins involved in splicing and red are all the rest ? After training the SVM we can use it to predict hypothetical genes based on their expression pattern

  37. Predicting function from structure Structural Genomics: a large scale structure determination project designed to cover all representative protein structures ATP binding domain of protein MJ0577 Zarembinski, et al., Proc.Nat.Acad.Sci.USA, 99:15189 (1998)

  38. Wanted ! Automated methodsto predict function from the protein structures resulting from the structural genomic project. As a result of the Structure Genomic initiative many structures of proteins with unknown function will be solved

  39. Approaches for predicting function from structure ConSurf - Mapping the evolution conservation on the protein structure http://consurf.tau.ac.il/

  40. Approaches for predicting function from structure PHPlus – Identifying positive electrostatic patches on the protein structure http://pfp.technion.ac.il/

  41. Approaches for predicting function from structure SHARP2 – Identifying positive electrostatic patches on the protein structure http://www.bioinformatics.sussex.ac.uk/SHARP2

  42. ALL TOGETHER….

More Related