1 / 17

Modeling RNA motifs by graph-grammars François.Major@UMontreal.CA

Modeling RNA motifs by graph-grammars François.Major@UMontreal.CA. www.iric.ca. MC-Tools: Functions. ( MC-Annotate 3-D ) -> graph ( MC-Cycles graph ) -> [ NCM ] ( MC-Seq graph ) -> [ sequence ] ( MC-Fold sequence ) -> [ graph ] ( MC-Cons [ ( sequence, [ graph ] ) ] ) -> [ graph ]

bevis
Download Presentation

Modeling RNA motifs by graph-grammars François.Major@UMontreal.CA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling RNA motifs by graph-grammarsFrançois.Major@UMontreal.CA www.iric.ca

  2. MC-Tools: Functions • ( MC-Annotate 3-D ) -> graph • ( MC-Cycles graph ) -> [ NCM ] • ( MC-Seq graph ) -> [ sequence ] • ( MC-Fold sequence ) -> [ graph ] • ( MC-Cons [ ( sequence, [ graph ] ) ] ) -> [ graph ] • ( MC-Search ( graph, [ 3-D ] ) -> [ 3-D ] • ( MC-Sym graph ) -> [ 3-D ]

  3. MC-Tools: Objects(rat 28S rRNA sarcin/ricin stem-loop) Nucleotide cyclic motifs: ( MC-Sym graph ) -> [ 3-D ] Graph: 3-D structure: ( MC-Fold sequence ) -> [ graph ] Szewczak et al. PNAS(USA) 1993 Lemieux & Major NAR 2006 Parisien, Thibault & Major (in prep.) Sequence: GGGUGCUCAGUACGAGAGGAACCGCACCC

  4. Graph ( MC-Annotate 3-D ) -> graph Gendron, Lemieux & Major JMB 2001 Lemieux & Major NAR 2002 Leontis & Westhof RNA 2001

  5. X4 Y1 C4 C5 C2 X3 Y2 X2 C3 C1 Y3 X1 5’ 3’ Shortest Cycle Basis ( MC-Cycle graph ) -> [ NCM ] Horton SIAM J Comp 1987 St-Onge et al. NAR 2007

  6. The Nucleotide Cyclic Motifs (NCM) • Embrace indistinctly all base pairing types (Watson-Crick and others) • Precisely designate how any nucleotide in the sequence relate to others • Are joined through a common base pair (context). This helps us predict coherent chains of NCMs and to project them in 3-D. Tentative definition of a motif: “ordered” chain of NCMs. • Recur within and across all RNAs • Are short (< 10 nts; most of 3 to 5 nts) • Compose the classical motifs (cf. GRNA tetraloop; sarcin/ricin motif, etc). There are exceptions (cf. AA platform). Lemieux & Major (2006) NAR34:2340 Parisien, Thibault & Major (in prep.)

  7. Aim We want a computational model that can encode the valid sequences and structural features of RNA motifs. Hypothesis: A relation between the sequence and the structure of RNA motifs exists.

  8. Graph Grammars • A graph grammar is to a set of graphs what a formal generative grammar is to a set of strings, i.e. a precise and formal description of that set. • A graph-grammar consists of a set of rules or productions for transforming graphs. • Formally, a graph-grammar, H = {N, , P}, consists of a set of non‑terminal symbols, N, a set of terminal symbols, , and a set of production rules, P. Hypothesis: NCMs are “independent” building blocks. Nagl Computing 1976 Nagl In H. Ehrig et al., eds 1987 St-Onge et al. NAR 2007

  9. ARNt levure 23S H. marismortui 16S E. coli ⇒ Sarcin/Ricin Graph Grammar ⇒ N = {C1, C2, … C5}, the set of NCMs:  = {S1, S2, … S5} the sets of sequences for each NCM: P is a set of consistent assignment of the sequences in  to the NCMs in N (production rules): St-Onge et al. NAR 2007

  10. G A A G U U A A A A A G A U U G A U A Sarcin/Ricin Building Blocks C3 : Theoretical : 64 (16 x 4) IMs : 56 (14 x 4) PDB : 2 C4 : Theoretical : 256 (16 x 16) IMs : 160 (16 x 10) PDB : 3 C5 : Theoretical : 64 (16 x 4) IMs : 40 (10 x 4) PDB : 8 C1 : Theoretical : 256 (16 x 16) IMs : 120 (10 x 12) PDB : 7 C2 : Theoretical : 64 (16 x 4) IMs : 40 (10 x 4) PDB : 5 Theoretical : 16 IMs : 10 PDB : 15 St-Onge et al. NAR 2007

  11. ( MC-Seq sarcin-ricin-graph ) -> [ sequence ] Sequences supported by the NCMs in the PDB: AGUA-GAA AGUA-AAA GGUA-GAA GGUA-AAA If we remove the instances of the sarcin/ricin motifs ( MC-Search ( sarcin-ricin-graph, [ PDB ] ) ) -> [ 3-D ] Then, the same four sequences are supported => NCMs are found outside the sarcin/ricin context Larose et al. (in prep.) St-Onge et al. NAR 2007

  12. Graph Grammar Parsing 806 sequences aligned according to E. coli 23S rRNA structure; site 204-207 / 189-191. Westhof (personal comm.) St-Onge et al. NAR 2007

  13. Validation(MC-Seq vs. PDB vs. Alignment) Isostericity matrices MC-Seq PDB GGUA-AAA AGUA-AAA AGUA-GAA GGUA-GAA 10 000 sequences AAUA-AAA AAUA-GAA ACUA-AAA ACUA-GAA ACUA-GAC AGUA-AAC AGUA-CAA AGUA-GAC AGUA-GAU AGUA-GCC AGUA-GGG AGUA-GUG AGUC-GAA AUUA-GAA CGUA-GAA GAUA-GAA GGUA-GAU GUUA-GAA UGUA-GAA UGUA-GAC Alignement: 5S, 16S, 23S St-Onge et al. NAR 2007

  14. Perspectives • We want to develop a version of MC-Seq that would be useful during the alignment process. • PDB does not seem to contain enough structural information yet. • To avoid too many sequences, the NCMs (context) are necessary. • Two more things need to be considered…

  15. Sarcin/Ricin(Sequence/Structure Space Is Not Simple) St-Onge et al. (in prep.)

  16. Modeling In 3-D Might Be Necessary MC-Fold CAUU-AAG (2.1Å) Alignment AUUA-GAA (0.9Å) St-Onge et al. NAR 2007

  17. Acknowledgments Martin Larose (Res. assistant) Philippe Thibault (Res. assistant) Patrick Gendron (Res. assistant) Romain Rivière (Postdoc, CS) Véronique Lisi (Ph.D. Molecular Biology) Marc Parisien (Ph.D. Computer Science) Emmanuelle Permal (Ph.D. Bioinformatics) Karine St-Onge (Ph.D. Computer Science) Louis-Philippe Lavoie (M.Sc. Bioinformatics) Maxime Caron (M.Sc. Bioinformatics) Caroline Louis-Jeune (M.Sc. Bioinformatics) Montréal: Pascal Chartrand Gerardo Ferberye Sylvie Hamel Sébastien Lemieux Pascale Legault Luc Desgroseillers Kathy Borden Daniel Lamarre Éric Westhof (Strasbourg) Alain Denise (Paris) Dave Mathews (Rochester)

More Related