A Graph-Based Similarity Approach to Classify Recurrent Complex Motifs from Their Context in RNA Structures

Authors Coline Gianfrotta , Vladimir Reinharz , Dominique Barth, Alain Denise

Thumbnail PDF


  • Filesize: 2.08 MB
  • 18 pages

Document Identifiers

Author Details

Coline Gianfrotta
  • Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, DAVID lab, France
  • Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400, Orsay, France
Vladimir Reinharz
  • Department of Computer Science, Université du Québec à Montréal, Québec, Canada
Dominique Barth
  • Université de Versailles Saint-Quentin-en-Yvelines, Université Paris-Saclay, DAVID lab, France
Alain Denise
  • Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400, Orsay, France
  • Université Paris-Saclay, CNRS, I2BC, 91400, Orsay, France

Cite AsGet BibTex

Coline Gianfrotta, Vladimir Reinharz, Dominique Barth, and Alain Denise. A Graph-Based Similarity Approach to Classify Recurrent Complex Motifs from Their Context in RNA Structures. In 19th International Symposium on Experimental Algorithms (SEA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 190, pp. 19:1-19:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


This article proposes to use an RNA graph similarity metric, based on the MCES resolution problem, to compare the occurrences of specific complex motifs in RNA graphs, according to their context represented as subgraph. We rely on a new modeling by graphs of these contexts, at two different levels of granularity, and obtain a classification of these graphs, which is consistent with the RNA 3D structure. RNA many non-translational functions, as a ribozyme, riboswitch, or ribosome, require complex structures. Those are composed of a rigid skeleton, a set of canonical interactions called the secondary structure. Decades of experimental and theoretical work have produced precise thermodynamic parameters and efficient algorithms to predict, from sequence, the secondary structure of RNA molecules. On top of the skeleton, the nucleotides form an intricate network of interactions that are not captured by present thermodynamic models. This network has been shown to be composed of modular motifs, that are linked to function, and have been leveraged for better prediction and design. A peculiar subclass of complex structural motifs are those connecting RNA regions far away in the secondary structure. They are crucial to predict since they determine the global shape of the molecule, therefore important for the function. In this paper, we show by using our graph approach that the context is important for the formation of conserved complex structural motifs. We furthermore show that a natural classification of structural variants of the motifs emerges from their context. We explore the cases of three known motif families and we exhibit their experimentally emerging classification.

Subject Classification

ACM Subject Classification
  • Applied computing → Molecular structural biology
  • Graph similarity
  • clustering
  • RNA 3D folding
  • RNA motif


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Faisal N. Abu-Khzam, Nagiza F. Samatova, Mohamad A. Rizk, and Michael A. Langston. The maximum common subgraph problem: Faster solutions via vertex cover. In IEEE/ACS International Conference on Computer Systems and Applications, pages 367-373, 2007. URL: https://doi.org/10.1109/AICCSA.2007.370907.
  2. Tatsuya Akutsu and Hiroshi Nagamochi. Comparison and enumeration of chemical graphs. Computational and Structural Biotechnology Journal, 5, 2013. URL: https://doi.org/10.5936/csbj.201302004.
  3. Rafael Brüschweiler. Efficient RMSD measures for the comparison of two molecular ensembles. Root-mean-square deviation. Proteins, 50(1):26-34, 2003. URL: https://doi.org/10.1002/prot.10250.
  4. Emidio Capriotti and Marc A. Marti-Renom. RNA structure alignment by a unit-vector approach. Bioinformatics, 24(16):i112-i118, 2008. URL: https://doi.org/10.1093/bioinformatics/btn288.
  5. Grzegorz Chojnowski, Tomasz Waleń, and Janusz M. Bujnicki. RNA Bricks—a database of RNA 3D motifs and their interactions. Nucleic acids research, 42(D1):D123-D131, 2014. Google Scholar
  6. Hanna Eckert and Jürgen Bajorath. Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discovery Today, 12(5):225-233, 2007. URL: https://doi.org/10.1016/j.drudis.2007.01.011.
  7. Michael R. Garey and David S. Johnson. Computers and intractability, volume 29. WH Freeman New York, 2002. Google Scholar
  8. Johann Gasteiger. Handbook of Chemoinformatics: From Data to Knowledge. Wiley, 1 edition, 2003. URL: https://doi.org/10.1002/9783527618279.
  9. Mark A. Johnson and Gerald M. Maggiora. Concepts and applications of molecular similarity. The American Chemical Society, 1988. Google Scholar
  10. Neocles B. Leontis, Aurélie Lescoute, and Eric Westhof. The building blocks and motifs of RNA architecture. Current Opinion in Structural Biology, 16(3):279-287, 2006. URL: https://doi.org/10.1016/j.sbi.2006.05.009.
  11. Neocles B. Leontis and Eric Westhof. Geometric nomenclature and classification of RNA base pairs. RNA, 7(4):499-512, April 2001. Google Scholar
  12. Neocles B. Leontis and Eric Westhof. Analysis of RNA motifs. Current opinion in structural biology, 13(3):300-308, 2003. Google Scholar
  13. Aurélie Lescoute and Eric Westhof. The A-minor motifs in the decoding recognition process. Biochimie, 88(8):993-999, 2006. URL: https://doi.org/10.1016/j.biochi.2006.05.018.
  14. Aurélie Lescoute and Eric Westhof. The interaction networks of structured RNAs. Nucleic acids research, 34(22):6587-6604, 2006. Google Scholar
  15. Marcin Magnus, Kalli Kappel, Rhiju Das, and Janusz M. Bujnicki. RNA 3D structure prediction guided by independent folding of homologous sequences. BMC Bioinformatics, 20(1):512, 2019. URL: https://doi.org/10.1186/s12859-0193120-y.
  16. Stefi Nouleho Ilemo, Dominique Barth, Oliver David, Franck Quessette, Marc-Antoine Weisser, and Dimitri Watel. Improving graphs of cycles approach to structural similarity of molecules. PLOS ONE, 14(12):1-25, 2019. URL: https://doi.org/10.1371/journal.pone.0226680.
  17. Carlos Oliver, Vincent Mallet, Roman Sarrazin-Gendron, Vladimir Reinharz, William L. Hamilton, Nicolas Moitessier, and Jérôme Waldispühl. Augmented base pairing networks encode RNA-small molecule binding preferences. Nucleic acids research, 48(14):7690-7699, 2020. Google Scholar
  18. Marc Parisien and François Major. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature, 452:51-5, 2008. URL: https://doi.org/10.1038/nature06684.
  19. Samuela Pasquali, Hin H. Gan, and Tamar Schlick. Modular RNA architecture revealed by computational analysis of existing pseudoknots and ribosomal RNAs. Nucleic Acids Research, 33(4):1384-1398, 2005. URL: https://doi.org/10.1093/nar/gki267.
  20. Airel Pérez-Suàrez, José F. Martínez-Trinidad, Jésus A. Carrasco-Ochoa, and José E. Medina-Pagola. An algorithm based on density and compactness for dynamic overlapping clustering. Pattern Recognition, 46(11):3040-3055, 2013. URL: https://doi.org/10.1016/j.patcog.2013.03.022.
  21. Anton I. Petrov, Craig L. Zirbel, and Neocles B. Leontis. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. Rna, 19(10):1327-1340, 2013. Google Scholar
  22. John Raymond, Eleanor Gardiner, and Peter Willett. RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs. Computer Journal, 45:631-644, April 2002. Google Scholar
  23. Vladimir Reinharz, Antoine Soulé, Eric Westhof, Jérôme Waldispühl, and Alain Denise. Mining for recurrent long-range interactions in RNA structures reveals embedded hierarchies in network families. Nucleic Acids Research, 46(8):3841-3851, 2018. URL: https://doi.org/10.1093/nar/gky197.
  24. Roger Sayle, John May, Noel O’Boyle, J. Andrew Grant, Stefan Senger, and Darren V.S. Green. Chemical similarity based on graph edit distance:efficient implementation and the challenges of evaluation. In 7th Joint Sheffield Conference on Chemoinformatics, 2015. Google Scholar
  25. Jason Yao, Vladimir Reinharz, François Major, and Jérôme Waldispühl. RNA-MoIP: prediction of RNA secondary structure and local 3D motifs from sequence data. Nucleic acids research, 45(W1):W440-W444, 2017. Google Scholar
  26. Laura A. Zager and George C. Verghese. Graph similarity scoring and matching. Applied Mathematics Letters, 21(45):86-94, 2008. URL: https://doi.org/10.1016/j.aml.2007.01.006.
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail