RNA Inverse Folding Can Be Solved in Linear Time for Structures Without Isolated Stacks or Base Pairs

Authors Théo Boury , Laurent Bulteau , Yann Ponty



PDF
Thumbnail PDF

File

LIPIcs.WABI.2024.19.pdf
  • Filesize: 1.33 MB
  • 23 pages

Document Identifiers

Author Details

Théo Boury
  • Laboratoire d’Informatique de l’Ecole Polytechnique (LIX
  • UMR 7161), Institut Polytechnique de Paris, France
Laurent Bulteau
  • LIGM, CNRS, Université Gustave Eiffel, France
Yann Ponty
  • Laboratoire d’Informatique de l’Ecole Polytechnique (LIX
  • UMR 7161), Institut Polytechnique de Paris, France

Cite AsGet BibTex

Théo Boury, Laurent Bulteau, and Yann Ponty. RNA Inverse Folding Can Be Solved in Linear Time for Structures Without Isolated Stacks or Base Pairs. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 19:1-19:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.WABI.2024.19

Abstract

Inverse folding is a classic instance of negative RNA design which consists in finding a sequence that uniquely folds into a target secondary structure with respect to energy minimization. A breakthrough result of Bonnet et al. shows that, even in simple base pairs-based (BP) models, the decision version of a mildly constrained version of inverse folding is NP-hard. In this work, we show that inverse folding can be solved in linear time for a large collection of targets, including every structure that contains no isolated BP and no isolated stack (or, equivalently, when all helices consist of 3^{+} base pairs). For structures featuring shorter helices, our linear algorithm is no longer guaranteed to produce a solution, but still does so for a large proportion of instances. Our approach introduces a notion of modulo m-separability, generalizing a property pioneered by Hales et al. Separability is a sufficient condition for the existence of a solution to the inverse folding problem. We show that, for any input secondary structure of length n, a modulo m-separated sequence can be produced in time 𝒪(n 2^m) anytime such a sequence exists. Meanwhile, we show that any structure consisting of 3^{+} base pairs is either trivially non-designable, or always admits a modulo-2 separated solution (m = 2). Solution sequences can thus be produced in linear time, and even be uniformly generated within the set of modulo-2 separable sequences.

Subject Classification

ACM Subject Classification
  • Applied computing → Molecular structural biology
Keywords
  • RNA structure
  • String Design
  • Parameterized Complexity
  • Uniform Sampling

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Mirela Andronescu, Anthony P. Fejes, Frank Hutter, Holger H. Hoos, and Anne Condon. A new algorithm for rna secondary structure design. Journal of Molecular Biology, 336(3):607-624, 2004. URL: https://doi.org/10.1016/j.jmb.2003.12.041.
  2. Édouard Bonnet, Paweł Rzążewski, and Florian Sikora. Designing rna secondary structures is hard. Journal of Computational Biology, 27(3):302-316, 2020. PMID:32160034. URL: https://doi.org/10.1089/cmb.2019.0420.
  3. Théo Boury, Laurent Bulteau, and Yann Ponty. LinearBPDesign. Software, version 1.0., swhId: https://archive.softwareheritage.org/swh:1:dir:73673b14e891528ae11d29515662b482f730be12;origin=https://gitlab.inria.fr/amibio/linearbpdesign;visit=swh:1:snp:c8ad7229d32bb5e86b05dda530f3280ae4d87608;anchor=swh:1:rev:c4ba4998d0790a1fc14115c33d500a7e22e5fe9b (visited on 2024-08-19). URL: https://gitlab.inria.fr/amibio/linearbpdesign.
  4. Anke Busch and Rolf Backofen. INFO-RNA-a fast approach to inverse RNA folding. Bioinformatics, 22(15):1823-31, 2006. Google Scholar
  5. Ali Esmaili-Taheri and Mohammad Ganjtabesh. ERD: a fast and reliable tool for RNA design including constraints. BMC Bioinform., 16:20:1-20:11, 2015. Google Scholar
  6. Juan Antonio Garcia-Martin, Ivan Dotu, and Peter Clote. RNAiFold 2.0: a web server and software to design custom and Rfam-based RNA molecules. Nucleic Acids Research, 43(W1):W513-W521, May 2015. URL: https://doi.org/10.1093/nar/gkv460.
  7. Jozef Hales, Alice Héliou, Ján Manuch, Yann Ponty, and Ladislav Stacho. Combinatorial RNA design: Designability and structure-approximating algorithm in watson-crick and nussinov-jacobson energy models. Algorithmica, 79(3):835-856, 2017. Google Scholar
  8. Stefan Hammer, Wei Wang, Sebastian Will, and Yann Ponty. Fixed-parameter tractable sampling for RNA design with multiple target structures. BMC bioinformatics, 20:209, April 2019. URL: https://doi.org/10.1186/s12859-019-2784-7.
  9. Ivo L Hofacker, Walter Fontana, Peter F Stadler, L Sebastian Bonhoeffer, Manfred Tacker, and Peter Schuster. Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie/Chemical Monthly, 125(2):167-188, 1994. Google Scholar
  10. Robert Kleinkauf, Martin Mann, and Rolf Backofen. antaRNA: ant colony-based RNA sequence design. Bioinformatics, 31(19):3114-3121, May 2015. URL: https://doi.org/10.1093/bioinformatics/btv319.
  11. William Andrew Lorenz and Yann Ponty. Non-redundant random generation algorithms for weighted context-free grammars. Theoretical Computer Science, 502:177-194, 2013. Generation of Combinatorial Structures. URL: https://doi.org/10.1016/j.tcs.2013.01.006.
  12. Rune B. Lyngsø, James W. J. Anderson, Elena Sizikova, Amarendra Badugu, Tomas Hyland, and Jotun Hein. Frnakenstein: multiple target inverse RNA folding. BMC Bioinform., 13:260, 2012. Google Scholar
  13. Nono S. C. Merleau and Matteo Smerlak. arnaque: an evolutionary algorithm for inverse pseudoknotted RNA folding inspired by lévy flights. BMC Bioinform., 23(1):335, 2022. Google Scholar
  14. R Nussinov and A B Jacobson. Fast algorithm for predicting the secondary structure of single-stranded rna. Proceedings of the National Academy of Sciences, 77(11):6309-6313, 1980. URL: https://doi.org/10.1073/pnas.77.11.6309.
  15. Yann Ponty. Ensemble Algorithms and Analytic Combinatorics in RNA Bioinformatics and Beyond. Habilitation à diriger des recherches, Université Paris-Saclay, May 2020. URL: https://theses.hal.science/tel-03219977.
  16. Vladimir Reinharz, Yann Ponty, and Jérôme Waldispühl. A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution. Bioinformatics, 29(13):i308-i315, June 2013. URL: https://doi.org/10.1093/bioinformatics/btt217.
  17. Matan Drory Retwitzer, Vladimir Reinharz, Alexander Churkin, Yann Ponty, Jérôme Waldispühl, and Danny Barash. incaRNAfbinv 2.0: a webserver and software with motif control for fragment-based design of RNAs. Bioinformatics, 36(9):2920-2922, January 2020. URL: https://doi.org/10.1093/bioinformatics/btaa039.
  18. Frederic Runge, Danny Stoll, Stefan Falkner, and Frank Hutter. Learning to design RNA. In Proceedings of ICLR 2019, 2019. Google Scholar
  19. Michael Schnall-Levin, Leonid Chindelevitch, and Bonnie Berger. Inverting the viterbi algorithm: an abstract framework for structure design. In ICML, volume 307 of ACM International Conference Proceeding Series, pages 904-911. ACM, 2008. Google Scholar
  20. Douglas H. Turner and David H. Mathews. NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Research, 38(suppl_1):D280-D282, October 2009. URL: https://doi.org/10.1093/nar/gkp892.
  21. Hua-Ting Yao, Cedric Chauve, Mireille Regnier, and Yann Ponty. Exponentially few RNA structures are designable. In ACM-BCB 2019 - 10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 289-298, Niagara-Falls, United States, September 2019. ACM Press. URL: https://doi.org/10.1145/3307339.3342163.
  22. Hua-Ting Yao, Jérôme Waldispühl, Yann Ponty, and Sebastian Will. Taming Disruptive Base Pairs to Reconcile Positive and Negative Structural Design of RNA. In Proc. of the 25th Annual International Conferences on Computational Molecular Biology (RECOMB'21), 2021. URL: https://inria.hal.science/hal-02987566.
  23. Joseph N. Zadeh, Brian R. Wolfe, and Niles A. Pierce. Nucleic acid sequence design via efficient ensemble defect optimization. Journal of Computational Chemistry, 32(3):439-452, 2011. URL: https://doi.org/10.1002/jcc.21633.