Hardness of RNA Folding Problem With Four Symbols

Author Yi-Jun Chang



PDF
Thumbnail PDF

File

LIPIcs.CPM.2016.13.pdf
  • Filesize: 1.04 MB
  • 12 pages

Document Identifiers

Author Details

Yi-Jun Chang

Cite As Get BibTex

Yi-Jun Chang. Hardness of RNA Folding Problem With Four Symbols. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 13:1-13:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.CPM.2016.13

Abstract

An RNA sequence is a string composed of four types of nucleotides, A, C, G, and U. Given an RNA sequence, the goal of the RNA folding problem is to find a maximum cardinality set of crossing-free pairs of the form {A,U} or {C,G}. The problem is central in bioinformatics and has received much attention over the years. Whether the RNA folding problem can be solved in O(n^{3-epsilon}) time remains an open problem. Recently, Abboud, Backurs, and Williams (FOCS'15) made the first progress by showing a conditional lower bound for a generalized version of the RNA folding problem based on a conjectured hardness of the $k$-clique problem. However, their proof requires alphabet size >= 36 to work, making the result biologically irrelevant. In this paper, by constructing the gadgets using a lemma of Bringmann and Künnemann (FOCS'15) and surrounding them with some carefully designed sequences, we improve upon the framework of Abboud et al. to handle the case of alphabet size 4, yielding a conditional lower bound for the RNA folding problem. We also investigate the Dyck edit distance problem. We demonstrate a reduction from RNA folding problem to Dyck edit distance problem of alphabet size 10, establishing a connection between the two fundamental string problems. This leads to a much simpler proof of the conditional lower bound for Dyck edit distance problem given by Abboud et al. and lowers the required alphabet size for the lower bound to work.

Subject Classification

Keywords
  • RNA folding
  • Dyck edit distance
  • longest common subsequence
  • conditional lower bound
  • clique

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. If the current clique algorithms are optimal, so is Valiant’s parser. In Proceedings of 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 98-117, 2015. Google Scholar
  2. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In Proceedings of 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 59-78, 2015. Google Scholar
  3. Tatsuya Akutsu. Approximation and exact algorithms for RNA secondary structure prediction and recognition of stochastic context-free languages. Journal of Combinatorial Optimization, 3(2):321-336, 1999. Google Scholar
  4. Amihood Amir, Timothy M. Chan, Moshe Lewenstein, and Noa Lewenstein. On hardness of jumbled indexing. In Proceedings of 41st International Colloquium Automata, Languages, and Programming (ICALP), pages 114-125, 2014. Google Scholar
  5. Amihood Amir and Gad M. Landau. Fast parallel and serial multidimensional approximate array matching. Theoretical Computer Science, 81(1):97-115, 1991. Google Scholar
  6. Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In Proceedings of 47th Annual ACM Symposium on Theory of Computing (STOC), pages 51-58, 2015. Google Scholar
  7. Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Proceedings of 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 79-97, 2015. Google Scholar
  8. Timothy M. Chan and Moshe Lewenstein. Clustered integer 3SUM via additive combinatorics. In Proceedings of 47th Annual ACM Symposium on Theory of Computing (STOC), pages 31-40, 2015. Google Scholar
  9. Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme J. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998. Google Scholar
  10. Friedrich Eisenbrand and Fabrizio Grandoni. On the complexity of fixed parameter clique and dominating set. Theoretical Computer Science, 326(1):57-67, 2004. Google Scholar
  11. Yelena Frid and Dan Gusfield. A simple, practical and complete 𝒪(\fracn³log n)-time algorithm for RNA folding using the Four-Russians speedup. Algorithms for Molecular Biology, 5(1):1-8, 2010. Google Scholar
  12. Tamar Pinhas, Dekel Tsur, Shay Zakov, and Michal Ziv-Ukelson. Edit distance with duplications and contractions revisited. In Proceedings of 22nd Annual Symposium on Combinatorial Pattern Matching (CPM), pages 441-454. Springer Berlin Heidelberg, 2011. Google Scholar
  13. Tamar Pinhas, Shay Zakov, Dekel Tsur, and Michal Ziv-Ukelson. Efficient edit distance with duplications and contractions. Algorithms for Molecular Biology, 8(1):1-28, 2013. Google Scholar
  14. Mihai Pătraşcu and Ryan Williams. On the possibility of faster SAT algorithms. In Proceedings of 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1065-1075, 2010. Google Scholar
  15. Liam Roditty and Virginia Vassilevska Williams. Fast approximation algorithms for the diameter and radius of sparse graphs. In Proceedings of 45th ACM Symposium on Theory of Computing (STOC), pages 515-524, 2013. Google Scholar
  16. Barna Saha. The Dyck language edit distance problem in near-linear time. In Proceedings of 55th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 611-620, 2014. Google Scholar
  17. Barna Saha. Language edit distance and maximum likelihood parsing of stochastic grammars: Faster algorithms and connection to fundamental graph problems. In Proceedings of 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 118-135, 2015. Google Scholar
  18. Yinglei Song. Time and space efficient algorithms for RNA folding with the Four-Russians technique. Technical Report arXiv:1503.05670, 2015. Google Scholar
  19. Leslie G. Valiant. General context-free recognition in less than cubic time. Journal of Computer and System Sciences, 10(2):308-315, 1975. Google Scholar
  20. Virginia Vassilevska. Efficient algorithms for clique problems. Information Processing Letters, 109(4):254-257, 2009. Google Scholar
  21. Balaji Venkatachalam, Dan Gusfield, and Yelena Frid. Faster algorithms for RNA-folding using the Four-Russians method. Algorithms for Molecular Biology, 9(1):1-12, 2014. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail