Double String Tandem Repeats

Authors Amihood Amir, Ayelet Butman, Gad M. Landau, Shoshana Marcus, Dina Sokol



PDF
Thumbnail PDF

File

LIPIcs.CPM.2020.3.pdf
  • Filesize: 1.19 MB
  • 13 pages

Document Identifiers

Author Details

Amihood Amir
  • Department of Computer Science, Bar Ilan University, Ramat-Gan, 52900, Israel
Ayelet Butman
  • Department of Computer Science, Holon Institude of Technology, Golomb St 52, Holon, 5810201, Israel
Gad M. Landau
  • Department of Computer Science, University of Haifa, Haifa 31905, Israel
  • NYU Tandon School of Engineering, New York University, Six MetroTech Center, Brooklyn, NY 11201, USA
Shoshana Marcus
  • Department of Mathematics and Computer Science, Kingsborough Community College of the City University of New York, 2001 Oriental Boulevard, Brooklyn, NY 11235, USA
Dina Sokol
  • Department of Computer and Information Science, Brooklyn College and The Graduate Center, City University of New York, Brooklyn, NY, USA

Cite As Get BibTex

Amihood Amir, Ayelet Butman, Gad M. Landau, Shoshana Marcus, and Dina Sokol. Double String Tandem Repeats. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 3:1-3:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.CPM.2020.3

Abstract

A tandem repeat is an occurrence of two adjacent identical substrings. In this paper, we introduce the notion of a double string, which consists of two parallel strings, and we study the problem of locating all tandem repeats in a double string. The problem introduced here has applications beyond actual double strings, as we illustrate by solving two different problems with the algorithm of the double string tandem repeats problem. The first problem is that of finding all corner-sharing tandems in a 2-dimensional text, defined by Apostolico and Brimkov. The second problem is that of finding all scaled tandem repeats in a 1d text, where a scaled tandem repeat is defined as a string UU' such that U' is discrete scale of U. In addition to the algorithms for exact tandem repeats, we also present algorithms that solve the problem in the inexact sense, allowing up to k mismatches. We believe that this framework will open a new perspective for other problems in the future.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
  • Theory of computation → Pattern matching
Keywords
  • double string
  • tandem repeat
  • 2-dimensional
  • scale

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir, Ayelet Butman, and Moshe Lewenstein. Real scaled matching. Information Processing Letters, 70(4):185-190, 1999. URL: https://doi.org/10.1016/S0020-0190(99)00060-5.
  2. Alberto Apostolico and Valentin E. Brimkov. Optimal discovery of repetitions in 2d. Discrete Applied Mathematics, 151(1-3):5-20, 2005. URL: https://doi.org/10.1016/j.dam.2005.02.019.
  3. Ayelet Butman, Revital Eres, and Gad M. Landau. Scaled and permuted string matching. Information processing letters, 92(6):293-297, 2004. URL: https://doi.org/10.1016/j.ipl.2004.09.002.
  4. Maxime Crochemore, Lucian Ilie, and Wojciech Rytter. Repetitions in strings: Algorithms and combinatorics. Theoretical Computer Science, 410(50):5227-5235, 2009. Mathematical Foundations of Computer Science (MFCS 2007). URL: https://doi.org/10.1016/j.tcs.2009.08.024.
  5. Tali Eilam-Tsoreff and Uzi Vishkin. Matching patterns in strings subject to multilinear trasformations. Theoretical Computer Science, 60(3):231-254, 1988. URL: https://doi.org/10.1016/0304-3975(88)90112-0.
  6. Zvi Galil and Raffaele Giancarlo. Improved string matching with k mismatches. SIGACT News, 17(4):52-54, 1986. Google Scholar
  7. Sara H. Geizhals and Dina Sokol. Finding maximal 2-dimensional palindromes. Information and Computation, 266:161-172, 2019. URL: https://doi.org/10.1016/j.ic.2019.03.001.
  8. David Harel and Robert E. Tarjan. Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing, 13(2):338-355, 1984. URL: https://doi.org/10.1137/0213024.
  9. Costas S. Iliopoulos, Dennis Moore, and W.F. Smyth. A characterization of the squares in a fibonacci string. Theoretical Computer Science, 172(1):281-291, 1997. URL: https://doi.org/10.1016/S0304-3975(96)00141-7.
  10. Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323-350, 1977. URL: https://doi.org/10.1137/0206024.
  11. Roman M. Kolpakov and Gregory Kucherov. Finding maximal repetitions in a word in linear time. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 596-604. IEEE Computer Society, 1999. URL: https://doi.org/10.1109/SFFCS.1999.814634.
  12. Gad M. Landau, Jeanette P. Schmidt, and Dina Sokol. An algorithm for approximate tandem repeats. Journal of Computational Biology, 8:1-18, 2001. URL: https://doi.org/10.1089/106652701300099038.
  13. Gad M. Landau and Uzi Vishkin. Fast string matching with k differences. Journal of Computer and System Sciences, 37(1):63-78, 1988. URL: https://doi.org/10.1016/0022-0000(88)90045-1.
  14. Gad M. Landau and Uzi Vishkin. Fast parallel and serial approximate string matching. Journal of Algorithms, 10(2):157-169, 1989. URL: https://doi.org/10.1016/0196-6774(89)90010-2.
  15. J.J. Liu, G.S. Huang, and Y.L. Wang. A fast algorithm for finding the positions of all squares in a run-length encoded string. Theoretical Computer Science, 410(38):3942-3948, 2009. URL: https://doi.org/10.1016/j.tcs.2009.05.032.
  16. Michael G. Main and Richard J. Lorentz. An O(n log n) algorithm for finding all repetitions in a string. Journal of Algorithms, 5(3):422-432, 1984. URL: https://doi.org/10.1016/0196-6774(84)90021-X.
  17. Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3):249-260, 1995. URL: https://doi.org/10.1007/BF01206331.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail