Quasi-Linear-Time Algorithm for Longest Common Circular Factor

Authors Mai Alzamel , Maxime Crochemore , Costas S. Iliopoulos , Tomasz Kociumaka , Jakub Radoszewski , Wojciech Rytter , Juliusz Straszyński , Tomasz Waleń , Wiktor Zuba



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.25.pdf
  • Filesize: 0.6 MB
  • 14 pages

Document Identifiers

Author Details

Mai Alzamel
  • Department of Informatics, King’s College London, UK
  • Department of Computer Science, King Saud University, Riyadh, Saudi Arabia
Maxime Crochemore
  • Department of Informatics, King’s College London, UK
  • Laboratoire d\'Informatique Gaspard-Monge, Université Paris-Est, Marne-la-Vallée, France
Costas S. Iliopoulos
  • Department of Informatics, King’s College London, UK
Tomasz Kociumaka
  • Institute of Informatics, University of Warsaw, Poland
  • Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
Jakub Radoszewski
  • Institute of Informatics, University of Warsaw, Poland
Wojciech Rytter
  • Institute of Informatics, University of Warsaw, Poland
Juliusz Straszyński
  • Institute of Informatics, University of Warsaw, Poland
Tomasz Waleń
  • Institute of Informatics, University of Warsaw, Poland
Wiktor Zuba
  • Institute of Informatics, University of Warsaw, Poland

Cite AsGet BibTex

Mai Alzamel, Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Quasi-Linear-Time Algorithm for Longest Common Circular Factor. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 25:1-25:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CPM.2019.25

Abstract

We introduce the Longest Common Circular Factor (LCCF) problem in which, given strings S and T of length at most n, we are to compute the longest factor of S whose cyclic shift occurs as a factor of T. It is a new similarity measure, an extension of the classic Longest Common Factor. We show how to solve the LCCF problem in O(n log^4 n) time using O(n log^2 n) space.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • longest common factor
  • circular pattern matching
  • internal pattern matching
  • intersection of hyperrectangles

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amihood Amir, Gad M. Landau, Moshe Lewenstein, and Dina Sokol. Dynamic text and static pattern matching. ACM Transactions on Algorithms, 3(2):19, 2007. URL: http://dx.doi.org/10.1145/1240233.1240242.
  2. Alberto Apostolico, Maxime Crochemore, Martin Farach-Colton, Zvi Galil, and S. Muthukrishnan. 40 years of suffix trees. Communications of the ACM, 59(4):66-73, 2016. URL: http://dx.doi.org/10.1145/2810036.
  3. Tanver Athar, Carl Barton, Widmer Bland, Jia Gao, Costas S. Iliopoulos, Chang Liu, and Solon P. Pissis. Fast circular dictionary-matching algorithm. Mathematical Structures in Computer Science, 27(2):143-156, 2017. URL: http://dx.doi.org/10.1017/S0960129515000134.
  4. Md. Aashikur Rahman Azim, Costas S. Iliopoulos, Mohammad Sohel Rahman, and M. Samiruzzaman. A fast and lightweight filter-based algorithm for circular pattern matching. In Pierre Baldi and Wei Wang, editors, 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2014, pages 621-622. ACM, 2014. URL: http://dx.doi.org/10.1145/2649387.2660804.
  5. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "Runs" Theorem. SIAM Journal on Computing, 46(5):1501-1514, 2017. URL: http://dx.doi.org/10.1137/15M1011032.
  6. Carl Barton, Costas S. Iliopoulos, and Solon P. Pissis. Fast algorithms for approximate circular string matching. Algorithms for Molecular Biology, 9:9, 2014. URL: http://dx.doi.org/10.1186/1748-7188-9-9.
  7. Carl Barton, Costas S. Iliopoulos, and Solon P. Pissis. Average-Case Optimal Approximate Circular String Matching. In Adrian-Horia Dediu, Enrico Formenti, Carlos Martín-Vide, and Bianca Truthe, editors, Language and Automata Theory and Applications, LATA 2015, volume 8977 of LNCS, pages 85-96. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-319-15579-1_6.
  8. Or Birenzwige, Shay Golan, and Ely Porat. Locally Consistent Parsing for Text Indexing in Small Space. ArXiv preprint, 2018. URL: http://arxiv.org/abs/1812.00359.
  9. Domenico Cantone, Simone Faro, and Arianna Pavone. Sequence Searching Allowing for Non-Overlapping Adjacent Unbalanced Translocations. ArXiv preprint, 2018. URL: http://arxiv.org/abs/1812.00421.
  10. Kuei-Hao Chen, Guan-Shieng Huang, and Richard Chia-Tung Lee. Bit-Parallel Algorithms for Exact Circular String Matching. The Computer Journal, 57(5):731-743, 2014. URL: http://dx.doi.org/10.1093/comjnl/bxt023.
  11. Da-Jung Cho, Yo-Sub Han, and Hwee Kim. Alignment with non-overlapping inversions and translocations on two strings. Theoretical Computer Science, 575:90-101, 2015. URL: http://dx.doi.org/10.1016/j.tcs.2014.10.036.
  12. Maxime Crochemore, Christophe Hancart, and Thierry Lecroq. Algorithms on strings. Cambridge University Press, 2007. URL: http://dx.doi.org/10.1017/cbo9780511546853.
  13. Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Extracting powers and periods in a word from its runs structure. Theoretical Computer Science, 521:29-41, 2014. URL: http://dx.doi.org/10.1016/j.tcs.2013.11.018.
  14. Herbert Edelsbrunner. A new approach to rectangle intersections, Part I. International Journal of Computer Mathematics, 13(3-4):209-219, 1983. URL: http://dx.doi.org/10.1080/00207168308803364.
  15. Martin Farach-Colton, Paolo Ferragina, and S. Muthukrishnan. On the sorting-complexity of suffix tree construction. Journal of the ACM, 47(6):987-1011, 2000. URL: http://dx.doi.org/10.1145/355541.355547.
  16. Nathan J. Fine and Herbert S. Wilf. Uniqueness Theorems for Periodic Functions. Proceedings of the American Mathematical Society, 16:109-114, 1965. URL: http://dx.doi.org/10.1090/S0002-9939-1965-0174934-9.
  17. Kimmo Fredriksson and Szymon Grabowski. Average-optimal string matching. Journal of Discrete Algorithms, 7(4):579-594, 2009. URL: http://dx.doi.org/10.1016/j.jda.2008.09.001.
  18. Kimmo Fredriksson and Gonzalo Navarro. Average-optimal single and multiple approximate string matching. ACM Journal of Experimental Algorithmics, 9:1.4:1-1.4:47, 2004. URL: http://dx.doi.org/10.1145/1005813.1041513.
  19. Szymon Grabowski, Tomasz Kociumaka, and Jakub Radoszewski. On Abelian Longest Common Factor with and without RLE. Fundamenta Informaticae, 163(3):225-244, 2018. URL: http://dx.doi.org/10.3233/FI-2018-1740.
  20. Tommi Hirvola and Jorma Tarhio. Bit-Parallel Approximate Matching of Circular Strings with k Mismatches. ACM Journal of Experimental Algorithmics, 22:1.5:1-1.5:22, 2017. URL: http://dx.doi.org/10.1145/3129536.
  21. Costas S. Iliopoulos, Solon P. Pissis, and M. Sohel Rahman. Searching and Indexing Circular Patterns. In Mourad Elloumi, editor, Algorithms for Next-Generation Sequencing Data, Techniques, Approaches, and Applications, pages 77-90. Springer, 2017. URL: http://dx.doi.org/10.1007/978-3-319-59826-0_3.
  22. Costas S. Iliopoulos and M. Sohel Rahman. Indexing Circular Patterns. In Shin-Ichi Nakano and Md. Saidur Rahman, editors, Algorithms and Computation, WALCOM 2008, volume 4921 of LNCS, pages 46-57. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-77891-2_5.
  23. Dominik Kempa and Tomasz Kociumaka. String synchronizing sets: Sublinear-time BWT construction and optimal LCE data structure. In Edith Cohen, editor, 51st Annual ACM Symposium on Theory of Computing, STOC 2019. ACM, 2019. URL: http://dx.doi.org/10.1145/3313276.3316368.
  24. Tomasz Kociumaka. Efficient Data Structures for Internal Queries in Texts. PhD thesis, University of Warsaw, October 2018. URL: https://www.mimuw.edu.pl/~kociumaka/files/phd.pdf.
  25. Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A Linear Time Algorithm for Seeds Computation. ArXiv preprint, 2019. URL: http://arxiv.org/abs/1107.2422v2.
  26. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Efficient Data Structures for the Factor Periodicity Problem. In Liliana Calderón-Benavides, Cristina N. González-Caro, Edgar Chávez, and Nivio Ziviani, editors, String Processing and Information Retrieval, SPIRE 2012, volume 7608 of LNCS, pages 284-294. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-34109-0_30.
  27. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Internal Pattern Matching Queries in a Text and Applications. In Piotr Indyk, editor, 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, pages 532-551. SIAM, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.36.
  28. Jie Lin and Donald A. Adjeroh. All-Against-All Circular Pattern Matching. The Computer Journal, 55(7):897-906, 2012. URL: http://dx.doi.org/10.1093/comjnl/bxr126.
  29. Jie Lin, Yue Jiang, and Don Adjeroh. Circular Pattern Discovery. The Computer Journal, 58(5):1061-1073, 2015. URL: http://dx.doi.org/10.1093/comjnl/bxu009.
  30. Udi Manber and Eugene W. Myers. Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing, 22(5):935-948, 1993. URL: http://dx.doi.org/10.1137/0222058.
  31. Hideaki Ogiwara, Takashi Kohno, Hirofumi Nakanishi, Kazuhiro Nagayama, Masanori Sato, and Jun Yokota. Unbalanced translocation, a major chromosome alteration causing loss of heterozygosity in human lung cancer. Oncogene, 27(35):4788-4797, 2008. URL: http://dx.doi.org/10.1038/onc.2008.113.
  32. Robert Susik, Szymon Grabowski, and Sebastian Deorowicz. Fast and Simple Circular Pattern Matching. In Aleksandra Gruca, Tadeusz Czachórski, and Stanisław Kozielski, editors, Man-Machine Interactions, ICMMI 2013, volume 242 of Advances in Intelligent Systems and Computing, pages 537-544. Springer, 2013. URL: http://dx.doi.org/10.1007/978-3-319-02309-0_59.
  33. Dorothy Warburton. De novo balanced chromosome rearrangements and extra marker chromosomes identified at prenatal diagnosis: clinical significance and distribution of breakpoints. The American Journal of Human Genetics, 49(5):995-1013, 1991. URL: http://www.ncbi.nlm.nih.gov/pubmed/1928105.
  34. Brooke Weckselblatt, Karen E. Hermetz, and M. Katharine Rudd. Unbalanced translocations arise from diverse mutational mechanisms including chromothripsis. Genome Research, 25(7):937-947, 2015. URL: http://dx.doi.org/10.1101/gr.191247.115.
  35. Peter Weiner. Linear Pattern Matching Algorithms. In 14th Annual Symposium on Switching and Automata Theory, SWAT 1973, pages 1-11, Washington, DC, USA, 1973. IEEE Computer Society. URL: http://dx.doi.org/10.1109/SWAT.1973.13.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail