Document

Algorithms for Galois Words: Detection, Factorization, and Rotation

File

LIPIcs.CPM.2024.18.pdf
• Filesize: 0.96 MB
• 16 pages

Cite As

Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara. Algorithms for Galois Words: Detection, Factorization, and Rotation. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 18:1-18:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.CPM.2024.18

Abstract

Lyndon words are extensively studied in combinatorics on words - they play a crucial role on upper bounding the number of runs a word can have [Bannai+, SIAM J. Comput.'17]. We can determine Lyndon words, factorize a word into Lyndon words in lexicographically non-increasing order, and find the Lyndon rotation of a word, all in linear time within constant additional working space. A recent research interest emerged from the question of what happens when we change the lexicographic order, which is at the heart of the definition of Lyndon words. In particular, the alternating order, where the order of all odd positions becomes reversed, has been recently proposed. While a Lyndon word is, among all its cyclic rotations, the smallest one with respect to the lexicographic order, a Galois word exhibits the same property by exchanging the lexicographic order with the alternating order. Unfortunately, this exchange has a large impact on the properties Galois words exhibit, which makes it a nontrivial task to translate results from Lyndon words to Galois words. Up until now, it has only been conjectured that linear-time algorithms with constant additional working space in the spirit of Duval’s algorithm are possible for computing the Galois factorization or the Galois rotation. Here, we affirm this conjecture as follows. Given a word T of length n, we can determine whether T is a Galois word, in O(n) time with constant additional working space. Within the same complexities, we can also determine the Galois rotation of T, and compute the Galois factorization of T online. The last result settles Open Problem 1 in [Dolce et al., TCS 2019] for Galois words.

Subject Classification

ACM Subject Classification
• Theory of computation
Keywords
• Galois Factorization
• Alternating Order
• Word Factorization Algorithm
• Regularity Detection

Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

References

1. Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, and Marcin Piątkowski. Indexing the bijective BWT. In Proc. CPM, volume 128 of LIPIcs, pages 17:1-17:14, 2019. URL: https://doi.org/10.4230/LIPIcs.CPM.2019.17.
2. Timothy C. Bell, Ian H. Witten, and John G. Cleary. Modeling for text compression. ACM Comput. Surv., 21(4):557-591, 1989. URL: https://doi.org/10.1145/76894.76896.
3. Amanda Burcroff and Eric Winsor. Generalized Lyndon factorizations of infinite words. Theor. Comput. Sci., 809:30-38, 2020. URL: https://doi.org/10.1016/J.TCS.2019.11.003.
4. Michael Burrows and David J. Wheeler. A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, Palo Alto, California, 1994.
5. Francesco Dolce, Antonio Restivo, and Christophe Reutenauer. On generalized Lyndon words. Theor. Comput. Sci., 777:232-242, 2019. URL: https://doi.org/10.1016/j.tcs.2018.12.015.
6. Francesco Dolce, Antonio Restivo, and Christophe Reutenauer. Some variations on Lyndon words (invited talk). In Proc. CPM, volume 128 of LIPIcs, pages 2:1-2:14, 2019. URL: https://doi.org/10.4230/LIPIcs.CPM.2019.2.
7. Jean-Pierre Duval. Factorizing words over an ordered alphabet. J. Algorithms, 4(4):363-381, 1983. URL: https://doi.org/10.1016/0196-6774(83)90017-2.
8. Paolo Ferragina, Rodrigo González, Gonzalo Navarro, and Rossano Venturini. Compressed text indexes: From theory to practice. ACM Journal of Experimental Algorithmics, 13:1.12:1-1.12:31, 2008. URL: https://doi.org/10.1145/1412228.1455268.
9. Nathan J. Fine and Herbert S. Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society, 16(1):109-114, 1965.
10. Travis Gagie, Gonzalo Navarro, and Nicola Prezza. Optimal-time text indexing in BWT-runs bounded space. In Proc. SODA, pages 1459-1477, 2018. URL: https://doi.org/10.1137/1.9781611975031.96.
11. Ira M. Gessel, Antonio Restivo, and Christophe Reutenauer. A bijection between words and multisets of necklaces. Eur. J. Comb., 33(7):1537-1546, 2012. URL: https://doi.org/10.1016/j.ejc.2012.03.016.
12. Raffaele Giancarlo, Giovanni Manzini, Antonio Restivo, Giovanna Rosone, and Marinella Sciortino. The alternating BWT: An algorithmic perspective. Theor. Comput. Sci., 812:230-243, 2020. URL: https://doi.org/10.1016/j.tcs.2019.11.002.
13. Raffaele Giancarlo, Giovanni Manzini, Antonio Restivo, Giovanna Rosone, and Marinella Sciortino. A new class of string transformations for compressed text indexing. Inf. Comput., 294:105068, 2023. URL: https://doi.org/10.1016/J.IC.2023.105068.
14. Joseph Yossi Gil and David Allen Scott. A bijective string sorting transform. ArXiv 1201.3077, 2012. URL: https://arxiv.org/abs/1201.3077.
15. J. Ian Munro, Gonzalo Navarro, and Yakov Nekrich. Space-efficient construction of compressed indexes in deterministic linear time. In Proc. SODA, pages 408-424, 2017. URL: https://doi.org/10.1137/1.9781611974782.26.
16. Christophe Reutenauer. Mots de Lyndon généralisés. Séminaire Lotharingien de Combinatoire, 54(B54h):1-16, 2005.
17. Yossi Shiloach. Fast canonization of circular strings. J. Algorithms, 2(2):107-121, 1981. URL: https://doi.org/10.1016/0196-6774(81)90013-4.