k-Universality of Regular Languages

Adamson, Duncan; Fleischmann, Pamela; Huch, Annika; Koß, Tore; Manea, Florin; Nowotka, Dirk

doi:10.4230/LIPIcs.ISAAC.2023.4

Abstract

A subsequence of a word w is a word u such that u = w[i₁] w[i₂] … w[i_k], for some set of indices 1 ≤ i₁ < i₂ < … < i_k ≤ |w|. A word w is k-subsequence universal over an alphabet Σ if every word in Σ^k appears in w as a subsequence. In this paper, we study the intersection between the set of k-subsequence universal words over some alphabet Σ and regular languages over Σ. We call a regular language L k-∃-subsequence universal if there exists a k-subsequence universal word in L, and k-∀-subsequence universal if every word of L is k-subsequence universal. We give algorithms solving the problems of deciding if a given regular language, represented by a finite automaton recognising it, is k-∃-subsequence universal and, respectively, if it is k-∀-subsequence universal, for a given k. The algorithms are FPT w.r.t. the size of the input alphabet, and their run-time does not depend on k; they run in polynomial time in the number n of states of the input automaton when the size of the input alphabet is O(log n). Moreover, we show that the problem of deciding if a given regular language is k-∃-subsequence universal is NP-complete, when the language is over a large alphabet. Further, we provide algorithms for counting the number of k-subsequence universal words (paths) accepted by a given deterministic (respectively, nondeterministic) finite automaton, and ranking an input word (path) within the set of k-subsequence universal words accepted by a given finite automaton.

D. Adamson. Ranking binary unlabelled necklaces in polynomial time. In DCFS, pages 15-29. Springer, 2022.
D. Adamson. Ranking and unranking k-subsequence universal words. In Anna Frid and Robert Mercaş, editors, WORDS, pages 47-59. Springer Nature Switzerland, 2023.
D. Adamson, A. Deligkas, V. V. Gusev, and I. Potapov. Ranking bracelets in polynomial time. CPM, pages 4-17, 2021.
D. Adamson, M. Kosche, T. Koß, F. Manea, and S. Siemer. Longest common subsequence with gap constraints. In Anna Frid and Robert Mercaş, editors, WORDS, pages 60-76, 2023.
A. Artikis, A. Margara, M. Ugarte, S. Vansummeren, and M. Weidlich. Complex event recognition languages: Tutorial. In DEBS, pages 7-10, 2017.
L. Barker, P. Fleischmann, K. Harwardt, F. Manea, and D. Nowotka. Scattered factor-universality of words. In DLT, pages 14-28. Springer, 2020.
H. Z. Q. Chen, S. Kitaev, T. Mütze, and B. Y. Sun. On universal partial words. Electronic Notes in Discrete Mathematics, 61:231-237, 2017.
M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on strings. Cambridge University Press, 2007.
J.D. Day, P. Fleischmann, M. Kosche, T. Koß, F. Manea, and S. Siemer. The edit distance to k-subsequence universality. In STACS, volume 187, pages 25:1-25:19, 2021.
N. G. de Bruijn. A combinatorial problem. Koninklijke Nederlandse Akademie v. Wetenschappen, 49:758-764, 1946.
L. Fleischer and M. Kufleitner. Testing simon’s congruence. In MFCS. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
P. Fleischmann, S.B. Germann, and D. Nowotka. Scattered factor universality-the power of the remainder. preprint arXiv:2104.09063 (published at RuFiDim), 2021.
P. Fleischmann, L. Haschke, A. Huch, A. Mayrock, and D. Nowotka. Nearly k-universal words-investigating a part of simon’s congruence. In DCFS, pages 57-71, 2022.
P. Fleischmann, J. Höfer, A. Huch, and D. Nowotka. α-β-factorization and the binary case of simon’s congruence, 2023. URL: https://arxiv.org/abs/2306.14192.
F. V. Fomin, D. Kratsch, I. Todinca, and Y. Villanger. Exact algorithms for treewidth and minimum fill-in. SIAM J. Comput., 38(3):1058-1079, 2008. URL: https://doi.org/10.1137/050643350.
H. Fredricksen and J. Maiorana. Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Mathematics, 23(3):207-210, 1978.
A. Frochaux and S. Kleest-Meißner. Puzzling over subsequence-query extensions: Disjunction and generalised gaps. In AMW 2023, volume 3409 of CEUR Workshop Proceedings. CEUR-WS.org, 2023.
P. Gawrychowski, M. Kosche, T. Koß, F. Manea, and S. Siemer. Efficiently Testing Simon’s Congruence. In STACS, volume 187, pages 34:1-34:18, 2021.
P. Gawrychowski, M. Lange, N. Rampersad, J. O. Shallit, and M. Szykula. Existential length universality. In Proc. STACS 2020, volume 154 of LIPIcs, pages 16:1-16:14, 2020.
E. N. Gilbert and J. Riordan. Symmetry types of periodic sequences. Illinois Journal of Mathematics, 5(4):657-665, 1961.
B. Goeckner, C. Groothuis, C. Hettle, B. Kell, P. Kirkpatrick, R. Kirsch, and R. W. Solava. Universal partial words over non-binary alphabets. Theor. Comput. Sci, 713:56-65, 2018.
S. Halfon, P. Schnoebelen, and G. Zetzsche. Decidability, complexity, and expressiveness of first-order logic over the subword ordering. In LICS, pages 1-12. IEEE, 2017.
R. Han, S. Wang, and X. Gao. Novel algorithms for efficient subsequence searching and mapping in nanopore raw signals towards targeted sequencing. Bioinformatics, 36(5):1333-1343, 2020.
J.-J. Hebrard. An algorithm for distinguishing efficiently bit-strings by their subsequences. Theoretical Computer Science, 82(1):35-49, 1991.
M. Holzer and M. Kutrib. Descriptional and computational complexity of finite automata - A survey. Inf. Comput., 209(3):456-470, 2011.
J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979.
P. Karandikar, M. Kufleitner, and P. Schnoebelen. On the index of Simon’s congruence for piecewise testability. Inf. Process. Lett., 115(4):515-519, 2015.
P. Karandikar and P. Schnoebelen. The height of piecewise-testable languages with applications in logical complexity. In CSL, 2016.
S. Kim, Y. Han, S. Ko, and K. Salomaa. On simon’s congruence closure of a string. In DCFS 2022, Proceedings, volume 13439 of Lecture Notes in Computer Science, pages 127-141. Springer, 2022.
S. Kim, Y. Han, S. Ko, and K. Salomaa. On the simon’s congruence neighborhood of languages. In DLT 2023, Proceedings, volume 13911 of Lecture Notes in Computer Science, pages 168-181. Springer, 2023.
S. Kim, S. Ko, and Y. Han. Simon’s congruence pattern matching. In ISAAC 2022, Proceedings, volume 248 of LIPIcs, pages 60:1-60:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
S. Kleest-Meißner, R. Sattler, M. L. Schmid, N. Schweikardt, and M. Weidlich. Discovering event queries from traces: Laying foundations for subsequence-queries with wildcards and gap-size constraints. In ICDT 2022, Proceedings, volume 220 of LIPIcs, pages 18:1-18:21, 2022.
S. Kleest-Meißner, R. Sattler, M. L. Schmid, N. Schweikardt, and M. Weidlich. Discovering multi-dimensional subsequence queries from traces - from theory to practice. In BTW 2023, Proceedings, volume P-331 of LNI, pages 511-533, 2023.
T. Kociumaka, J. Radoszewski, and W. Rytter. Computing k-th Lyndon word and decoding lexicographically minimal de Bruijn sequence. In CPM, pages 202-211. Springer, 2014.
M. Kosche, T. Koß, F. Manea, and S. Siemer. Absent subsequences in words. In RP, pages 115-131. Springer, 2021.
M. Kosche, T. Koß, F. Manea, and S. Siemer. Combinatorial algorithms for subsequence matching: A survey. In Henning Bordihn, Géza Horváth, and György Vaszil, editors, NCMA, 2022.
M. Krötzsch, T. Masopust, and M. Thomazo. Complexity of universality and related problems for partially ordered NFAs. Inf. Comput., 255:177-192, 2017.
M. Lothaire. Combinatorics on Words. Cambridge Mathematical Library. Cambridge University Press, 1997.
M. H. Martin. A problem in arrangements. Bull. Amer. Math. Soc., 40(12):859-864, December 1934.
A. Mateescu, A. Salomaa, and S. Yu. Subword histories and parikh matrices. Journal of Computer and System Sciences, 68(1):1-21, 2004.
N. Rampersad, J. Shallit, and Z. Xu. The computational complexity of universality problems for prefixes, suffixes, factors, and subwords of regular languages. Fundam. Inf., 116(1-4):223-236, January 2012.
C. Savage. A survey of combinatorial gray codes. SIAM review, 39(4):605-629, 1997.
J. Sawada and A. Williams. Practical algorithms to rank necklaces, Lyndon words, and de Bruijn sequences. Journal of Discrete Algorithms, 43:95-110, 2017.
P. Schnoebelen and P. Karandikar. The height of piecewise-testable languages and the complexity of the logic of subwords. Logical Methods in Computer Science, 15, 2019.
P. Schnoebelen and J. Veron. On arch factorization and subword universality for words and compressed words. In WORDS 2023, Proceedings, volume 13899 of Lecture Notes in Computer Science, pages 274-287. Springer, 2023.
A. C. Shaw. Software descriptions with flow expressions. IEEE Transactions on Software Engineering, 3:242-254, 1978.
R. Shikder, P. Thulasiraman, P. Irani, and P. Hu. An openmp-based tool for finding longest common subsequence in bioinformatics. BMC research notes, 12:1-6, 2019.
I. Simon. Piecewise testable events. In Autom. Theor. Form. Lang., 2nd GI Conf., volume 33 of LNCS, pages 214-222. Springer, 1975.
I. Simon. Words distinguished by their subwords. WORDS, 27:6-13, 2003.
Z. Troniĉek. Common subsequence automaton. In CIAA, pages 270-275, 2003.
G. Zetzsche. The complexity of downward closure comparisons. In ICALP, volume 55, pages 123:1-123:14, 2016.

k-Universality of Regular Languages

Authors Duncan Adamson , Pamela Fleischmann , Annika Huch, Tore Koß , Florin Manea , Dirk Nowotka

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

k-Universality of Regular Languages

Authors Duncan Adamson , Pamela Fleischmann , Annika Huch, Tore Koß , Florin Manea , Dirk Nowotka

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message