85 Search Results for "Radoszewski, Jakub"


Volume

LIPIcs, Volume 78

28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)

CPM 2017, July 4-6, 2017, Warsaw, Poland

Editors: Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter

Document
Sorting Magazines and Boxes

Authors: Gabriele Fici, Manal Mohamed, and Jakub Radoszewski

Published in: LIPIcs, Volume 366, 13th International Conference on Fun with Algorithms (FUN 2026)


Abstract
It is a rainy Sunday. Agata has decided to sort the magazines on her shelf. Because the magazines are quite thin, she refuses to insert one between two others, preferring to move them only to the ends of the shelf. She has conceived a strategy for this but is unsure of its efficiency. Meanwhile, in the adjacent room, her three-year-old son, Szymon, has just finished his Montessori tower puzzle and is figuring out how to put it away. He has adopted a very intuitive approach to nesting the boxes, though he is not certain it will ultimately succeed. Agata and Szymon are employing very primitive strategies. While many sorting algorithms are remarkably simple to explain and implement-specifically, the class of in-place sorting algorithms with 𝒪(n²) worst-case and average-case running time and constant space requirements (e.g., Bubble Sort, Gnome Sort)-the strategies discussed here offer a unique perspective on "intuitive" sorting. Our contribution aims to enrich the field of simple sorting algorithms. Interestingly, determining the exact worst-case complexity of some of the proposed algorithms remains an open problem.

Cite as

Gabriele Fici, Manal Mohamed, and Jakub Radoszewski. Sorting Magazines and Boxes. In 13th International Conference on Fun with Algorithms (FUN 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 366, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{fici_et_al:LIPIcs.FUN.2026.17,
  author =	{Fici, Gabriele and Mohamed, Manal and Radoszewski, Jakub},
  title =	{{Sorting Magazines and Boxes}},
  booktitle =	{13th International Conference on Fun with Algorithms (FUN 2026)},
  pages =	{17:1--17:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-417-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{366},
  editor =	{Iacono, John},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FUN.2026.17},
  URN =		{urn:nbn:de:0030-drops-257368},
  doi =		{10.4230/LIPIcs.FUN.2026.17},
  annote =	{Keywords: Sorting algorithm, analysis of algorithms, intuitive sorting}
}
Document
On the Complexity of Language Membership for Probabilistic Words

Authors: Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)


Abstract
We study the membership problem to context-free languages L (CFLs) on probabilistic words, that specify for each position a probability distribution on the letters (assuming independence across positions). Our task is to compute, given a probabilistic word, what is the probability that a word drawn according to the distribution belongs to L. This problem generalizes the problem of counting how many words of length n belong to L, or of counting how many completions of a partial word belong to L. We show that this problem is in polynomial time for unambiguous context-free languages (uCFLs), but can be #P-hard already for unions of two linear uCFLs. More generally, we show that the problem is in polynomial time for so-called poly-slicewise-unambiguous languages, where given a length n we can tractably compute an uCFL for the words of length n in the language. This class includes some inherently ambiguous languages, and implies the tractability of bounded CFLs and of languages recognized by unambiguous polynomial-time counter automata; but we show that the problem can be #P-hard for nondeterministic counter automata, even for Parikh automata with a single counter. We then introduce classes of circuits from knowledge compilation which we use for tractable counting, and show that this covers the tractability of poly-slicewise-unambiguous languages and of some CFLs that are not poly-slicewise-unambiguous. Extending these circuits with negation further allows us to show tractability for the language of primitive words, and for the language of concatenations of two palindromes. We finally show the conditional undecidability of the meta-problem that asks, given a CFG, whether the probabilistic membership problem for that CFG is tractable or #P-hard.

Cite as

Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati. On the Complexity of Language Membership for Probabilistic Words. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 5:1-5:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.STACS.2026.5,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l and Rapha\"{e}l, Paul and Salvati, Sylvain},
  title =	{{On the Complexity of Language Membership for Probabilistic Words}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{5:1--5:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.5},
  URN =		{urn:nbn:de:0030-drops-254943},
  doi =		{10.4230/LIPIcs.STACS.2026.5},
  annote =	{Keywords: Automaton, probabilistic words, context-free grammar, membership problem}
}
Document
Time-Optimal Construction of String Synchronizing Sets

Authors: Jonas Ellert and Tomasz Kociumaka

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)


Abstract
A powerful design principle behind many modern string algorithms is local consistency: breaking the symmetry between string positions based on their small contexts so that matching fragments are handled consistently. Among the most influential instantiations of this principle are string synchronizing sets [Kempa & Kociumaka; STOC 2019]. A τ-synchronizing set of a string of length n is a set of O(n/τ) string positions, chosen using their length-2τ contexts, such that (outside of highly periodic regions) every block of τ consecutive positions contains at least one element of the set. Synchronizing sets have found dozens of applications in diverse settings, from quantum and dynamic algorithms to fully compressed computation. In the classic word RAM model, particularly for strings over small alphabets, they enabled faster solutions to core problems in data compression, text indexing, and string similarity. In this work, we show that any string T ∈ [0 .. σ)ⁿ can be preprocessed in O(n log σ / log n) time so that, for any given integer τ ∈ [1 .. n], a τ-synchronizing set of T can be constructed in O((n log τ)/(τ log n)) time. Both bounds are optimal in the word RAM model with machine word size w = Θ(log n), matching the information-theoretic minimum for the input and output sizes, respectively. Previously, constructing a τ-synchronizing set required O(n/τ) time after an O(n)-time preprocessing [Kociumaka, Radoszewski, Rytter, and Waleń; SICOMP 2024], or, in the restricted regime of τ < 0.2 log_σ n, without any preprocessing needed [Kempa & Kociumaka; STOC 2019]. A simple instantiation of our method outputs the synchronizing set as a sorted list in O(n/τ) time, or as a bitmask in O(n/log n) time. Our optimal construction produces a compact fully indexable dictionary, supporting select queries in O(1) time and rank queries in O(log ((log τ)/(log log n))) time. The latter complexity matches known unconditional cell-probe lower bounds for τ ≤ n^{1-Ω(1)}. To achieve this, we introduce a general framework for efficiently processing sparse integer sequences via a custom variable-length encoding. We also augment the optimal variant of van Emde Boas trees [Pătraşcu & Thorup; STOC 2006] with a deterministic linear-time construction. When the set is represented as a bitmask under our sparse encoding, the same guarantees for select and rank queries hold after preprocessing in time proportional to the size of our encoding (in words).

Cite as

Jonas Ellert and Tomasz Kociumaka. Time-Optimal Construction of String Synchronizing Sets. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 36:1-36:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{ellert_et_al:LIPIcs.STACS.2026.36,
  author =	{Ellert, Jonas and Kociumaka, Tomasz},
  title =	{{Time-Optimal Construction of String Synchronizing Sets}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{36:1--36:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.36},
  URN =		{urn:nbn:de:0030-drops-255258},
  doi =		{10.4230/LIPIcs.STACS.2026.36},
  annote =	{Keywords: synchronizing sets, local consistency, packed strings}
}
Document
Languages of Words of Low Automatic Complexity Are Hard to Compute

Authors: Joey Chen, Bjørn Kjos-Hanssen, Ivan Koswara, Linus Richter, and Frank Stephan

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
The automatic complexity of a finite word (string) is an analogue for finite automata of Sipser’s distinguishing complexity (1983) and was introduced by Shallit and Wang (2001). For a finite alphabet Σ of at least two elements, we consider the non-deterministic automatic complexity given by exactly - yet not necessarily uniquely - accepting automata: a word x ∈ Σ^* has exact non-deterministic automatic complexity k ∈ ℕ if there exists a non-deterministic automaton of k states which accepts x while rejecting every other word of the same length as x, and no automaton of fewer states has this property. Importantly, and in contrast to the classical notion, the witnessing automaton may have multiple paths of computation accepting x. We denote this measure of complexity by A_{Ne}, and study a class of languages of low A_{Ne}-complexity defined as L_q = {x ∈ Σ^* : A_{Ne}(x) < q|x|}, which is parameterised by rationals q ∈ (0,1/2) (generalising a class of sets first studied by Kjos-Hanssen). We show that for every q ∈ (0,1/2), this class is neither context-free nor recognisable by certain Boolean circuits. In the process, we answer an open question of Kjos-Hanssen quantifying the complexity of L_{1/3} in terms of Boolean circuits, and also prove the Shannon effect for A_{Ne}.

Cite as

Joey Chen, Bjørn Kjos-Hanssen, Ivan Koswara, Linus Richter, and Frank Stephan. Languages of Words of Low Automatic Complexity Are Hard to Compute. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 24:1-24:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.FSTTCS.2025.24,
  author =	{Chen, Joey and Kjos-Hanssen, Bj{\o}rn and Koswara, Ivan and Richter, Linus and Stephan, Frank},
  title =	{{Languages of Words of Low Automatic Complexity Are Hard to Compute}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{24:1--24:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.24},
  URN =		{urn:nbn:de:0030-drops-251055},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.24},
  annote =	{Keywords: Automatic complexity, automata theory, formal languages, Boolean circuits, Shannon effect}
}
Document
Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares

Authors: Yuto Nakashima, Jakub Radoszewski, and Tomasz Waleń

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
A k-mismatch square is a string of the form XY where X and Y are two equal-length strings that have at most k mismatches. Kolpakov and Kucherov [Theor. Comput. Sci., 2003] defined two notions of k-mismatch repeats, called k-repetitions and k-runs, each representing a sequence of consecutive k-mismatch squares of equal length. They proposed algorithms for computing k-repetitions and k-runs working in 𝒪(nklog k+output) time for a string of length n over an integer alphabet, where output is the number of the reported repeats. We show that output = 𝒪(nk log k), both in case of k-repetitions and k-runs, which implies that the complexity of their algorithms is actually 𝒪(nk log k). We apply this result to computing parameterized squares. A parameterized square is a string of the form XY such that X and Y parameterized-match, i.e., there exists a bijection f on the alphabet such that f(X) = Y. Two parameterized squares XY and X'Y' are equivalent if they parameterized match. Recently Hamai et al. [SPIRE 2024] showed that a string of length n over an alphabet of size σ contains less than nσ non-equivalent parameterized squares, improving an earlier bound by Kociumaka et al. [Theor. Comput. Sci., 2016]. We apply our bound for k-mismatch repeats to propose an algorithm that reports all non-equivalent parameterized squares in 𝒪(nσ log σ) time. We also show that the number of non-equivalent parameterized squares can be computed in 𝒪(n log n) time. This last algorithm applies to squares under any substring compatible equivalence relation and also to counting squares that are distinct as strings. In particular, this improves upon the 𝒪(nσ)-time algorithm of Gawrychowski et al. [CPM 2023] for counting order-preserving squares that are distinct as strings if σ = ω(log n).

Cite as

Yuto Nakashima, Jakub Radoszewski, and Tomasz Waleń. Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{nakashima_et_al:LIPIcs.ESA.2025.8,
  author =	{Nakashima, Yuto and Radoszewski, Jakub and Wale\'{n}, Tomasz},
  title =	{{Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.8},
  URN =		{urn:nbn:de:0030-drops-244768},
  doi =		{10.4230/LIPIcs.ESA.2025.8},
  annote =	{Keywords: string algorithm, k-mismatch square, parameterized square, order-preserving square, maximum gapped repeat}
}
Document
Counting Distinct Square Substrings in Sublinear Time

Authors: Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 345, 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)


Abstract
We show that the number of distinct squares in a packed string of length n over an alphabet of size σ can be computed in 𝒪(n/log_{σ}n) time in the word-RAM model of computation. This paper is the first to introduce a sublinear time algorithm for the packed version of squares counting. The packed representation of a string of length n over an alphabet of size σ is given as a sequence of 𝒪(n/ log_{σ} n) machine words in the word-RAM model (a machine word consists of ω ≥ log₂ n bits). Previously it was known how to count distinct squares in 𝒪(n) time [Gusfield and Stoye, JCSS 2004], even for a string over an integer alphabet, see [Crochemore et al., TCS 2014; Bannai et al., CPM 2017; Charalampopoulos et al., SPIRE 2020]. We use techniques of squares extraction from runs described by Crochemore et al. [TCS 2014]. However, the packed model requires novel approaches. In particular, we need an 𝒪(n/log_{σ}n) sized representation of all long-period runs (runs with periods that are Ω(log_{σ}n)) which guarantees sublinear time counting of potentially linearly-many implied squares. The long-period runs with a string period that is periodic itself (called layer runs) are an obstacle, since their number can be Ω(n). Fortunately, the number of all other long-period runs is 𝒪(n/log_{σ}n) and we can construct an implicit representation of all long-period runs in 𝒪(n/log_{σ}n) time by adopting the insights of Amir et al. [ESA 2019], combined with sublinear time tools provided by the PILLAR model of computations in case of packed strings. We count squares in layer runs in sublinear time by exploiting combinatorial properties of types of pyramidally-shaped groups of layer runs. As a by-product, we discover several new structural properties of runs. Another difficulty is to compute, in sublinear time, locations of Lyndon roots of runs in packed strings, which is needed for grouping of runs that can generate equal squares. To overcome this difficulty, we introduce sparse-Lyndon roots which are based on the notion of string synchronizers proposed by Kempa and Kociumaka [STOC 2019].

Cite as

Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Counting Distinct Square Substrings in Sublinear Time. In 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 345, pp. 36:1-36:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.MFCS.2025.36,
  author =	{Charalampopoulos, Panagiotis and Mohamed, Manal and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Counting Distinct Square Substrings in Sublinear Time}},
  booktitle =	{50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)},
  pages =	{36:1--36:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-388-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{345},
  editor =	{Gawrychowski, Pawe{\l} and Mazowiecki, Filip and Skrzypczak, Micha{\l}},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2025.36},
  URN =		{urn:nbn:de:0030-drops-241439},
  doi =		{10.4230/LIPIcs.MFCS.2025.36},
  annote =	{Keywords: square in a string, packed model, run (maximal repetition), Lyndon word}
}
Document
Research
On the Construction of Elastic Degenerate Strings

Authors: Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)


Abstract
An elastic degenerate string (EDS) is a sequence of sets of strings. In the context of bioinformatics, EDSes can be used to represent the variations observed in a population from its consensus genome. Pattern matching and comparison problems on EDSes have been widely studied in the literature, but their construction has been largely omitted. We fill this gap by showing how algorithms originally developed for related problems of founder reconstruction can be adapted to minimize the total cardinality of the EDS sets and total length of the EDS strings in linear time, given suitable multiple alignments representing the input data.

Cite as

Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti. On the Construction of Elastic Degenerate Strings. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 2:1-2:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{rizzo_et_al:OASIcs.Grossi.2,
  author =	{Rizzo, Nicola and M\"{a}kinen, Veli and Pisanti, Nadia},
  title =	{{On the Construction of Elastic Degenerate Strings}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.2},
  URN =		{urn:nbn:de:0030-drops-238014},
  doi =		{10.4230/OASIcs.Grossi.2},
  annote =	{Keywords: multiple sequence alignment, pattern matching, data structures, segmentation algorithms, founder reconstruction, dynamic programming, semi-dynamic range minimum queries, positional Burrows-Wheeler transform}
}
Document
BWT for String Collections

Authors: Davide Cenzato, Zsuzsanna Lipták, Nadia Pisanti, Giovanna Rosone, and Marinella Sciortino

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
We survey the different methods used for extending the BWT to collections of strings, following largely [Cenzato and Lipták, CPM 2022, Bioinformatics 2024]. We analyze the specific aspects and combinatorial properties of the resulting BWT variants and give a categorization of publicly available tools for computing the BWT of string collections. We show how the specific method used impacts on the resulting transform, including the number of runs, and on the dynamicity of the transform with respect to adding or removing strings from the collection. We then focus on the number of runs of these BWT variants and present the optimal BWT introduced in [Cenzato et al., DCC 2023], which implements an algorithm originally proposed by [Bentley et al., ESA 2020] to minimize the number of BWT-runs. We also discuss several recent heuristics and study their impact on the compression of biological sequences. We conclude with an overview of the applications and the impact of the BWT of string collections in bioinformatics.

Cite as

Davide Cenzato, Zsuzsanna Lipták, Nadia Pisanti, Giovanna Rosone, and Marinella Sciortino. BWT for String Collections. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 3:1-3:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{cenzato_et_al:OASIcs.Manzini.3,
  author =	{Cenzato, Davide and Lipt\'{a}k, Zsuzsanna and Pisanti, Nadia and Rosone, Giovanna and Sciortino, Marinella},
  title =	{{BWT for String Collections}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{3:1--3:29},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.3},
  URN =		{urn:nbn:de:0030-drops-239113},
  doi =		{10.4230/OASIcs.Manzini.3},
  annote =	{Keywords: Burrows-Wheeler transform, Extended Burrows-Wheeler transform, compressed text indexes, text compression, string collections, bioinformatics}
}
Document
Circular Dictionary Matching Using Extended BWT

Authors: Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
The dictionary matching problem involves preprocessing a set of strings (patterns) into a data structure that efficiently identifies all occurrences of these patterns within a query string (text). In this work, we investigate a variation of this problem, termed circular dictionary matching, where the patterns are circular, meaning their cyclic shifts are also considered valid patterns. Such patterns naturally occur in areas such as bioinformatics and computational geometry. Based on the extended Burrows-Wheeler Transformation (eBWT), we design a space-efficient solution for this problem. Specifically, we show that a dictionary of d circular patterns of total length n can be indexed in nlog σ + O(n+dlog n+σ log n) bits of space and support circular dictionary matching on a query text T in O((|T|+occ)log n) time, where σ represents the size of the underlying alphabet and occ represents the output size.

Cite as

Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan. Circular Dictionary Matching Using Extended BWT. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{hon_et_al:OASIcs.Manzini.11,
  author =	{Hon, Wing-Kai and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Circular Dictionary Matching Using Extended BWT}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{11:1--11:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.11},
  URN =		{urn:nbn:de:0030-drops-239195},
  doi =		{10.4230/OASIcs.Manzini.11},
  annote =	{Keywords: String algorithms, Burrows-Wheeler transformation, suffix trees, succinct data structures}
}
Document
A Simple Integer Successor-Delete Data Structure

Authors: Gerth Stølting Brodal

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)


Abstract
We consider a simple decremental data structure for maintaining a set of integers, that supports initializing the set to {1,2,…,n} followed by d deletions and s successor queries in arbitrary order in total 𝒪(n+d+s⋅(1+log_{max(2,s/n)} min(s,n))) time. The data structure consists of a single array of n integers. A straightforward modification allows the data structure to also support p predecessor and r range queries, with a total output k, in total 𝒪(n+d+k+q ⋅ (1+log_{max(2,q/n)} min(q,n))) time, where q = s+p+r. The data structure is essentially a special case of the classic union-find data structure with path compression but with unweighted linking (i.e., without linking by rank or size), that is known to achieve logarithmic amortized time bounds (Tarjan and van Leeuwen, 1984). In this paper we study the efficiency of this simple data structure, and compare it to other, theoretically superior, data structures.

Cite as

Gerth Stølting Brodal. A Simple Integer Successor-Delete Data Structure. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{brodal:LIPIcs.SEA.2025.8,
  author =	{Brodal, Gerth St{\o}lting},
  title =	{{A Simple Integer Successor-Delete Data Structure}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{8:1--8:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.8},
  URN =		{urn:nbn:de:0030-drops-232461},
  doi =		{10.4230/LIPIcs.SEA.2025.8},
  annote =	{Keywords: Successor queries, deletions, interval union-find, union-find}
}
Document
Track A: Algorithms, Complexity and Games
The Role of Regularity in (Hyper-)Clique Detection and Implications for Optimizing Boolean CSPs

Authors: Nick Fischer, Marvin Künnemann, Mirza Redžić, and Julian Stieß

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
Is detecting a k-clique in k-partite regular (hyper-)graphs as hard as in the general case? Intuition suggests yes, but proving this - especially for hypergraphs - poses notable challenges. Concretely, we consider a strong notion of regularity in h-uniform hypergraphs, where we essentially require that any subset of at most h-1 is incident to a uniform number of hyperedges. Such notions are studied intensively in the combinatorial block design literature. We show that any f(k)n^{g(k)}-time algorithm for detecting k-cliques in such graphs transfers to an f'(k)n^{g(k)}-time algorithm for the general case, establishing a fine-grained equivalence between the h-uniform hyperclique hypothesis and its natural regular analogue. Equipped with this regularization result, we then fully resolve the fine-grained complexity of optimizing Boolean constraint satisfaction problems over assignments with k non-zeros. Our characterization depends on the maximum degree d of a constraint function. Specifically, if d ≤ 1, we obtain a linear-time solvable problem, if d = 2, the time complexity is essentially equivalent to k-clique detection, and if d ≥ 3 the problem requires exhaustive-search time under the 3-uniform hyperclique hypothesis. To obtain our hardness results, the regularization result plays a crucial role, enabling a very convenient approach when applied carefully. We believe that our regularization result will find further applications in the future.

Cite as

Nick Fischer, Marvin Künnemann, Mirza Redžić, and Julian Stieß. The Role of Regularity in (Hyper-)Clique Detection and Implications for Optimizing Boolean CSPs. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 78:1-78:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{fischer_et_al:LIPIcs.ICALP.2025.78,
  author =	{Fischer, Nick and K\"{u}nnemann, Marvin and Red\v{z}i\'{c}, Mirza and Stie{\ss}, Julian},
  title =	{{The Role of Regularity in (Hyper-)Clique Detection and Implications for Optimizing Boolean CSPs}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{78:1--78:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.78},
  URN =		{urn:nbn:de:0030-drops-234559},
  doi =		{10.4230/LIPIcs.ICALP.2025.78},
  annote =	{Keywords: fine-grained complexity theory, clique detections in hypergraphs, constraint satisfaction, parameterized algorithms}
}
Document
Minimal Generators in Optimal Time

Authors: Jonas Ellert, Paweł Gawrychowski, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
A walk of length n on a string S of length m is a function f : {1, … , n} → {1, … , m} such that ∀ i ∈ {2, … , n} : |f(i) - f(i - 1)| ≤ 1. The walk generates the string T of length n defined by {∀ i ∈ {1, … , n} : T[i] = S[f(i)]}. Intuitively, this can be seen as walking n steps in S and outputting the encountered symbols, where in each step we either remain at the same position, or move one position to the left or to the right. The minimal generator of a string T is the shortest string S such that a walk on S generates T. Recently, it was shown that each string admits exactly one (up to reversal) minimal generator (Pratt-Hartmann, CPM 2024). However, no efficient algorithm for computing the minimal generator was known. We provide an optimal algorithm for this task, taking {O}(n) time for a string of length n over general unordered alphabet, i.e., accessing the string only by equality comparisons of symbols. The main challenge is to detect substrings of the form axbx̃axb and replace them with axb, where a,b are symbols and x is a string with reversal x̃. We solve this problem with a non-trivial adaptation of Manacher’s classic algorithm for computing maximal palindromic substrings (Manacher, J. ACM 1975). To obtain the final algorithm, we solve small subinstances of the problem in optimal time by adapting the "Four Russians" technique to strings over general unordered alphabet, which may be of independent interest.

Cite as

Jonas Ellert, Paweł Gawrychowski, and Tatiana Starikovskaya. Minimal Generators in Optimal Time. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{ellert_et_al:LIPIcs.CPM.2025.14,
  author =	{Ellert, Jonas and Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  title =	{{Minimal Generators in Optimal Time}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{14:1--14:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.14},
  URN =		{urn:nbn:de:0030-drops-231082},
  doi =		{10.4230/LIPIcs.CPM.2025.14},
  annote =	{Keywords: string algorithms, walking on words, minimal generator, palindromic substrings, general unordered alphabet, decision tree complexity}
}
Document
Pattern Matching on Run-Length Grammar-Compressed Strings in Linear Time

Authors: Yuto Iguchi, Ryo Yoshinaka, and Ayumi Shinohara

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
Run-length straight-line programs (RLSLPs) are a technique for grammar-based compression, allowing any string to be represented with optimal space for δ, the substring complexity of the string. We address the compressed pattern matching problem for RLSLPs: Given a compressed text in RLSLP format and an uncompressed pattern, determine if the pattern appears in the text. This paper proposes an algorithm that solves this problem in linear time with respect to the size of the grammar and the length of the pattern.

Cite as

Yuto Iguchi, Ryo Yoshinaka, and Ayumi Shinohara. Pattern Matching on Run-Length Grammar-Compressed Strings in Linear Time. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 9:1-9:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{iguchi_et_al:LIPIcs.CPM.2025.9,
  author =	{Iguchi, Yuto and Yoshinaka, Ryo and Shinohara, Ayumi},
  title =	{{Pattern Matching on Run-Length Grammar-Compressed Strings in Linear Time}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{9:1--9:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.9},
  URN =		{urn:nbn:de:0030-drops-231034},
  doi =		{10.4230/LIPIcs.CPM.2025.9},
  annote =	{Keywords: pattern matching, run-length straight-line programs, compression, suffix tree}
}
Document
Sorted Consecutive Occurrence Queries in Substrings

Authors: Waseem Akram and Takuya Mieno

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
The string indexing problem is a fundamental computational problem with numerous applications, including information retrieval and bioinformatics. It aims to efficiently solve the pattern matching problem: given a text T of length n for preprocessing and a pattern P of length m as a query, the goal is to report all occurrences of P as substrings of T. Navarro and Thankachan [CPM 2015, Theor. Comput. Sci. 2016] introduced a variant of this problem called the gap-bounded consecutive occurrence query, which reports pairs of consecutive occurrences of P in T such that their gaps (i.e., the distances between them) lie within a query-specified range [g₁, g₂]. Recently, Bille et al. [FSTTCS 2020, Theor. Comput. Sci. 2022] proposed the top-k close consecutive occurrence query, which reports the k closest consecutive occurrences of P in T, sorted in non-decreasing order of distance. Both problems are optimally solved in query time with O(n log n)-space data structures. In this paper, we generalize these problems to the range query model, which focuses only on occurrences of P in a specified substring T[a.. b] of T. Our contributions are as follows: - We propose an O(n log² n)-space data structure that answers the range top-k consecutive occurrence query in O(|P| + log log n + k) time. - We propose an O(n log^{2+ε} n)-space data structure that answers the range gap-bounded consecutive occurrence query in O(|P| + log log n + output) time, where ε is a positive constant and output denotes the number of outputs. Additionally, as by-products, we present algorithms for geometric problems involving weighted horizontal segments in a 2D plane, which are of independent interest.

Cite as

Waseem Akram and Takuya Mieno. Sorted Consecutive Occurrence Queries in Substrings. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 24:1-24:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{akram_et_al:LIPIcs.CPM.2025.24,
  author =	{Akram, Waseem and Mieno, Takuya},
  title =	{{Sorted Consecutive Occurrence Queries in Substrings}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{24:1--24:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.24},
  URN =		{urn:nbn:de:0030-drops-231187},
  doi =		{10.4230/LIPIcs.CPM.2025.24},
  annote =	{Keywords: string algorithm, consecutive occurrences, suffix tree}
}
  • Refine by Type
  • 84 Document/PDF
  • 21 Document/HTML
  • 1 Volume

  • Refine by Publication Year
  • 3 2026
  • 19 2025
  • 2 2024
  • 3 2023
  • 4 2022
  • Show More...

  • Refine by Author
  • 35 Radoszewski, Jakub
  • 17 Rytter, Wojciech
  • 14 Charalampopoulos, Panagiotis
  • 14 Pissis, Solon P.
  • 14 Waleń, Tomasz
  • Show More...

  • Refine by Series/Journal
  • 81 LIPIcs
  • 3 OASIcs

  • Refine by Classification
  • 33 Theory of computation → Pattern matching
  • 3 Theory of computation → Data structures design and analysis
  • 2 Mathematics of computing → Combinatorial algorithms
  • 2 Theory of computation → Design and analysis of algorithms
  • 2 Theory of computation → Regular languages
  • Show More...

  • Refine by Keyword
  • 7 string algorithms
  • 6 edit distance
  • 6 suffix tree
  • 5 Hamming distance
  • 5 pattern matching
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail