165 Search Results for "Pissis, Solon P."


Volume

LIPIcs, Volume 312

24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

WABI 2024, September 2-4, 2024, Royal Holloway, London, United Kingdom

Editors: Solon P. Pissis and Wing-Kin Sung

Volume

LIPIcs, Volume 128

30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019)

CPM 2019, June 18-20, 2019, Pisa, Italy

Editors: Nadia Pisanti and Solon P. Pissis

Volume

LIPIcs, Volume 75

16th International Symposium on Experimental Algorithms (SEA 2017)

SEA 2017, June 21-23, 2017, London, UK

Editors: Costas S. Iliopoulos, Solon P. Pissis, Simon J. Puglisi, and Rajeev Raman

Document
On the Complexity of Language Membership for Probabilistic Words

Authors: Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)


Abstract
We study the membership problem to context-free languages L (CFLs) on probabilistic words, that specify for each position a probability distribution on the letters (assuming independence across positions). Our task is to compute, given a probabilistic word, what is the probability that a word drawn according to the distribution belongs to L. This problem generalizes the problem of counting how many words of length n belong to L, or of counting how many completions of a partial word belong to L. We show that this problem is in polynomial time for unambiguous context-free languages (uCFLs), but can be #P-hard already for unions of two linear uCFLs. More generally, we show that the problem is in polynomial time for so-called poly-slicewise-unambiguous languages, where given a length n we can tractably compute an uCFL for the words of length n in the language. This class includes some inherently ambiguous languages, and implies the tractability of bounded CFLs and of languages recognized by unambiguous polynomial-time counter automata; but we show that the problem can be #P-hard for nondeterministic counter automata, even for Parikh automata with a single counter. We then introduce classes of circuits from knowledge compilation which we use for tractable counting, and show that this covers the tractability of poly-slicewise-unambiguous languages and of some CFLs that are not poly-slicewise-unambiguous. Extending these circuits with negation further allows us to show tractability for the language of primitive words, and for the language of concatenations of two palindromes. We finally show the conditional undecidability of the meta-problem that asks, given a CFG, whether the probabilistic membership problem for that CFG is tractable or #P-hard.

Cite as

Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati. On the Complexity of Language Membership for Probabilistic Words. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 5:1-5:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.STACS.2026.5,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l and Rapha\"{e}l, Paul and Salvati, Sylvain},
  title =	{{On the Complexity of Language Membership for Probabilistic Words}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{5:1--5:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.5},
  URN =		{urn:nbn:de:0030-drops-254943},
  doi =		{10.4230/LIPIcs.STACS.2026.5},
  annote =	{Keywords: Automaton, probabilistic words, context-free grammar, membership problem}
}
Document
Time-Optimal Construction of String Synchronizing Sets

Authors: Jonas Ellert and Tomasz Kociumaka

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)


Abstract
A powerful design principle behind many modern string algorithms is local consistency: breaking the symmetry between string positions based on their small contexts so that matching fragments are handled consistently. Among the most influential instantiations of this principle are string synchronizing sets [Kempa & Kociumaka; STOC 2019]. A τ-synchronizing set of a string of length n is a set of O(n/τ) string positions, chosen using their length-2τ contexts, such that (outside of highly periodic regions) every block of τ consecutive positions contains at least one element of the set. Synchronizing sets have found dozens of applications in diverse settings, from quantum and dynamic algorithms to fully compressed computation. In the classic word RAM model, particularly for strings over small alphabets, they enabled faster solutions to core problems in data compression, text indexing, and string similarity. In this work, we show that any string T ∈ [0 .. σ)ⁿ can be preprocessed in O(n log σ / log n) time so that, for any given integer τ ∈ [1 .. n], a τ-synchronizing set of T can be constructed in O((n log τ)/(τ log n)) time. Both bounds are optimal in the word RAM model with machine word size w = Θ(log n), matching the information-theoretic minimum for the input and output sizes, respectively. Previously, constructing a τ-synchronizing set required O(n/τ) time after an O(n)-time preprocessing [Kociumaka, Radoszewski, Rytter, and Waleń; SICOMP 2024], or, in the restricted regime of τ < 0.2 log_σ n, without any preprocessing needed [Kempa & Kociumaka; STOC 2019]. A simple instantiation of our method outputs the synchronizing set as a sorted list in O(n/τ) time, or as a bitmask in O(n/log n) time. Our optimal construction produces a compact fully indexable dictionary, supporting select queries in O(1) time and rank queries in O(log ((log τ)/(log log n))) time. The latter complexity matches known unconditional cell-probe lower bounds for τ ≤ n^{1-Ω(1)}. To achieve this, we introduce a general framework for efficiently processing sparse integer sequences via a custom variable-length encoding. We also augment the optimal variant of van Emde Boas trees [Pătraşcu & Thorup; STOC 2006] with a deterministic linear-time construction. When the set is represented as a bitmask under our sparse encoding, the same guarantees for select and rank queries hold after preprocessing in time proportional to the size of our encoding (in words).

Cite as

Jonas Ellert and Tomasz Kociumaka. Time-Optimal Construction of String Synchronizing Sets. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 36:1-36:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{ellert_et_al:LIPIcs.STACS.2026.36,
  author =	{Ellert, Jonas and Kociumaka, Tomasz},
  title =	{{Time-Optimal Construction of String Synchronizing Sets}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{36:1--36:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.36},
  URN =		{urn:nbn:de:0030-drops-255258},
  doi =		{10.4230/LIPIcs.STACS.2026.36},
  annote =	{Keywords: synchronizing sets, local consistency, packed strings}
}
Document
Dynamic Pattern Matching with Wildcards

Authors: Arshia Ataee Naeini, Amir-Parsa Mobed, Masoud Seddighin, and Saeed Seddighin

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)


Abstract
We study the fully dynamic pattern matching problem where the pattern may contain up to k wildcard symbols, each matching any symbol of the alphabet. Both the text and the pattern are subject to updates (insert, delete, change). We design an algorithm with 𝒪(n log² n) preprocessing and update/query time 𝒪̃(kn^{k/{k+1}} + k² log n). The bound is truly sublinear for a constant k, and sublinear when k = o(log n). We further complement our results with a conditional lower bound: assuming subquadratic preprocessing time, achieving truly sublinear update time for the case k = Ω(log n) would contradict the Strong Exponential Time Hypothesis (SETH). Finally, we develop sublinear algorithms for two special cases: - If the pattern contains w non-wildcard symbols, we give an algorithm with preprocessing time 𝒪(nw) and update time 𝒪(w + log n), which is truly sublinear whenever w is truly sublinear. - Using FFT technique combined with block decomposition, we design a deterministic truly sublinear algorithm with preprocessing time 𝒪(n^{1.8}) and update time 𝒪(n^{0.8} log n) for the case that there are at most two non-wildcards.

Cite as

Arshia Ataee Naeini, Amir-Parsa Mobed, Masoud Seddighin, and Saeed Seddighin. Dynamic Pattern Matching with Wildcards. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 68:1-68:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{naeini_et_al:LIPIcs.STACS.2026.68,
  author =	{Naeini, Arshia Ataee and Mobed, Amir-Parsa and Seddighin, Masoud and Seddighin, Saeed},
  title =	{{Dynamic Pattern Matching with Wildcards}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{68:1--68:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.68},
  URN =		{urn:nbn:de:0030-drops-255579},
  doi =		{10.4230/LIPIcs.STACS.2026.68},
  annote =	{Keywords: pattern matching, wildcards, dynamic algorithms, string algorithms, data structures}
}
Document
A Finer View of the Parameterized Landscape of Labeled Graph Contractions

Authors: Yashaswini Mathur and Prafullkumar Tale

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
We study the Labeled Contractibility problem, where the input consists of two vertex-labeled graphs G and H, and the goal is to determine whether H can be obtained from G via a sequence of edge contractions. Lafond and Marchand [WADS 2025] initiated the parameterized complexity study of this problem, showing it to be W[1]-hard when parameterized by the number k of allowed contractions. They also proved that the problem is fixed-parameter tractable when parameterized by the tree-width tw of G, via an application of Courcelle’s theorem resulting in a non-constructive algorithm. In this work, we present a constructive fixed-parameter algorithm for Labeled Contractibility with running time 2^{𝒪(tw²)} ⋅ |V(G)|^{𝒪(1)}. We also prove that unless the Exponential Time Hypothesis ({ETH}) fails, it does not admit an algorithm running in time 2^{o(tw²)} ⋅ |V(G)|^{𝒪(1)}. This result adds Labeled Contractibility to a small list of problems that admit such a lower bound and matching algorithm. We further strengthen existing hardness results by showing that the problem remains NP-complete even when both input graphs have bounded maximum degree. We also investigate parameterizations by (k + δ(G)) where δ(G) denotes the degeneracy of G, and rule out the existence of subexponential-time algorithms. This answers question raised in Lafond and Marchand [WADS 2025]. We additionally provide an improved FPT algorithm with better dependence on (k + δ(G)) than previously known. Finally, we analyze a brute-force algorithm for Labeled Contractibility with running time |V(H)|^{𝒪(|V(G)|)}, and show that this running time is optimal under {ETH}.

Cite as

Yashaswini Mathur and Prafullkumar Tale. A Finer View of the Parameterized Landscape of Labeled Graph Contractions. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 43:1-43:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{mathur_et_al:LIPIcs.FSTTCS.2025.43,
  author =	{Mathur, Yashaswini and Tale, Prafullkumar},
  title =	{{A Finer View of the Parameterized Landscape of Labeled Graph Contractions}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{43:1--43:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.43},
  URN =		{urn:nbn:de:0030-drops-251237},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.43},
  annote =	{Keywords: Labeled Contraction, ETH Lower-bound, Treewidth, NP-hard}
}
Document
Safe Sequences via Dominators in DAGs for Path-Covering Problems

Authors: Francisco Sena, Romeo Rizzi, and Alexandru I. Tomescu

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
A path-covering problem on a directed acyclic graph (DAG) requires finding a set of source-to-sink paths that cover all the nodes, all the arcs, or subsets thereof, and additionally they are optimal with respect to some function. In this paper we study safe sequences of nodes or arcs, namely sequences that appear in some path of every path cover of a DAG. We show that safe sequences admit a simple characterization via cutnodes. Moreover, we establish a connection between maximal safe sequences and leaf-to-root paths in the source- and sink-dominator trees of the DAG, which may be of independent interest in the extensive literature on dominators. With dominator trees, safe sequences admit an O(n)-size representation and a linear-time output-sensitive enumeration algorithm running in time O(m + o), where n and m are the number of nodes and arcs, respectively, and o is the total length of the maximal safe sequences. We then apply maximal safe sequences to simplify Integer Linear Programs (ILPs) for two path-covering problems, LeastSquares and MinPathError, which are at the core of RNA transcript assembly problems from bioinformatics. On various datasets, maximal safe sequences can be computed in under 0.1 seconds per graph, on average, and ILP solvers whose search space is reduced in this manner exhibit significant speed-ups. For example on graphs with a large width, average speed-ups are in the range 50-250× for MinPathError and in the range 80-350× for LeastSquares. Optimizing ILPs using safe sequences can thus become a fast building block of practical RNA transcript assembly tools, and more generally, of path-covering problems.

Cite as

Francisco Sena, Romeo Rizzi, and Alexandru I. Tomescu. Safe Sequences via Dominators in DAGs for Path-Covering Problems. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 55:1-55:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{sena_et_al:LIPIcs.ESA.2025.55,
  author =	{Sena, Francisco and Rizzi, Romeo and Tomescu, Alexandru I.},
  title =	{{Safe Sequences via Dominators in DAGs for Path-Covering Problems}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{55:1--55:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.55},
  URN =		{urn:nbn:de:0030-drops-245230},
  doi =		{10.4230/LIPIcs.ESA.2025.55},
  annote =	{Keywords: directed acyclic graph, path cover, dominator tree, integer linear programming, least squares, minimum path error}
}
Document
Better Indexing for Rectangular Pattern Matching

Authors: Paweł Gawrychowski and Adam Górkiewicz

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
We revisit the complexity of building, given a two-dimensional string of size n, an indexing structure that allows locating all k occurrences of a two-dimensional pattern of size m. While a structure of size 𝒪(n) with query time 𝒪(m+k) is known for this problem under the additional assumption that the pattern is a square [Giancarlo, SICOMP 1995], a popular belief was that for rectangular patterns one cannot achieve such (or even similar) bounds, due to a lower bound for a certain natural class of approaches [Giancarlo, WADS 1993]. We show that, in fact, it is possible to construct a very simple structure of size 𝒪(nlog n) that supports such queries for any rectangular pattern in 𝒪(m+klog^{ε}n) time, for any ε > 0.

Cite as

Paweł Gawrychowski and Adam Górkiewicz. Better Indexing for Rectangular Pattern Matching. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 33:1-33:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.ESA.2025.33,
  author =	{Gawrychowski, Pawe{\l} and G\'{o}rkiewicz, Adam},
  title =	{{Better Indexing for Rectangular Pattern Matching}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{33:1--33:7},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.33},
  URN =		{urn:nbn:de:0030-drops-245011},
  doi =		{10.4230/LIPIcs.ESA.2025.33},
  annote =	{Keywords: 2D strings, pattern matching, string indexing}
}
Document
Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares

Authors: Yuto Nakashima, Jakub Radoszewski, and Tomasz Waleń

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
A k-mismatch square is a string of the form XY where X and Y are two equal-length strings that have at most k mismatches. Kolpakov and Kucherov [Theor. Comput. Sci., 2003] defined two notions of k-mismatch repeats, called k-repetitions and k-runs, each representing a sequence of consecutive k-mismatch squares of equal length. They proposed algorithms for computing k-repetitions and k-runs working in 𝒪(nklog k+output) time for a string of length n over an integer alphabet, where output is the number of the reported repeats. We show that output = 𝒪(nk log k), both in case of k-repetitions and k-runs, which implies that the complexity of their algorithms is actually 𝒪(nk log k). We apply this result to computing parameterized squares. A parameterized square is a string of the form XY such that X and Y parameterized-match, i.e., there exists a bijection f on the alphabet such that f(X) = Y. Two parameterized squares XY and X'Y' are equivalent if they parameterized match. Recently Hamai et al. [SPIRE 2024] showed that a string of length n over an alphabet of size σ contains less than nσ non-equivalent parameterized squares, improving an earlier bound by Kociumaka et al. [Theor. Comput. Sci., 2016]. We apply our bound for k-mismatch repeats to propose an algorithm that reports all non-equivalent parameterized squares in 𝒪(nσ log σ) time. We also show that the number of non-equivalent parameterized squares can be computed in 𝒪(n log n) time. This last algorithm applies to squares under any substring compatible equivalence relation and also to counting squares that are distinct as strings. In particular, this improves upon the 𝒪(nσ)-time algorithm of Gawrychowski et al. [CPM 2023] for counting order-preserving squares that are distinct as strings if σ = ω(log n).

Cite as

Yuto Nakashima, Jakub Radoszewski, and Tomasz Waleń. Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{nakashima_et_al:LIPIcs.ESA.2025.8,
  author =	{Nakashima, Yuto and Radoszewski, Jakub and Wale\'{n}, Tomasz},
  title =	{{Fast Computation of k-Runs, Parameterized Squares, and Other Generalised Squares}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{8:1--8:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.8},
  URN =		{urn:nbn:de:0030-drops-244768},
  doi =		{10.4230/LIPIcs.ESA.2025.8},
  annote =	{Keywords: string algorithm, k-mismatch square, parameterized square, order-preserving square, maximum gapped repeat}
}
Document
Parameterized Algorithms for Computing Pareto Sets

Authors: Joshua Marc Könen, Heiko Röglin, and Tarek Stuck

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
The problem of computing the set of Pareto-optimal solutions has been studied for a variety of multiobjective optimization problems. For many such problems, algorithms are known that compute the Pareto set in (weak) output-polynomial time. These algorithms are often based on dynamic programming and by weak output-polynomial time, we mean that the running time depends polynomially on the size of the Pareto set but also on the sizes of the Pareto sets of the subproblems that occur in the dynamic program. For some problems, like the multiobjective minimum spanning tree problem, such algorithms are not known to exist and for other problems, like multiobjective versions of many NP-hard problems, such algorithms cannot exist, unless 𝒫 = 𝒩𝒫. Dynamic programming over tree decompositions is a common technique in parameterized algorithms. In this paper, we study whether this technique can also be applied to compute Pareto sets of multiobjective optimization problems. We first derive an algorithm to compute the Pareto set for the multicriteria s-t cut problem and show how this result can be applied to a polygon aggregation problem arising in cartography that has recently been introduced by Rottmann et al. (GIScience 2021). We also show how to apply these techniques to also compute the Pareto set of the multiobjective minimum spanning tree problem and for the multiobjective TSP. The running time of our algorithms is O(f(w)⋅poly(n,p_{max})), where f is some function in the treewidth w, n is the input size, and p_{max} is an upper bound on the size of the Pareto sets of the subproblems that occur in the dynamic program. Finally, we present an experimental evaluation of computing Pareto sets on real-world instances of polygon aggregation problems. For this matter we devised a task-specific data structure that allows for efficient storage and modification of large sets of Pareto-optimal solutions. Throughout the implementation process, we incorporated several improved strategies and heuristics that significantly reduced both runtime and memory usage, enabling us to solve instances with treewidth of up to 22 within reasonable amount of time. Moreover, we conducted a preprocessing study to compare different tree decompositions in terms of their estimated overall runtime.

Cite as

Joshua Marc Könen, Heiko Röglin, and Tarek Stuck. Parameterized Algorithms for Computing Pareto Sets. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 105:1-105:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{konen_et_al:LIPIcs.ESA.2025.105,
  author =	{K\"{o}nen, Joshua Marc and R\"{o}glin, Heiko and Stuck, Tarek},
  title =	{{Parameterized Algorithms for Computing Pareto Sets}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{105:1--105:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.105},
  URN =		{urn:nbn:de:0030-drops-245749},
  doi =		{10.4230/LIPIcs.ESA.2025.105},
  annote =	{Keywords: parameterized algorithms, treewidth, multicriteria optimization problems, multicriteria MST, multicriteria TSP, polygon aggregation}
}
Document
When Is String Reconstruction Using de Bruijn Graphs Hard?

Authors: Ben Bals, Sebastiaan van Krieken, Solon P. Pissis, Leen Stougie, and Hilde Verbeek

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
The reduction of the fragment assembly problem to (variations of) the classical Eulerian trail problem [Pevzner et al., PNAS 2001] has led to remarkable progress in genome assembly. This reduction employs the notion of de Bruijn graph G = (V,E) of order k over an alphabet Σ. A single Eulerian trail in G represents a candidate genome reconstruction. Bernardini et al. have also introduced the complementary idea in data privacy [ALENEX 2020] based on z-anonymity. Let S be a private string that we would like to release, preventing, however, its full reconstruction. For a privacy threshold z > 0, we compute the largest k for which there exist at least z Eulerian trails in the order-k de Bruijn graph of S, and release a string S' obtained via a random Eulerian trail. The pressing question is: How hard is it to reconstruct a best string from a de Bruijn graph given a function that models domain knowledge? Such a function maps every length-k string to an interval of positions where it may occur in the reconstructed string. By the above reduction to de Bruijn graphs, the latter function translates into a function c mapping every edge to an interval where it may occur in an Eulerian trail. This gives rise to the following basic problem on graphs: Given an instance (G,c), can we efficiently compute an Eulerian trail respecting c? Hannenhalli et al. [CABIOS 1996] formalized this problem and showed that it is NP-complete. Ben-Dor et al. [J. Comput. Biol. 2002] showed that it is NP-complete, even on de Bruin graphs with |Σ| = 4. In this work, we settle the lower-bound side of this problem by showing that finding a c-respecting Eulerian trail in de Bruijn graphs over alphabets of size 2 is NP-complete. We then shift our focus to parametrization aiming to capture the quality of our domain knowledge in the complexity. Ben-Dor et al. developed an algorithm to solve the problem on de Bruijn graphs in 𝒪(m⋅w^{1.5} 4^w) time, where m = |E| and w is the maximum interval length over all edges in E. Bumpus and Meeks [Algorithmica 2023] later rediscovered the same algorithm on temporal graphs, which highlights the relevance of this problem in other contexts. Our central contribution is showing how combinatorial insights lead to exponential-time improvements over the state-of-the-art algorithm. In particular, for the important class of de Bruijn graphs, we develop an algorithm parametrized by w (log w+1) /(k-1): for a de Bruijn graph of order k, it runs in 𝒪(mw⋅2^{w(log(w)+1)/(k-1)}) time. Our result improves on the state of the art by roughly an exponent of (log(w)+1)/(k-1). The existing algorithms have a natural interpretation for string reconstruction: when for each length-k string, we know a small range of positions it must lie in, string reconstruction can be solved in linear time. Our improved algorithm shows that it is enough when the range of positions is small relative to k. We then generalize both the existing and our novel FPT algorithm by allowing the cost at every position of an interval to vary. In this optimization version, our hardness result translates into inapproximability and the FPT algorithms work with a slight extension. Surprisingly, even in this more general setting, we extend the FPT algorithms to count and enumerate the min-cost Eulerian trails. The counting result has direct applications in the data privacy framework of Bernardini et al.

Cite as

Ben Bals, Sebastiaan van Krieken, Solon P. Pissis, Leen Stougie, and Hilde Verbeek. When Is String Reconstruction Using de Bruijn Graphs Hard?. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 53:1-53:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{bals_et_al:LIPIcs.ESA.2025.53,
  author =	{Bals, Ben and van Krieken, Sebastiaan and Pissis, Solon P. and Stougie, Leen and Verbeek, Hilde},
  title =	{{When Is String Reconstruction Using de Bruijn Graphs Hard?}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{53:1--53:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.53},
  URN =		{urn:nbn:de:0030-drops-245215},
  doi =		{10.4230/LIPIcs.ESA.2025.53},
  annote =	{Keywords: string algorithm, graph algorithm, de Bruijn graph, Eulerian trail}
}
Document
Counting Distinct Square Substrings in Sublinear Time

Authors: Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 345, 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)


Abstract
We show that the number of distinct squares in a packed string of length n over an alphabet of size σ can be computed in 𝒪(n/log_{σ}n) time in the word-RAM model of computation. This paper is the first to introduce a sublinear time algorithm for the packed version of squares counting. The packed representation of a string of length n over an alphabet of size σ is given as a sequence of 𝒪(n/ log_{σ} n) machine words in the word-RAM model (a machine word consists of ω ≥ log₂ n bits). Previously it was known how to count distinct squares in 𝒪(n) time [Gusfield and Stoye, JCSS 2004], even for a string over an integer alphabet, see [Crochemore et al., TCS 2014; Bannai et al., CPM 2017; Charalampopoulos et al., SPIRE 2020]. We use techniques of squares extraction from runs described by Crochemore et al. [TCS 2014]. However, the packed model requires novel approaches. In particular, we need an 𝒪(n/log_{σ}n) sized representation of all long-period runs (runs with periods that are Ω(log_{σ}n)) which guarantees sublinear time counting of potentially linearly-many implied squares. The long-period runs with a string period that is periodic itself (called layer runs) are an obstacle, since their number can be Ω(n). Fortunately, the number of all other long-period runs is 𝒪(n/log_{σ}n) and we can construct an implicit representation of all long-period runs in 𝒪(n/log_{σ}n) time by adopting the insights of Amir et al. [ESA 2019], combined with sublinear time tools provided by the PILLAR model of computations in case of packed strings. We count squares in layer runs in sublinear time by exploiting combinatorial properties of types of pyramidally-shaped groups of layer runs. As a by-product, we discover several new structural properties of runs. Another difficulty is to compute, in sublinear time, locations of Lyndon roots of runs in packed strings, which is needed for grouping of runs that can generate equal squares. To overcome this difficulty, we introduce sparse-Lyndon roots which are based on the notion of string synchronizers proposed by Kempa and Kociumaka [STOC 2019].

Cite as

Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Counting Distinct Square Substrings in Sublinear Time. In 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 345, pp. 36:1-36:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.MFCS.2025.36,
  author =	{Charalampopoulos, Panagiotis and Mohamed, Manal and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Counting Distinct Square Substrings in Sublinear Time}},
  booktitle =	{50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)},
  pages =	{36:1--36:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-388-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{345},
  editor =	{Gawrychowski, Pawe{\l} and Mazowiecki, Filip and Skrzypczak, Micha{\l}},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2025.36},
  URN =		{urn:nbn:de:0030-drops-241439},
  doi =		{10.4230/LIPIcs.MFCS.2025.36},
  annote =	{Keywords: square in a string, packed model, run (maximal repetition), Lyndon word}
}
Document
An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT

Authors: Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
String matching problems in bioinformatics are typically for finding exact substring matches between a query and a reference text. Previous formulations often focus on maximum exact matches (MEMs). However, multiple occurrences of substrings of the query in the text that are long enough but not maximal may not be captured by MEMs. Such long matches can be informative, especially when the text is a collection of similar sequences such as genomes. In this paper, we describe a new type of match between a pattern and a text that aren't necessarily maximal in the query, but still contain useful matching information: locally maximal exact matches (LEMs). There are usually a large amount of LEMs, so we only consider those above some length threshold ℒ. These are referred to as long LEMs. The purpose of long LEMs is to capture substring matches between a query and a text that are not necessarily maximal in the pattern but still long enough to be important. Therefore efficient long LEMs finding algorithms are desired for these datasets. However, these datasets are too large to query on traditional string indexes. Fortunately, these datasets are very repetitive. Recently, compressed string indexes that take advantage of the redundancy in the data but retain efficient querying capability have been proposed as a solution. We therefore give an efficient algorithm for computing all the long LEMs of a query and a text in a BWT runs compressed string index. We describe an O(m+occ) expected time algorithm that relies on an O(r) words space string index for outputting all long LEMs of a pattern with respect to a text given the matching statistics of the pattern with respect to the text. Here m is the length of the query, occ is the number of long LEMs outputted, and r is the number of runs in the BWT of the text. The O(r) space string index we describe relies on an adaptation of the move data structure by Nishimoto and Tabei. We are able to support LCP[i] queries in constant time given SA[i]. In other words, we answer PLCP[i] queries in constant time. These PLCP queries enable the efficient long LEM query. Long LEMs may provide useful similarity information between a pattern and a text that MEMs may ignore. This information is particularly useful in pangenome and biobank scale haplotype panel contexts.

Cite as

Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang. An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{sanaullah_et_al:LIPIcs.WABI.2025.17,
  author =	{Sanaullah, Ahsan and Zhi, Degui and Zhang, Shaojie},
  title =	{{An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.17},
  URN =		{urn:nbn:de:0030-drops-239433},
  doi =		{10.4230/LIPIcs.WABI.2025.17},
  annote =	{Keywords: BWT, LEM, Long LEM, MEM, Run Length Compressed BWT, Move Data Structure, Pangenome}
}
Document
Design of Worst-Case-Optimal Spaced Seeds

Authors: Jens Zentgraf and Sven Rahmann

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Read mapping (and alignment) is a fundamental problem in biological sequence analysis. For speed and computational efficiency, many popular read mappers tolerate only a few differences between the read and the corresponding part of the reference genome, which leads to reference bias: Reads with too many differences are not guaranteed to be mapped correctly or at all, because to even consider a genomic position, a sufficiently long exact match (seed) must exist. While pangenomes and their graph-based representations provide one way to avoid reference bias by enlarging the reference, we explore an orthogonal approach and consider stronger substitution-tolerant primitives, namely spaced seeds or gapped k-mers. Given two integers k ≤ w, one considers k selected positions, described by a mask, from each length-w window in a sequence. In the existing literature, masks with certain probabilistic guarantees have been designed for small values of k. Here, for the first time, we take a combinatorial approach from a worst-case perspective. For any mask, using integer linear programs, we find least favorable distributions of sequence changes in two different senses: (1) minimizing the number of unchanged windows; (2) minimizing the number of positions covered by unchanged windows. Then, among all masks or all symmetric masks of a given shape (k,w), we find the set of best masks that maximize these minima. As a result, we obtain robust masks, even for large numbers of changes. We illustrate the properties of these masks by constructing a challenging set of reads that contain many approximately equidistributed substitutions (but no indels) that many existing tools cannot map, even though they are in principle easily mappable (apart from the large number of changes) because they originate from selected non-repetitive regions of the human reference genome. We observe that the majority of these reads can be mapped with a simple alignment-free approach using chosen spaced masks, where seeding approaches based on contiguous k-mers fail.

Cite as

Jens Zentgraf and Sven Rahmann. Design of Worst-Case-Optimal Spaced Seeds. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 22:1-22:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{zentgraf_et_al:LIPIcs.WABI.2025.22,
  author =	{Zentgraf, Jens and Rahmann, Sven},
  title =	{{Design of Worst-Case-Optimal Spaced Seeds}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{22:1--22:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.22},
  URN =		{urn:nbn:de:0030-drops-239488},
  doi =		{10.4230/LIPIcs.WABI.2025.22},
  annote =	{Keywords: Spaced seed, Gapped k-mer, Integer linear program (ILP), Worst-case design, Reference bias}
}
  • Refine by Type
  • 162 Document/PDF
  • 34 Document/HTML
  • 3 Volume

  • Refine by Publication Year
  • 3 2026
  • 32 2025
  • 33 2024
  • 3 2023
  • 4 2022
  • Show More...

  • Refine by Author
  • 42 Pissis, Solon P.
  • 16 Radoszewski, Jakub
  • 12 Charalampopoulos, Panagiotis
  • 11 Bernardini, Giulia
  • 11 Loukides, Grigorios
  • Show More...

  • Refine by Series/Journal
  • 152 LIPIcs
  • 10 OASIcs

  • Refine by Classification
  • 58 Theory of computation → Pattern matching
  • 15 Applied computing → Bioinformatics
  • 14 Theory of computation → Design and analysis of algorithms
  • 11 Applied computing → Computational biology
  • 8 Mathematics of computing → Combinatorics on words
  • Show More...

  • Refine by Keyword
  • 15 string algorithms
  • 8 Hamming distance
  • 8 de Bruijn graph
  • 8 pattern matching
  • 7 string algorithm
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail