DROPS

Document

DOI: 10.4230/LIPIcs.STACS.2026.58

A Polylogarithmic Competitive Algorithm for Stochastic Online Sorting and TSP

Authors: Andreas Kalavas, Charalampos Platanos, and Thanos Tolias

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

In Online Sorting, an array of n initially empty cells is given. At each time step t, an element x_t ∈ [0,1] arrives and must be irrevocably placed in an empty cell without knowledge of future arrivals. We aim to minimize the sum of absolute differences between pairs of elements placed in consecutive array cells, seeking an online placement strategy that results in a final array close to a sorted one. An interesting multidimensional generalization, referred to as the Online Traveling Salesperson Problem, arises when the request sequence consists of points in the d-dimensional unit cube and the objective is to minimize the sum of Euclidean distances between points in consecutive cells. Motivated by the recent work of (Abrahamsen, Bercea, Beretta, Klausen and Kozma; ESA 2024), we consider the stochastic version of Online Sorting (resp. Online TSP), where each element (resp. point) x_t is an i.i.d. sample from the uniform distribution on [0, 1] (resp. [0,1]^d). By carefully decomposing the request sequence into a hierarchy of balls-into-bins instances, where the balls to bins ratio is large enough so that bin occupancy is sharply concentrated around its mean and small enough so that we can efficiently deal with the elements placed in the same bin, we obtain an online algorithm that approximates the optimal cost within a factor of O(log² n) with high probability. Our result comprises an exponential improvement over the previously best known competitive ratio of Õ(n^{1/4}) for Stochastic Online Sorting due to (Abrahamsen et al.; ESA 2024) and O(√n) for (adversarial) Online TSP due to (Bertram, ESA 2025).

Cite as

Andreas Kalavas, Charalampos Platanos, and Thanos Tolias. A Polylogarithmic Competitive Algorithm for Stochastic Online Sorting and TSP. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 58:1-58:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{kalavas_et_al:LIPIcs.STACS.2026.58,
  author =	{Kalavas, Andreas and Platanos, Charalampos and Tolias, Thanos},
  title =	{{A Polylogarithmic Competitive Algorithm for Stochastic Online Sorting and TSP}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{58:1--58:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.58},
  URN =		{urn:nbn:de:0030-drops-255473},
  doi =		{10.4230/LIPIcs.STACS.2026.58},
  annote =	{Keywords: sorting, online algorithm, balls-into-bins, TSP}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2026.45

Fairness in the k-Server Problem

Authors: Mohammadreza Daneshvaramoli, Mohammad Hajiesmaili, Shahin Kamali, Helia Karisani, and Cameron Musco

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Abstract

We initiate a formal study of fairness for the k-server problem, where the objective is not only to minimize the total movement cost, but also to distribute the cost equitably among servers. We first define a general notion of (α,β)-fairness, where, for parameters α ≥ 1 and β ≥ 0, no server incurs more than an α/k-fraction of the total cost plus an additive term β. We then show that fairness can be achieved without a loss in competitiveness in both the offline and online settings. In the offline setting, we give a deterministic algorithm that, for any ε > 0, transforms any optimal solution into an (α,β)-fair solution for α = 1 + ε and β = O(diam ⋅ log k / ε), while increasing the cost of the solution by just an additive O(diam ⋅ k log k / ε) term. Here diam is the diameter of the underlying metric space. We give a similar result in the online setting, showing that any competitive algorithm can be transformed into a randomized online algorithm that is fair with high probability against an oblivious adversary and still competitive up to a small loss. The above results leave open a significant question: can fairness be achieved in the online setting, either with a deterministic algorithm or a randomized algorithm, against a fully adaptive adversary? We make progress towards answering this question, showing that the classic deterministic Double Coverage Algorithm (DCA) is fair on line metrics and on tree metrics when k = 2. However, we also show a negative result: DCA fails to be fair for any non-vacuous parameters on general tree metrics. We further show that on uniform metrics (i.e., the paging problem), the deterministic First-In First-Out (FIFO) algorithm is fair. We show that any "marking algorithm", including the Least Recently Used (LRU) algorithm, also satisfies a weaker, but still meaningful notion of fairness.

Cite as

Mohammadreza Daneshvaramoli, Mohammad Hajiesmaili, Shahin Kamali, Helia Karisani, and Cameron Musco. Fairness in the k-Server Problem. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 45:1-45:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{daneshvaramoli_et_al:LIPIcs.ITCS.2026.45,
  author =	{Daneshvaramoli, Mohammadreza and Hajiesmaili, Mohammad and Kamali, Shahin and Karisani, Helia and Musco, Cameron},
  title =	{{Fairness in the k-Server Problem}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{45:1--45:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.45},
  URN =		{urn:nbn:de:0030-drops-253328},
  doi =		{10.4230/LIPIcs.ITCS.2026.45},
  annote =	{Keywords: k-server problem, online algorithms, fairness, competitive analysis}
}

Document

DOI: 10.4230/LIPIcs.IPEC.2025.8

Binary k-Center with Missing Entries: Structure Leads to Tractability

Authors: Tobias Friedrich, Kirill Simonov, and Farehe Soheil

Published in: LIPIcs, Volume 358, 20th International Symposium on Parameterized and Exact Computation (IPEC 2025)

Abstract

k-Center clustering is a fundamental classification problem, where the task is to categorize the given collection of entities into k clusters and come up with a representative for each cluster, so that the maximum distance between an entity and its representative is minimized. In this work, we focus on the setting where the entities are represented by binary vectors with missing entries, which model incomplete categorical data. This version of the problem has wide applications, from predictive analytics to bioinformatics. Our main finding is that the problem, which is notoriously hard from the classical complexity viewpoint, becomes tractable as soon as the known entries are sparse and exhibit a certain structure. Formally, we show fixed-parameter tractable algorithms for the parameters vertex cover, fracture number, and treewidth of the row-column graph, which encodes the positions of the known entries of the matrix. Additionally, we tie the complexity of the 1-cluster variant of the problem, which is famous under the name Closest String, to the complexity of solving integer linear programs with few constraints. This implies, in particular, that improving upon the running times of our algorithms would lead to more efficient algorithms for integer linear programming in general.

Cite as

Tobias Friedrich, Kirill Simonov, and Farehe Soheil. Binary k-Center with Missing Entries: Structure Leads to Tractability. In 20th International Symposium on Parameterized and Exact Computation (IPEC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 358, pp. 8:1-8:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{friedrich_et_al:LIPIcs.IPEC.2025.8,
  author =	{Friedrich, Tobias and Simonov, Kirill and Soheil, Farehe},
  title =	{{Binary k-Center with Missing Entries: Structure Leads to Tractability}},
  booktitle =	{20th International Symposium on Parameterized and Exact Computation (IPEC 2025)},
  pages =	{8:1--8:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-407-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{358},
  editor =	{Agrawal, Akanksha and van Leeuwen, Erik Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.IPEC.2025.8},
  URN =		{urn:nbn:de:0030-drops-251403},
  doi =		{10.4230/LIPIcs.IPEC.2025.8},
  annote =	{Keywords: Clustering, Missing Entries, k-Center, Parameterized Algorithms}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.27

Online Makespan Scheduling Under Scenarios

Authors: Ekin Ergen

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We consider a natural extension of online makespan scheduling on identical parallel machines by introducing scenarios. A scenario is a subset of jobs, and the task of our problem is to find a global assignment of the jobs to machines so that the maximum makespan under a scenario, i.e., the maximum makespan of any schedule restricted to a scenario, is minimized. For varying values of the number of scenarios and machines, we explore the competitiveness of online algorithms. We prove tight and near-tight bounds, several of which are achieved through novel constructions. In particular, we leverage the interplay between the unit processing time case of our problem and the hypergraph coloring problem both ways: We use hypergraph coloring techniques to steer an adversarial family of instances proving lower bounds for our problem, which in turn leads to lower bounds for several variants of online hypergraph coloring.

Cite as

Ekin Ergen. Online Makespan Scheduling Under Scenarios. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 27:1-27:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ergen:LIPIcs.ESA.2025.27,
  author =	{Ergen, Ekin},
  title =	{{Online Makespan Scheduling Under Scenarios}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{27:1--27:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.27},
  URN =		{urn:nbn:de:0030-drops-244950},
  doi =		{10.4230/LIPIcs.ESA.2025.27},
  annote =	{Keywords: online scheduling, scenario-based model, online algorithms}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.80

Online Metric TSP

Authors: Christian Bertram

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

In the online metric traveling salesperson problem, n points of a metric space arrive one by one and have to be placed (immediately and irrevocably) into empty cells of a size-n array. The goal is to minimize the sum of distances between consecutive points in the array. This problem was introduced by Abrahamsen, Bercea, Beretta, Klausen, and Kozma [ESA'24] as a generalization of the online sorting problem, which was introduced by Aamand, Abrahamsen, Beretta, and Kleist [SODA'23] as a tool in their study of online geometric packing problems. Online metric TSP has been studied for a range of fixed metric spaces. For 1-dimensional Euclidean space, the problem is equivalent to online sorting, where an optimal competitive ratio of Θ(√n) is known. For d-dimensional Euclidean space, the best-known upper bound is O(2^d √{dn log n}), leaving a gap to the Ω(√n) lower bound. Finally, for the uniform metric, where all distances are 0 or 1, the optimal competitive ratio is known to be Θ(log n). We study the problem for a general metric space, presenting an algorithm with competitive ratio O(√n). In particular, we close the gap for d-dimensional Euclidean space, completely removing the dependence on dimension. One might hope to simultaneously guarantee competitive ratio O(√n) in general and O(log n) for the uniform metric, but we show that this is impossible.

Cite as

Christian Bertram. Online Metric TSP. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 80:1-80:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bertram:LIPIcs.ESA.2025.80,
  author =	{Bertram, Christian},
  title =	{{Online Metric TSP}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{80:1--80:9},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.80},
  URN =		{urn:nbn:de:0030-drops-245485},
  doi =		{10.4230/LIPIcs.ESA.2025.80},
  annote =	{Keywords: online algorithm, metric space, TSP}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.53

When Is String Reconstruction Using de Bruijn Graphs Hard?

Authors: Ben Bals, Sebastiaan van Krieken, Solon P. Pissis, Leen Stougie, and Hilde Verbeek

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

The reduction of the fragment assembly problem to (variations of) the classical Eulerian trail problem [Pevzner et al., PNAS 2001] has led to remarkable progress in genome assembly. This reduction employs the notion of de Bruijn graph G = (V,E) of order k over an alphabet Σ. A single Eulerian trail in G represents a candidate genome reconstruction. Bernardini et al. have also introduced the complementary idea in data privacy [ALENEX 2020] based on z-anonymity. Let S be a private string that we would like to release, preventing, however, its full reconstruction. For a privacy threshold z > 0, we compute the largest k for which there exist at least z Eulerian trails in the order-k de Bruijn graph of S, and release a string S' obtained via a random Eulerian trail. The pressing question is: How hard is it to reconstruct a best string from a de Bruijn graph given a function that models domain knowledge? Such a function maps every length-k string to an interval of positions where it may occur in the reconstructed string. By the above reduction to de Bruijn graphs, the latter function translates into a function c mapping every edge to an interval where it may occur in an Eulerian trail. This gives rise to the following basic problem on graphs: Given an instance (G,c), can we efficiently compute an Eulerian trail respecting c? Hannenhalli et al. [CABIOS 1996] formalized this problem and showed that it is NP-complete. Ben-Dor et al. [J. Comput. Biol. 2002] showed that it is NP-complete, even on de Bruin graphs with |Σ| = 4. In this work, we settle the lower-bound side of this problem by showing that finding a c-respecting Eulerian trail in de Bruijn graphs over alphabets of size 2 is NP-complete. We then shift our focus to parametrization aiming to capture the quality of our domain knowledge in the complexity. Ben-Dor et al. developed an algorithm to solve the problem on de Bruijn graphs in 𝒪(m⋅w^{1.5} 4^w) time, where m = |E| and w is the maximum interval length over all edges in E. Bumpus and Meeks [Algorithmica 2023] later rediscovered the same algorithm on temporal graphs, which highlights the relevance of this problem in other contexts. Our central contribution is showing how combinatorial insights lead to exponential-time improvements over the state-of-the-art algorithm. In particular, for the important class of de Bruijn graphs, we develop an algorithm parametrized by w (log w+1) /(k-1): for a de Bruijn graph of order k, it runs in 𝒪(mw⋅2^{w(log(w)+1)/(k-1)}) time. Our result improves on the state of the art by roughly an exponent of (log(w)+1)/(k-1). The existing algorithms have a natural interpretation for string reconstruction: when for each length-k string, we know a small range of positions it must lie in, string reconstruction can be solved in linear time. Our improved algorithm shows that it is enough when the range of positions is small relative to k. We then generalize both the existing and our novel FPT algorithm by allowing the cost at every position of an interval to vary. In this optimization version, our hardness result translates into inapproximability and the FPT algorithms work with a slight extension. Surprisingly, even in this more general setting, we extend the FPT algorithms to count and enumerate the min-cost Eulerian trails. The counting result has direct applications in the data privacy framework of Bernardini et al.

Cite as

Ben Bals, Sebastiaan van Krieken, Solon P. Pissis, Leen Stougie, and Hilde Verbeek. When Is String Reconstruction Using de Bruijn Graphs Hard?. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 53:1-53:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bals_et_al:LIPIcs.ESA.2025.53,
  author =	{Bals, Ben and van Krieken, Sebastiaan and Pissis, Solon P. and Stougie, Leen and Verbeek, Hilde},
  title =	{{When Is String Reconstruction Using de Bruijn Graphs Hard?}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{53:1--53:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.53},
  URN =		{urn:nbn:de:0030-drops-245215},
  doi =		{10.4230/LIPIcs.ESA.2025.53},
  annote =	{Keywords: string algorithm, graph algorithm, de Bruijn graph, Eulerian trail}
}

Document

DOI: 10.4230/LIPIcs.WADS.2025.41

Scheduling on Identical Machines with Setup Time and Unknown Execution Time

Authors: Yasushi Kawase, Kazuhisa Makino, Vinh Long Phan, and Hanna Sumita

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)

Abstract

In this study, we investigate a scheduling problem on identical machines in which jobs require initial setup before execution. We assume that an algorithm can dynamically form a batch (i.e., a collection of jobs to be processed together) from the remaining jobs. The setup time is modeled as a known monotone function of the set of jobs within a batch, while the execution time of each job remains unknown until completion. This uncertainty poses significant challenges for minimizing the makespan. We address these challenges by considering two scenarios: each job batch must be assigned to a single machine, or a batch may be distributed across multiple machines. For both scenarios, we analyze settings with and without preemption. Across these four settings, we design online algorithms that achieve asymptotically optimal competitive ratios with respect to both the number of jobs and the number of machines.

Cite as

Yasushi Kawase, Kazuhisa Makino, Vinh Long Phan, and Hanna Sumita. Scheduling on Identical Machines with Setup Time and Unknown Execution Time. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 41:1-41:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kawase_et_al:LIPIcs.WADS.2025.41,
  author =	{Kawase, Yasushi and Makino, Kazuhisa and Phan, Vinh Long and Sumita, Hanna},
  title =	{{Scheduling on Identical Machines with Setup Time and Unknown Execution Time}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{41:1--41:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.41},
  URN =		{urn:nbn:de:0030-drops-242728},
  doi =		{10.4230/LIPIcs.WADS.2025.41},
  annote =	{Keywords: Online scheduling, Competitive analysis, Makespan minimization, Identical machines scheduling}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.15

Average-Tree Phylogenetic Diversity of Networks

Authors: Leo van Iersel, Mark Jones, Jannik Schestag, Celine Scornavacca, and Mathias Weller

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

Phylogenetic diversity is a measure used to quantify the biodiversity of a set of species. Here, we introduce the "average-tree" phylogenetic diversity score in rooted binary phylogenetic networks and consider algorithms for computing and maximizing the score on a given network. Basically, the score is the weighted average of the phylogenetic diversity scores in all trees displayed by the network, with the weights determined by the inheritance probabilities on the reticulation edges used in the embeddings. We show that computing the score of a given set of taxa in a given network is #P-hard, directly implying #P-hardness of finding a subset of k taxa achieving maximum diversity score and, thereby, ruling out polynomial-time algorithms for these problems unless the polynomial hierarchy collapses. However, we show that both problems can be solved efficiently if the input network is close to being a tree in the sense that its reticulation number is small. More precisely, we prove that we can solve the optimization problem in networks with n leaves and r reticulations in 2^{𝒪(r)}⋅ n⋅ k time. Using experiments on data produced by simulating a reticulate-evolution process, we show that our algorithm runs efficiently on networks with hundreds of taxa and tens of reticulations.

Cite as

Leo van Iersel, Mark Jones, Jannik Schestag, Celine Scornavacca, and Mathias Weller. Average-Tree Phylogenetic Diversity of Networks. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 15:1-15:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{vaniersel_et_al:LIPIcs.WABI.2025.15,
  author =	{van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias},
  title =	{{Average-Tree Phylogenetic Diversity of Networks}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{15:1--15:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.15},
  URN =		{urn:nbn:de:0030-drops-239405},
  doi =		{10.4230/LIPIcs.WABI.2025.15},
  annote =	{Keywords: phylogenetic diversity, phylogenetic networks, network phylogenetic diversity, algorithms, computational complexity}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.4

Haplotype-Aware Long-Read Error Correction

Authors: Parvesh Barak, Daniel Gibney, and Chirag Jain

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

Error correction of long reads is an important initial step in genome assembly workflows. For organisms with ploidy greater than one, it is important to preserve haplotype-specific variation during read correction. This challenge has driven the development of several haplotype-aware correction methods. However, existing methods are based on either ad-hoc heuristics or deep learning approaches. In this paper, we introduce a rigorous formulation for this problem. Our approach builds on the minimum error correction framework used in reference-based haplotype phasing. We prove that the proposed formulation for error correction of reads in de novo context, i.e., without using a reference genome, is NP-hard. To make our exact algorithm scale to large datasets, we introduce practical heuristics. Experiments using PacBio HiFi sequencing datasets from human and plant genomes show that our approach achieves accuracy comparable to state-of-the-art methods. The software is freely available at https://github.com/at-cg/HALE.

Cite as

Parvesh Barak, Daniel Gibney, and Chirag Jain. Haplotype-Aware Long-Read Error Correction. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{barak_et_al:LIPIcs.WABI.2025.4,
  author =	{Barak, Parvesh and Gibney, Daniel and Jain, Chirag},
  title =	{{Haplotype-Aware Long-Read Error Correction}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{4:1--4:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.4},
  URN =		{urn:nbn:de:0030-drops-239300},
  doi =		{10.4230/LIPIcs.WABI.2025.4},
  annote =	{Keywords: Genome assembly, phasing, clustering, overlap graph, consensus}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.2

On the Construction of Elastic Degenerate Strings

Authors: Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

An elastic degenerate string (EDS) is a sequence of sets of strings. In the context of bioinformatics, EDSes can be used to represent the variations observed in a population from its consensus genome. Pattern matching and comparison problems on EDSes have been widely studied in the literature, but their construction has been largely omitted. We fill this gap by showing how algorithms originally developed for related problems of founder reconstruction can be adapted to minimize the total cardinality of the EDS sets and total length of the EDS strings in linear time, given suitable multiple alignments representing the input data.

Cite as

Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti. On the Construction of Elastic Degenerate Strings. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 2:1-2:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{rizzo_et_al:OASIcs.Grossi.2,
  author =	{Rizzo, Nicola and M\"{a}kinen, Veli and Pisanti, Nadia},
  title =	{{On the Construction of Elastic Degenerate Strings}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.2},
  URN =		{urn:nbn:de:0030-drops-238014},
  doi =		{10.4230/OASIcs.Grossi.2},
  annote =	{Keywords: multiple sequence alignment, pattern matching, data structures, segmentation algorithms, founder reconstruction, dynamic programming, semi-dynamic range minimum queries, positional Burrows-Wheeler transform}
}

@InProceedings{rizzo_et_al:OASIcs.Grossi.2,
  author =	{Rizzo, Nicola and M\"{a}kinen, Veli and Pisanti, Nadia},
  title =	{{On the Construction of Elastic Degenerate Strings}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.2},
  URN =		{urn:nbn:de:0030-drops-238014},
  doi =		{10.4230/OASIcs.Grossi.2},
  annote =	{Keywords: multiple sequence alignment, pattern matching, data structures, segmentation algorithms, founder reconstruction, dynamic programming, semi-dynamic range minimum queries, positional Burrows-Wheeler transform}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.14

Specific Patterns Against Reference Sequences

Authors: Marie-Pierre Béal and Maxime Crochemore

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

We design alignment-free techniques for comparing a set of sequences or just a word, called a target, against another set of words, called a reference. This is done with the detection of factor patterns that distinguish the target from the reference. A target-specific factor of a target T against a reference R is then a factor w of a word in T that is not a factor of a word in R but whose proper factors of w are factors of a word in R. The strategy is based on the notion of minimal absent/forbidden words. We first address the computation of the set of target-specific factors of a target T against a reference R, where T and R are finite sets of sequences. The result is the construction of an automaton accepting the set of all considered target-specific factors. The construction algorithm runs in linear time according to the size of T ∪ R. The second result is the design of an algorithm to compute all the occurrences in a single sequence T of its target-specific factors against a reference R. The algorithm runs in real-time on the target sequence, independently of the number of occurrences of target-specific factors.

Cite as

Marie-Pierre Béal and Maxime Crochemore. Specific Patterns Against Reference Sequences. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 14:1-14:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{beal_et_al:OASIcs.Grossi.14,
  author =	{B\'{e}al, Marie-Pierre and Crochemore, Maxime},
  title =	{{Specific Patterns Against Reference Sequences}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{14:1--14:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.14},
  URN =		{urn:nbn:de:0030-drops-238130},
  doi =		{10.4230/OASIcs.Grossi.14},
  annote =	{Keywords: Specific pattern, Minimal absent word, Minimal forbidden word, Directed Acyclic Word Graph (DAWG), Suffix automaton}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.9

On String and Graph Sanitization

Authors: Giulia Bernardini, Huiping Chen, Grigorios Loukides, and Solon P. Pissis

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

Data sanitization is a process that conceals sensitive patterns from a given dataset. A secondary goal is to not severely harm the utility of the underlying data along this process. We survey some recent advancements on two related data sanitization topics: string and graph sanitization. In particular, we highlight the important contributions of our friend Prof. Roberto Grossi along this journey.

Cite as

Giulia Bernardini, Huiping Chen, Grigorios Loukides, and Solon P. Pissis. On String and Graph Sanitization. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 9:1-9:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bernardini_et_al:OASIcs.Grossi.9,
  author =	{Bernardini, Giulia and Chen, Huiping and Loukides, Grigorios and Pissis, Solon P.},
  title =	{{On String and Graph Sanitization}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{9:1--9:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.9},
  URN =		{urn:nbn:de:0030-drops-238086},
  doi =		{10.4230/OASIcs.Grossi.9},
  annote =	{Keywords: data privacy, data sanitization, string algorithms, graph algorithms}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.12

Shortest Undirected Paths in de Bruijn Graphs

Authors: Wiktor Zuba, Oded Lachish, and Solon P. Pissis

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

Computing shortest directed paths in de Bruijn graphs is well studied and well understood. This is not the case for computing undirected paths, which is much more challenging algorithmically. In this paper, we present a general framework for computing shortest undirected paths in arbitrary de Bruijn graphs, that is, arbitrary subgraphs of the complete de Bruijn graph. We then present an application of our techniques for making any arbitrary order-k de Bruijn graph G(V,E) weakly connected by adding a set of edges of minimum total cost. This improves the running time of the recent (2-2/d)-approximation algorithm by Bernardini et al. [CPM 2024] from 𝒪(k|V|²) to 𝒪(k|V|log d) time, where d is the number of weakly connected components of graph G.

Cite as

Wiktor Zuba, Oded Lachish, and Solon P. Pissis. Shortest Undirected Paths in de Bruijn Graphs. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 12:1-12:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{zuba_et_al:LIPIcs.CPM.2025.12,
  author =	{Zuba, Wiktor and Lachish, Oded and Pissis, Solon P.},
  title =	{{Shortest Undirected Paths in de Bruijn Graphs}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{12:1--12:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.12},
  URN =		{urn:nbn:de:0030-drops-231060},
  doi =		{10.4230/LIPIcs.CPM.2025.12},
  annote =	{Keywords: string algorithm, graph algorithm, de Bruijn graph, Eulerian graph}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.29

Faster Approximate Elastic-Degenerate String Matching - Part B

Authors: Paweł Gawrychowski, Adam Górkiewicz, Pola Marciniak, Solon P. Pissis, and Karol Pokorski

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

We revisit the complexity of approximate pattern matching in an elastic-degenerate string. Such a string is a sequence of n finite sets of strings of total length N, and compactly describes a collection of strings obtained by first choosing exactly one string in every set, and then concatenating them together. This is motivated by the need of storing a collection of highly similar DNA sequences. The basic algorithmic question on elastic-degenerate strings is pattern matching: given such an elastic-degenerate string and a standard pattern of length m, check if the pattern occurs in one of the strings in the described collection. Bernardini et al. [SICOMP 2022] showed how to leverage fast matrix multiplication to obtain an Õ(nm^{ω-1})+𝒪(N)-time complexity for this problem, where ω is the matrix multiplication exponent. However, from the point of view of possible applications, it is more desirable to work with approximate pattern matching, where we seek approximate occurrences of the pattern. This generalization has been considered in a few papers already, but the best result so far for occurrences with k mismatches, where k is a constant, is the Õ(nm²+N)-time algorithm presented in Part A [CPM 2025]. This brings the question whether increasing the dependency on m from m^{ω-1} to quadratic is necessary when moving from k = 0 to larger (but still constant) k. We design an Õ(nm^{1.5}+N)-time algorithm for pattern matching with k mismatches in an elastic-degenerate string, for any constant k. To obtain this time bound, we leverage the structural characterization of occurrences with k mismatches of Charalampopoulos, Kociumaka, and Wellnitz [FOCS 2020] together with fast Fourier transform. We need to work with multiple patterns at the same time, instead of a single pattern, which requires refining the original characterization. This might be of independent interest.

Cite as

Paweł Gawrychowski, Adam Górkiewicz, Pola Marciniak, Solon P. Pissis, and Karol Pokorski. Faster Approximate Elastic-Degenerate String Matching - Part B. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 29:1-29:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2025.29,
  author =	{Gawrychowski, Pawe{\l} and G\'{o}rkiewicz, Adam and Marciniak, Pola and Pissis, Solon P. and Pokorski, Karol},
  title =	{{Faster Approximate Elastic-Degenerate String Matching - Part B}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{29:1--29:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.29},
  URN =		{urn:nbn:de:0030-drops-231236},
  doi =		{10.4230/LIPIcs.CPM.2025.29},
  annote =	{Keywords: ED string, approximate pattern matching, Hamming distance, k mismatches}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.28

Faster Approximate Elastic-Degenerate String Matching - Part A

Authors: Solon P. Pissis, Jakub Radoszewski, and Wiktor Zuba

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

An elastic-degenerate (ED) string 𝐓 is a sequence 𝐓 = 𝐓[1] ⋯ 𝐓[n] of n finite sets of strings. The cardinality m of 𝐓 is the total number of strings in 𝐓[i], for all i ∈ [1..n]. The size N of 𝐓 is the total length of all m strings of 𝐓. ED strings have been introduced to represent a set of closely-related DNA sequences. Let P = P[1..p] be a pattern of length p and k > 0 be an integer. We consider the problem of k-Approximate ED String Matching (EDSM): searching k-approximate occurrences of P in the language of 𝐓. We call k-Approximate EDSM under the Hamming distance, k-Mismatch EDSM; and we call k-Approximate EDSM under edit distance, k-Edit EDSM. Bernardini et al. (Theoretical Computer Science, 2020) showed a simple 𝒪(k m p + kN)-time algorithm for k-Mismatch EDSM and an 𝒪(k² m p + kN)-time algorithm for k-Edit EDSM. We improve the dependency on k in both results, obtaining an Õ(k^{2/3}mp+√kN)-time algorithm for k-Mismatch EDSM and an Õ(kmp+ kN)-time algorithm for k-Edit EDSM. Bernardini et al. (Theory of Computing Systems, 2024) presented several algorithms for 1-Approximate EDSM working in Õ(np²+N) time. They have also left the possibility to generalize these solutions for k > 1 as an open problem. We improve the runtime of their solution for 1-Mismatch and 1-Edit EDSM from Õ(np²+N) to 𝒪(np²+N). We further show algorithms for k-Approximate EDSM for the Hamming and edit distances working in Õ(np² + N) time, for any constant k > 0. Finally, we show how our techniques can be applied to improve upon the complexity of the k-Approximate ED String Intersection and k-Approximate Doubly EDSM problems that were introduced very recently by Gabory et al. (Information and Computation, 2025).

Cite as

Solon P. Pissis, Jakub Radoszewski, and Wiktor Zuba. Faster Approximate Elastic-Degenerate String Matching - Part A. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 28:1-28:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{pissis_et_al:LIPIcs.CPM.2025.28,
  author =	{Pissis, Solon P. and Radoszewski, Jakub and Zuba, Wiktor},
  title =	{{Faster Approximate Elastic-Degenerate String Matching - Part A}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{28:1--28:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.28},
  URN =		{urn:nbn:de:0030-drops-231227},
  doi =		{10.4230/LIPIcs.CPM.2025.28},
  annote =	{Keywords: ED string, approximate string matching, Hamming distance, edit distance}
}

27 Search Results for "Stougie, Leen"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message