DROPS

Document

DOI: 10.4230/LIPIcs.ITCS.2026.81

Supercritical Tradeoff Between Size and Depth for Resolution over Parities

Authors: Dmitry Itsykson and Alexander Knop

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Abstract

Alekseev and Itsykson (STOC 2025) proved the existence of an unsatisfiable CNF formula such that any resolution over parities (Res(⊕)) refutation must either have exponential size (in the formula size) or superlinear depth (in the number of variables). In this paper, we extend this result by constructing a formula with the same hardness properties, but which additionally admits a resolution refutation of quasi-polynomial size. This establishes a supercritical tradeoff between size and depth for resolution over parities. The proof builds on the framework of Alekseev and Itsykson and relies on a lifting argument applied to the supercritical tradeoff between width and depth in resolution, proposed by Buss and Thapen (IPL 2026).

Cite as

Dmitry Itsykson and Alexander Knop. Supercritical Tradeoff Between Size and Depth for Resolution over Parities. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 81:1-81:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{itsykson_et_al:LIPIcs.ITCS.2026.81,
  author =	{Itsykson, Dmitry and Knop, Alexander},
  title =	{{Supercritical Tradeoff Between Size and Depth for Resolution over Parities}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{81:1--81:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.81},
  URN =		{urn:nbn:de:0030-drops-253680},
  doi =		{10.4230/LIPIcs.ITCS.2026.81},
  annote =	{Keywords: lifting theorems, resolution depth, resolution over parities, resolution width, supercritical tradeoff}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.FSTTCS.2025.2

Unboundedness Problems for Formal Languages (Invited Talk)

Authors: Georg Zetzsche

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)

Abstract

Informally, unboundedness problems are decision problems that ask about the existence of infinitely many words (satisfying certain properties) in a formal language. For example: Is a given language infinite? Or: Does a given language have super-polynomial growth? These came into focus in recent years because of their connections to downward closure computation and separability problems. Although unboundedness problems may seem difficult at first, it turns out that there are techniques that are at the same time conceptually very simple, but also apply to a surprisingly wide variety of language classes. The talk will survey recent results (and techniques) concerning unboundedness problems.

Cite as

Georg Zetzsche. Unboundedness Problems for Formal Languages (Invited Talk). In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 2:1-2:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{zetzsche:LIPIcs.FSTTCS.2025.2,
  author =	{Zetzsche, Georg},
  title =	{{Unboundedness Problems for Formal Languages}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{2:1--2:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.2},
  URN =		{urn:nbn:de:0030-drops-250810},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.2},
  annote =	{Keywords: Decidability, formal languages, unifying frameworks, downward closure, separability}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2025.42

Degrees of Second and Higher-Order Polynomials

Authors: Donghyun Lim and Martin Ziegler

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)

Abstract

Second-order polynomials generalize classical (=first-order) ones in allowing for additional variables that range over functions rather than values. We are motivated by their applications in higher-order computational complexity theory, extending for instance discrete classes (like P/FP or PSPACE/FPSPACE) to operators in Analysis [http://doi.org/10.1137/S0097539794263452], [http://doi.org/10.1145/2189778.2189780]. The degree subclassifies ordinary polynomial growth into linear, quadratic, cubic, etc. To similarly classify second-order polynomials, we (well-)define their degree by structural induction as an "arctic" first-order polynomial: a term/expression over integer variable D and operations + and ⋅ and binary max(). This generalized degree turns out to transform nicely under (now two kinds of) polynomial composition. As examples, we collect and determine the degrees of previous and new asymptotic analyses of algorithms and operators receiving function/oracle arguments. Then we motivate and introduce third-order polynomials and their degrees as arctic second-order polynomials, along with their transformations under three kinds of composition. Proceeding to fourth order and beyond yields a hierarchy, with characterization in Simply Typed Lambda Calculus.

Cite as

Donghyun Lim and Martin Ziegler. Degrees of Second and Higher-Order Polynomials. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 42:1-42:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{lim_et_al:LIPIcs.FSTTCS.2025.42,
  author =	{Lim, Donghyun and Ziegler, Martin},
  title =	{{Degrees of Second and Higher-Order Polynomials}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{42:1--42:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.42},
  URN =		{urn:nbn:de:0030-drops-251225},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.42},
  annote =	{Keywords: Logic in Computer Science, Higher Order Program Analysis, Asymptotic Type Theory}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.47

Fast and Lightweight Distributed Suffix Array Construction

Authors: Manuel Haag, Florian Kurpicz, Peter Sanders, and Matthias Schimek

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

The suffix array contains the lexicographical order of all suffixes of a text. It is one of the most well-studied text indices with applications in bioinformatics, compression, and pattern matching. The main bottleneck of distributed-memory suffix array construction algorithms is their memory requirements. Even careful implementations require 30×-60× the input size as working memory. We present a scalable and lightweight distributed-memory adaptation of the difference cover (DCX) suffix array construction algorithm. Our approach relies on novel bucketing and random chunk redistribution techniques which reduce our memory requirement to 20×-26× the input size for medium-sized inputs and to 14×-15× for large-sized inputs. Regarding running time, we achieve speedups of up to 5× over current state-of-the-art distributed suffix array construction algorithms.

Cite as

Manuel Haag, Florian Kurpicz, Peter Sanders, and Matthias Schimek. Fast and Lightweight Distributed Suffix Array Construction. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 47:1-47:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{haag_et_al:LIPIcs.ESA.2025.47,
  author =	{Haag, Manuel and Kurpicz, Florian and Sanders, Peter and Schimek, Matthias},
  title =	{{Fast and Lightweight Distributed Suffix Array Construction}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{47:1--47:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.47},
  URN =		{urn:nbn:de:0030-drops-245154},
  doi =		{10.4230/LIPIcs.ESA.2025.47},
  annote =	{Keywords: Distributed Computing, Suffix Array Construction}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.109

Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Authors: Hans-Peter Lehmann, Peter Sanders, Stefan Walzer, and Jonatan Ziegler

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Randomised algorithms often employ methods that can fail and that are retried with independent randomness until they succeed. Randomised data structures therefore often store indices of successful attempts, called seeds. If n such seeds are required (e.g., for independent substructures) the standard approach is to compute for each i ∈ [n] the smallest successful seed S_i and store S = (S_1,…,S_n). The central observation of this paper is that this is not space-optimal. We present a different algorithm that computes a sequence S' = (S_1',…,S_n') of successful seeds such that the entropy of S' undercuts the entropy of S by Ω(n) bits in most cases. To achieve a memory consumption of OPT+εn, the expected number of inspected seeds increases by a factor of 𝒪(1/ε). We demonstrate the usefulness of our findings with a novel construction for minimal perfect hash functions that, for n keys and any ε ∈ [n^{-3/7},1], has space requirement (1+ε)OPT and construction time 𝒪(n/ε). All previous approaches only support ε = ω(1/log n) or have construction times that increase exponentially with 1/ε. Our implementation beats the construction throughput of the state of the art by more than two orders of magnitude for ε ≤ 3%.

Cite as

Hans-Peter Lehmann, Peter Sanders, Stefan Walzer, and Jonatan Ziegler. Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 109:1-109:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{lehmann_et_al:LIPIcs.ESA.2025.109,
  author =	{Lehmann, Hans-Peter and Sanders, Peter and Walzer, Stefan and Ziegler, Jonatan},
  title =	{{Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{109:1--109:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.109},
  URN =		{urn:nbn:de:0030-drops-245780},
  doi =		{10.4230/LIPIcs.ESA.2025.109},
  annote =	{Keywords: Random Seed, Encoding, Bernoulli Process, Backtracking, Perfect Hashing}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.99

Engineering Minimal k-Perfect Hash Functions

Authors: Stefan Hermann, Sebastian Kirmayer, Hans-Peter Lehmann, Peter Sanders, and Stefan Walzer

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Given a set S of n keys, a k-perfect hash function (kPHF) is a data structure that maps the keys to the first m integers, where each output integer can be hit by at most k input keys. When m = ⌈n/k⌉, the resulting function is called a minimal k-perfect hash function (MkPHF). Applications of kPHFs can be found in external memory data structures or to create efficient 1-perfect hash functions, which in turn have a wide range of applications from databases to bioinformatics. Several papers from the 1980s look at external memory data structures with small internal memory indexes. However, actual k-perfect hash functions are surprisingly rare, and the area has not seen a lot of research recently. At the same time, recent research in 1-perfect hashing shows that there is a lack of efficient kPHFs. In this paper, we revive the area of k-perfect hashing, presenting four new constructions. Our implementations simultaneously dominate older approaches in space consumption, construction time, and query time. We see this paper as a possible starting point of an active line of research, similar to the area of 1-perfect hashing.

Cite as

Stefan Hermann, Sebastian Kirmayer, Hans-Peter Lehmann, Peter Sanders, and Stefan Walzer. Engineering Minimal k-Perfect Hash Functions. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 99:1-99:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hermann_et_al:LIPIcs.ESA.2025.99,
  author =	{Hermann, Stefan and Kirmayer, Sebastian and Lehmann, Hans-Peter and Sanders, Peter and Walzer, Stefan},
  title =	{{Engineering Minimal k-Perfect Hash Functions}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{99:1--99:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.99},
  URN =		{urn:nbn:de:0030-drops-245685},
  doi =		{10.4230/LIPIcs.ESA.2025.99},
  annote =	{Keywords: Compressed Data Structures, Perfect Hashing}
}

Document

DOI: 10.4230/LIPIcs.CCC.2025.24

Super-Critical Trade-Offs in Resolution over Parities via Lifting

Authors: Arkadev Chattopadhyay and Pavel Dvořák

Published in: LIPIcs, Volume 339, 40th Computational Complexity Conference (CCC 2025)

Abstract

Razborov [Alexander A. Razborov, 2016] exhibited the following surprisingly strong trade-off phenomenon in propositional proof complexity: for a parameter k = k(n), there exists k-CNF formulas over n variables, having resolution refutations of O(k) width, but every tree-like refutation of width n^{1-ε}/k needs size exp(n^Ω(k)). We extend this result to tree-like Resolution over parities, commonly denoted by Res(⊕), with parameters essentially unchanged. To obtain our result, we extend the lifting theorem of Chattopadhyay, Mande, Sanyal and Sherif [Arkadev Chattopadhyay et al., 2023] to handle tree-like affine DAGs. We introduce additional ideas from linear algebra to handle forget nodes along long paths.

Cite as

Arkadev Chattopadhyay and Pavel Dvořák. Super-Critical Trade-Offs in Resolution over Parities via Lifting. In 40th Computational Complexity Conference (CCC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 339, pp. 24:1-24:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chattopadhyay_et_al:LIPIcs.CCC.2025.24,
  author =	{Chattopadhyay, Arkadev and Dvo\v{r}\'{a}k, Pavel},
  title =	{{Super-Critical Trade-Offs in Resolution over Parities via Lifting}},
  booktitle =	{40th Computational Complexity Conference (CCC 2025)},
  pages =	{24:1--24:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-379-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{339},
  editor =	{Srinivasan, Srikanth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2025.24},
  URN =		{urn:nbn:de:0030-drops-237186},
  doi =		{10.4230/LIPIcs.CCC.2025.24},
  annote =	{Keywords: Proof complexity, Lifting, Resolution over parities}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.14

Efficient Terabyte-Scale Text Compression via Stable Local Consistency and Parallel Grammar Processing

Authors: Diego Díaz-Domínguez

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

We present compression algorithms designed to process terabyte-sized datasets in parallel. Our approach builds on locally consistent grammars, a lightweight form of compression, combined with simple post-processing techniques to achieve further space reductions. Locally consistent grammar algorithms are suitable for scaling as they need minimal satellite information to compact the text, but they are not inherently parallel. To enable parallelisation, we introduce a novel concept that we call stable local consistency. A grammar algorithm ALG is stable if for any pattern P occurring in a collection 𝒯 = {T_1, T_2, …, T_k}, instances ALG(T_1), ALG(T_2), …, ALG(T_k) independently produce cores for P with the same topology. In a locally consistent grammar, the core of P is a subset of nodes and edges in the parse tree of 𝒯 that remains the same in all the occurrences of P. This feature enables compression, but it only holds if ALG defines a common set of nonterminal symbols for the strings. Stability removes this restriction, allowing us to run ALG(T_1), ALG(T_2), …, ALG(T_k) in parallel and subsequently merge their grammars into a single output equivalent to that of ALG(𝒯). We implemented our ideas and tested them on massive datasets. Our experiments showed that our method could process 7.9 TB of bacterial genomes in around nine hours, using 16 threads and 0.43 bits/symbol of working memory, achieving a compression ratio of 85x.

Cite as

Diego Díaz-Domínguez. Efficient Terabyte-Scale Text Compression via Stable Local Consistency and Parallel Grammar Processing. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 14:1-14:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{diazdominguez:LIPIcs.SEA.2025.14,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego},
  title =	{{Efficient Terabyte-Scale Text Compression via Stable Local Consistency and Parallel Grammar Processing}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{14:1--14:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.14},
  URN =		{urn:nbn:de:0030-drops-232525},
  doi =		{10.4230/LIPIcs.SEA.2025.14},
  annote =	{Keywords: Grammar compression, locally consistent parsing, hashing}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.21

PtrHash: Minimal Perfect Hashing at RAM Throughput

Authors: Ragnar Groot Koerkamp

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

Motivation. Given a set K of n keys, a minimal perfect hash function (MPHF) is a collision-free bijective map H_mphf from K to {0, … , n-1}. These functions have uses in databases, search engines, and are used in bioinformatics indexing tools such as Pufferfish (using BBHash), and Piscem (PTHash). PTHash is also used in SSHash, a data structure on k-mers that supports membership queries. PTHash only takes around 5% of the total space of SSHash, and thus, trading slightly more space for faster queries is beneficial. Thus, this work presents a (minimal) perfect hash function that first prioritizes query throughput, while also allowing efficient construction for 10⁹ or more elements using 2.4 bits of memory per key. Contributions. Both PTHash and PHOBIC first map all n keys to n/λ < n buckets. Then, each bucket stores a pilot that controls the final hash value of the keys mapping to it. PtrHash builds on this by using 1) fixed-width (uncompressed) 8-bit pilots, 2) a construction algorithm similar to Cuckoo hashing to find suitable pilot values. Further, it partitions the keys, so that keys in each part map to their own set of slots. PtrHash 3) uses the same number of buckets and slots for each part, with 4) a single remap table to map intermediate positions ≥ n to < n, 5) encoded using per-cacheline Elias-Fano coding. Lastly, 6) PtrHash supports streaming queries, where we use prefetching to answer a stream of multiple queries more efficiently than one-by-one processing. Results. With default parameters, PtrHash takes 2.4 bits per key. On 300 million string keys, PtrHash is as fast or faster to build than other MPHFs at a similar size, and at least 2.1× faster to query. When streaming multiple queries, this improves to 3.3× speedup over the fastest alternative, while also being significantly faster to construct. When using 10⁹ integer keys instead, query times are as low as 12 ns/key when iterating in a for loop, or even down to 8 ns/key when using the streaming approach, just short of the 7.4 ns inverse throughput of random memory accesses.

Cite as

Ragnar Groot Koerkamp. PtrHash: Minimal Perfect Hashing at RAM Throughput. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 21:1-21:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{grootkoerkamp:LIPIcs.SEA.2025.21,
  author =	{Groot Koerkamp, Ragnar},
  title =	{{PtrHash: Minimal Perfect Hashing at RAM Throughput}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{21:1--21:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.21},
  URN =		{urn:nbn:de:0030-drops-232597},
  doi =		{10.4230/LIPIcs.SEA.2025.21},
  annote =	{Keywords: Minimal perfect hashing, Compressed Data Structures}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.4

U-Index: A Universal Indexing Framework for Matching Long Patterns

Authors: Lorraine A. K. Ayad, Gabriele Fici, Ragnar Groot Koerkamp, Grigorios Loukides, Rob Patro, Giulio Ermanno Pibiri, and Solon P. Pissis

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

Motivation. Text indexing is a fundamental and well-studied problem. Classic solutions to this problem either replace the original text with a compressed representation, e.g., the FM-index and its variants, or keep it uncompressed but attach some redundancy - an index - to accelerate matching, e.g., the suffix array. The former solutions thus retain excellent compressed space, but are practically slow to construct and query. The latter approaches, instead, sacrifice space efficiency but are typically faster; for example, the suffix array takes much more space than the text itself for commonly used alphabets, like ASCII or DNA, but it is very fast to construct and query. Methods. In this paper, we show that efficient text indexing can be achieved using just a small extra space on top of the original text, provided that the query patterns are sufficiently long. More specifically, we develop a new indexing paradigm in which a sketch of a query pattern is first matched against a sketch of the text. Once candidate matches are retrieved, they are verified using the original text. This paradigm is thus universal in the sense that it allows us to use any solution to index the sketched text, like a suffix array, FM-index, or r-index. Results. We explore both the theory and the practice of this universal framework. With an extensive experimental analysis, we show that, surprisingly, universal indexes can be constructed much faster than their unsketched counterparts and take a fraction of the space, as a direct consequence of (i) having a lower bound on the length of patterns and (ii) working in sketch space. Furthermore, these data structures have the potential of retaining or even improving query time, because matching against the sketched text is faster and verifying candidates can be theoretically done in constant time per occurrence (or, in practice, by short and cache-friendly scans of the text). Finally, we discuss some important applications of this novel indexing paradigm to computational biology. We hypothesize that such indexes will be particularly effective when the queries are sufficiently long, and so we demonstrate applications in long-read mapping.

Cite as

Lorraine A. K. Ayad, Gabriele Fici, Ragnar Groot Koerkamp, Grigorios Loukides, Rob Patro, Giulio Ermanno Pibiri, and Solon P. Pissis. U-Index: A Universal Indexing Framework for Matching Long Patterns. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 4:1-4:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ayad_et_al:LIPIcs.SEA.2025.4,
  author =	{Ayad, Lorraine A. K. and Fici, Gabriele and Groot Koerkamp, Ragnar and Loukides, Grigorios and Patro, Rob and Pibiri, Giulio Ermanno and Pissis, Solon P.},
  title =	{{U-Index: A Universal Indexing Framework for Matching Long Patterns}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{4:1--4:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.4},
  URN =		{urn:nbn:de:0030-drops-232420},
  doi =		{10.4230/LIPIcs.SEA.2025.4},
  annote =	{Keywords: Text Indexing, Sketching, Minimizers, Hashing}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.58

A New Impossibility Result for Online Bipartite Matching Problems

Authors: Flavio Chierichetti, Mirko Giacchini, Alessandro Panconesi, and Andrea Vattani

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

Online Bipartite Matching with random user arrival is a fundamental problem in the online advertisement ecosystem. Over the last 30 years, many algorithms and impossibility results have been developed for this problem. In particular, the latest impossibility result was established by Manshadi, Oveis Gharan and Saberi [Manshadi et al., 2011] in 2011. Since then, several algorithms have been published in an effort to narrow the gap between the upper and the lower bounds on the competitive ratio. In this paper we show that no algorithm can achieve a competitive ratio better than 1- e/(e^e) = 0.82062…, improving upon the 0.823 upper bound presented in [Manshadi et al., 2011]. Our construction is simple to state, accompanied by a fully analytic proof, and yields a competitive ratio bound intriguingly similar to 1 - 1/e, the optimal competitive ratio for the fully adversarial Online Bipartite Matching problem. Although the tightness of our upper bound remains an open question, we show that our construction is extremal in a natural class of instances.

Cite as

Flavio Chierichetti, Mirko Giacchini, Alessandro Panconesi, and Andrea Vattani. A New Impossibility Result for Online Bipartite Matching Problems. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 58:1-58:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chierichetti_et_al:LIPIcs.ICALP.2025.58,
  author =	{Chierichetti, Flavio and Giacchini, Mirko and Panconesi, Alessandro and Vattani, Andrea},
  title =	{{A New Impossibility Result for Online Bipartite Matching Problems}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{58:1--58:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.58},
  URN =		{urn:nbn:de:0030-drops-234354},
  doi =		{10.4230/LIPIcs.ICALP.2025.58},
  annote =	{Keywords: Bipartite Matching, Random Graphs, Competitive Ratio}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.20

Text Indexing for Simple Regular Expressions

Authors: Hideo Bannai, Philip Bille, Inge Li Gørtz, Gad M. Landau, Gonzalo Navarro, Nicola Prezza, Teresa Anna Steiner, and Simon Rumle Tarnow

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

We study the problem of indexing a text T[1..n] ∈ Σⁿ so that, later, given a query regular expression pattern R of size m = |R|, we can report all the occ substrings T[i..j] of T matching R. The problem is known to be hard for arbitrary patterns R, so in this paper, we consider the following two types of patterns. (1) Character-class Kleene-star patterns of the form P₁ D^* P₂, where P₁ and P₂ are strings and D = {c₁, …, c_k} ⊂ Σ is a character-class (shorthand for the regular expression (c₁ | c₂ | ⋯ | c_k)) and (2) String Kleene-star patterns of the form P₁ P^* P₂ where P, P₁ and P₂ are strings. In case (1), we describe an index of O(nlog^{1+ε}n) space (for any constant ε > 0) solving queries in time O(m + log n/log log n + occ) on constant-sized alphabets. We also describe a general solution for any alphabet size. This result is conditioned on the existence of an anchor: a character of P₁P₂ that does not belong to D. We justify this assumption by proving that no efficient indexing solution can exist if an anchor is not present unless the Set Disjointness Conjecture fails. In case (2), we describe an index of size O(n) answering queries in time O(m + (occ+1)log^{ε}n) on any alphabet size.

Cite as

Hideo Bannai, Philip Bille, Inge Li Gørtz, Gad M. Landau, Gonzalo Navarro, Nicola Prezza, Teresa Anna Steiner, and Simon Rumle Tarnow. Text Indexing for Simple Regular Expressions. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 20:1-20:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bannai_et_al:LIPIcs.CPM.2025.20,
  author =	{Bannai, Hideo and Bille, Philip and G{\o}rtz, Inge Li and Landau, Gad M. and Navarro, Gonzalo and Prezza, Nicola and Steiner, Teresa Anna and Tarnow, Simon Rumle},
  title =	{{Text Indexing for Simple Regular Expressions}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{20:1--20:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.20},
  URN =		{urn:nbn:de:0030-drops-231143},
  doi =		{10.4230/LIPIcs.CPM.2025.20},
  annote =	{Keywords: Text indexing, regular expressions, data structures}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.3

Counting on General Run-Length Grammars

Authors: Gonzalo Navarro and Alejandro Pacheco

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

We introduce a data structure for counting pattern occurrences in texts compressed with any run-length context-free grammar. Our structure uses space proportional to the grammar size and counts the occurrences of a pattern of length m in a text of length n in time O(mlog^{2+ε} n), for any constant ε > 0 chosen at indexing time. This is the first solution to an open problem posed by Christiansen et al. [ACM TALG 2020] and enhances our abilities for computation over compressed data; we give an example application.

Cite as

Gonzalo Navarro and Alejandro Pacheco. Counting on General Run-Length Grammars. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 3:1-3:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{navarro_et_al:LIPIcs.CPM.2025.3,
  author =	{Navarro, Gonzalo and Pacheco, Alejandro},
  title =	{{Counting on General Run-Length Grammars}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{3:1--3:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.3},
  URN =		{urn:nbn:de:0030-drops-230977},
  doi =		{10.4230/LIPIcs.CPM.2025.3},
  annote =	{Keywords: Grammar-based indexing, Run-length context-free grammars, Counting pattern occurrences, Periods in strings}
}

Document

DOI: 10.4230/LIPIcs.OPODIS.2024.23

Distributed Recoverable Sketches

Authors: Diana Cohen, Roy Friedman, and Rana Shahout

Published in: LIPIcs, Volume 324, 28th International Conference on Principles of Distributed Systems (OPODIS 2024)

Abstract

Sketches are commonly used in computer systems and network monitoring tools to provide efficient query executions while maintaining a compact data representation. Switches and routers maintain sketches to track statistical characteristics of the network traffic. The availability of such data is essential for the network analysis as a whole. Consequently, being able to recover sketches is critical following a switch crash. In this paper, we explore how nodes in a network environment can cooperate to recover sketch data whenever any of them crashes. In particular, we focus on frequency estimation linear sketches, such as the Count-Min Sketch. We consider various approaches to ensure data reliability and explore the trade-offs between space consumption, runtime overheads, and traffic during recovery, which we point out as design guidelines. Besides different aspects of efficacy, we design a modular system for ease of maintenance and further scaling. A key aspect we examine is how nodes update each other about their sketch content as it evolves over time. In particular, we compare between periodic full updates vs. incremental updates. We also examine several data structures to economically represent and encode a batch of latest changes. Our framework is generic, and other data structures can be plugged-in via an abstract API as long as they implement the corresponding API methods.

Cite as

Diana Cohen, Roy Friedman, and Rana Shahout. Distributed Recoverable Sketches. In 28th International Conference on Principles of Distributed Systems (OPODIS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 324, pp. 23:1-23:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.OPODIS.2024.23,
  author =	{Cohen, Diana and Friedman, Roy and Shahout, Rana},
  title =	{{Distributed Recoverable Sketches}},
  booktitle =	{28th International Conference on Principles of Distributed Systems (OPODIS 2024)},
  pages =	{23:1--23:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-360-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{324},
  editor =	{Bonomi, Silvia and Galletta, Letterio and Rivi\`{e}re, Etienne and Schiavoni, Valerio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2024.23},
  URN =		{urn:nbn:de:0030-drops-225594},
  doi =		{10.4230/LIPIcs.OPODIS.2024.23},
  annote =	{Keywords: Sketches, Stream Processing, Distributed Recovery, Incremental Updates, Sketch Partitioning}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.39

Subset Sum in Time 2^{n/2} / poly(n)

Authors: Xi Chen, Yaonan Jin, Tim Randolph, and Rocco A. Servedio

Published in: LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

Abstract

A major goal in the area of exact exponential algorithms is to give an algorithm for the (worst-case) n-input Subset Sum problem that runs in time 2^{(1/2 - c)n} for some constant c > 0. In this paper we give a Subset Sum algorithm with worst-case running time O(2^{n/2} ⋅ n^{-γ}) for a constant γ > 0.5023 in standard word RAM or circuit RAM models. To the best of our knowledge, this is the first improvement on the classical "meet-in-the-middle" algorithm for worst-case Subset Sum, due to Horowitz and Sahni, which can be implemented in time O(2^{n/2}) in these memory models [Horowitz and Sahni, 1974]. Our algorithm combines a number of different techniques, including the "representation method" introduced by Howgrave-Graham and Joux [Howgrave-Graham and Joux, 2010] and subsequent adaptations of the method in Austrin, Kaski, Koivisto, and Nederlof [Austrin et al., 2016], and Nederlof and Węgrzycki [Jesper Nederlof and Karol Wegrzycki, 2021], and "bit-packing" techniques used in the work of Baran, Demaine, and Pǎtraşcu [Baran et al., 2005] on subquadratic algorithms for 3SUM.

Cite as

Xi Chen, Yaonan Jin, Tim Randolph, and Rocco A. Servedio. Subset Sum in Time 2^{n/2} / poly(n). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 39:1-39:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.APPROX/RANDOM.2023.39,
  author =	{Chen, Xi and Jin, Yaonan and Randolph, Tim and Servedio, Rocco A.},
  title =	{{Subset Sum in Time 2^\{n/2\} / poly(n)}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{39:1--39:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.39},
  URN =		{urn:nbn:de:0030-drops-188641},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.39},
  annote =	{Keywords: Exact algorithms, subset sum, log shaving}
}

28 Search Results for "Dietzfelbinger, Martin"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message