DROPS

Document

Database Theory in Action

DOI: 10.4230/LIPIcs.ICDT.2026.25

Database Theory in Action: Yannakakis' Algorithm

Authors: Paraschos Koutris, Stijn Vansummeren, Qichen Wang, Yisu Remy Wang, and Xiangyao Yu

Published in: LIPIcs, Volume 365, 29th International Conference on Database Theory (ICDT 2026)

Abstract

Yannakakis' seminal algorithm is optimal for acyclic joins, yet it has not been widely adopted due to its poor performance in practice. This paper briefly surveys recent advancements in making Yannakakis' algorithm more practical, in terms of both efficiency and ease of implementation, and points out several avenues for future research.

Cite as

Paraschos Koutris, Stijn Vansummeren, Qichen Wang, Yisu Remy Wang, and Xiangyao Yu. Database Theory in Action: Yannakakis' Algorithm. In 29th International Conference on Database Theory (ICDT 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 365, pp. 25:1-25:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2026.25,
  author =	{Koutris, Paraschos and Vansummeren, Stijn and Wang, Qichen and Wang, Yisu Remy and Yu, Xiangyao},
  title =	{{Database Theory in Action: Yannakakis' Algorithm}},
  booktitle =	{29th International Conference on Database Theory (ICDT 2026)},
  pages =	{25:1--25:6},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-413-0},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{365},
  editor =	{ten Cate, Balder and Funk, Maurice},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2026.25},
  URN =		{urn:nbn:de:0030-drops-256395},
  doi =		{10.4230/LIPIcs.ICDT.2026.25},
  annote =	{Keywords: Join algorithms, acyclicity, Yannakakis' algorithm}
}

Document

Invited Paper

DOI: 10.4230/OASIcs.RW.2024/2025.5

Inconsistency-Tolerant Semantics Based on (Preferred) Repairs (Invited Paper)

Authors: Camille Bourgaux

Published in: OASIcs, Volume 138, Joint Proceedings of the 20th and 21st Reasoning Web Summer Schools (RW 2024 & RW 2025)

Abstract

Real-world datasets are plagued by data quality issues which may render the data inconsistent w.r.t. a set of constraints, be they given by database integrity constraints or ontologies. A prominent way to handle such inconsistent data is to use inconsistency-tolerant semantics to obtain meaningful answers to queries. Most of these semantics are based on some notion of repairs, which represent ways of restoring the data consistency. The most basic kind of repairs is that of subset repairs, which are maximal consistent subsets of the dataset. However, in many scenarios, one can define preferred repairs based on some preference information. These lecture notes present inconsistency-tolerant semantics, focusing on the repair-based ones, then review different kinds of preferred repairs that have been considered in the literature. We present in particular the relationships between different kinds of preferred repairs and other notions related to inconsistency handling, the computational complexity of reasoning with (preferred) repairs, and some implementations.

Cite as

Camille Bourgaux. Inconsistency-Tolerant Semantics Based on (Preferred) Repairs (Invited Paper). In Joint Proceedings of the 20th and 21st Reasoning Web Summer Schools (RW 2024 & RW 2025). Open Access Series in Informatics (OASIcs), Volume 138, pp. 5:1-5:67, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bourgaux:OASIcs.RW.2024/2025.5,
  author =	{Bourgaux, Camille},
  title =	{{Inconsistency-Tolerant Semantics Based on (Preferred) Repairs}},
  booktitle =	{Joint Proceedings of the 20th and 21st Reasoning Web Summer Schools (RW 2024 \& RW 2025)},
  pages =	{5:1--5:67},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-405-5},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{138},
  editor =	{Artale, Alessandro and Bienvenu, Meghyn and Garc{\'\i}a, Yazm{\'\i}n Ib\'{a}\~{n}ez and Murlak, Filip},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.RW.2024/2025.5},
  URN =		{urn:nbn:de:0030-drops-250504},
  doi =		{10.4230/OASIcs.RW.2024/2025.5},
  annote =	{Keywords: Knowledge bases, databases, inconsistency handling, repairs, preferences}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.64

Fully Scalable MPC Algorithms for Euclidean k-Center

Authors: Artur Czumaj, Guichen Gao, Mohsen Ghaffari, and Shaofeng H.-C. Jiang

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

The k-center problem is a fundamental optimization problem with numerous applications in machine learning, data analysis, data mining, and communication networks. The k-center problem has been extensively studied in the classical sequential setting for several decades, and more recently there have been some efforts in understanding the problem in parallel computing, on the Massively Parallel Computation (MPC) model. For now, we have a good understanding of k-center in the case where each local MPC machine has sufficient local memory to store some representatives from each cluster, that is, when one has Ω(k) local memory per machine. While this setting covers the case of small values of k, for a large number of clusters these algorithms require undesirably large local memory, making them poorly scalable. The case of large k has been considered only recently for the fully scalable low-local-memory MPC model for the Euclidean instances of the k-center problem. However, the earlier works have been considering only the constant dimensional Euclidean space, required a super-constant number of rounds, and produced only k(1+o(1)) centers whose cost is a super-constant approximation of k-center. In this work, we significantly improve upon the earlier results for the k-center problem for the fully scalable low-local-memory MPC model. In the low dimensional Euclidean case in ℝ^d, we present the first constant-round fully scalable MPC algorithm for (2+ε)-approximation. We push the ratio further to (1 + ε)-approximation albeit using slightly more (1 + ε)k centers. All these results naturally extends to slightly super-constant values of d. In the high-dimensional regime, we provide the first fully scalable MPC algorithm that in a constant number of rounds achieves an O(log n/ log log n)-approximation for k-center.

Cite as

Artur Czumaj, Guichen Gao, Mohsen Ghaffari, and Shaofeng H.-C. Jiang. Fully Scalable MPC Algorithms for Euclidean k-Center. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 64:1-64:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{czumaj_et_al:LIPIcs.ICALP.2025.64,
  author =	{Czumaj, Artur and Gao, Guichen and Ghaffari, Mohsen and Jiang, Shaofeng H.-C.},
  title =	{{Fully Scalable MPC Algorithms for Euclidean k-Center}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{64:1--64:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.64},
  URN =		{urn:nbn:de:0030-drops-234416},
  doi =		{10.4230/LIPIcs.ICALP.2025.64},
  annote =	{Keywords: Massively Parallel Computing, Euclidean Spaces, k-Center Clustering}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.10

Bottom-Up Synthesis of Memory Mutations with Separation Logic

Authors: Kasra Ferdowsi and Hila Peleg

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Programming-by-Example (PBE) is the paradigm of program synthesis specified via input-output pairs. It is commonly used because examples are easy to provide and collect from the environment. A popular optimization for enumerative synthesis with examples is Observational Equivalence (OE), which groups programs into equivalence classes according to their evaluation on example inputs. Current formulations of OE, however, are severely limited by the assumption that the synthesizer’s target language contains only pure components with no side-effects, either enforcing this in their target language, or ignoring it, leading to an incorrect enumeration. This limits their ability to use realistic component sets. We address this limitation by borrowing from Separation Logic, which can compositionally reason about heap mutations. We reformulate PBE using a restricted Separation Logic: Concrete Heap Separation Logic (CHSL), transforming the search for programs into a proof search in CHSL. This lets us perform bottom-up enumerative synthesis without the need for expert-provided annotations or domain-specific inferences, but with three key advantages: we (i) preserve correctness in the presence of memory-mutating operations, (ii) compact the search space by representing many concrete programs as one under CHSL, and (iii) perform a provably correct OE-reduction. We present SObEq (Side-effects in OBservational EQuivalence), a bottom-up enumerative algorithm that, given a PBE task, searches for its CHSL derivation. The SObEq algorithm is proved correct with no purity assumptions: we show it is guaranteed to lose no solutions. We also evaluate our implementation of SObEq on benchmarks from the literature and online sources, and show that it produces high-quality results quickly.

Cite as

Kasra Ferdowsi and Hila Peleg. Bottom-Up Synthesis of Memory Mutations with Separation Logic. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 10:1-10:32, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ferdowsi_et_al:LIPIcs.ECOOP.2025.10,
  author =	{Ferdowsi, Kasra and Peleg, Hila},
  title =	{{Bottom-Up Synthesis of Memory Mutations with Separation Logic}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{10:1--10:32},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.10},
  URN =		{urn:nbn:de:0030-drops-233036},
  doi =		{10.4230/LIPIcs.ECOOP.2025.10},
  annote =	{Keywords: Program synthesis, observational equivalence}
}

Document

DOI: 10.4230/LIPIcs.CPM.2025.6

String Problems in the Congested Clique Model

Authors: Shay Golan and Matan Kraus

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)

Abstract

In this paper we present algorithms for several string problems in the Congested Clique model. In the Congested Clique model, n nodes (computers) are used to solve some problem. The input to the problem is distributed among the nodes, and the communication between the nodes is conducted in rounds. In each round, every node is allowed to send an O(log n)-bit message to every other node in the network. We consider three fundamental string problems in the Congested Clique model. First, we present an O(1) rounds algorithm for string sorting that supports strings of arbitrary length. Second, we present an O(1) rounds combinatorial pattern matching algorithm. Finally, we present an O(log log n) rounds algorithm for the computation of the suffix array and the corresponding Longest Common Prefix array of a given string.

Cite as

Shay Golan and Matan Kraus. String Problems in the Congested Clique Model. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 6:1-6:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{golan_et_al:LIPIcs.CPM.2025.6,
  author =	{Golan, Shay and Kraus, Matan},
  title =	{{String Problems in the Congested Clique Model}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{6:1--6:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.6},
  URN =		{urn:nbn:de:0030-drops-231003},
  doi =		{10.4230/LIPIcs.CPM.2025.6},
  annote =	{Keywords: String Sorting, Pattern Matching, Suffix Array, Congested Clique, Sorting}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2025.1

The Quest for Faster Join Algorithms (Invited Talk)

Authors: Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Joins are the cornerstone of relational databases. Surprisingly, even after several decades of research in the systems and theory database community, we still lack an understanding of how to design the fastest possible join algorithm. In this talk, we will present the exciting progress the database theory community has achieved in join algorithms over the last two decades. The talk will revolve around five key ideas fundamentally shaping this research area: tree decompositions, data partitioning, leveraging statistical information, enumeration, and algebraic techniques.

Cite as

Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao. The Quest for Faster Join Algorithms (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 1:1-1:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2025.1,
  author =	{Koutris, Paraschos and Deep, Shaleen and Fan, Austen and Zhao, Hangdong},
  title =	{{The Quest for Faster Join Algorithms}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{1:1--1:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.1},
  URN =		{urn:nbn:de:0030-drops-229428},
  doi =		{10.4230/LIPIcs.ICDT.2025.1},
  annote =	{Keywords: Conjunctive Queries, Joins, Tree Decompositions, Enumeration, Semirings}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.25

Optimal Oblivious Algorithms for Multi-Way Joins

Authors: Xiao Hu and Zhiang Wu

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

In cloud databases, cloud computation over sensitive data uploaded by clients inevitably causes concern about data security and privacy. Even if cryptographic primitives and trusted computing environments are integrated into query processing to safeguard the actual contents of the data, access patterns of algorithms can still leak private information about data. Oblivious RAM (ORAM) and circuits are two generic approaches to address this issue, ensuring that access patterns of algorithms remain oblivious to the data. However, deploying these methods on insecure algorithms, particularly for multi-way join processing, is computationally expensive and inherently challenging. In this paper, we propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions. Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor. Furthermore, it is cache-agnostic, with cache complexity matching the insecure lower bound, also up to a logarithmic factor. This clean and straightforward approach has the potential to be extended to other security settings and implemented in practical database systems.

Cite as

Xiao Hu and Zhiang Wu. Optimal Oblivious Algorithms for Multi-Way Joins. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 25:1-25:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hu_et_al:LIPIcs.ICDT.2025.25,
  author =	{Hu, Xiao and Wu, Zhiang},
  title =	{{Optimal Oblivious Algorithms for Multi-Way Joins}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{25:1--25:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.25},
  URN =		{urn:nbn:de:0030-drops-229662},
  doi =		{10.4230/LIPIcs.ICDT.2025.25},
  annote =	{Keywords: oblivious algorithms, multi-way joins, worst-case optimality}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.26

Dynamic Direct Access of MSO Query Evaluation over Strings

Authors: Pierre Bourhis, Florent Capelli, Stefan Mengel, and Cristian Riveros

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We study the problem of evaluating a Monadic Second Order (MSO) query over strings under updates in the setting of direct access. We present an algorithm that, given an MSO query with first-order free variables represented by an unambiguous variable-set automaton 𝒜 with state set Q and variables X and a string s, computes a data structure in time 𝒪(|Q|^ω⋅ |X|² ⋅ |s|) and, then, given an index i retrieves, using the data structure, the i-th output of the evaluation of 𝒜 over s in time 𝒪(|Q|^ω ⋅ |X|³ ⋅ log(|s|)²) where ω is the exponent for matrix multiplication. Ours is the first efficient direct access algorithm for MSO query evaluation over strings; such algorithms so far had only been studied for first-order queries and conjunctive queries over relational data. Our algorithm gives the answers in lexicographic order where, in contrast to the setting of conjunctive queries, the order between variables can be freely chosen by the user without degrading the runtime. Moreover, our data structure can be updated efficiently after changes to the input string, allowing more powerful updates than in the enumeration literature, e.g. efficient deletion of substrings, concatenation and splitting of strings, and cut-and-paste operations. Our approach combines a matrix representation of MSO queries and a novel data structure for dynamic word problems over semi-groups which yields an overall algorithm that is elegant and easy to formulate.

Cite as

Pierre Bourhis, Florent Capelli, Stefan Mengel, and Cristian Riveros. Dynamic Direct Access of MSO Query Evaluation over Strings. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 26:1-26:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bourhis_et_al:LIPIcs.ICDT.2025.26,
  author =	{Bourhis, Pierre and Capelli, Florent and Mengel, Stefan and Riveros, Cristian},
  title =	{{Dynamic Direct Access of MSO Query Evaluation over Strings}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{26:1--26:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.26},
  URN =		{urn:nbn:de:0030-drops-229675},
  doi =		{10.4230/LIPIcs.ICDT.2025.26},
  annote =	{Keywords: Query evaluation, direct access, MSO queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.7

O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN

Authors: Junhao Gan, Anthony Wirth, and Zhuo Zhang

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

In this paper, we investigate three fundamental problems in the Massively Parallel Computation (MPC) model: (i) grid graph connectivity, (ii) approximate Euclidean Minimum Spanning Tree (EMST), and (iii) approximate DBSCAN. Our first result is a O(1)-round Las Vegas (i.e., succeeding with high probability) MPC algorithm for computing the connected components on a d-dimensional c-penetration grid graph ((d,c)-grid graph), where both d and c are positive integer constants. In such a grid graph, each vertex is a point with integer coordinates in ℕ^d, and an edge can only exist between two distinct vertices with 𝓁_∞-norm at most c. To our knowledge, the current best existing result for computing the connected components (CC’s) on (d,c)-grid graphs in the MPC model is to run the state-of-the-art MPC CC algorithms that are designed for general graphs: they achieve O(log log n + log D) [Behnezhad et al., 2019] and O(log log n + log 1/(λ)) [Sepehr Assadi et al., 2019] rounds, respectively, where D is the diameter and λ is the spectral gap of the graph. With our grid graph connectivity technique, our second main result is a O(1)-round Las Vegas MPC algorithm for computing approximate Euclidean MST. The existing state-of-the-art result on this problem is the O(1)-round MPC algorithm proposed by Andoni et al. [Alexandr Andoni et al., 2014], which only guarantees an approximation on the overall weight in expectation. In contrast, our algorithm not only guarantees a deterministic overall weight approximation, but also achieves a deterministic edge-wise weight approximation. The latter property is crucial to many applications, such as finding the Bichromatic Closest Pair and Single-Linkage Clustering. Last, but not least, our third main result is a O(1)-round Las Vegas MPC algorithm for computing an approximate DBSCAN clustering in O(1)-dimensional Euclidean space.

Cite as

Junhao Gan, Anthony Wirth, and Zhuo Zhang. O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 7:1-7:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gan_et_al:LIPIcs.ICDT.2025.7,
  author =	{Gan, Junhao and Wirth, Anthony and Zhang, Zhuo},
  title =	{{O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{7:1--7:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.7},
  URN =		{urn:nbn:de:0030-drops-229483},
  doi =		{10.4230/LIPIcs.ICDT.2025.7},
  annote =	{Keywords: Massively Parallel Computation, Graph Connectivity, Grid Graphs, Euclidean Minimum Spanning Tree, DBSCAN}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.6

PAC: Computing Join Queries with Semi-Covers

Authors: Heba Aamer and Bas Ketsman

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

An increased and growing interest in large-scale data processing has triggered a demand for specialized algorithms that thrive in massively parallel shared-nothing systems. To answer the question of how to efficiently compute join queries in this setting, a rich line of research has emerged specifically for the Massively Parallel Communication (MPC) model. In the MPC model, algorithms are executed in rounds, with each round consisting of a synchronized communication phase and a separate local computation phase. The main cost measure is the load of the algorithm, defined as the maximum number of messages received by any server in any round. We study worst-case optimal algorithms for the join query evaluation problem in the constant-round MPC model. In the single-round variant of MPC, the worst-case optimal load for this problem is well understood and algorithms exist that guarantee this load for any join query. In the constant-round variant of MPC, queries can often be computed with a lower load compared to the single-round variant, but the worst-case optimal load is only known for specific classes of join queries, including graph-like and acyclic join queries, and the associated algorithms use very different techniques. In this paper, we propose a new constant-round MPC algorithm for computing join queries. Our algorithm is correct for every join query and its load matches (up to a polylog factor) the worst-case optimal load for at least all join queries that are acyclic or graph-like.

Cite as

Heba Aamer and Bas Ketsman. PAC: Computing Join Queries with Semi-Covers. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{aamer_et_al:LIPIcs.ICDT.2025.6,
  author =	{Aamer, Heba and Ketsman, Bas},
  title =	{{PAC: Computing Join Queries with Semi-Covers}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.6},
  URN =		{urn:nbn:de:0030-drops-229474},
  doi =		{10.4230/LIPIcs.ICDT.2025.6},
  annote =	{Keywords: Worst-case optimal load, MPC model, join queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.32

The Free Termination Property of Queries over Time

Authors: Conor Power, Paraschos Koutris, and Joseph M. Hellerstein

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Building on prior work on distributed databases and the CALM Theorem, we define and study the question of free termination: in the absence of distributed coordination, what query properties allow nodes in a distributed (database) system to unilaterally terminate execution even though they may receive additional data or messages in the future? This completeness question is complementary to the soundness questions studied in the CALM literature. We also develop a new model based on semiautomata that allows us to bridge from the relational transducer model of the CALM papers to algebraic models that are popular among software engineers (e.g. CRDTs) and of increasing interest to database theory for datalog extensions and incremental view maintenance.

Cite as

Conor Power, Paraschos Koutris, and Joseph M. Hellerstein. The Free Termination Property of Queries over Time. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 32:1-32:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{power_et_al:LIPIcs.ICDT.2025.32,
  author =	{Power, Conor and Koutris, Paraschos and Hellerstein, Joseph M.},
  title =	{{The Free Termination Property of Queries over Time}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{32:1--32:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.32},
  URN =		{urn:nbn:de:0030-drops-229736},
  doi =		{10.4230/LIPIcs.ICDT.2025.32},
  annote =	{Keywords: distributed systems, algebraic data models, coordination-free systems}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.27

Parallel Query Processing with Heterogeneous Machines

Authors: Simon Frisk and Paraschos Koutris

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We study the problem of computing a full Conjunctive Query in parallel using p heterogeneous machines. Our computational model is similar to the MPC model, but each machine has its own cost function mapping from the number of bits it receives to a cost. An optimal algorithm should minimize the maximum cost across all machines. We consider algorithms over a single communication round and give a lower bound and matching upper bound for databases where each relation has the same cardinality. We do this for both linear cost functions like in previous work, but also for more general cost functions. For databases with relations of different cardinalities, we also find a lower bound, and give matching upper bounds for specific queries like the cartesian product, the join, the star query, and the triangle query. Our approach is inspired by the HyperCube algorithm, but there are additional challenges involved when machines have heterogeneous cost functions.

Cite as

Simon Frisk and Paraschos Koutris. Parallel Query Processing with Heterogeneous Machines. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 27:1-27:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{frisk_et_al:LIPIcs.ICDT.2025.27,
  author =	{Frisk, Simon and Koutris, Paraschos},
  title =	{{Parallel Query Processing with Heterogeneous Machines}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{27:1--27:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.27},
  URN =		{urn:nbn:de:0030-drops-229683},
  doi =		{10.4230/LIPIcs.ICDT.2025.27},
  annote =	{Keywords: Joins, Massively Parallel Computation, Heterogeneous}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.28

Generalized Covers for Conjunctive Queries

Authors: Paraschos Koutris

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Covers of query results were introduced as succinct lossless representations of join query outputs. A cover is a subset of the query result from which we can efficiently enumerate the output with constant delay and linear preprocessing time. However, covers are dependent on a single tree decomposition of the query. In this work, we generalize the notion of a cover to a set of multiple tree decompositions. We show that this generalization can potentially produce asymptotically smaller covers while maintaining the properties of constant-delay enumeration and linear preprocessing time. In particular, given a set of tree decompositions, we can determine exactly the asymptotic size of a minimum cover, which is tied to the notion of entropic width of the query. We also provide a simple greedy algorithm that computes this cover efficiently. Finally, we relate covers to semiring circuits when the semiring is idempotent.

Cite as

Paraschos Koutris. Generalized Covers for Conjunctive Queries. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 28:1-28:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{koutris:LIPIcs.ICDT.2025.28,
  author =	{Koutris, Paraschos},
  title =	{{Generalized Covers for Conjunctive Queries}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{28:1--28:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.28},
  URN =		{urn:nbn:de:0030-drops-229698},
  doi =		{10.4230/LIPIcs.ICDT.2025.28},
  annote =	{Keywords: Conjunctive Query, tree decomposition, cover}
}

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2023.127

The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems

Authors: Austen Z. Fan, Paraschos Koutris, and Hangdong Zhao

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

Abstract

We study the fine-grained complexity of evaluating Boolean Conjunctive Queries and their generalization to sum-of-product problems over an arbitrary semiring. For these problems, we present a general semiring-oblivious reduction from the k-clique problem to any query structure (hypergraph). Our reduction uses the notion of embedding a graph to a hypergraph, first introduced by Marx [Dániel Marx, 2013]. As a consequence of our reduction, we can show tight conditional lower bounds for many classes of hypergraphs, including cycles, Loomis-Whitney joins, some bipartite graphs, and chordal graphs. These lower bounds have a dependence on what we call the clique embedding power of a hypergraph H, which we believe is a quantity of independent interest. We show that the clique embedding power is always less than the submodular width of the hypergraph, and present a decidable algorithm for computing it. We conclude with many open problems for future research.

Cite as

Austen Z. Fan, Paraschos Koutris, and Hangdong Zhao. The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 127:1-127:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.ICALP.2023.127,
  author =	{Fan, Austen Z. and Koutris, Paraschos and Zhao, Hangdong},
  title =	{{The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{127:1--127:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.127},
  URN =		{urn:nbn:de:0030-drops-181791},
  doi =		{10.4230/LIPIcs.ICALP.2023.127},
  annote =	{Keywords: Fine-grained complexity, conjunctive queries, semiring-oblivious reduction}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2022.6

Certifiable Robustness for Nearest Neighbor Classifiers

Authors: Austen Z. Fan and Paraschos Koutris

Published in: LIPIcs, Volume 220, 25th International Conference on Database Theory (ICDT 2022)

Abstract

ML models are typically trained using large datasets of high quality. However, training datasets often contain inconsistent or incomplete data. To tackle this issue, one solution is to develop algorithms that can check whether a prediction of a model is certifiably robust. Given a learning algorithm that produces a classifier and given an example at test time, a classification outcome is certifiably robust if it is predicted by every model trained across all possible worlds (repairs) of the uncertain (inconsistent) dataset. This notion of robustness falls naturally under the framework of certain answers. In this paper, we study the complexity of certifying robustness for a simple but widely deployed classification algorithm, k-Nearest Neighbors (k-NN). Our main focus is on inconsistent datasets when the integrity constraints are functional dependencies (FDs). For this setting, we establish a dichotomy in the complexity of certifying robustness w.r.t. the set of FDs: the problem either admits a polynomial time algorithm, or it is coNP-hard. Additionally, we exhibit a similar dichotomy for the counting version of the problem, where the goal is to count the number of possible worlds that predict a certain label. As a byproduct of our study, we also establish the complexity of a problem related to finding an optimal subset repair that may be of independent interest.

Cite as

Austen Z. Fan and Paraschos Koutris. Certifiable Robustness for Nearest Neighbor Classifiers. In 25th International Conference on Database Theory (ICDT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 220, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{fan_et_al:LIPIcs.ICDT.2022.6,
  author =	{Fan, Austen Z. and Koutris, Paraschos},
  title =	{{Certifiable Robustness for Nearest Neighbor Classifiers}},
  booktitle =	{25th International Conference on Database Theory (ICDT 2022)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-223-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{220},
  editor =	{Olteanu, Dan and Vortmeier, Nils},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2022.6},
  URN =		{urn:nbn:de:0030-drops-158809},
  doi =		{10.4230/LIPIcs.ICDT.2022.6},
  annote =	{Keywords: Inconsistent databases, k-NN classification, certifiable robustness}
}

24 Search Results for "Koutris, Paraschos"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message