DROPS

Volume

LIPIcs, Volume 328

28th International Conference on Database Theory (ICDT 2025)

ICDT 2025, March 25-28, 2025, Barcelona, Spain

Editors: Sudeepa Roy and Ahmet Kara

Document

DOI: 10.4230/LIPIcs.ICDT.2026.18

Output-Sensitive Evaluation of Acyclic Conjunctive Regular Path Queries

Authors: Mahmoud Abo Khamis, Alexandru-Mihai Hurjui, Ahmet Kara, Dan Olteanu, Dan Suciu, and Zilu Tian

Published in: LIPIcs, Volume 365, 29th International Conference on Database Theory (ICDT 2026)

Abstract

Conjunctive Regular Path Queries, or CRPQs for short, are an essential construct in graph query languages. In this paper, we propose the first output-sensitive algorithm for evaluating acyclic CRPQs. It is output-sensitive in the sense that its complexity is a function of the sizes of the input graph and of the query output and not of the output sizes of the regular expressions that appear in the query, as these latter sizes can be larger than the query output size. Our algorithm proceeds in two stages. In the first stage, it contracts the given query into a free-connex acyclic one such that the output of the original query can be obtained from the output of the contracted one. This contraction removes bound variables by composing regular expressions or by promoting bound variables to free ones. The minimum necessary number of promoted bound variables gives the contraction width, which is a novel parameter specific to CRPQs. In the second stage, our algorithm evaluates the free-connex acyclic CRPQ and projects away the columns of the promoted bound variables. It ensures output-sensitivity by computing the calibrated outputs of the regular expressions appearing in the free-connex acyclic CRPQ in time proportional to their sizes. Our algorithm has lower complexity than the state-of-the-art approaches for problem instances where the query output is asymptotically smaller than the output sizes of the regular expressions that appear in the query.

Cite as

Mahmoud Abo Khamis, Alexandru-Mihai Hurjui, Ahmet Kara, Dan Olteanu, Dan Suciu, and Zilu Tian. Output-Sensitive Evaluation of Acyclic Conjunctive Regular Path Queries. In 29th International Conference on Database Theory (ICDT 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 365, pp. 18:1-18:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{abokhamis_et_al:LIPIcs.ICDT.2026.18,
  author =	{Abo Khamis, Mahmoud and Hurjui, Alexandru-Mihai and Kara, Ahmet and Olteanu, Dan and Suciu, Dan and Tian, Zilu},
  title =	{{Output-Sensitive Evaluation of Acyclic Conjunctive Regular Path Queries}},
  booktitle =	{29th International Conference on Database Theory (ICDT 2026)},
  pages =	{18:1--18:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-413-0},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{365},
  editor =	{ten Cate, Balder and Funk, Maurice},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2026.18},
  URN =		{urn:nbn:de:0030-drops-256321},
  doi =		{10.4230/LIPIcs.ICDT.2026.18},
  annote =	{Keywords: graph databases, regular path queries, output-sensitive algorithms}
}

Document

DOI: 10.4230/LIPIcs.STACS.2026.29

A Pumping-Like Lemma for Languages over Infinite Alphabets

Authors: Yoav Danieli

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

We prove a kind of a pumping lemma for languages accepted by one-register alternating finite-memory automata. As a corollary, we obtain that the set of lengths of words in such languages is semi-linear.

Cite as

Yoav Danieli. A Pumping-Like Lemma for Languages over Infinite Alphabets. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 29:1-29:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{danieli:LIPIcs.STACS.2026.29,
  author =	{Danieli, Yoav},
  title =	{{A Pumping-Like Lemma for Languages over Infinite Alphabets}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{29:1--29:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.29},
  URN =		{urn:nbn:de:0030-drops-255185},
  doi =		{10.4230/LIPIcs.STACS.2026.29},
  annote =	{Keywords: infinite alphabets, pumping lemma, alternation, semi-linearity}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.STACS.2026.1

Query Languages for Machine-Learning Models (Invited Talk)

Authors: Martin Grohe

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

In my invited talk and this accompanying paper, I discuss two logics for weighted finite structures: first-order logic with summation (FO(SUM)) and its recursive extension IFP(SUM). These logics originate from foundational work by Grädel, Gurevich, and Meer in the 1990s. In recent joint work with Standke, Steegmans, and Van den Bussche, we have investigated these logics as query languages for machine learning models, specifically neural networks, which are naturally represented as weighted graphs. I present illustrative examples of queries to neural networks that can be expressed in these logics and discuss fundamental results on their expressiveness and computational complexity.

Cite as

Martin Grohe. Query Languages for Machine-Learning Models (Invited Talk). In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 1:1-1:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{grohe:LIPIcs.STACS.2026.1,
  author =	{Grohe, Martin},
  title =	{{Query Languages for Machine-Learning Models}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{1:1--1:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.1},
  URN =		{urn:nbn:de:0030-drops-254904},
  doi =		{10.4230/LIPIcs.STACS.2026.1},
  annote =	{Keywords: Expressive power of query languages, fixed-point logics, weighted structures, neural networks, explainable AI}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2026.102

Smoothed Analysis of Dynamic Graph Algorithms

Authors: Uri Meir and Ami Paz

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Abstract

Recent years have seen significant progress in the study of dynamic graph algorithms, and most notably, the introduction of strong lower bound techniques for them (e.g., Henzinger, Krinninger, Nanongkai and Saranurak, STOC 2015; Larsen and Yu, FOCS 2023). As worst-case analysis (adversarial inputs) may lead to the necessity of high running times, a natural question arises: in which cases are high running times really necessary, and in which cases these inputs merely manifest unique pathological cases? Early attempts to tackle this question were made by Nikoletseas, Reif, Spirakis and Yung (ICALP 1995) and by Alberts and Henzinger (Algorithmica 1998), who considered models with very little adversarial control over the inputs, and showed fast algorithms exist for them. The question was then overlooked for decades, until Henzinger, Lincoln and Saha (SODA 2022) recently addressed uniformly random inputs, and presented algorithms and impossibility results for several subgraph counting problems. To tackle the above question more thoroughly, we employ smoothed analysis, a celebrated framework introduced by Spielman and Teng (J. ACM, 2004). An input is proposed by an adversary but then a noisy version of it is processed by the algorithm instead. This model of inputs is parameterized by the amount of adversarial control, and fully interpolates between worst-case inputs and a uniformly random input. Doing so, we extend impossibility results for some problems to the smoothed model with only a minor quantitative loss. That is, we show that partially-adversarial inputs suffice to impose high running times for certain problems. In contrast, we show that other problems become easy even with the slightest amount of noise. In addition, we study the interplay between the adversary and the noise, leading to three natural models of smoothed inputs, for which we show a hierarchy of increasing difficulty stretching between the average-case and the worst-case complexities.

Cite as

Uri Meir and Ami Paz. Smoothed Analysis of Dynamic Graph Algorithms. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 102:1-102:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{meir_et_al:LIPIcs.ITCS.2026.102,
  author =	{Meir, Uri and Paz, Ami},
  title =	{{Smoothed Analysis of Dynamic Graph Algorithms}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{102:1--102:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.102},
  URN =		{urn:nbn:de:0030-drops-253896},
  doi =		{10.4230/LIPIcs.ITCS.2026.102},
  annote =	{Keywords: Dynamic graph algorithms, Smoothed analysis, Shortest paths}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.4

Learning Aggregate Queries Defined by First-Order Logic with Counting

Authors: Steffen van Bergerem and Nicole Schweikardt

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

In the logical framework introduced by Grohe and Turán (TOCS 2004) for Boolean classification problems, the instances to classify are tuples from a logical structure, and Boolean classifiers are described by parametric models based on logical formulas. This is a specific scenario for supervised passive learning, where classifiers should be learned based on labelled examples. Existing results in this scenario focus on Boolean classification. This paper presents learnability results beyond Boolean classification. We focus on multiclass classification problems where the task is to assign input tuples to arbitrary integers. To represent such integer-valued classifiers, we use aggregate queries specified by an extension of first-order logic with counting terms called FOC₁. Our main result shows the following: given a database of polylogarithmic degree, within quasi-linear time, we can build an index structure that makes it possible to learn FOC₁-definable integer-valued classifiers in time polylogarithmic in the size of the database and polynomial in the number of training examples.

Cite as

Steffen van Bergerem and Nicole Schweikardt. Learning Aggregate Queries Defined by First-Order Logic with Counting. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 4:1-4:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{vanbergerem_et_al:LIPIcs.ICDT.2025.4,
  author =	{van Bergerem, Steffen and Schweikardt, Nicole},
  title =	{{Learning Aggregate Queries Defined by First-Order Logic with Counting}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{4:1--4:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.4},
  URN =		{urn:nbn:de:0030-drops-229457},
  doi =		{10.4230/LIPIcs.ICDT.2025.4},
  annote =	{Keywords: Supervised learning, multiclass classification problems, counting logic}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2025.1

The Quest for Faster Join Algorithms (Invited Talk)

Authors: Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Joins are the cornerstone of relational databases. Surprisingly, even after several decades of research in the systems and theory database community, we still lack an understanding of how to design the fastest possible join algorithm. In this talk, we will present the exciting progress the database theory community has achieved in join algorithms over the last two decades. The talk will revolve around five key ideas fundamentally shaping this research area: tree decompositions, data partitioning, leveraging statistical information, enumeration, and algebraic techniques.

Cite as

Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao. The Quest for Faster Join Algorithms (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 1:1-1:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2025.1,
  author =	{Koutris, Paraschos and Deep, Shaleen and Fan, Austen and Zhao, Hangdong},
  title =	{{The Quest for Faster Join Algorithms}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{1:1--1:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.1},
  URN =		{urn:nbn:de:0030-drops-229428},
  doi =		{10.4230/LIPIcs.ICDT.2025.1},
  annote =	{Keywords: Conjunctive Queries, Joins, Tree Decompositions, Enumeration, Semirings}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2025.2

Data Management Perspectives on Prescriptive Analytics (Invited Talk)

Authors: Alexandra Meliou, Azza Abouzied, Peter J. Haas, Riddho R. Haque, Anh Mai, and Vasileios Vittis

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Decision makers in a broad range of domains, such as finance, transportation, manufacturing, and healthcare, often need to derive optimal decisions given a set of constraints and objectives. Traditional solutions to such constrained optimization problems are typically application-specific, complex, and do not generalize. Further, the usual workflow requires slow, cumbersome, and error-prone data movement between a database, and predictive-modeling and optimization packages. All of these problems are exacerbated by the unprecedented size of modern data-intensive optimization problems. The emerging research area of in-database prescriptive analytics aims to provide seamless domain-independent, declarative, and scalable approaches powered by the system where the data typically resides: the database. Integrating optimization with database technology opens up prescriptive analytics to a much broader community, amplifying its benefits. We discuss how deep integration between the DBMS, predictive models, and optimization software creates opportunities for rich prescriptive-query functionality with good scalability and performance. Summarizing some of our main results and ongoing work in this area, we highlight challenges related to usability, scalability, data uncertainty, and dynamic environments, and argue that perspectives from data management research can drive novel strategies and solutions.

Cite as

Alexandra Meliou, Azza Abouzied, Peter J. Haas, Riddho R. Haque, Anh Mai, and Vasileios Vittis. Data Management Perspectives on Prescriptive Analytics (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 2:1-2:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{meliou_et_al:LIPIcs.ICDT.2025.2,
  author =	{Meliou, Alexandra and Abouzied, Azza and Haas, Peter J. and Haque, Riddho R. and Mai, Anh and Vittis, Vasileios},
  title =	{{Data Management Perspectives on Prescriptive Analytics}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{2:1--2:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.2},
  URN =		{urn:nbn:de:0030-drops-229432},
  doi =		{10.4230/LIPIcs.ICDT.2025.2},
  annote =	{Keywords: Prescriptive analytics, decision making, scalable constrained optimization}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2025.3

Evaluating First-Order Formulas in Structured Graphs (Invited Talk)

Authors: Szymon Toruńczyk

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

A central problem in database theory concerns the complexity of the query evaluation problem, also called the model-checking problem in finite model theory: the problem of evaluating a given formula in a given structure. Here, I will focus on formulas of first-order logic, and the data complexity (or parameterized complexity) of their evaluation. Leveraging tools from structural graph theory, I will assume that the input structure is a graph which comes from a fixed class of well-structured graphs, such as the class of planar graphs, classes of bounded treewidth or clique-width, or much more general "tame" graph classes, such as the nowhere dense graph classes introduced by Ossona de Mendez and Nešetřil, or classes of bounded twin-width studied by Bonnet, Thomassé, and coauthors. I will survey the recent progress in this area, which connects tools from structural graph theory, from model theory - such as stability and dependence - and from statistical learning theory and computational geometry - such as VC-dimension and ε-nets.

Cite as

Szymon Toruńczyk. Evaluating First-Order Formulas in Structured Graphs (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 3:1-3:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{torunczyk:LIPIcs.ICDT.2025.3,
  author =	{Toru\'{n}czyk, Szymon},
  title =	{{Evaluating First-Order Formulas in Structured Graphs}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{3:1--3:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.3},
  URN =		{urn:nbn:de:0030-drops-229449},
  doi =		{10.4230/LIPIcs.ICDT.2025.3},
  annote =	{Keywords: Finite model theory, first-order model checking, graph parameters}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.20

A Formal Language Perspective on Factorized Representations

Authors: Benny Kimelfeld, Wim Martens, and Matthias Niewerth

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Factorized representations (FRs) are a well-known tool to succinctly represent results of join queries and have been originally defined using the named database perspective. We define FRs in the unnamed database perspective and use them to establish several new connections. First, unnamed FRs can be exponentially more succinct than named FRs, but this difference can be alleviated by imposing a disjointness condition on columns. Conversely, named FRs can also be exponentially more succinct than unnamed FRs. Second, unnamed FRs are the same as (i.e., isomorphic to) context-free grammars for languages in which each word has the same length. This tight connection allows us to transfer a wide range of results on context-free grammars to database factorization; of which we offer a selection in the paper. Third, when we generalize unnamed FRs to arbitrary sets of tuples, they become a generalization of path multiset representations, a formalism that was recently introduced to succinctly represent sets of paths in the context of graph database query evaluation.

Cite as

Benny Kimelfeld, Wim Martens, and Matthias Niewerth. A Formal Language Perspective on Factorized Representations. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 20:1-20:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kimelfeld_et_al:LIPIcs.ICDT.2025.20,
  author =	{Kimelfeld, Benny and Martens, Wim and Niewerth, Matthias},
  title =	{{A Formal Language Perspective on Factorized Representations}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{20:1--20:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.20},
  URN =		{urn:nbn:de:0030-drops-229614},
  doi =		{10.4230/LIPIcs.ICDT.2025.20},
  annote =	{Keywords: Databases, relational databases, graph databases, factorized databases, regular path queries, compact representations}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.16

Quantum Data Sketches

Authors: Qin Zhang and Mohsen Heidari

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Recent advancements in quantum technologies, particularly in quantum sensing and simulation, have facilitated the generation and analysis of inherently quantum data. This progress underscores the necessity for developing efficient and scalable quantum data management strategies. This goal faces immense challenges due to the exponential dimensionality of quantum data and its unique quantum properties such as no-cloning and measurement stochasticity. Specifically, classical storage and manipulation of an arbitrary n-qubit quantum state requires exponential space and time. Hence, there is a critical need to revisit foundational data management concepts and algorithms for quantum data. In this paper, we propose succinct quantum data sketches to support basic database operations such as search and selection. We view our work as an initial step towards the development of quantum data management model, opening up many possibilities for future research in this direction.

Cite as

Qin Zhang and Mohsen Heidari. Quantum Data Sketches. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 16:1-16:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{zhang_et_al:LIPIcs.ICDT.2025.16,
  author =	{Zhang, Qin and Heidari, Mohsen},
  title =	{{Quantum Data Sketches}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{16:1--16:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.16},
  URN =		{urn:nbn:de:0030-drops-229570},
  doi =		{10.4230/LIPIcs.ICDT.2025.16},
  annote =	{Keywords: quantum data representation, data sketching, query execution}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.21

Targeted Least Cardinality Candidate Key for Relational Databases

Authors: Vasileios Nakos, Hung Q. Ngo, and Charalampos E. Tsourakakis

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Functional dependencies (FDs) are a central theme in databases, playing a major role in the design of database schemas and the optimization of queries [Ramakrishnan and Gehrke, 2003]. In this work, we introduce the targeted least cardinality candidate key problem (TCAND). This problem is defined over a set of functional dependencies ℱ and a target variable set T ⊆ V, and it aims to find the smallest set X ⊆ V such that the FD X → T can be derived from ℱ. The TCAND problem generalizes the well-known NP-hard problem of finding the least cardinality candidate key [Lucchesi and Osborn, 1978], which has been previously demonstrated to be at least as difficult as the set cover problem. We present an integer programming (IP) formulation for the TCAND problem, analogous to a layered set cover problem. We analyze its linear programming (LP) relaxation from two perspectives: we propose two approximation algorithms and investigate the integrality gap. Our findings indicate that the approximation upper bounds for our algorithms are not significantly improvable through LP rounding, a notable distinction from the standard Set Cover problem. Additionally, we discover that a generalization of the TCAND problem is equivalent to a variant of the Set Cover problem, named Red Blue Set Cover [Carr et al., 2000], which cannot be approximated within a sub-polynomial factor in polynomial time under plausible conjectures [Chlamtáč et al., 2023]. Despite the extensive history surrounding the issue of identifying the least cardinality candidate key, our research contributes new theoretical insights, novel algorithms, and demonstrates that the general TCAND problem poses complexities beyond those encountered in the Set Cover problem.

Cite as

Vasileios Nakos, Hung Q. Ngo, and Charalampos E. Tsourakakis. Targeted Least Cardinality Candidate Key for Relational Databases. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{nakos_et_al:LIPIcs.ICDT.2025.21,
  author =	{Nakos, Vasileios and Ngo, Hung Q. and Tsourakakis, Charalampos E.},
  title =	{{Targeted Least Cardinality Candidate Key for Relational Databases}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{21:1--21:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.21},
  URN =		{urn:nbn:de:0030-drops-229628},
  doi =		{10.4230/LIPIcs.ICDT.2025.21},
  annote =	{Keywords: functional dependencies, candidate key, approximation algorithms, hardness}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.25

Optimal Oblivious Algorithms for Multi-Way Joins

Authors: Xiao Hu and Zhiang Wu

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

In cloud databases, cloud computation over sensitive data uploaded by clients inevitably causes concern about data security and privacy. Even if cryptographic primitives and trusted computing environments are integrated into query processing to safeguard the actual contents of the data, access patterns of algorithms can still leak private information about data. Oblivious RAM (ORAM) and circuits are two generic approaches to address this issue, ensuring that access patterns of algorithms remain oblivious to the data. However, deploying these methods on insecure algorithms, particularly for multi-way join processing, is computationally expensive and inherently challenging. In this paper, we propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions. Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor. Furthermore, it is cache-agnostic, with cache complexity matching the insecure lower bound, also up to a logarithmic factor. This clean and straightforward approach has the potential to be extended to other security settings and implemented in practical database systems.

Cite as

Xiao Hu and Zhiang Wu. Optimal Oblivious Algorithms for Multi-Way Joins. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 25:1-25:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hu_et_al:LIPIcs.ICDT.2025.25,
  author =	{Hu, Xiao and Wu, Zhiang},
  title =	{{Optimal Oblivious Algorithms for Multi-Way Joins}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{25:1--25:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.25},
  URN =		{urn:nbn:de:0030-drops-229662},
  doi =		{10.4230/LIPIcs.ICDT.2025.25},
  annote =	{Keywords: oblivious algorithms, multi-way joins, worst-case optimality}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.19

Beyond Logarithmic Bounds: Querying in Constant Expected Time with Learned Indexes

Authors: Luis Alberto Croquevielle, Guang Yang, Liang Liang, Ali Hadian, and Thomas Heinis

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probabilistic assumptions. Specifically, for a sorted array with n elements, it has been shown that learned indexes can find a key in O(log(log n)) expected time using at most linear space, compared with O(log n) for non-learned methods. In this work, we prove O(1) expected time can be achieved with at most linear space, thereby establishing the tightest upper bound so far for the time complexity of an asymptotically optimal learned index. Notably, we use weaker probabilistic assumptions than prior research, meaning our work generalizes previous results. Furthermore, we introduce a new measure of statistical complexity for data. This metric exhibits an information-theoretical interpretation and can be estimated in practice. This characterization provides further theoretical understanding of learned indexes, by helping to explain why some datasets seem to be particularly challenging for these methods.

Cite as

Luis Alberto Croquevielle, Guang Yang, Liang Liang, Ali Hadian, and Thomas Heinis. Beyond Logarithmic Bounds: Querying in Constant Expected Time with Learned Indexes. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 19:1-19:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{croquevielle_et_al:LIPIcs.ICDT.2025.19,
  author =	{Croquevielle, Luis Alberto and Yang, Guang and Liang, Liang and Hadian, Ali and Heinis, Thomas},
  title =	{{Beyond Logarithmic Bounds: Querying in Constant Expected Time with Learned Indexes}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{19:1--19:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.19},
  URN =		{urn:nbn:de:0030-drops-229603},
  doi =		{10.4230/LIPIcs.ICDT.2025.19},
  annote =	{Keywords: Learned Indexes, Expected Time, Stochastic Processes, R\'{e}nyi Entropy}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.18

A Framework for Extraction and Transformation of Documents

Authors: Cristian Riveros, Markus L. Schmid, and Nicole Schweikardt

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We present a theoretical framework for the extraction and transformation of text documents as a two-phase process: The first phase uses document spanners to extract information from the input document. The second phase transforms the extracted information into a suitable output. To support several reasonable extract-transform scenarios, we propose for the first phase an extension of document spanners from span-tuples to so-called multispan-tuples, where variables are mapped to sets of spans instead of only single spans. We focus on multispanners described by regex formulas, and we prove that these have the same desirable properties as standard regular spanners. To formalize the second phase, we consider transformations that map every pair document-tuple, where each tuple comes from the (multi)span-relation extracted in the first phase, into a new output document. The specification of the two phases is what we call an extract-transform (ET) program, which covers practically relevant extract-transform tasks. In this paper, our main technical goal is to identify a broad class of ET programs that can be evaluated efficiently. We specifically focus on the scenario of regular ET programs: the extraction phase is given by a regex multispanner and the transformation phase is given by a regular string-to-string function. We show that for any regular ET program, given an input document, we can enumerate all final output documents with output-linear delay after linear preprocessing. As a side effect, we characterize the expressive power of regular ET programs and also show that they have desirable properties, like being closed under composition.

Cite as

Cristian Riveros, Markus L. Schmid, and Nicole Schweikardt. A Framework for Extraction and Transformation of Documents. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 18:1-18:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{riveros_et_al:LIPIcs.ICDT.2025.18,
  author =	{Riveros, Cristian and Schmid, Markus L. and Schweikardt, Nicole},
  title =	{{A Framework for Extraction and Transformation of Documents}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{18:1--18:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.18},
  URN =		{urn:nbn:de:0030-drops-229593},
  doi =		{10.4230/LIPIcs.ICDT.2025.18},
  annote =	{Keywords: Information extraction, Document spanners, Transducers, Query evaluation}
}

49 Search Results for "Kara, Ahmet"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message