DROPS

Document

Research

Native Provenance Computation for Federated and Non-Federated SPARQL Queries

Authors: Zubaria Asma, Daniel Hernández, Luis Galárraga, Giorgos Flouris, Irini Fundulaki, and Katja Hose

Published in: TGDK, Volume 4, Issue 1 (2026). Transactions on Graph Data and Knowledge, Volume 4, Issue 1

Abstract

The popularity of knowledge graphs (KGs) owes credit to their flexible data model, which is suitable for data integration from multiple sources. Several KG-based applications, such as trust assessment, view maintenance, or data valuation on dynamic data, rely on the ability to compute provenance explanations for query results. This need becomes more urgent in federated query processing systems, which allow the online consumption of heterogeneous and decentralized Web data. However, the problem of computing and interacting with provenance has received little attention, especially in the federated setting. On those grounds, this paper introduces the NPCS (Native Provenance Computation for SPARQL) approach, and its federated variant Fed-NPCS, that compute provenance for SPARQL query results. Both approaches build upon spm-semirings to annotate the results of monotonic and non-monotonic SPARQL queries with their provenance. Due to their reliance on query rewriting techniques, the approaches are directly applicable to already deployed SPARQL engines and federations using different reification schemes, including RDF-star. Our experimental evaluation shows that our novel query rewriting approach brings significant run-time improvements w.r.t. the state-of-the-art across both centralized and federated settings. In centralized settings, our tests on two popular SPARQL engines (GraphDB and Stardog) reveal substantial runtime gains over existing query rewriting solutions, enabling scalability to RDF graphs with billions of triples. In federated settings, our experiments on the FedShop benchmark with GraphDB show the viability of Fed-NPCS for federations with up to 200 sources.

Cite as

Zubaria Asma, Daniel Hernández, Luis Galárraga, Giorgos Flouris, Irini Fundulaki, and Katja Hose. Native Provenance Computation for Federated and Non-Federated SPARQL Queries. In Transactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 4:1-4:43, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@Article{asma_et_al:TGDK.4.1.4,
  author =	{Asma, Zubaria and Hern\'{a}ndez, Daniel and Gal\'{a}rraga, Luis and Flouris, Giorgos and Fundulaki, Irini and Hose, Katja},
  title =	{{Native Provenance Computation for Federated and Non-Federated SPARQL Queries}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:43},
  ISSN =	{2942-7517},
  year =	{2026},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.4.1.4},
  URN =		{urn:nbn:de:0030-drops-259642},
  doi =		{10.4230/TGDK.4.1.4},
  annote =	{Keywords: native provenance computation, federated SPARQL queries, data provenance, NPCS, Fed-NPCS}
}

Document

Research

DOI: 10.4230/TGDK.4.1.3

Semantically Reflected Programs

Authors: Eduard Kamburjan, Vidar Norstein Klungre, Yuanwei Qu, Rudolf Schlatte, Egor V. Kostylev, Martin Giese, and Einar Broch Johnsen

Published in: TGDK, Volume 4, Issue 1 (2026). Transactions on Graph Data and Knowledge, Volume 4, Issue 1

Abstract

This paper addresses the dichotomy between the formalization of structural and the formalization of executable behavioral knowledge by means of semantically lifted programs, which explore an intuitive connection between imperative programs and knowledge graphs. While knowledge graphs and ontologies are eminently useful to represent formal knowledge about a system’s individuals and universals, programming languages are designed to describe the system’s evolution. To address this dichotomy, we introduce a semantic lifting of the program states of an executing progam into a knowledge graph, for an object-oriented programming language. The resulting graph is exposed as a semantic reflection layer within the programming language, allowing programmers to leverage knowledge of the application domain in their programs during execution. In this paper, we formalize semantic lifting and semantic reflection for a small imperative programming language, SMOL, explain the operational aspects of the language, and consider type correctness and virtualization for runtime program queries through the semantic reflection layer. We illustrate semantic lifting and semantic reflection through a case study of geological modeling and discuss different applications of the technique. The language implementation is open source and available online.

Cite as

Eduard Kamburjan, Vidar Norstein Klungre, Yuanwei Qu, Rudolf Schlatte, Egor V. Kostylev, Martin Giese, and Einar Broch Johnsen. Semantically Reflected Programs. In Transactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 3:1-3:52, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@Article{kamburjan_et_al:TGDK.4.1.3,
  author =	{Kamburjan, Eduard and Klungre, Vidar Norstein and Qu, Yuanwei and Schlatte, Rudolf and Kostylev, Egor V. and Giese, Martin and Johnsen, Einar Broch},
  title =	{{Semantically Reflected Programs}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:52},
  ISSN =	{2942-7517},
  year =	{2026},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.4.1.3},
  URN =		{urn:nbn:de:0030-drops-256884},
  doi =		{10.4230/TGDK.4.1.3},
  annote =	{Keywords: Knowledge Graphs, Ontologies, Object-Oriented Modelling, Imperative Programming Languages, Reflection, Type Safety}
}

Document

DOI: 10.4230/LIPIcs.STACS.2026.5

On the Complexity of Language Membership for Probabilistic Words

Authors: Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

We study the membership problem to context-free languages L (CFLs) on probabilistic words, that specify for each position a probability distribution on the letters (assuming independence across positions). Our task is to compute, given a probabilistic word, what is the probability that a word drawn according to the distribution belongs to L. This problem generalizes the problem of counting how many words of length n belong to L, or of counting how many completions of a partial word belong to L. We show that this problem is in polynomial time for unambiguous context-free languages (uCFLs), but can be #P-hard already for unions of two linear uCFLs. More generally, we show that the problem is in polynomial time for so-called poly-slicewise-unambiguous languages, where given a length n we can tractably compute an uCFL for the words of length n in the language. This class includes some inherently ambiguous languages, and implies the tractability of bounded CFLs and of languages recognized by unambiguous polynomial-time counter automata; but we show that the problem can be #P-hard for nondeterministic counter automata, even for Parikh automata with a single counter. We then introduce classes of circuits from knowledge compilation which we use for tractable counting, and show that this covers the tractability of poly-slicewise-unambiguous languages and of some CFLs that are not poly-slicewise-unambiguous. Extending these circuits with negation further allows us to show tractability for the language of primitive words, and for the language of concatenations of two palindromes. We finally show the conditional undecidability of the meta-problem that asks, given a CFG, whether the probabilistic membership problem for that CFG is tractable or #P-hard.

Cite as

Antoine Amarilli, Mikaël Monet, Paul Raphaël, and Sylvain Salvati. On the Complexity of Language Membership for Probabilistic Words. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 5:1-5:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.STACS.2026.5,
  author =	{Amarilli, Antoine and Monet, Mika\"{e}l and Rapha\"{e}l, Paul and Salvati, Sylvain},
  title =	{{On the Complexity of Language Membership for Probabilistic Words}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{5:1--5:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.5},
  URN =		{urn:nbn:de:0030-drops-254943},
  doi =		{10.4230/LIPIcs.STACS.2026.5},
  annote =	{Keywords: Automaton, probabilistic words, context-free grammar, membership problem}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.STACS.2026.1

Query Languages for Machine-Learning Models (Invited Talk)

Authors: Martin Grohe

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

In my invited talk and this accompanying paper, I discuss two logics for weighted finite structures: first-order logic with summation (FO(SUM)) and its recursive extension IFP(SUM). These logics originate from foundational work by Grädel, Gurevich, and Meer in the 1990s. In recent joint work with Standke, Steegmans, and Van den Bussche, we have investigated these logics as query languages for machine learning models, specifically neural networks, which are naturally represented as weighted graphs. I present illustrative examples of queries to neural networks that can be expressed in these logics and discuss fundamental results on their expressiveness and computational complexity.

Cite as

Martin Grohe. Query Languages for Machine-Learning Models (Invited Talk). In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 1:1-1:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{grohe:LIPIcs.STACS.2026.1,
  author =	{Grohe, Martin},
  title =	{{Query Languages for Machine-Learning Models}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{1:1--1:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.1},
  URN =		{urn:nbn:de:0030-drops-254904},
  doi =		{10.4230/LIPIcs.STACS.2026.1},
  annote =	{Keywords: Expressive power of query languages, fixed-point logics, weighted structures, neural networks, explainable AI}
}

Document

DOI: 10.4230/LIPIcs.ITP.2025.36

Verifying Datalog Reasoning with Lean

Authors: Johannes Tantow, Lukas Gerlach, Stephan Mennicke, and Markus Krötzsch

Published in: LIPIcs, Volume 352, 16th International Conference on Interactive Theorem Proving (ITP 2025)

Abstract

Datalog is an essential logical rule language with many applications, and modern rule engines compute logical consequences for Datalog with high performance and scalability. While Datalog is rather simple and, in principle, explainable by design, such sophisticated implementations and optimizations are hard to verify. We therefore propose a certificate-based approach to validate results of Datalog reasoners in a formally verified checker for Datalog proofs. Using the proof assistant Lean, we implement such a checker and verify its correctness against direct formalizations of the Datalog semantics. We propose two JSON encodings for Datalog proofs: one using the widely supported Datalog proof trees, and one using directed acyclic graphs for succinctness. To evaluate the practical feasibility and performance of our approach, we validate proofs that we obtain by converting derivation traces of an existing Datalog reasoner into our tool-independent format.

Cite as

Johannes Tantow, Lukas Gerlach, Stephan Mennicke, and Markus Krötzsch. Verifying Datalog Reasoning with Lean. In 16th International Conference on Interactive Theorem Proving (ITP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 352, pp. 36:1-36:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{tantow_et_al:LIPIcs.ITP.2025.36,
  author =	{Tantow, Johannes and Gerlach, Lukas and Mennicke, Stephan and Kr\"{o}tzsch, Markus},
  title =	{{Verifying Datalog Reasoning with Lean}},
  booktitle =	{16th International Conference on Interactive Theorem Proving (ITP 2025)},
  pages =	{36:1--36:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-396-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{352},
  editor =	{Forster, Yannick and Keller, Chantal},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2025.36},
  URN =		{urn:nbn:de:0030-drops-246342},
  doi =		{10.4230/LIPIcs.ITP.2025.36},
  annote =	{Keywords: Certifying Algorithms, Datalog, Formal Verification}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.108

Near-Optimal Hypergraph Sparsification in Insertion-Only and Bounded-Deletion Streams

Authors: Sanjeev Khanna, Aaron Putterman, and Madhu Sudan

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

We study the problem of constructing hypergraph cut sparsifiers in the streaming model where a hypergraph on n vertices is revealed either via an arbitrary sequence of hyperedge insertions alone (insertion-only streaming model) or via an arbitrary sequence of hyperedge insertions and deletions (dynamic streaming model). For any ε ∈ (0,1), a (1 ± ε) hypergraph cut-sparsifier of a hypergraph H is a reweighted subgraph H' whose cut values approximate those of H to within a (1 ± ε) factor. Prior work shows that in the static setting, one can construct a (1 ± ε) hypergraph cut-sparsifier using Õ(nr/ε²) bits of space [Chen-Khanna-Nagda FOCS 2020], and in the setting of dynamic streams using Õ(nrlog m/ε²) bits of space [Khanna-Putterman-Sudan FOCS 2024]; here the Õ notation hides terms that are polylogarithmic in n, and we use m to denote the total number of hyperedges in the hypergraph. Up until now, the best known space complexity for insertion-only streams has been the same as that for the dynamic streams. This naturally poses the question of understanding the complexity of hypergraph sparsification in insertion-only streams. Perhaps surprisingly, in this work we show that in insertion-only streams, a (1 ± ε) cut-sparsifier can be computed in Õ(nr/ε²) bits of space, matching the complexity of the static setting. As a consequence, this also establishes an Ω(log m) factor separation between the space complexity of hypergraph cut sparsification in insertion-only streams and dynamic streams, as the latter is provably known to require Ω(nr log m) bits of space. To better explain this gap, we then show a more general result: namely, if the stream has at most k hyperedge deletions then Õ(n r log k/ε²) bits of space suffice for hypergraph cut sparsification. Thus the space complexity smoothly interpolates between the insertion-only regime (k = 0) and the fully dynamic regime (k = m). Our algorithmic results are driven by a key technical insight: once sufficiently many hyperedges have been inserted into the stream (relative to the number of allowed deletions), we can significantly reduce the underlying hypergraph by size by irrevocably contracting large subsets of vertices. Finally, we complement this result with an essentially matching lower bound of Ω(n r log(k/n)) bits, thus providing essentially a tight characterization of the space complexity for hypergraph cut-sparsification across a spectrum of streaming models.

Cite as

Sanjeev Khanna, Aaron Putterman, and Madhu Sudan. Near-Optimal Hypergraph Sparsification in Insertion-Only and Bounded-Deletion Streams. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 108:1-108:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{khanna_et_al:LIPIcs.ICALP.2025.108,
  author =	{Khanna, Sanjeev and Putterman, Aaron and Sudan, Madhu},
  title =	{{Near-Optimal Hypergraph Sparsification in Insertion-Only and Bounded-Deletion Streams}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{108:1--108:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.108},
  URN =		{urn:nbn:de:0030-drops-234851},
  doi =		{10.4230/LIPIcs.ICALP.2025.108},
  annote =	{Keywords: Sparsification, sketching, hypergraphs}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.26

Dynamic Direct Access of MSO Query Evaluation over Strings

Authors: Pierre Bourhis, Florent Capelli, Stefan Mengel, and Cristian Riveros

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We study the problem of evaluating a Monadic Second Order (MSO) query over strings under updates in the setting of direct access. We present an algorithm that, given an MSO query with first-order free variables represented by an unambiguous variable-set automaton 𝒜 with state set Q and variables X and a string s, computes a data structure in time 𝒪(|Q|^ω⋅ |X|² ⋅ |s|) and, then, given an index i retrieves, using the data structure, the i-th output of the evaluation of 𝒜 over s in time 𝒪(|Q|^ω ⋅ |X|³ ⋅ log(|s|)²) where ω is the exponent for matrix multiplication. Ours is the first efficient direct access algorithm for MSO query evaluation over strings; such algorithms so far had only been studied for first-order queries and conjunctive queries over relational data. Our algorithm gives the answers in lexicographic order where, in contrast to the setting of conjunctive queries, the order between variables can be freely chosen by the user without degrading the runtime. Moreover, our data structure can be updated efficiently after changes to the input string, allowing more powerful updates than in the enumeration literature, e.g. efficient deletion of substrings, concatenation and splitting of strings, and cut-and-paste operations. Our approach combines a matrix representation of MSO queries and a novel data structure for dynamic word problems over semi-groups which yields an overall algorithm that is elegant and easy to formulate.

Cite as

Pierre Bourhis, Florent Capelli, Stefan Mengel, and Cristian Riveros. Dynamic Direct Access of MSO Query Evaluation over Strings. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 26:1-26:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bourhis_et_al:LIPIcs.ICDT.2025.26,
  author =	{Bourhis, Pierre and Capelli, Florent and Mengel, Stefan and Riveros, Cristian},
  title =	{{Dynamic Direct Access of MSO Query Evaluation over Strings}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{26:1--26:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.26},
  URN =		{urn:nbn:de:0030-drops-229675},
  doi =		{10.4230/LIPIcs.ICDT.2025.26},
  annote =	{Keywords: Query evaluation, direct access, MSO queries}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.17

Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins

Authors: Kyle Deeds and Timo Camillo Merkl

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

In the last decade, various works have used statistics on relations to improve both the theory and practice of conjunctive query execution. Starting with the AGM bound which took advantage of relation sizes, later works incorporated statistics like functional dependencies and degree constraints. Each new statistic prompted work along two lines; bounding the size of conjunctive query outputs and worst-case optimal join algorithms. In this work, we continue in this vein by introducing a new statistic called a partition constraint. This statistic captures latent structure within relations by partitioning them into sub-relations which each have much tighter degree constraints. We show that this approach can both refine existing cardinality bounds and improve existing worst-case optimal join algorithms.

Cite as

Kyle Deeds and Timo Camillo Merkl. Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 17:1-17:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{deeds_et_al:LIPIcs.ICDT.2025.17,
  author =	{Deeds, Kyle and Merkl, Timo Camillo},
  title =	{{Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{17:1--17:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.17},
  URN =		{urn:nbn:de:0030-drops-229588},
  doi =		{10.4230/LIPIcs.ICDT.2025.17},
  annote =	{Keywords: Worst-Case Optimal Joins, Cardinality Bounds, Degeneracy, Degree Constraints, Partition Constraints}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.8

Enumeration of Minimal Hitting Sets Parameterized by Treewidth

Authors: Batya Kenig and Dan Shlomo Mizrahi

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Enumerating the minimal hitting sets of a hypergraph is a problem which arises in many data management applications that include constraint mining, discovering unique column combinations, and enumerating database repairs. Previously, Eiter et al. [Thomas Eiter et al., 2003] showed that the minimal hitting sets of an n-vertex hypergraph, with treewidth w, can be enumerated with delay O^*(n^w) (ignoring polynomial factors), with space requirements that scale with the output size. We improve this to fixed-parameter-linear delay, following an FPT preprocessing phase. The memory consumption of our algorithm is exponential with respect to the treewidth of the hypergraph.

Cite as

Batya Kenig and Dan Shlomo Mizrahi. Enumeration of Minimal Hitting Sets Parameterized by Treewidth. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 8:1-8:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kenig_et_al:LIPIcs.ICDT.2025.8,
  author =	{Kenig, Batya and Mizrahi, Dan Shlomo},
  title =	{{Enumeration of Minimal Hitting Sets Parameterized by Treewidth}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{8:1--8:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.8},
  URN =		{urn:nbn:de:0030-drops-229498},
  doi =		{10.4230/LIPIcs.ICDT.2025.8},
  annote =	{Keywords: Enumeration, Hitting sets}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.9

Query Languages for Neural Networks

Authors: Martin Grohe, Christoph Standke, Juno Steegmans, and Jan Van den Bussche

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We lay the foundations for a database-inspired approach to interpreting and understanding neural network models by querying them using declarative languages. Towards this end we study different query languages, based on first-order logic, that mainly differ in their access to the neural network model. First-order logic over the reals naturally yields a language which views the network as a black box; only the input-output function defined by the network can be queried. This is essentially the approach of constraint query languages. On the other hand, a white-box language can be obtained by viewing the network as a weighted graph, and extending first-order logic with summation over weight terms. The latter approach is essentially an abstraction of SQL . In general, the two approaches are incomparable in expressive power, as we will show. Under natural circumstances, however, the white-box approach can subsume the black-box approach; this is our main result. We prove the result concretely for linear constraint queries over real functions definable by feedforward neural networks with a fixed number of hidden layers and piecewise linear activation functions.

Cite as

Martin Grohe, Christoph Standke, Juno Steegmans, and Jan Van den Bussche. Query Languages for Neural Networks. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 9:1-9:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{grohe_et_al:LIPIcs.ICDT.2025.9,
  author =	{Grohe, Martin and Standke, Christoph and Steegmans, Juno and Van den Bussche, Jan},
  title =	{{Query Languages for Neural Networks}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{9:1--9:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.9},
  URN =		{urn:nbn:de:0030-drops-229508},
  doi =		{10.4230/LIPIcs.ICDT.2025.9},
  annote =	{Keywords: Expressive power of query languages, Machine learning models, languages for interpretability, explainable AI}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.6

PAC: Computing Join Queries with Semi-Covers

Authors: Heba Aamer and Bas Ketsman

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

An increased and growing interest in large-scale data processing has triggered a demand for specialized algorithms that thrive in massively parallel shared-nothing systems. To answer the question of how to efficiently compute join queries in this setting, a rich line of research has emerged specifically for the Massively Parallel Communication (MPC) model. In the MPC model, algorithms are executed in rounds, with each round consisting of a synchronized communication phase and a separate local computation phase. The main cost measure is the load of the algorithm, defined as the maximum number of messages received by any server in any round. We study worst-case optimal algorithms for the join query evaluation problem in the constant-round MPC model. In the single-round variant of MPC, the worst-case optimal load for this problem is well understood and algorithms exist that guarantee this load for any join query. In the constant-round variant of MPC, queries can often be computed with a lower load compared to the single-round variant, but the worst-case optimal load is only known for specific classes of join queries, including graph-like and acyclic join queries, and the associated algorithms use very different techniques. In this paper, we propose a new constant-round MPC algorithm for computing join queries. Our algorithm is correct for every join query and its load matches (up to a polylog factor) the worst-case optimal load for at least all join queries that are acyclic or graph-like.

Cite as

Heba Aamer and Bas Ketsman. PAC: Computing Join Queries with Semi-Covers. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{aamer_et_al:LIPIcs.ICDT.2025.6,
  author =	{Aamer, Heba and Ketsman, Bas},
  title =	{{PAC: Computing Join Queries with Semi-Covers}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.6},
  URN =		{urn:nbn:de:0030-drops-229474},
  doi =		{10.4230/LIPIcs.ICDT.2025.6},
  annote =	{Keywords: Worst-case optimal load, MPC model, join queries}
}

Document

Database Theory in Action

DOI: 10.4230/LIPIcs.ICDT.2025.33

Database Theory in Action: Making Provenance and Probabilistic Database Theory Work in Practice (Invited Talk)

Authors: Silviu Maniu and Pierre Senellart

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

There has been a rich literature in database theory on how to model and manage the provenance of data (for instance using the semiring framework) and its uncertainty (in particular via probabilistic databases). In this article, we explain how these results have been used as the basis for practical implementations, notably in the ProvSQL system, and how these implementations need to be adapted for the efficient management of provenance and probability for real-world data.

Cite as

Silviu Maniu and Pierre Senellart. Database Theory in Action: Making Provenance and Probabilistic Database Theory Work in Practice (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 33:1-33:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{maniu_et_al:LIPIcs.ICDT.2025.33,
  author =	{Maniu, Silviu and Senellart, Pierre},
  title =	{{Database Theory in Action: Making Provenance and Probabilistic Database Theory Work in Practice (Invited Talk)}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{33:1--33:6},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.33},
  URN =		{urn:nbn:de:0030-drops-229746},
  doi =		{10.4230/LIPIcs.ICDT.2025.33},
  annote =	{Keywords: provenance, probabilistic data, ProvSQL}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.32

The Free Termination Property of Queries over Time

Authors: Conor Power, Paraschos Koutris, and Joseph M. Hellerstein

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Building on prior work on distributed databases and the CALM Theorem, we define and study the question of free termination: in the absence of distributed coordination, what query properties allow nodes in a distributed (database) system to unilaterally terminate execution even though they may receive additional data or messages in the future? This completeness question is complementary to the soundness questions studied in the CALM literature. We also develop a new model based on semiautomata that allows us to bridge from the relational transducer model of the CALM papers to algebraic models that are popular among software engineers (e.g. CRDTs) and of increasing interest to database theory for datalog extensions and incremental view maintenance.

Cite as

Conor Power, Paraschos Koutris, and Joseph M. Hellerstein. The Free Termination Property of Queries over Time. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 32:1-32:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{power_et_al:LIPIcs.ICDT.2025.32,
  author =	{Power, Conor and Koutris, Paraschos and Hellerstein, Joseph M.},
  title =	{{The Free Termination Property of Queries over Time}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{32:1--32:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.32},
  URN =		{urn:nbn:de:0030-drops-229736},
  doi =		{10.4230/LIPIcs.ICDT.2025.32},
  annote =	{Keywords: distributed systems, algebraic data models, coordination-free systems}
}

Document

DOI: 10.4230/LIPIcs.CSL.2025.14

Finite Variable Counting Logics with Restricted Requantification

Authors: Simon Raßmann, Georg Schindling, and Pascal Schweitzer

Published in: LIPIcs, Volume 326, 33rd EACSL Annual Conference on Computer Science Logic (CSL 2025)

Abstract

Counting logics with a bounded number of variables form one of the central concepts in descriptive complexity theory. Although they restrict the number of variables that a formula can contain, the variables can be nested within scopes of quantified occurrences of themselves. In other words, the variables can be requantified. We study the fragments obtained from counting logics by restricting requantification for some but not necessarily all the variables. Similar to the logics without limitation on requantification, we develop tools to investigate the restricted variants. Specifically, we introduce a bijective pebble game in which certain pebbles can only be placed once and for all, and a corresponding two-parametric family of Weisfeiler-Leman algorithms. We show close correspondences between the three concepts. By using a suitable cops-and-robber game and adaptations of the Cai-Fürer-Immerman construction, we completely clarify the relative expressive power of the new logics. We show that the restriction of requantification has beneficial algorithmic implications in terms of graph identification. Indeed, we argue that with regard to space complexity, non-requantifiable variables only incur an additive polynomial factor when testing for equivalence. In contrast, for all we know, requantifiable variables incur a multiplicative linear factor. Finally, we observe that graphs of bounded tree-depth and 3-connected planar graphs can be identified using no, respectively, only a very limited number of requantifiable variables.

Cite as

Simon Raßmann, Georg Schindling, and Pascal Schweitzer. Finite Variable Counting Logics with Restricted Requantification. In 33rd EACSL Annual Conference on Computer Science Logic (CSL 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 326, pp. 14:1-14:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ramann_et_al:LIPIcs.CSL.2025.14,
  author =	{Ra{\ss}mann, Simon and Schindling, Georg and Schweitzer, Pascal},
  title =	{{Finite Variable Counting Logics with Restricted Requantification}},
  booktitle =	{33rd EACSL Annual Conference on Computer Science Logic (CSL 2025)},
  pages =	{14:1--14:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-362-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{326},
  editor =	{Endrullis, J\"{o}rg and Schmitz, Sylvain},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CSL.2025.14},
  URN =		{urn:nbn:de:0030-drops-227716},
  doi =		{10.4230/LIPIcs.CSL.2025.14},
  annote =	{Keywords: Requantification, Finite variable counting logics, Weisfeiler-Leman algorithm}
}

Document

DOI: 10.4230/DagRep.14.3.9

Shapes in Graph Data: Theory and Implementation (Dagstuhl Seminar 24102)

Authors: Shqiponja Ahmetaj, Slawomir Staworko, Jan Van den Bussche, and Maxime Jakubowski

Published in: Dagstuhl Reports, Volume 14, Issue 3 (2024)

Abstract

This report documents the program and the outcomes of Dagstuhl Seminar "Shapes in Graph Data: Theory and Implementation" (24102). The seminar brought together active expert and junior researchers, both from academia and industry, to discuss the many open problems and research directions that arise from shapes in graph data, and, more generally, flexible and expressive schema and constraint languages for graph databases. The participants informed each other on how we perceive the research area, reported on the most recent results, discussed open problems and future directions, and in particular, four working groups were formed with promising intentions to work on new research and vision papers.

Cite as

Shqiponja Ahmetaj, Slawomir Staworko, Jan Van den Bussche, and Maxime Jakubowski. Shapes in Graph Data: Theory and Implementation (Dagstuhl Seminar 24102). In Dagstuhl Reports, Volume 14, Issue 3, pp. 9-30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{ahmetaj_et_al:DagRep.14.3.9,
  author =	{Ahmetaj, Shqiponja and Staworko, Slawomir and Van den Bussche, Jan and Jakubowski, Maxime},
  title =	{{Shapes in Graph Data: Theory and Implementation (Dagstuhl Seminar 24102)}},
  pages =	{9--30},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2024},
  volume =	{14},
  number =	{3},
  editor =	{Ahmetaj, Shqiponja and Staworko, Slawomir and Van den Bussche, Jan and Jakubowski, Maxime},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.14.3.9},
  URN =		{urn:nbn:de:0030-drops-211818},
  doi =		{10.4230/DagRep.14.3.9},
  annote =	{Keywords: constraint languages, data for the semantic web, graph data, schema languages}
}

27 Search Results for "Van den Bussche, Jan"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message