Search Results

Documents authored by Barceló, Pablo


Found 2 Possible Name Variants:

Barceló, Pablo

Document
Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032)

Authors: Pablo Barcelo, Pierre Bourhis, Stefan Mengel, and Sudeepa Roy

Published in: Dagstuhl Reports, Volume 14, Issue 1 (2024)


Abstract
This report documents the program and the outcomes of Dagstuhl Seminar "Representation, Provenance, and Explanations in Database Theory and Logic" (24032), which was broadly in the area of database theory. Database theory formalizes the theoretical underpinnings of databases and analyzes them with mathematical tools. We focused on questions related to the fundamental problem of efficient query evaluation: compute the answers of a query on a database. This seminar focused on three key aspects of query evaluations. (1) Representation studies the tradeoff between expressivity, compactness, and efficient computation of outputs from the inputs, including circuits and knowledge compilation forms, enumeration, and direct access. (2) Provenance captures the computation process of outputs from the inputs using a compact formula, and has applications to probabilistic databases. (3) Explanations give meaningful insights to responsibilities of different inputs toward an output beyond provenance, e.g., by using Shapley Values from co-operative game theory that has been recently popular in both DB and ML.

Cite as

Pablo Barcelo, Pierre Bourhis, Stefan Mengel, and Sudeepa Roy. Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032). In Dagstuhl Reports, Volume 14, Issue 1, pp. 49-71, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{barcelo_et_al:DagRep.14.1.49,
  author =	{Barcelo, Pablo and Bourhis, Pierre and Mengel, Stefan and Roy, Sudeepa},
  title =	{{Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032)}},
  pages =	{49--71},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2024},
  volume =	{14},
  number =	{1},
  editor =	{Barcelo, Pablo and Bourhis, Pierre and Mengel, Stefan and Roy, Sudeepa},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.14.1.49},
  URN =		{urn:nbn:de:0030-drops-204904},
  doi =		{10.4230/DagRep.14.1.49},
  annote =	{Keywords: Circuits, database theory, factorized databases, provenance, shapley values}
}
Document
Separating Automatic Relations

Authors: Pablo Barceló, Diego Figueira, and Rémi Morvan

Published in: LIPIcs, Volume 272, 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)


Abstract
We study the separability problem for automatic relations (i.e., relations on finite words definable by synchronous automata) in terms of recognizable relations (i.e., finite unions of products of regular languages). This problem takes as input two automatic relations R and R', and asks if there exists a recognizable relation S that contains R and does not intersect R'. We show this problem to be undecidable when the number of products allowed in the recognizable relation is fixed. In particular, checking if there exists a recognizable relation S with at most k products of regular languages that separates R from R' is undecidable, for each fixed k ⩾ 2. Our proofs reveal tight connections, of independent interest, between the separability problem and the finite coloring problem for automatic graphs, where colors are regular languages.

Cite as

Pablo Barceló, Diego Figueira, and Rémi Morvan. Separating Automatic Relations. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 17:1-17:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.MFCS.2023.17,
  author =	{Barcel\'{o}, Pablo and Figueira, Diego and Morvan, R\'{e}mi},
  title =	{{Separating Automatic Relations}},
  booktitle =	{48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)},
  pages =	{17:1--17:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-292-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{272},
  editor =	{Leroux, J\'{e}r\^{o}me and Lombardy, Sylvain and Peleg, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2023.17},
  URN =		{urn:nbn:de:0030-drops-185514},
  doi =		{10.4230/LIPIcs.MFCS.2023.17},
  annote =	{Keywords: Automatic relations, recognizable relations, separability, finite colorability}
}
Document
Invited Talk
Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk)

Authors: Pablo Barceló

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)


Abstract
In this talk I will present two recent examples of my research on explainability problems over machine learning (ML) models. In rough terms, these explainability problems deal with specific queries one poses over a ML model in order to obtain meaningful justifications for their results. Both of the examples I will present deal with “local” and “post-hoc” explainability queries. Here “local” means that we intend to explain the output of the ML model for a particular input, while “post-hoc” refers to the fact that the explanation is obtained after the model is trained. In the process I will also establish connections with problems studied in data management. This with the intention of suggesting new possibilities for cross-fertilization between the area and ML. The first example I will present refers to computing explanations with scores based on Shapley values, in particular with the recently proposed, and already influential, SHAP-score. This score provides a measure of how different features in the input contribute to the output of the ML model. We provide a detailed analysis of the complexity of this problem for different classes of Boolean circuits. In particular, we show that the problem of computing SHAP-scores is tractable as long as the circuit is deterministic and decomposable, but becomes computationally hard if any of these restrictions is lifted. The tractability part of this result provides a generalization of a recent result stating that, for Boolean hierarchical conjunctive queries, the Shapley-value of the contribution of a tuple in the database to the final result can be computed in polynomial time. The second example I will present refers to the comparison of different ML models in terms of important families of (local and post-hoc) explainability queries. For the models, I will consider multi-layer perceptrons and binary decision diagrams. The main object of study will be the computational complexity of the aforementioned queries over such models. The obtained results will show an interesting theoretical counterpart to wisdom’s claims on interpretability. This work also suggests the need for developing query languages that support the process of retrieving explanations from ML models, and also for obtaining general tractability results for such languages over specific classes of models.

Cite as

Pablo Barceló. Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk). In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{barcelo:LIPIcs.ICDT.2021.1,
  author =	{Barcel\'{o}, Pablo},
  title =	{{Explainability Queries for ML Models and its Connections with Data Management Problems}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.1},
  URN =		{urn:nbn:de:0030-drops-137091},
  doi =		{10.4230/LIPIcs.ICDT.2021.1},
  annote =	{Keywords: ML models, Explainability, Shapley values, decision trees, OBDDs, deterministic and decomposable Boolean circuits}
}
Document
On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra

Authors: Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)


Abstract
We study the expressive power of the Lara language - a recently proposed unified model for expressing relational and linear algebra operations - both in terms of traditional database query languages and some analytic tasks often performed in machine learning pipelines. We start by showing Lara to be expressive complete with respect to first-order logic with aggregation. Since Lara is parameterized by a set of user-defined functions which allow to transform values in tables, the exact expressive power of the language depends on how these functions are defined. We distinguish two main cases depending on the level of genericity queries are enforced to satisfy. Under strong genericity assumptions the language cannot express matrix convolution, a very important operation in current machine learning operations. This language is also local, and thus cannot express operations such as matrix inverse that exhibit a recursive behavior. For expressing convolution, one can relax the genericity requirement by adding an underlying linear order on the domain. This, however, destroys locality and turns the expressive power of the language much more difficult to understand. In particular, although under complexity assumptions the resulting language can still not express matrix inverse, a proof of this fact without such assumptions seems challenging to obtain.

Cite as

Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux. On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2020.6,
  author =	{Barcel\'{o}, Pablo and Higuera, Nelson and P\'{e}rez, Jorge and Subercaseaux, Bernardo},
  title =	{{On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.6},
  URN =		{urn:nbn:de:0030-drops-119305},
  doi =		{10.4230/LIPIcs.ICDT.2020.6},
  annote =	{Keywords: languages for linear and relational algebra, expressive power, first order logic with aggregation, matrix convolution, matrix inverse, query genericity, locality of queries, safety}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Monadic Decomposability of Regular Relations (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors: Pablo Barceló, Chih-Duo Hong, Xuan-Bach Le, Anthony W. Lin, and Reino Niskanen

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
Monadic decomposibility - the ability to determine whether a formula in a given logical theory can be decomposed into a boolean combination of monadic formulas - is a powerful tool for devising a decision procedure for a given logical theory. In this paper, we revisit a classical decision problem in automata theory: given a regular (a.k.a. synchronized rational) relation, determine whether it is recognizable, i.e., it has a monadic decomposition (that is, a representation as a boolean combination of cartesian products of regular languages). Regular relations are expressive formalisms which, using an appropriate string encoding, can capture relations definable in Presburger Arithmetic. In fact, their expressive power coincide with relations definable in a universal automatic structure; equivalently, those definable by finite set interpretations in WS1S (Weak Second Order Theory of One Successor). Determining whether a regular relation admits a recognizable relation was known to be decidable (and in exponential time for binary relations), but its precise complexity still hitherto remains open. Our main contribution is to fully settle the complexity of this decision problem by developing new techniques employing infinite Ramsey theory. The complexity for DFA (resp. NFA) representations of regular relations is shown to be NLOGSPACE-complete (resp. PSPACE-complete).

Cite as

Pablo Barceló, Chih-Duo Hong, Xuan-Bach Le, Anthony W. Lin, and Reino Niskanen. Monadic Decomposability of Regular Relations (Track B: Automata, Logic, Semantics, and Theory of Programming). In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 103:1-103:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICALP.2019.103,
  author =	{Barcel\'{o}, Pablo and Hong, Chih-Duo and Le, Xuan-Bach and Lin, Anthony W. and Niskanen, Reino},
  title =	{{Monadic Decomposability of Regular Relations}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{103:1--103:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.103},
  URN =		{urn:nbn:de:0030-drops-106790},
  doi =		{10.4230/LIPIcs.ICALP.2019.103},
  annote =	{Keywords: Transducers, Automata, Synchronized Rational Relations, Ramsey Theory, Variable Independence, Automatic Structures}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Boundedness of Conjunctive Regular Path Queries (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors: Pablo Barceló, Diego Figueira, and Miguel Romero

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
We study the boundedness problem for unions of conjunctive regular path queries with inverses (UC2RPQs). This is the problem of, given a UC2RPQ, checking whether it is equivalent to a union of conjunctive queries (UCQ). We show the problem to be ExpSpace-complete, thus coinciding with the complexity of containment for UC2RPQs. As a corollary, when a UC2RPQ is bounded, it is equivalent to a UCQ of at most triple-exponential size, and in fact we show that this bound is optimal. We also study better behaved classes of UC2RPQs, namely acyclic UC2RPQs of bounded thickness, and strongly connected UCRPQs, whose boundedness problem is, respectively, PSpace-complete and Pi_2^P-complete. Most upper bounds exploit results on limitedness for distance automata, in particular extending the model with alternation and two-wayness, which may be of independent interest.

Cite as

Pablo Barceló, Diego Figueira, and Miguel Romero. Boundedness of Conjunctive Regular Path Queries (Track B: Automata, Logic, Semantics, and Theory of Programming). In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 104:1-104:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICALP.2019.104,
  author =	{Barcel\'{o}, Pablo and Figueira, Diego and Romero, Miguel},
  title =	{{Boundedness of Conjunctive Regular Path Queries}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{104:1--104:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.104},
  URN =		{urn:nbn:de:0030-drops-106803},
  doi =		{10.4230/LIPIcs.ICALP.2019.104},
  annote =	{Keywords: regular path queries, boundedness, limitedness, distance automata}
}
Document
Complete Volume
LIPIcs, Volume 127, ICDT'19, Complete Volume

Authors: Pablo Barcelo and Marco Calautti

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
LIPIcs, Volume 127, ICDT'19, Complete Volume

Cite as

22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Proceedings{barcelo_et_al:LIPIcs.ICDT.2019,
  title =	{{LIPIcs, Volume 127, ICDT'19, Complete Volume}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019},
  URN =		{urn:nbn:de:0030-drops-103630},
  doi =		{10.4230/LIPIcs.ICDT.2019},
  annote =	{Keywords: Computing Methodologies, Knowledge Representation and Reasoning, Theory of computation, Data modeling, Incomplete, inconsistent and uncertain database Information systems, Data management systems, Data streams, Database query processing, Incomplete data, Inconsistent data, Relational database model}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Pablo Barcelo and Marco Calautti

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2019.0,
  author =	{Barcelo, Pablo and Calautti, Marco},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{0:i--0:xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.0},
  URN =		{urn:nbn:de:0030-drops-103020},
  doi =		{10.4230/LIPIcs.ICDT.2019.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)

Authors: Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi

Published in: Dagstuhl Manifestos, Volume 7, Issue 1 (2018)


Abstract
The area of Principles of Data Management (PDM) has made crucial contributions to the development of formal frameworks for understanding and managing data and knowledge. This work has involved a rich cross-fertilization between PDM and other disciplines in mathematics and computer science, including logic, complexity theory, and knowledge representation. We anticipate on-going expansion of PDM research as the technology and applications involving data management continue to grow and evolve. In particular, the lifecycle of Big Data Analytics raises a wealth of challenge areas that PDM can help with. In this report we identify some of the most important research directions where the PDM community has the potential to make significant contributions. This is done from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term.

Cite as

Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi. Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151). In Dagstuhl Manifestos, Volume 7, Issue 1, pp. 1-29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}
Document
A More General Theory of Static Approximations for Conjunctive Queries

Authors: Pablo Barceló, Miguel Romero, and Thomas Zeume

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)


Abstract
Conjunctive query (CQ) evaluation is NP-complete, but becomes tractable for fragments of bounded hypertreewidth. If a CQ is hard to evaluate, it is thus useful to evaluate an approximation of it in such fragments. While underapproximations (i.e., those that return correct answers only) are well-understood, the dual notion of overapproximations that return complete (but not necessarily sound) answers, and also a more general notion of approximation based on the symmetric difference of query results, are almost unexplored. In fact, the decidability of the basic problems of evaluation, identification, and existence of those approximations, is open. We develop a connection with existential pebble game tools that allows the systematic study of such problems. In particular, we show that the evaluation and identification of overapproximations can be solved in polynomial time. We also make progress in the problem of existence of overapproximations, showing it to be decidable in 2EXPTIME over the class of acyclic CQs. Furthermore, we look at when overapproximations do not exist, suggesting that this can be alleviated by using a more liberal notion of overapproximation. We also show how to extend our tools to study symmetric difference approximations. We observe that such approximations properly extend under- and over-approximations, settle the complexity of its associated identification problem, and provide several results on existence and evaluation.

Cite as

Pablo Barceló, Miguel Romero, and Thomas Zeume. A More General Theory of Static Approximations for Conjunctive Queries. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 7:1-7:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2018.7,
  author =	{Barcel\'{o}, Pablo and Romero, Miguel and Zeume, Thomas},
  title =	{{A More General Theory of Static Approximations for Conjunctive Queries}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{7:1--7:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.7},
  URN =		{urn:nbn:de:0030-drops-86021},
  doi =		{10.4230/LIPIcs.ICDT.2018.7},
  annote =	{Keywords: conjunctive queries, hypertreewidth, approximations, pebble games}
}
Document
The Complexity of Reverse Engineering Problems for Conjunctive Queries

Authors: Pablo Barceló and Miguel Romero

Published in: LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)


Abstract
Reverse engineering problems for conjunctive queries (CQs), such as query by example (QBE) or definability, take a set of user examples and convert them into an explanatory CQ. Despite their importance, the complexity of these problems is prohibitively high (coNEXPTIME-complete). We isolate their two main sources of complexity and propose relaxations of them that reduce the complexity while having meaningful theoretical interpretations. The first relaxation is based on the idea of using existential pebble games for approximating homomorphism tests. We show that this characterizes QBE/definability for CQs up to treewidth k, while reducing the complexity to EXPTIME. As a side result, we obtain that the complexity of the QBE/definability problems for CQs of treewidth k is EXPTIME-complete for each k > 1. The second relaxation is based on the idea of "desynchronizing" direct products, which characterizes QBE/definability for unions of CQs and reduces the complexity to coNP. The combination of these two relaxations yields tractability for QBE and characterizes it in terms of unions of CQs of treewidth at most k. We also study the complexity of these problems for conjunctive regular path queries over graph databases, showing them to be no more difficult than for CQs.

Cite as

Pablo Barceló and Miguel Romero. The Complexity of Reverse Engineering Problems for Conjunctive Queries. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2017.7,
  author =	{Barcel\'{o}, Pablo and Romero, Miguel},
  title =	{{The Complexity of Reverse Engineering Problems for Conjunctive Queries}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{7:1--7:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.7},
  URN =		{urn:nbn:de:0030-drops-70525},
  doi =		{10.4230/LIPIcs.ICDT.2017.7},
  annote =	{Keywords: reverse engineering, conjunctive queries, query by example, definability, treewidth, complexity of pebble games}
}
Document
On the Data Complexity of Consistent Query Answering over Graph Databases

Authors: Pablo Barceló and Gaëlle Fontaine

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)


Abstract
Areas in which graph databases are applied - such as the semantic web, social networks and scientific databases - are prone to inconsistency, mainly due to interoperability issues. This raises the need for understanding query answering over inconsistent graph databases in a framework that is simple yet general enough to accommodate many of its applications. We follow the well-known approach of consistent query answering (CQA), and study the data complexity of CQA over graph databases for regular path queries (RPQs) and regular path constraints (RPCs), which are frequently used. We concentrate on subset, superset and symmetric difference repairs. Without further restrictions, CQA is undecidable for the semantics based on superset and symmetric difference repairs, and Pi_2^P-complete for subset repairs. However, we provide several tractable restrictions on both RPCs and the structure of graph databases that lead to decidability, and even tractability of CQA. We also compare our results with those obtained for CQA in the context of relational databases.

Cite as

Pablo Barceló and Gaëlle Fontaine. On the Data Complexity of Consistent Query Answering over Graph Databases. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 380-397, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2015.380,
  author =	{Barcel\'{o}, Pablo and Fontaine, Ga\"{e}lle},
  title =	{{On the Data Complexity of Consistent Query Answering over Graph Databases}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{380--397},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.380},
  URN =		{urn:nbn:de:0030-drops-49962},
  doi =		{10.4230/LIPIcs.ICDT.2015.380},
  annote =	{Keywords: graph databases, regular path queries, consistent query answering, description logics, rewrite systems}
}
Document
Parameterized Regular Expressions and Their Languages

Authors: Pablo Barceló, Leonid Libkin, and Juan L. Reutter

Published in: LIPIcs, Volume 13, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)


Abstract
We study regular expressions that use variables, or parameters, which are interpreted as alphabet letters. We consider two classes of languages denoted by such expressions: under the possibility semantics, a word belongs to the language if it is denoted by some regular expression obtained by replacing variables with letters; under the certainty semantics, the word must be denoted by every such expression. Such languages are regular, and we show that they naturally arise in several applications such as querying graph databases and program analysis. As the main contribution of the paper, we provide a complete characterization of the complexity of the main computational problems related to such languages: nonemptiness, universality, containment, membership, as well as the problem of constructing NFAs capturing such languages. We also look at the extension when domains of variables could be arbitrary regular languages, and show that under the certainty semantics, languages remain regular and the complexity of the main computational problems does not change.

Cite as

Pablo Barceló, Leonid Libkin, and Juan L. Reutter. Parameterized Regular Expressions and Their Languages. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011). Leibniz International Proceedings in Informatics (LIPIcs), Volume 13, pp. 351-362, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.FSTTCS.2011.351,
  author =	{Barcel\'{o}, Pablo and Libkin, Leonid and Reutter, Juan L.},
  title =	{{Parameterized Regular Expressions and Their Languages}},
  booktitle =	{IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)},
  pages =	{351--362},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-34-7},
  ISSN =	{1868-8969},
  year =	{2011},
  volume =	{13},
  editor =	{Chakraborty, Supratik and Kumar, Amit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2011.351},
  URN =		{urn:nbn:de:0030-drops-33334},
  doi =		{10.4230/LIPIcs.FSTTCS.2011.351},
  annote =	{Keywords: Regular expressions, complexity, decision problems, regular languages}
}

Barcelo, Pablo

Document
Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032)

Authors: Pablo Barcelo, Pierre Bourhis, Stefan Mengel, and Sudeepa Roy

Published in: Dagstuhl Reports, Volume 14, Issue 1 (2024)


Abstract
This report documents the program and the outcomes of Dagstuhl Seminar "Representation, Provenance, and Explanations in Database Theory and Logic" (24032), which was broadly in the area of database theory. Database theory formalizes the theoretical underpinnings of databases and analyzes them with mathematical tools. We focused on questions related to the fundamental problem of efficient query evaluation: compute the answers of a query on a database. This seminar focused on three key aspects of query evaluations. (1) Representation studies the tradeoff between expressivity, compactness, and efficient computation of outputs from the inputs, including circuits and knowledge compilation forms, enumeration, and direct access. (2) Provenance captures the computation process of outputs from the inputs using a compact formula, and has applications to probabilistic databases. (3) Explanations give meaningful insights to responsibilities of different inputs toward an output beyond provenance, e.g., by using Shapley Values from co-operative game theory that has been recently popular in both DB and ML.

Cite as

Pablo Barcelo, Pierre Bourhis, Stefan Mengel, and Sudeepa Roy. Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032). In Dagstuhl Reports, Volume 14, Issue 1, pp. 49-71, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{barcelo_et_al:DagRep.14.1.49,
  author =	{Barcelo, Pablo and Bourhis, Pierre and Mengel, Stefan and Roy, Sudeepa},
  title =	{{Representation, Provenance, and Explanations in Database Theory and Logic (Dagstuhl Seminar 24032)}},
  pages =	{49--71},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2024},
  volume =	{14},
  number =	{1},
  editor =	{Barcelo, Pablo and Bourhis, Pierre and Mengel, Stefan and Roy, Sudeepa},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.14.1.49},
  URN =		{urn:nbn:de:0030-drops-204904},
  doi =		{10.4230/DagRep.14.1.49},
  annote =	{Keywords: Circuits, database theory, factorized databases, provenance, shapley values}
}
Document
Separating Automatic Relations

Authors: Pablo Barceló, Diego Figueira, and Rémi Morvan

Published in: LIPIcs, Volume 272, 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)


Abstract
We study the separability problem for automatic relations (i.e., relations on finite words definable by synchronous automata) in terms of recognizable relations (i.e., finite unions of products of regular languages). This problem takes as input two automatic relations R and R', and asks if there exists a recognizable relation S that contains R and does not intersect R'. We show this problem to be undecidable when the number of products allowed in the recognizable relation is fixed. In particular, checking if there exists a recognizable relation S with at most k products of regular languages that separates R from R' is undecidable, for each fixed k ⩾ 2. Our proofs reveal tight connections, of independent interest, between the separability problem and the finite coloring problem for automatic graphs, where colors are regular languages.

Cite as

Pablo Barceló, Diego Figueira, and Rémi Morvan. Separating Automatic Relations. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 17:1-17:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.MFCS.2023.17,
  author =	{Barcel\'{o}, Pablo and Figueira, Diego and Morvan, R\'{e}mi},
  title =	{{Separating Automatic Relations}},
  booktitle =	{48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)},
  pages =	{17:1--17:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-292-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{272},
  editor =	{Leroux, J\'{e}r\^{o}me and Lombardy, Sylvain and Peleg, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2023.17},
  URN =		{urn:nbn:de:0030-drops-185514},
  doi =		{10.4230/LIPIcs.MFCS.2023.17},
  annote =	{Keywords: Automatic relations, recognizable relations, separability, finite colorability}
}
Document
Invited Talk
Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk)

Authors: Pablo Barceló

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)


Abstract
In this talk I will present two recent examples of my research on explainability problems over machine learning (ML) models. In rough terms, these explainability problems deal with specific queries one poses over a ML model in order to obtain meaningful justifications for their results. Both of the examples I will present deal with “local” and “post-hoc” explainability queries. Here “local” means that we intend to explain the output of the ML model for a particular input, while “post-hoc” refers to the fact that the explanation is obtained after the model is trained. In the process I will also establish connections with problems studied in data management. This with the intention of suggesting new possibilities for cross-fertilization between the area and ML. The first example I will present refers to computing explanations with scores based on Shapley values, in particular with the recently proposed, and already influential, SHAP-score. This score provides a measure of how different features in the input contribute to the output of the ML model. We provide a detailed analysis of the complexity of this problem for different classes of Boolean circuits. In particular, we show that the problem of computing SHAP-scores is tractable as long as the circuit is deterministic and decomposable, but becomes computationally hard if any of these restrictions is lifted. The tractability part of this result provides a generalization of a recent result stating that, for Boolean hierarchical conjunctive queries, the Shapley-value of the contribution of a tuple in the database to the final result can be computed in polynomial time. The second example I will present refers to the comparison of different ML models in terms of important families of (local and post-hoc) explainability queries. For the models, I will consider multi-layer perceptrons and binary decision diagrams. The main object of study will be the computational complexity of the aforementioned queries over such models. The obtained results will show an interesting theoretical counterpart to wisdom’s claims on interpretability. This work also suggests the need for developing query languages that support the process of retrieving explanations from ML models, and also for obtaining general tractability results for such languages over specific classes of models.

Cite as

Pablo Barceló. Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk). In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{barcelo:LIPIcs.ICDT.2021.1,
  author =	{Barcel\'{o}, Pablo},
  title =	{{Explainability Queries for ML Models and its Connections with Data Management Problems}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.1},
  URN =		{urn:nbn:de:0030-drops-137091},
  doi =		{10.4230/LIPIcs.ICDT.2021.1},
  annote =	{Keywords: ML models, Explainability, Shapley values, decision trees, OBDDs, deterministic and decomposable Boolean circuits}
}
Document
On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra

Authors: Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux

Published in: LIPIcs, Volume 155, 23rd International Conference on Database Theory (ICDT 2020)


Abstract
We study the expressive power of the Lara language - a recently proposed unified model for expressing relational and linear algebra operations - both in terms of traditional database query languages and some analytic tasks often performed in machine learning pipelines. We start by showing Lara to be expressive complete with respect to first-order logic with aggregation. Since Lara is parameterized by a set of user-defined functions which allow to transform values in tables, the exact expressive power of the language depends on how these functions are defined. We distinguish two main cases depending on the level of genericity queries are enforced to satisfy. Under strong genericity assumptions the language cannot express matrix convolution, a very important operation in current machine learning operations. This language is also local, and thus cannot express operations such as matrix inverse that exhibit a recursive behavior. For expressing convolution, one can relax the genericity requirement by adding an underlying linear order on the domain. This, however, destroys locality and turns the expressive power of the language much more difficult to understand. In particular, although under complexity assumptions the resulting language can still not express matrix inverse, a proof of this fact without such assumptions seems challenging to obtain.

Cite as

Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux. On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra. In 23rd International Conference on Database Theory (ICDT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 155, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2020.6,
  author =	{Barcel\'{o}, Pablo and Higuera, Nelson and P\'{e}rez, Jorge and Subercaseaux, Bernardo},
  title =	{{On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra}},
  booktitle =	{23rd International Conference on Database Theory (ICDT 2020)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-139-9},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{155},
  editor =	{Lutz, Carsten and Jung, Jean Christoph},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2020.6},
  URN =		{urn:nbn:de:0030-drops-119305},
  doi =		{10.4230/LIPIcs.ICDT.2020.6},
  annote =	{Keywords: languages for linear and relational algebra, expressive power, first order logic with aggregation, matrix convolution, matrix inverse, query genericity, locality of queries, safety}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Monadic Decomposability of Regular Relations (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors: Pablo Barceló, Chih-Duo Hong, Xuan-Bach Le, Anthony W. Lin, and Reino Niskanen

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
Monadic decomposibility - the ability to determine whether a formula in a given logical theory can be decomposed into a boolean combination of monadic formulas - is a powerful tool for devising a decision procedure for a given logical theory. In this paper, we revisit a classical decision problem in automata theory: given a regular (a.k.a. synchronized rational) relation, determine whether it is recognizable, i.e., it has a monadic decomposition (that is, a representation as a boolean combination of cartesian products of regular languages). Regular relations are expressive formalisms which, using an appropriate string encoding, can capture relations definable in Presburger Arithmetic. In fact, their expressive power coincide with relations definable in a universal automatic structure; equivalently, those definable by finite set interpretations in WS1S (Weak Second Order Theory of One Successor). Determining whether a regular relation admits a recognizable relation was known to be decidable (and in exponential time for binary relations), but its precise complexity still hitherto remains open. Our main contribution is to fully settle the complexity of this decision problem by developing new techniques employing infinite Ramsey theory. The complexity for DFA (resp. NFA) representations of regular relations is shown to be NLOGSPACE-complete (resp. PSPACE-complete).

Cite as

Pablo Barceló, Chih-Duo Hong, Xuan-Bach Le, Anthony W. Lin, and Reino Niskanen. Monadic Decomposability of Regular Relations (Track B: Automata, Logic, Semantics, and Theory of Programming). In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 103:1-103:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICALP.2019.103,
  author =	{Barcel\'{o}, Pablo and Hong, Chih-Duo and Le, Xuan-Bach and Lin, Anthony W. and Niskanen, Reino},
  title =	{{Monadic Decomposability of Regular Relations}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{103:1--103:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.103},
  URN =		{urn:nbn:de:0030-drops-106790},
  doi =		{10.4230/LIPIcs.ICALP.2019.103},
  annote =	{Keywords: Transducers, Automata, Synchronized Rational Relations, Ramsey Theory, Variable Independence, Automatic Structures}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Boundedness of Conjunctive Regular Path Queries (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors: Pablo Barceló, Diego Figueira, and Miguel Romero

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
We study the boundedness problem for unions of conjunctive regular path queries with inverses (UC2RPQs). This is the problem of, given a UC2RPQ, checking whether it is equivalent to a union of conjunctive queries (UCQ). We show the problem to be ExpSpace-complete, thus coinciding with the complexity of containment for UC2RPQs. As a corollary, when a UC2RPQ is bounded, it is equivalent to a UCQ of at most triple-exponential size, and in fact we show that this bound is optimal. We also study better behaved classes of UC2RPQs, namely acyclic UC2RPQs of bounded thickness, and strongly connected UCRPQs, whose boundedness problem is, respectively, PSpace-complete and Pi_2^P-complete. Most upper bounds exploit results on limitedness for distance automata, in particular extending the model with alternation and two-wayness, which may be of independent interest.

Cite as

Pablo Barceló, Diego Figueira, and Miguel Romero. Boundedness of Conjunctive Regular Path Queries (Track B: Automata, Logic, Semantics, and Theory of Programming). In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 104:1-104:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICALP.2019.104,
  author =	{Barcel\'{o}, Pablo and Figueira, Diego and Romero, Miguel},
  title =	{{Boundedness of Conjunctive Regular Path Queries}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{104:1--104:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.104},
  URN =		{urn:nbn:de:0030-drops-106803},
  doi =		{10.4230/LIPIcs.ICALP.2019.104},
  annote =	{Keywords: regular path queries, boundedness, limitedness, distance automata}
}
Document
Complete Volume
LIPIcs, Volume 127, ICDT'19, Complete Volume

Authors: Pablo Barcelo and Marco Calautti

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
LIPIcs, Volume 127, ICDT'19, Complete Volume

Cite as

22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Proceedings{barcelo_et_al:LIPIcs.ICDT.2019,
  title =	{{LIPIcs, Volume 127, ICDT'19, Complete Volume}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019},
  URN =		{urn:nbn:de:0030-drops-103630},
  doi =		{10.4230/LIPIcs.ICDT.2019},
  annote =	{Keywords: Computing Methodologies, Knowledge Representation and Reasoning, Theory of computation, Data modeling, Incomplete, inconsistent and uncertain database Information systems, Data management systems, Data streams, Database query processing, Incomplete data, Inconsistent data, Relational database model}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Pablo Barcelo and Marco Calautti

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2019.0,
  author =	{Barcelo, Pablo and Calautti, Marco},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{0:i--0:xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.0},
  URN =		{urn:nbn:de:0030-drops-103020},
  doi =		{10.4230/LIPIcs.ICDT.2019.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)

Authors: Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi

Published in: Dagstuhl Manifestos, Volume 7, Issue 1 (2018)


Abstract
The area of Principles of Data Management (PDM) has made crucial contributions to the development of formal frameworks for understanding and managing data and knowledge. This work has involved a rich cross-fertilization between PDM and other disciplines in mathematics and computer science, including logic, complexity theory, and knowledge representation. We anticipate on-going expansion of PDM research as the technology and applications involving data management continue to grow and evolve. In particular, the lifecycle of Big Data Analytics raises a wealth of challenge areas that PDM can help with. In this report we identify some of the most important research directions where the PDM community has the potential to make significant contributions. This is done from three perspectives: potential practical relevance, results already obtained, and research questions that appear surmountable in the short and medium term.

Cite as

Serge Abiteboul, Marcelo Arenas, Pablo Barceló, Meghyn Bienvenu, Diego Calvanese, Claire David, Richard Hull, Eyke Hüllermeier, Benny Kimelfeld, Leonid Libkin, Wim Martens, Tova Milo, Filip Murlak, Frank Neven, Magdalena Ortiz, Thomas Schwentick, Julia Stoyanovich, Jianwen Su, Dan Suciu, Victor Vianu, and Ke Yi. Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151). In Dagstuhl Manifestos, Volume 7, Issue 1, pp. 1-29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@Article{abiteboul_et_al:DagMan.7.1.1,
  author =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  title =	{{Research Directions for Principles of Data Management (Dagstuhl Perspectives Workshop 16151)}},
  pages =	{1--29},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2018},
  volume =	{7},
  number =	{1},
  editor =	{Abiteboul, Serge and Arenas, Marcelo and Barcel\'{o}, Pablo and Bienvenu, Meghyn and Calvanese, Diego and David, Claire and Hull, Richard and H\"{u}llermeier, Eyke and Kimelfeld, Benny and Libkin, Leonid and Martens, Wim and Milo, Tova and Murlak, Filip and Neven, Frank and Ortiz, Magdalena and Schwentick, Thomas and Stoyanovich, Julia and Su, Jianwen and Suciu, Dan and Vianu, Victor and Yi, Ke},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagMan.7.1.1},
  URN =		{urn:nbn:de:0030-drops-86772},
  doi =		{10.4230/DagMan.7.1.1},
  annote =	{Keywords: database theory, principles of data management, query languages, efficient query processing, query optimization, heterogeneous data, uncertainty, knowledge-enriched data management, machine learning, workflows, human-related data, ethics}
}
Document
A More General Theory of Static Approximations for Conjunctive Queries

Authors: Pablo Barceló, Miguel Romero, and Thomas Zeume

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)


Abstract
Conjunctive query (CQ) evaluation is NP-complete, but becomes tractable for fragments of bounded hypertreewidth. If a CQ is hard to evaluate, it is thus useful to evaluate an approximation of it in such fragments. While underapproximations (i.e., those that return correct answers only) are well-understood, the dual notion of overapproximations that return complete (but not necessarily sound) answers, and also a more general notion of approximation based on the symmetric difference of query results, are almost unexplored. In fact, the decidability of the basic problems of evaluation, identification, and existence of those approximations, is open. We develop a connection with existential pebble game tools that allows the systematic study of such problems. In particular, we show that the evaluation and identification of overapproximations can be solved in polynomial time. We also make progress in the problem of existence of overapproximations, showing it to be decidable in 2EXPTIME over the class of acyclic CQs. Furthermore, we look at when overapproximations do not exist, suggesting that this can be alleviated by using a more liberal notion of overapproximation. We also show how to extend our tools to study symmetric difference approximations. We observe that such approximations properly extend under- and over-approximations, settle the complexity of its associated identification problem, and provide several results on existence and evaluation.

Cite as

Pablo Barceló, Miguel Romero, and Thomas Zeume. A More General Theory of Static Approximations for Conjunctive Queries. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 7:1-7:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2018.7,
  author =	{Barcel\'{o}, Pablo and Romero, Miguel and Zeume, Thomas},
  title =	{{A More General Theory of Static Approximations for Conjunctive Queries}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{7:1--7:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.7},
  URN =		{urn:nbn:de:0030-drops-86021},
  doi =		{10.4230/LIPIcs.ICDT.2018.7},
  annote =	{Keywords: conjunctive queries, hypertreewidth, approximations, pebble games}
}
Document
The Complexity of Reverse Engineering Problems for Conjunctive Queries

Authors: Pablo Barceló and Miguel Romero

Published in: LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)


Abstract
Reverse engineering problems for conjunctive queries (CQs), such as query by example (QBE) or definability, take a set of user examples and convert them into an explanatory CQ. Despite their importance, the complexity of these problems is prohibitively high (coNEXPTIME-complete). We isolate their two main sources of complexity and propose relaxations of them that reduce the complexity while having meaningful theoretical interpretations. The first relaxation is based on the idea of using existential pebble games for approximating homomorphism tests. We show that this characterizes QBE/definability for CQs up to treewidth k, while reducing the complexity to EXPTIME. As a side result, we obtain that the complexity of the QBE/definability problems for CQs of treewidth k is EXPTIME-complete for each k > 1. The second relaxation is based on the idea of "desynchronizing" direct products, which characterizes QBE/definability for unions of CQs and reduces the complexity to coNP. The combination of these two relaxations yields tractability for QBE and characterizes it in terms of unions of CQs of treewidth at most k. We also study the complexity of these problems for conjunctive regular path queries over graph databases, showing them to be no more difficult than for CQs.

Cite as

Pablo Barceló and Miguel Romero. The Complexity of Reverse Engineering Problems for Conjunctive Queries. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 7:1-7:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2017.7,
  author =	{Barcel\'{o}, Pablo and Romero, Miguel},
  title =	{{The Complexity of Reverse Engineering Problems for Conjunctive Queries}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{7:1--7:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.7},
  URN =		{urn:nbn:de:0030-drops-70525},
  doi =		{10.4230/LIPIcs.ICDT.2017.7},
  annote =	{Keywords: reverse engineering, conjunctive queries, query by example, definability, treewidth, complexity of pebble games}
}
Document
On the Data Complexity of Consistent Query Answering over Graph Databases

Authors: Pablo Barceló and Gaëlle Fontaine

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)


Abstract
Areas in which graph databases are applied - such as the semantic web, social networks and scientific databases - are prone to inconsistency, mainly due to interoperability issues. This raises the need for understanding query answering over inconsistent graph databases in a framework that is simple yet general enough to accommodate many of its applications. We follow the well-known approach of consistent query answering (CQA), and study the data complexity of CQA over graph databases for regular path queries (RPQs) and regular path constraints (RPCs), which are frequently used. We concentrate on subset, superset and symmetric difference repairs. Without further restrictions, CQA is undecidable for the semantics based on superset and symmetric difference repairs, and Pi_2^P-complete for subset repairs. However, we provide several tractable restrictions on both RPCs and the structure of graph databases that lead to decidability, and even tractability of CQA. We also compare our results with those obtained for CQA in the context of relational databases.

Cite as

Pablo Barceló and Gaëlle Fontaine. On the Data Complexity of Consistent Query Answering over Graph Databases. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 380-397, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.ICDT.2015.380,
  author =	{Barcel\'{o}, Pablo and Fontaine, Ga\"{e}lle},
  title =	{{On the Data Complexity of Consistent Query Answering over Graph Databases}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{380--397},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.380},
  URN =		{urn:nbn:de:0030-drops-49962},
  doi =		{10.4230/LIPIcs.ICDT.2015.380},
  annote =	{Keywords: graph databases, regular path queries, consistent query answering, description logics, rewrite systems}
}
Document
Parameterized Regular Expressions and Their Languages

Authors: Pablo Barceló, Leonid Libkin, and Juan L. Reutter

Published in: LIPIcs, Volume 13, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)


Abstract
We study regular expressions that use variables, or parameters, which are interpreted as alphabet letters. We consider two classes of languages denoted by such expressions: under the possibility semantics, a word belongs to the language if it is denoted by some regular expression obtained by replacing variables with letters; under the certainty semantics, the word must be denoted by every such expression. Such languages are regular, and we show that they naturally arise in several applications such as querying graph databases and program analysis. As the main contribution of the paper, we provide a complete characterization of the complexity of the main computational problems related to such languages: nonemptiness, universality, containment, membership, as well as the problem of constructing NFAs capturing such languages. We also look at the extension when domains of variables could be arbitrary regular languages, and show that under the certainty semantics, languages remain regular and the complexity of the main computational problems does not change.

Cite as

Pablo Barceló, Leonid Libkin, and Juan L. Reutter. Parameterized Regular Expressions and Their Languages. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011). Leibniz International Proceedings in Informatics (LIPIcs), Volume 13, pp. 351-362, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)


Copy BibTex To Clipboard

@InProceedings{barcelo_et_al:LIPIcs.FSTTCS.2011.351,
  author =	{Barcel\'{o}, Pablo and Libkin, Leonid and Reutter, Juan L.},
  title =	{{Parameterized Regular Expressions and Their Languages}},
  booktitle =	{IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)},
  pages =	{351--362},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-34-7},
  ISSN =	{1868-8969},
  year =	{2011},
  volume =	{13},
  editor =	{Chakraborty, Supratik and Kumar, Amit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2011.351},
  URN =		{urn:nbn:de:0030-drops-33334},
  doi =		{10.4230/LIPIcs.FSTTCS.2011.351},
  annote =	{Keywords: Regular expressions, complexity, decision problems, regular languages}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail