Search Results

Documents authored by Bourhis, Pierre


Document
On Distances Between Words with Parameters

Authors: Pierre Bourhis, Aaron Boussidan, and Philippe Gambette

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
The edit distance between parameterized words is a generalization of the classical edit distance where it is allowed to map particular letters of the first word, called parameters, to parameters of the second word before computing the distance. This problem has been introduced in particular for detection of code duplication, and the notion of words with parameters has also been used with different semantics in other fields. The complexity of several variants of edit distances between parameterized words has been studied, however, the complexity of the most natural one, the Levenshtein distance, remained open. In this paper, we solve this open question and close the exhaustive analysis of all cases of parameterized word matching and function matching, showing that these problems are np-complete. To this aim, we also provide a comparison of the different problems, exhibiting several equivalences between them. We also provide and implement a MaxSAT encoding of the problem, as well as a simple FPT algorithm in the alphabet size, and study their efficiency on real data in the context of theater play structure comparison.

Cite as

Pierre Bourhis, Aaron Boussidan, and Philippe Gambette. On Distances Between Words with Parameters. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 6:1-6:23, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bourhis_et_al:LIPIcs.CPM.2023.6,
  author =	{Bourhis, Pierre and Boussidan, Aaron and Gambette, Philippe},
  title =	{{On Distances Between Words with Parameters}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{6:1--6:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.6},
  URN =		{urn:nbn:de:0030-drops-179602},
  doi =		{10.4230/LIPIcs.CPM.2023.6},
  annote =	{Keywords: String matching, edit distance, Levenshtein, parameterized matching, parameterized words, parameter words, instantiable words, NP-completeness, MAX-SAT}
}
Document
Ranked Enumeration of MSO Logic on Words

Authors: Pierre Bourhis, Alejandro Grez, Louis Jachiet, and Cristian Riveros

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)


Abstract
In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.

Cite as

Pierre Bourhis, Alejandro Grez, Louis Jachiet, and Cristian Riveros. Ranked Enumeration of MSO Logic on Words. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 20:1-20:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bourhis_et_al:LIPIcs.ICDT.2021.20,
  author =	{Bourhis, Pierre and Grez, Alejandro and Jachiet, Louis and Riveros, Cristian},
  title =	{{Ranked Enumeration of MSO Logic on Words}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{20:1--20:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.20},
  URN =		{urn:nbn:de:0030-drops-137286},
  doi =		{10.4230/LIPIcs.ICDT.2021.20},
  annote =	{Keywords: Persistent data structures, Query evaluation, Enumeration algorithms}
}
Document
Constant-Delay Enumeration for Nondeterministic Document Spanners

Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the results with a small delay between any two consecutive results. Our goal is to have an algorithm which is tractable in combined complexity, i.e., in the sizes of the input document and the VA; while ensuring the best possible data complexity bounds in the input document size, i.e., constant delay in the document size. Several recent works at PODS'18 proposed such algorithms but with linear delay in the document size or with an exponential dependency in size of the (generally nondeterministic) input VA. In particular, Florenzano et al. suggest that our desired runtime guarantees cannot be met for general sequential VAs. We refute this and show that, given a nondeterministic sequential VA and an input document, we can enumerate the mappings of the VA on the document with the following bounds: the preprocessing is linear in the document size and polynomial in the size of the VA, and the delay is independent of the document and polynomial in the size of the VA. The resulting algorithm thus achieves tractability in combined complexity and the best possible data complexity bounds. Moreover, it is rather easy to describe, in particular for the restricted case of so-called extended VAs.

Cite as

Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-Delay Enumeration for Nondeterministic Document Spanners. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 22:1-22:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2019.22,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan and Niewerth, Matthias},
  title =	{{Constant-Delay Enumeration for Nondeterministic Document Spanners}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{22:1--22:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.22},
  URN =		{urn:nbn:de:0030-drops-103246},
  doi =		{10.4230/LIPIcs.ICDT.2019.22},
  annote =	{Keywords: enumeration, spanners, automata}
}
Document
Enumeration on Trees under Relabelings

Authors: Antoine Amarilli, Pierre Bourhis, and Stefan Mengel

Published in: LIPIcs, Volume 98, 21st International Conference on Database Theory (ICDT 2018)


Abstract
We study how to evaluate MSO queries with free variables on trees, within the framework of enumeration algorithms. Previous work has shown how to enumerate answers with linear-time preprocessing and delay linear in the size of each output, i.e., constant-delay for free first-order variables. We extend this result to support relabelings, a restricted kind of update operations on trees which allows us to change the node labels. Our main result shows that we can enumerate the answers of MSO queries on trees with linear-time preprocessing and delay linear in each answer, while supporting node relabelings in logarithmic time. To prove this, we reuse the circuit-based enumeration structure from our earlier work, and develop techniques to maintain its index under node relabelings. We also show how enumeration under relabelings can be applied to evaluate practical query languages, such as aggregate, group-by, and parameterized queries.

Cite as

Antoine Amarilli, Pierre Bourhis, and Stefan Mengel. Enumeration on Trees under Relabelings. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 5:1-5:18, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2018.5,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Mengel, Stefan},
  title =	{{Enumeration on Trees under Relabelings}},
  booktitle =	{21st International Conference on Database Theory (ICDT 2018)},
  pages =	{5:1--5:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-063-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{98},
  editor =	{Kimelfeld, Benny and Amsterdamer, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2018.5},
  URN =		{urn:nbn:de:0030-drops-86060},
  doi =		{10.4230/LIPIcs.ICDT.2018.5},
  annote =	{Keywords: enumeration, trees, updates, MSO, circuits, knowledge compilation}
}
Document
Characterizing Definability in Decidable Fixpoint Logics

Authors: Michael Benedikt, Pierre Bourhis, and Michael Vanden Boom

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)


Abstract
We look at characterizing which formulas are expressible in rich decidable logics such as guarded fixpoint logic, unary negation fixpoint logic, and guarded negation fixpoint logic. We consider semantic characterizations of definability, as well as effective characterizations. Our algorithms revolve around a finer analysis of the tree-model property and a refinement of the method of moving back-and-forth between relational logics and logics over trees.

Cite as

Michael Benedikt, Pierre Bourhis, and Michael Vanden Boom. Characterizing Definability in Decidable Fixpoint Logics. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 107:1-107:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{benedikt_et_al:LIPIcs.ICALP.2017.107,
  author =	{Benedikt, Michael and Bourhis, Pierre and Vanden Boom, Michael},
  title =	{{Characterizing Definability in Decidable Fixpoint Logics}},
  booktitle =	{44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)},
  pages =	{107:1--107:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-041-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{80},
  editor =	{Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.107},
  URN =		{urn:nbn:de:0030-drops-74062},
  doi =		{10.4230/LIPIcs.ICALP.2017.107},
  annote =	{Keywords: Guarded logics, bisimulation, definability, automata}
}
Document
A Circuit-Based Approach to Efficient Enumeration

Authors: Antoine Amarilli, Pierre Bourhis, Louis Jachiet, and Stefan Mengel

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)


Abstract
We study the problem of enumerating the satisfying valuations of a circuit while bounding the delay, i.e., the time needed to compute each successive valuation. We focus on the class of structured d-DNNF circuits originally introduced in knowledge compilation, a sub-area of artificial intelligence. We propose an algorithm for these circuits that enumerates valuations with linear preprocessing and delay linear in the Hamming weight of each valuation. Moreover, valuations of constant Hamming weight can be enumerated with linear preprocessing and constant delay. Our results yield a framework for efficient enumeration that applies to all problems whose solutions can be compiled to structured d-DNNFs. In particular, we use it to recapture classical results in database theory, for factorized database representations and for MSO evaluation. This gives an independent proof of constant-delay enumeration for MSO formulae with first-order free variables on bounded-treewidth structures.

Cite as

Antoine Amarilli, Pierre Bourhis, Louis Jachiet, and Stefan Mengel. A Circuit-Based Approach to Efficient Enumeration. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 111:1-111:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICALP.2017.111,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Jachiet, Louis and Mengel, Stefan},
  title =	{{A Circuit-Based Approach to Efficient Enumeration}},
  booktitle =	{44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)},
  pages =	{111:1--111:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-041-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{80},
  editor =	{Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.111},
  URN =		{urn:nbn:de:0030-drops-74626},
  doi =		{10.4230/LIPIcs.ICALP.2017.111},
  annote =	{Keywords: circuits, constant-delay, enumeration, d-DNNFs, MSO}
}
Document
Combined Tractability of Query Evaluation via Tree Automata and Cycluits

Authors: Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart

Published in: LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)


Abstract
We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We introduce a new Datalog fragment with stratified negation, intensional-clique-guarded Datalog (ICG-Datalog), with linear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simplicial decompositions of bounded width, guarded negation fragment queries of bounded CQ-rank, or two-way regular path queries. Our result is shown by compiling to alternating two-way automata, whose semantics is defined via cyclic provenance circuits (cycluits) that can be tractably evaluated. Last, we prove that probabilistic query evaluation remains intractable in combined complexity under this parameterization.

Cite as

Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart. Combined Tractability of Query Evaluation via Tree Automata and Cycluits. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 6:1-6:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{amarilli_et_al:LIPIcs.ICDT.2017.6,
  author =	{Amarilli, Antoine and Bourhis, Pierre and Monet, Mika\"{e}l and Senellart, Pierre},
  title =	{{Combined Tractability of Query Evaluation via Tree Automata and Cycluits}},
  booktitle =	{20th International Conference on Database Theory (ICDT 2017)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-024-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{68},
  editor =	{Benedikt, Michael and Orsi, Giorgio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2017.6},
  URN =		{urn:nbn:de:0030-drops-70516},
  doi =		{10.4230/LIPIcs.ICDT.2017.6},
  annote =	{Keywords: query evaluation, tree automata, provenance, treewidth, circuits}
}
Document
A Formal Study of Collaborative Access Control in Distributed Datalog

Authors: Serge Abiteboul, Pierre Bourhis, and Victor Vianu

Published in: LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)


Abstract
We formalize and study a declaratively specified collaborative access control mechanism for data dissemination in a distributed environment. Data dissemination is specified using distributed datalog. Access control is also defined by datalog-style rules, at the relation level for extensional relations, and at the tuple level for intensional ones, based on the derivation of tuples. The model also includes a mechanism for "declassifying" data, that allows circumventing overly restrictive access control. We consider the complexity of determining whether a peer is allowed to access a given fact, and address the problem of achieving the goal of disseminating certain information under some access control policy. We also investigate the problem of information leakage, which occurs when a peer is able to infer facts to which the peer is not allowed access by the policy. Finally, we consider access control extended to facts equipped with provenance information, motivated by the many applications where such information is required. We provide semantics for access control with provenance, and establish the complexity of determining whether a peer may access a given fact together with its provenance. This work is motivated by the access control of the Webdamlog system, whose core features it formalizes.

Cite as

Serge Abiteboul, Pierre Bourhis, and Victor Vianu. A Formal Study of Collaborative Access Control in Distributed Datalog. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 10:1-10:17, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{abiteboul_et_al:LIPIcs.ICDT.2016.10,
  author =	{Abiteboul, Serge and Bourhis, Pierre and Vianu, Victor},
  title =	{{A Formal Study of Collaborative Access Control in Distributed Datalog}},
  booktitle =	{19th International Conference on Database Theory (ICDT 2016)},
  pages =	{10:1--10:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-002-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{48},
  editor =	{Martens, Wim and Zeume, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2016.10},
  URN =		{urn:nbn:de:0030-drops-57794},
  doi =		{10.4230/LIPIcs.ICDT.2016.10},
  annote =	{Keywords: Distributed datalog, access control, provenance}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail