Document

# Ranked Enumeration of MSO Logic on Words

## File

LIPIcs.ICDT.2021.20.pdf
• Filesize: 0.77 MB
• 19 pages

## Cite As

Pierre Bourhis, Alejandro Grez, Louis Jachiet, and Cristian Riveros. Ranked Enumeration of MSO Logic on Words. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 20:1-20:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ICDT.2021.20

## Abstract

In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Data structures and algorithms for data management
• Theory of computation → Complexity theory and logic
• Theory of computation → Formal languages and automata theory
##### Keywords
• Persistent data structures
• Query evaluation
• Enumeration algorithms

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Alfred V Aho and John E Hopcroft. The design and analysis of computer algorithms. Pearson Education India, 1974.
2. Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-delay enumeration for nondeterministic document spanners. In ICDT, pages 22:1-22:19, 2019.
3. Guillaume Bagan. Mso queries on tree decomposable structures are computable with linear delay. In International Workshop on Computer Science Logic, pages 167-181. Springer, 2006.
4. Gerth Stølting Brodal and Chris Okasaki. Optimal purely functional priority queues. Journal of Functional Programming, 6(6):839-857, 1996.
5. Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to algorithms. MIT press, 2009.
6. Bruno Courcelle. Linear delay enumeration and monadic second-order logic. Discrete Applied Mathematics, 157(12):2675-2700, 2009.
7. Shaleen Deep and Paraschos Koutris. Ranked enumeration of conjunctive query results. CoRR, abs/1902.02698, 2019.
8. Johannes Doleschal, Noa Bratman, Benny Kimelfeld, and Wim Martens. The complexity of aggregates over extractions by regular expressions. In ICDT, 2021.
9. Johannes Doleschal, Benny Kimelfeld, Wim Martens, and Liat Peterfreund. Weight annotation in information extraction. In ICDT, volume 155, pages 8:1-8:18, 2020.
10. James R Driscoll, Neil Sarnak, Daniel Dominic Sleator, and Robert Endre Tarjan. Making data structures persistent. In STOC, pages 109-121, 1986.
11. Manfred Droste and Paul Gastin. Weighted automata and weighted logics. In ICALP, volume 3580, pages 513-525, 2005.
12. Manfred Droste, Werner Kuich, and Heiko Vogler. Handbook of weighted automata. Springer Science & Business Media, 2009.
13. Ronald Fagin, Benny Kimelfeld, Frederick Reiss, and Stijn Vansummeren. Document spanners: A formal approach to information extraction. J. ACM, 62(2):12:1-12:51, 2015.
14. Fernando Florenzano, Cristian Riveros, Martín Ugarte, Stijn Vansummeren, and Domagoj Vrgoc. Efficient enumeration algorithms for regular document spanners. ACM Trans. Database Syst., 45(1):3:1-3:42, 2020.
15. Dominik D Freydenberger, Benny Kimelfeld, and Liat Peterfreund. Joining extractions of regular expressions. In Proceedings of PODS, pages 137-149, 2018.
16. Markus Frick and Martin Grohe. The complexity of first-order and monadic second-order logic revisited. Ann. Pure Appl. Log., 130(1-3):3-31, 2004.
17. Jonathan S Golan. Semirings and their Applications. Springer Science & Business Media, 2013.
18. Alejandro Grez, Cristian Riveros, and Martín Ugarte. A Formal Framework for Complex Event Processing. In ICDT, pages 5:1-5:18, 2019.
19. Stephan Kreutzer and Cristian Riveros. Quantitative monadic second-order logic. In LICS, pages 113-122, 2013.
20. Leonid Libkin. Elements of finite model theory. Springer Science & Business Media, 2013.
21. Francisco Maturana, Cristian Riveros, and Domagoj Vrgoc. Document spanners for extracting incomplete information: Expressiveness and complexity. In Proceedings of PODS, pages 125-136. ACM, 2018.
22. Klaus Reinhardt. The complexity of translating logic to finite automata. In Automata logics, and infinite games, pages 231-238. Springer, 2002.
23. Luc Segoufin. Enumerating with constant delay the answers to a query. In Proceedings of the 16th International Conference on Database Theory, pages 10-20, 2013.
24. Nikolaos Tziavelis, Deepak Ajwani, Wolfgang Gatterbauer, Mirek Riedewald, and Xiaofeng Yang. Optimal algorithms for ranked enumeration of answers to full conjunctive queries. VLDB, 13(9):1582-1597, 2020.
25. Nikolaos Tziavelis, Wolfgang Gatterbauer, and Mirek Riedewald. Optimal join algorithms meet top-k. In SIGMOD, pages 2659-2665. ACM, 2020.
X

Feedback for Dagstuhl Publishing