Ranked Enumeration of MSO Logic on Words

Authors Pierre Bourhis, Alejandro Grez, Louis Jachiet, Cristian Riveros

Thumbnail PDF


  • Filesize: 0.77 MB
  • 19 pages

Document Identifiers

Author Details

Pierre Bourhis
  • CNRS Lille, CRIStAL UMR 9189, University of Lille, INRIA Lille, France
Alejandro Grez
  • Pontificia Universidad Católica de Chile, Santiago, Chile
  • Millennium Institute for Foundational Research on Data, Santiago, Chile
Louis Jachiet
  • LTCI, IP Paris, France
Cristian Riveros
  • Pontificia Universidad Católica de Chile, Santiago, Chile
  • Millennium Institute for Foundational Research on Data, Santiago, Chile

Cite AsGet BibTex

Pierre Bourhis, Alejandro Grez, Louis Jachiet, and Cristian Riveros. Ranked Enumeration of MSO Logic on Words. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 20:1-20:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


In the last years, enumeration algorithms with bounded delay have attracted a lot of attention for several data management tasks. Given a query and the data, the task is to preprocess the data and then enumerate all the answers to the query one by one and without repetitions. This enumeration scheme is typically useful when the solutions are treated on the fly or when we want to stop the enumeration once the pertinent solutions have been found. However, with the current schemes, there is no restriction on the order how the solutions are given and this order usually depends on the techniques used and not on the relevance for the user. In this paper we study the enumeration of monadic second order logic (MSO) over words when the solutions are ranked. We present a framework based on MSO cost functions that allows to express MSO formulae on words with a cost associated with each solution. We then demonstrate the generality of our framework which subsumes, for instance, document spanners and adds ranking to them. The main technical result of the paper is an algorithm for enumerating all the solutions of formulae in increasing order of cost efficiently, namely, with a linear preprocessing phase and logarithmic delay between solutions. The novelty of this algorithm is based on using functional data structures, in particular, by extending functional Brodal queues to suit with the ranked enumeration of MSO on words.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures and algorithms for data management
  • Theory of computation → Complexity theory and logic
  • Theory of computation → Formal languages and automata theory
  • Persistent data structures
  • Query evaluation
  • Enumeration algorithms


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Alfred V Aho and John E Hopcroft. The design and analysis of computer algorithms. Pearson Education India, 1974. Google Scholar
  2. Antoine Amarilli, Pierre Bourhis, Stefan Mengel, and Matthias Niewerth. Constant-delay enumeration for nondeterministic document spanners. In ICDT, pages 22:1-22:19, 2019. Google Scholar
  3. Guillaume Bagan. Mso queries on tree decomposable structures are computable with linear delay. In International Workshop on Computer Science Logic, pages 167-181. Springer, 2006. Google Scholar
  4. Gerth Stølting Brodal and Chris Okasaki. Optimal purely functional priority queues. Journal of Functional Programming, 6(6):839-857, 1996. Google Scholar
  5. Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to algorithms. MIT press, 2009. Google Scholar
  6. Bruno Courcelle. Linear delay enumeration and monadic second-order logic. Discrete Applied Mathematics, 157(12):2675-2700, 2009. Google Scholar
  7. Shaleen Deep and Paraschos Koutris. Ranked enumeration of conjunctive query results. CoRR, abs/1902.02698, 2019. Google Scholar
  8. Johannes Doleschal, Noa Bratman, Benny Kimelfeld, and Wim Martens. The complexity of aggregates over extractions by regular expressions. In ICDT, 2021. Google Scholar
  9. Johannes Doleschal, Benny Kimelfeld, Wim Martens, and Liat Peterfreund. Weight annotation in information extraction. In ICDT, volume 155, pages 8:1-8:18, 2020. Google Scholar
  10. James R Driscoll, Neil Sarnak, Daniel Dominic Sleator, and Robert Endre Tarjan. Making data structures persistent. In STOC, pages 109-121, 1986. Google Scholar
  11. Manfred Droste and Paul Gastin. Weighted automata and weighted logics. In ICALP, volume 3580, pages 513-525, 2005. Google Scholar
  12. Manfred Droste, Werner Kuich, and Heiko Vogler. Handbook of weighted automata. Springer Science & Business Media, 2009. Google Scholar
  13. Ronald Fagin, Benny Kimelfeld, Frederick Reiss, and Stijn Vansummeren. Document spanners: A formal approach to information extraction. J. ACM, 62(2):12:1-12:51, 2015. Google Scholar
  14. Fernando Florenzano, Cristian Riveros, Martín Ugarte, Stijn Vansummeren, and Domagoj Vrgoc. Efficient enumeration algorithms for regular document spanners. ACM Trans. Database Syst., 45(1):3:1-3:42, 2020. Google Scholar
  15. Dominik D Freydenberger, Benny Kimelfeld, and Liat Peterfreund. Joining extractions of regular expressions. In Proceedings of PODS, pages 137-149, 2018. Google Scholar
  16. Markus Frick and Martin Grohe. The complexity of first-order and monadic second-order logic revisited. Ann. Pure Appl. Log., 130(1-3):3-31, 2004. Google Scholar
  17. Jonathan S Golan. Semirings and their Applications. Springer Science & Business Media, 2013. Google Scholar
  18. Alejandro Grez, Cristian Riveros, and Martín Ugarte. A Formal Framework for Complex Event Processing. In ICDT, pages 5:1-5:18, 2019. Google Scholar
  19. Stephan Kreutzer and Cristian Riveros. Quantitative monadic second-order logic. In LICS, pages 113-122, 2013. Google Scholar
  20. Leonid Libkin. Elements of finite model theory. Springer Science & Business Media, 2013. Google Scholar
  21. Francisco Maturana, Cristian Riveros, and Domagoj Vrgoc. Document spanners for extracting incomplete information: Expressiveness and complexity. In Proceedings of PODS, pages 125-136. ACM, 2018. Google Scholar
  22. Klaus Reinhardt. The complexity of translating logic to finite automata. In Automata logics, and infinite games, pages 231-238. Springer, 2002. Google Scholar
  23. Luc Segoufin. Enumerating with constant delay the answers to a query. In Proceedings of the 16th International Conference on Database Theory, pages 10-20, 2013. Google Scholar
  24. Nikolaos Tziavelis, Deepak Ajwani, Wolfgang Gatterbauer, Mirek Riedewald, and Xiaofeng Yang. Optimal algorithms for ranked enumeration of answers to full conjunctive queries. VLDB, 13(9):1582-1597, 2020. Google Scholar
  25. Nikolaos Tziavelis, Wolfgang Gatterbauer, and Mirek Riedewald. Optimal join algorithms meet top-k. In SIGMOD, pages 2659-2665. ACM, 2020. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail