Expressivity and Complexity of MongoDB Queries

Authors Elena Botoeva, Diego Calvanese, Benjamin Cogrel, Guohui Xiao



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2018.9.pdf
  • Filesize: 0.81 MB
  • 23 pages

Document Identifiers

Author Details

Elena Botoeva
Diego Calvanese
Benjamin Cogrel
Guohui Xiao

Cite AsGet BibTex

Elena Botoeva, Diego Calvanese, Benjamin Cogrel, and Guohui Xiao. Expressivity and Complexity of MongoDB Queries. In 21st International Conference on Database Theory (ICDT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 98, pp. 9:1-9:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ICDT.2018.9

Abstract

In this paper, we consider MongoDB, a widely adopted but not formally understood database system managing JSON documents and equipped with a powerful query mechanism, called the aggregation framework. We provide a clean formal abstraction of this query language, which we call MQuery. We study the expressivity of MQuery, showing the equivalence of its well-typed fragment with nested relational algebra. We further investigate the computational complexity of significant fragments of it, obtaining several (tight) bounds in combined complexity, which range from LogSpace to alternating exponential-time with a polynomial number of alternations.
Keywords
  • MongoDB
  • NoSQL
  • aggregation framework
  • expressivity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Serge Abiteboul and Catriel Beeri. The power of languages for the manipulation of complex values. Very Large Database Journal, 4(4):727-794, 1995. URL: http://www.vldb.org/journal/VLDBJ4/P727.pdf.
  2. Elena Botoeva, Diego Calvanese, Benjamin Cogrel, Martin Rezk, and Guohui Xiao. OBDA beyond relational DBs: A study for MongoDB. In Proc. of the 29th Int. Workshop on Description Logics (DL), volume 1577 of CEUR Workshop Proceedings, http://ceur-ws.org/, 2016.
  3. Elena Botoeva, Diego Calvanese, Benjamin Cogrel, and Guohui Xiao. Expressivity and complexity of MongoDB (Extended version). CoRR Technical Report arXiv:1603.09291, arXiv.org e-Print archive, 2017. Available at URL: http://arxiv.org/abs/1603.09291.
  4. Pierre Bourhis, Juan L. Reutter, Fernando Suárez, and Domagoj Vrgoc. JSON: data model, query languages and schema specification. In Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts, editors, Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 123-135. ACM, 2017. URL: http://dx.doi.org/10.1145/3034786.3056120.
  5. Peter Buneman, Shamim A. Naqvi, Val Tannen, and Limsoon Wong. Principles of programming with complex objects and collection types. Theor. Comput. Sci., 149(1):3-48, 1995. URL: http://dx.doi.org/10.1016/0304-3975(95)00024-Q.
  6. Ashok K. Chandra, Dexter Kozen, and Larry J. Stockmeyer. Alternation. J. ACM, 28(1):114-133, 1981. URL: http://dx.doi.org/10.1145/322234.322243.
  7. Evgeny Dantsin and Andrei Voronkov. Complexity of query answering in logic databases with complex values. In Sergei I. Adian and Anil Nerode, editors, Logical Foundations of Computer Science, 4th International Symposium, LFCS'97, Yaroslavl, Russia, July 6-12, 1997, Proceedings, volume 1234 of Lecture Notes in Computer Science, pages 56-66. Springer, 1997. URL: http://dx.doi.org/10.1007/3-540-63045-7_7.
  8. Jan Van den Bussche. Simulation of the nested relational algebra by the flat relational algebra, with an application to the complexity of evaluating powerset algebra expressions. Theor. Comput. Sci., 254(1-2):363-377, 2001. URL: http://dx.doi.org/10.1016/S0304-3975(99)00301-1.
  9. Jan Van den Bussche and Jan Paredaens. The expressive power of complex values in object-based data models. Inf. Comput., 120(2):220-236, 1995. URL: http://dx.doi.org/10.1006/inco.1995.1110.
  10. Daniela Florescu and Ghislain Fourny. Jsoniq: The history of a query language. IEEE Internet Computing, 17(5):86-90, 2013. URL: http://dx.doi.org/10.1109/MIC.2013.97.
  11. Stéphane Grumbach and Victor Vianu. Tractable query languages for complex object databases. In Daniel J. Rosenkrantz, editor, Proceedings of the Tenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 29-31, 1991, Denver, Colorado, USA, pages 315-327. ACM Press, 1991. URL: http://dx.doi.org/10.1145/113413.113442.
  12. Jan Hidders, Jan Paredaens, and Jan Van den Bussche. J-logic: Logical foundations for JSON querying. In Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts, editors, Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 137-149. ACM, 2017. URL: http://dx.doi.org/10.1145/3034786.3056106.
  13. Gerhard Jaeschke and Hans-Jörg Schek. Remarks on the algebra of non first normal form relations. In Jeffrey D. Ullman and Alfred V. Aho, editors, Proceedings of the ACM Symposium on Principles of Database Systems, March 29-31, 1982, Los Angeles, California, USA, pages 124-138. ACM, 1982. URL: http://dx.doi.org/10.1145/588111.588133.
  14. David S. Johnson. A catalog of complexity classes. In Handbook of Theoretical Computer Science, volume A, chapter 2, pages 67-161. Elsevier Science Publishers, 1990. Google Scholar
  15. Christoph Koch. On the complexity of nonrecursive XQuery and functional query languages on complex values. ACM Trans. on Database Systems, 31(4):1215-1256, 2006. URL: http://dx.doi.org/10.1145/1189769.1189771.
  16. Kian Win Ong, Yannis Papakonstantinou, and Romain Vernoux. The SQL++ semi-structured data model and query language: A capabilities survey of SQL-on-Hadoop, NoSQL and NewSQL databases. CoRR Technical Report arXiv:1405.3631, arXiv.org e-Print archive, 2017. Available at URL: http://arxiv.org/abs/1405.3631.
  17. Felipe Pezoa, Juan L. Reutter, Fernando Suárez, Martín Ugarte, and Domagoj Vrgoc. Foundations of JSON schema. In Jacqueline Bourdeau, Jim Hendler, Roger Nkambou, Ian Horrocks, and Ben Y. Zhao, editors, Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016, pages 263-273. ACM, 2016. URL: http://dx.doi.org/10.1145/2872427.2883029.
  18. Antonella Poggi, Domenico Lembo, Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Riccardo Rosati. Linking data to ontologies. J. Data Semantics, 10:133-173, 2008. URL: http://dx.doi.org/10.1007/978-3-540-77688-8_5.
  19. Stan J. Thomas and Patrick C. Fischer. Nested relational structures. Advances in Computing Research, 3:269-307, 1986. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail