Ranked Enumeration of Conjunctive Query Results

Authors Shaleen Deep, Paraschos Koutris



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2021.5.pdf
  • Filesize: 0.91 MB
  • 19 pages

Document Identifiers

Author Details

Shaleen Deep
  • Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA
Paraschos Koutris
  • Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, USA

Cite As Get BibTex

Shaleen Deep and Paraschos Koutris. Ranked Enumeration of Conjunctive Query Results. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 5:1-5:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021) https://doi.org/10.4230/LIPIcs.ICDT.2021.5

Abstract

We study the problem of enumerating answers of Conjunctive Queries ranked according to a given ranking function. Our main contribution is a novel algorithm with small preprocessing time, logarithmic delay, and non-trivial space usage during execution. To allow for efficient enumeration, we exploit certain properties of ranking functions that frequently occur in practice. To this end, we introduce the notions of decomposable and compatible (w.r.t. a query decomposition) ranking functions, which allow for partial aggregation of tuple scores in order to efficiently enumerate the output. We complement the algorithmic results with lower bounds that justify why restrictions on the structure of ranking functions are necessary. Our results extend and improve upon a long line of work that has studied ranked enumeration from both a theoretical and practical perspective.

Subject Classification

ACM Subject Classification
  • Theory of computation → Database theory
Keywords
  • Query result enumeration
  • joins
  • ranking

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Mahmoud Abo Khamis, Hung Q Ngo, and Dan Suciu. What do shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another? In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 429-444. ACM, 2017. Google Scholar
  2. Albert Atserias, Martin Grohe, and Dániel Marx. Size bounds and query plans for relational joins. SIAM Journal on Computing, 42(4):1737-1767, 2013. Google Scholar
  3. Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. On acyclic conjunctive queries and constant delay enumeration. In International Workshop on Computer Science Logic, pages 208-222. Springer, 2007. Google Scholar
  4. Nurzhan Bakibayev, Tomáš Kočisky, Dan Olteanu, and Jakub Závodny. Aggregation and ordering in factorised databases. Proceedings of the VLDB Endowment, 6(14):1990-2001, 2013. Google Scholar
  5. Nurzhan Bakibayev, Dan Olteanu, and Jakub Závodny. Fdb: A query engine for factorised relational databases. Proceedings of the VLDB Endowment, 5(11):1232-1243, 2012. Google Scholar
  6. Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering conjunctive queries under updates. In proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems, pages 303-318. ACM, 2017. Google Scholar
  7. Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering fo+ mod queries under updates on bounded degree databases. ACM Transactions on Database Systems (TODS), 43(2):7, 2018. Google Scholar
  8. Endre Boros, Benny Kimelfeld, Reinhard Pichler, and Nicole Schweikardt. Enumeration in Data Management (Dagstuhl Seminar 19211). Dagstuhl Reports, 9(5):89-109, 2019. URL: https://doi.org/10.4230/DagRep.9.5.89.
  9. David Bremner, Timothy M Chan, Erik D Demaine, Jeff Erickson, Ferran Hurtado, John Iacono, Stefan Langerman, and Perouz Taslakian. Necklaces, convolutions, and x+ y. In European Symposium on Algorithms, pages 160-171. Springer, 2006. Google Scholar
  10. Lijun Chang, Xuemin Lin, Wenjie Zhang, Jeffrey Xu Yu, Ying Zhang, and Lu Qin. Optimal enumeration: Efficient top-k tree matching. Proceedings of the VLDB Endowment, 8(5):533-544, 2015. Google Scholar
  11. Amrita Roy Chowdhury, Theodoros Rekatsinas, and Somesh Jha. Data-dependent differentially private parameter learning for directed graphical models. In International Conference on Machine Learning, pages 1939-1951. PMLR, 2020. Google Scholar
  12. Sara Cohen and Yehoshua Sagiv. An incremental algorithm for computing ranked full disjunctions. Journal of Computer and System Sciences, 73(4):648-668, 2007. Google Scholar
  13. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009. Google Scholar
  14. Shaleen Deep, Xiao Hu, and Paraschos Koutris. Fast join project query evaluation using matrix multiplication. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 1213-1223, 2020. Google Scholar
  15. Shaleen Deep, Xiao Hu, and Paraschos Koutris. Enumeration algorithms for conjunctive queries with projection. In To appear the the proceedings of ICDT '21 Proceedings, 2021. Google Scholar
  16. Shaleen Deep and Paraschos Koutris. Compressed representations of conjunctive query results. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 307-322. ACM, 2018. Google Scholar
  17. Shaleen Deep and Paraschos Koutris. Ranked enumeration of conjunctive query results. arXiv preprint arXiv:1902.02698, 2019. Google Scholar
  18. Erik D Demaine and Joseph O'Rourke. Open problems from cccg 2005. In Canadian Conference on Computational Geometry, pages 75-80, 2005. Google Scholar
  19. David Eppstein. Finding the k shortest paths. SIAM Journal on computing, 28(2):652-673, 1998. Google Scholar
  20. Ronald Fagin. Combining fuzzy information: an overview. ACM SIGMOD Record, 31(2):109-118, 2002. Google Scholar
  21. Ronald Fagin, Amnon Lotem, and Moni Naor. Optimal aggregation algorithms for middleware. Journal of computer and system sciences, 66(4):614-656, 2003. Google Scholar
  22. Michael L Fredman. How good is the information theory bound in sorting? Theoretical Computer Science, 1(4):355-361, 1976. Google Scholar
  23. Manish Gupta, Jing Gao, Xifeng Yan, Hasan Cam, and Jiawei Han. Top-k interesting subgraph discovery in information networks. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 820-831. IEEE, 2014. Google Scholar
  24. John E Hopcroft, Jeffrey D Ullman, and AV Aho. The design and analysis of computer algorithms, 1975. Google Scholar
  25. Ihab F Ilyas, George Beskales, and Mohamed A Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 40(4):11, 2008. Google Scholar
  26. Ihab F Ilyas, Rahul Shah, Walid G Aref, Jeffrey Scott Vitter, and Ahmed K Elmagarmid. Rank-aware query optimization. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 203-214. ACM, 2004. Google Scholar
  27. Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. Trade-offs in static and dynamic evaluation of hierarchical queries. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 375-392, 2020. Google Scholar
  28. Benny Kimelfeld and Yehoshua Sagiv. Incrementally computing ordered answers of acyclic conjunctive queries. In International Workshop on Next Generation Information Technologies and Systems, pages 141-152. Springer, 2006. Google Scholar
  29. Chengkai Li, Kevin Chen-Chuan Chang, Ihab F Ilyas, and Sumin Song. Ranksql: query algebra and optimization for relational top-k queries. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pages 131-142. ACM, 2005. Google Scholar
  30. Chengkai Li, Mohamed A Soliman, Kevin Chen-Chuan Chang, and Ihab F Ilyas. Ranksql: supporting ranking queries in relational database management systems. In Proceedings of the 31st international conference on Very large data bases, pages 1342-1345. VLDB Endowment, 2005. Google Scholar
  31. Dániel Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. Journal of the ACM (JACM), 60(6):42, 2013. Google Scholar
  32. Apostol Natsev, Yuan-Chi Chang, John R Smith, Chung-Sheng Li, and Jeffrey Scott Vitter. Supporting incremental join queries on ranked inputs. In VLDB, volume 1, pages 281-290, 2001. Google Scholar
  33. Hung Q Ngo, Ely Porat, Christopher Ré, and Atri Rudra. Worst-case optimal join algorithms. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems, pages 37-48. ACM, 2012. Google Scholar
  34. Hung Q. Ngo, Christopher Ré, and Atri Rudra. Skew strikes back: new developments in the theory of join algorithms. SIGMOD Record, 42(4):5-16, 2013. URL: https://doi.org/10.1145/2590989.2590991.
  35. Dan Olteanu and Maximilian Schleich. Factorized databases. ACM SIGMOD Record, 45(2):5-16, 2016. Google Scholar
  36. Dan Olteanu and Jakub Závodný. Size bounds for factorised representations of query results. ACM Trans. Database Syst., 40(1):2, 2015. URL: https://doi.org/10.1145/2656335.
  37. Yan Qi, K Selçuk Candan, and Maria Luisa Sapino. Sum-max monotonic ranked joins for evaluating top-k twig queries on weighted data graphs. In Proceedings of the 33rd international conference on Very large data bases, pages 507-518. VLDB Endowment, 2007. Google Scholar
  38. Christopher Re, Nilesh Dalvi, and Dan Suciu. Efficient top-k query evaluation on probabilistic data. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 886-895. IEEE, 2007. Google Scholar
  39. Amrita Roy Chowdhury, Chenghong Wang, Xi He, Ashwin Machanavajjhala, and Somesh Jha. Crypt?: Crypto-assisted differential privacy on untrusted servers. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 603-619, 2020. Google Scholar
  40. Luc Segoufin. Enumerating with constant delay the answers to a query. In Proceedings of the 16th International Conference on Database Theory, pages 10-20. ACM, 2013. Google Scholar
  41. Luc Segoufin. Constant delay enumeration for conjunctive queries. SIGMOD Record, 44(1):10-17, 2015. URL: https://doi.org/10.1145/2783888.2783894.
  42. Luc Segoufin. Constant delay enumeration for conjunctive queries. ACM SIGMOD Record, 44(1):10-17, 2015. Google Scholar
  43. William L Steiger and Ileana Streinu. A pseudo-algorithmic separation of lines from pseudo-lines. Inf. Process. Lett., 53(5):295-299, 1995. Google Scholar
  44. Xiaofeng Yang, Deepak Ajwani, Wolfgang Gatterbauer, Patrick K Nicholson, Mirek Riedewald, and Alessandra Sala. Any-k: Anytime top-k tree pattern retrieval in labeled graphs. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 489-498. International World Wide Web Conferences Steering Committee, 2018. Google Scholar
  45. Zhaonian Zou, Jianzhong Li, Hong Gao, and Shuo Zhang. Finding top-k maximal cliques in an uncertain graph, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail