Diversity of Answers to Conjunctive Queries

Merkl, Timo Camillo; Pichler, Reinhard; Skritek, Sebastian

doi:10.4230/LIPIcs.ICDT.2023.10

Abstract

Enumeration problems aim at outputting, without repetition, the set of solutions to a given problem instance. However, outputting the entire solution set may be prohibitively expensive if it is too big. In this case, outputting a small, sufficiently diverse subset of the solutions would be preferable. This leads to the Diverse-version of the original enumeration problem, where the goal is to achieve a certain level d of diversity by selecting k solutions. In this paper, we look at the Diverse-version of the query answering problem for Conjunctive Queries and extensions thereof. That is, we study the problem if it is possible to achieve a certain level d of diversity by selecting k answers to the given query and, in the positive case, to actually compute such k answers.

Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
Paola Alimonti and Viggo Kann. Hardness of approximating problems on cubic graphs. In Gian Carlo Bongiovanni, Daniel P. Bovet, and Giuseppe Di Battista, editors, Algorithms and Complexity, Third Italian Conference, CIAC '97, Rome, Italy, March 12-14, 1997, Proceedings, volume 1203 of Lecture Notes in Computer Science, pages 288-298. Springer, 1997. URL: https://doi.org/10.1007/3-540-62592-5_80.
Antoine Amarilli, Louis Jachiet, Martin Muñoz, and Cristian Riveros. Efficient enumeration for annotated grammars. In Leonid Libkin and Pablo Barceló, editors, PODS '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 291-300. ACM, 2022. URL: https://doi.org/10.1145/3517804.3526232.
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. Efficient logspace classes for enumeration, counting, and uniform generation. In Proc. PODS 2019, pages 59-73. ACM, 2019. URL: https://doi.org/10.1145/3294052.3319704.
Julien Baste, Michael R. Fellows, Lars Jaffke, Tomás Masarík, Mateus de Oliveira Oliveira, Geevarghese Philip, and Frances A. Rosamond. Diversity of solutions: An exploration through the lens of fixed-parameter tractability theory. Artif. Intell., 303:103644, 2022. URL: https://doi.org/10.1016/j.artint.2021.103644.
Hans L. Bodlaender. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput., 25(6):1305-1317, 1996. URL: https://doi.org/10.1137/S0097539793251219.
Angela Bonifati, Wim Martens, and Thomas Timm. An analytical study of large SPARQL query logs. VLDB J., 29(2-3):655-679, 2020. URL: https://doi.org/10.1007/s00778-019-00558-9.
Endre Boros, Benny Kimelfeld, Reinhard Pichler, and Nicole Schweikardt. Enumeration in data management (dagstuhl seminar 19211). Dagstuhl Reports, 9(5):89-109, 2019. URL: https://doi.org/10.4230/DagRep.9.5.89.
Nofar Carmeli, Shai Zeevi, Christoph Berkholz, Benny Kimelfeld, and Nicole Schweikardt. Answering (unions of) conjunctive queries using random access and random-order enumeration. In Proc. PODS 2020, pages 393-409. ACM, 2020. URL: https://doi.org/10.1145/3375395.3387662.
Ashok K. Chandra and Philip M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In John E. Hopcroft, Emily P. Friedman, and Michael A. Harrison, editors, Proceedings of the 9th Annual ACM Symposium on Theory of Computing, May 4-6, 1977, Boulder, Colorado, USA, pages 77-90. ACM, 1977. URL: https://doi.org/10.1145/800105.803397.
Surajit Chaudhuri and Rajeev Motwani. On sampling and relational operators. IEEE Data Eng. Bull., 22(4):41-46, 1999. URL: http://sites.computer.org/debull/99dec/surajit.ps.
Miroslav Chlebík and Janka Chlebíková. Hard coloring problems in low degree planar bipartite graphs. Discret. Appl. Math., 154(14):1960-1965, 2006. URL: https://doi.org/10.1016/j.dam.2006.03.014.
Ting Deng and Wenfei Fan. On the complexity of query result diversification. ACM Trans. Database Syst., 39(2):15:1-15:46, 2014. URL: https://doi.org/10.1145/2602136.
Rodney G. Downey and Michael R. Fellows. Parameterized Complexity. Monographs in Computer Science. Springer, 1999. URL: https://doi.org/10.1007/978-1-4612-0515-9.
Thomas Eiter, Esra Erdem, Halit Erdogan, and Michael Fink. Finding similar/diverse solutions in answer set programming. Theory Pract. Log. Program., 13(3):303-359, 2013. URL: https://doi.org/10.1017/S1471068411000548.
Henning Fernau, Petr A. Golovach, and Marie-France Sagot. Algorithmic enumeration: Output-sensitive, input-sensitive, parameterized, approximative (dagstuhl seminar 18421). Dagstuhl Reports, 8(10):63-86, 2018. URL: https://doi.org/10.4230/DagRep.8.10.63.
Wolfgang Fischl, Georg Gottlob, Davide Mario Longo, and Reinhard Pichler. Hyperbench: A benchmark and tool for hypergraphs and empirical findings. ACM J. Exp. Algorithmics, 26:1.6:1-1.6:40, 2021. URL: https://doi.org/10.1145/3440015.
Georg Gottlob, Nicola Leone, and Francesco Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci., 64(3):579-627, 2002. URL: https://doi.org/10.1006/jcss.2001.1809.
Marc H. Graham. On The Universal Relation. Technical report, University of Toronto, 1979.
Emmanuel Hebrard, Brahim Hnich, Barry O'Sullivan, and Toby Walsh. Finding diverse and similar solutions in constraint programming. In Manuela M. Veloso and Subbarao Kambhampati, editors, Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA, pages 372-377. AAAI Press / The MIT Press, 2005. URL: http://www.aaai.org/Library/AAAI/2005/aaai05-059.php.
Linnea Ingmar, Maria Garcia de la Banda, Peter J. Stuckey, and Guido Tack. Modelling diversity of solutions. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 1528-1535. AAAI Press, 2020. URL: https://aaai.org/ojs/index.php/AAAI/article/view/5512.
David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119-123, 1988. URL: https://doi.org/10.1016/0020-0190(88)90065-8.
Ton Kloks. Treewidth, Computations and Approximations, volume 842 of Lecture Notes in Computer Science. Springer, 1994. URL: https://doi.org/10.1007/BFb0045375.
Yasuaki Kobayashi, Kazuhiro Kurita, and Kunihiro Wasa. Linear-delay enumeration for minimal steiner problems. In Leonid Libkin and Pablo Barceló, editors, PODS '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 301-313. ACM, 2022. URL: https://doi.org/10.1145/3517804.3524148.
Viktor Leis, Bernhard Radke, Andrey Gubichev, Alfons Kemper, and Thomas Neumann. Cardinality estimation done right: Index-based join sampling. In Proc. CIDR 2017. www.cidrdb.org, 2017. URL: http://cidrdb.org/cidr2017/papers/p9-leis-cidr17.pdf.
Feifei Li, Bin Wu, Ke Yi, and Zhuoyue Zhao. Wander join and XDB: online aggregation via random walks. ACM Trans. Database Syst., 44(1):2:1-2:41, 2019. URL: https://doi.org/10.1145/3284551.
Carsten Lutz and Marcin Przybylko. Efficiently enumerating answers to ontology-mediated queries. In Leonid Libkin and Pablo Barceló, editors, PODS '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, pages 277-289. ACM, 2022. URL: https://doi.org/10.1145/3517804.3524166.
Timo Camillo Merkl, Reinhard Pichler, and Sebastian Skritek. Diversity of answers to conjunctive queries. CoRR, arXiv:2301.08848, 2023. URL: https://doi.org/10.48550/arXiv.2301.08848.
Alexander Nadel. Generating diverse solutions in SAT. In Karem A. Sakallah and Laurent Simon, editors, Theory and Applications of Satisfiability Testing - SAT 2011 - 14th International Conference, SAT 2011, Ann Arbor, MI, USA, June 19-22, 2011. Proceedings, volume 6695 of Lecture Notes in Computer Science, pages 287-301. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-21581-0_23.
Thierry Petit and Andrew C. Trapp. Finding diverse solutions of high quality to constraint optimization problems. In Qiang Yang and Michael J. Wooldridge, editors, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 260-267. AAAI Press, 2015. URL: http://ijcai.org/Abstract/15/043.
Neil Robertson and Paul D. Seymour. Graph minors. III. planar tree-width. J. Comb. Theory, Ser. B, 36(1):49-64, 1984. URL: https://doi.org/10.1016/0095-8956(84)90013-3.
Marko Samer and Stefan Szeider. Algorithms for propositional model counting. J. Discrete Algorithms, 8(1):50-64, 2010. URL: https://doi.org/10.1016/j.jda.2009.06.002.
Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In Harry R. Lewis, Barbara B. Simons, Walter A. Burkhard, and Lawrence H. Landweber, editors, Proceedings of the 14th Annual ACM Symposium on Theory of Computing, May 5-7, 1982, San Francisco, California, USA, pages 137-146. ACM, 1982. URL: https://doi.org/10.1145/800070.802186.
Mihalis Yannakakis. Algorithms for acyclic database schemes. In Very Large Data Bases, 7th International Conference, September 9-11, 1981, Cannes, France, Proceedings, pages 82-94. IEEE Computer Society, 1981.
C. T. Yu and M. Z. Özsoyoglu. An algorithm for tree-query membership of a distributed query. In The IEEE Computer Society’s Third International Computer Software and Applications Conference, COMPSAC 1979, pages 306-312, 1979.
Zhuoyue Zhao, Robert Christensen, Feifei Li, Xiao Hu, and Ke Yi. Random sampling over joins revisited. In Proc. SIGMOD 2018, pages 1525-1539. ACM, 2018. URL: https://doi.org/10.1145/3183713.3183739.
Kaiping Zheng, Hongzhi Wang, Zhixin Qi, Jianzhong Li, and Hong Gao. A survey of query result diversification. Knowl. Inf. Syst., 51(1):1-36, 2017. URL: https://doi.org/10.1007/s10115-016-0990-4.

Diversity of Answers to Conjunctive Queries

Authors Timo Camillo Merkl, Reinhard Pichler, Sebastian Skritek

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Diversity of Answers to Conjunctive Queries

Authors Timo Camillo Merkl, Reinhard Pichler, Sebastian Skritek

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message