Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability

Authors Antoine Amarilli , Timothy van Bremen , Kuldeep S. Meel



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2024.15.pdf
  • Filesize: 0.79 MB
  • 20 pages

Document Identifiers

Author Details

Antoine Amarilli
  • LTCI, Télécom Paris, Institut Polytechnique de Paris, France
Timothy van Bremen
  • National University of Singapore, Singapore
Kuldeep S. Meel
  • University of Toronto, Canada

Acknowledgements

The authors thank Octave Gaspard for pointing out some oversights in some proofs, which are corrected in the present version.

Cite AsGet BibTex

Antoine Amarilli, Timothy van Bremen, and Kuldeep S. Meel. Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 15:1-15:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICDT.2024.15

Abstract

Query evaluation over probabilistic databases is a notoriously intractable problem - not only in combined complexity, but for many natural queries in data complexity as well [Antoine Amarilli et al., 2017; Nilesh N. Dalvi and Dan Suciu, 2012]. This motivates the study of probabilistic query evaluation through the lens of approximation algorithms, and particularly of combined FPRASes, whose runtime is polynomial in both the query and instance size. In this paper, we focus on tuple-independent probabilistic databases over binary signatures, which can be equivalently viewed as probabilistic graphs. We study in which cases we can devise combined FPRASes for probabilistic query evaluation in this setting. We settle the complexity of this problem for a variety of query and instance classes, by proving both approximability and (conditional) inapproximability results. This allows us to deduce many corollaries of possible independent interest. For example, we show how the results of [Marcelo Arenas et al., 2021] on counting fixed-length strings accepted by an NFA imply the existence of an FPRAS for the two-terminal network reliability problem on directed acyclic graphs: this was an open problem until now [Rico Zenklusen and Marco Laumanns, 2011]. We also show that one cannot extend a recent result [Timothy van Bremen and Kuldeep S. Meel, 2023] that gives a combined FPRAS for self-join-free conjunctive queries of bounded hypertree width on probabilistic databases: neither the bounded-hypertree-width condition nor the self-join-freeness hypothesis can be relaxed. Finally, we complement all our inapproximability results with unconditional lower bounds, showing that DNNF provenance circuits must have at least moderately exponential size in combined complexity.

Subject Classification

ACM Subject Classification
  • Theory of computation → Database query processing and optimization (theory)
Keywords
  • Probabilistic query evaluation
  • tuple-independent databases
  • approximation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Antoine Amarilli. Uniform reliability for unbounded homomorphism-closed graph queries. In ICDT, 2023. URL: https://doi.org/10.4230/LIPIcs.ICDT.2023.14.
  2. Antoine Amarilli, Pierre Bourhis, Mikaël Monet, and Pierre Senellart. Combined tractability of query evaluation via tree automata and cycluits. In ICDT, 2017. URL: https://doi.org/10.4230/LIPIcs.ICDT.2017.6.
  3. Antoine Amarilli, Pierre Bourhis, and Pierre Senellart. Provenance circuits for trees and treelike instances. In ICALP, 2015. URL: https://doi.org/10.1007/978-3-662-47666-6_5.
  4. Antoine Amarilli, Florent Capelli, Mikaël Monet, and Pierre Senellart. Connecting knowledge compilation classes and width parameters. ToCS, 2020. URL: https://doi.org/10.1007/s00224-019-09930-2.
  5. Antoine Amarilli and İsmail İlkan Ceylan. The dichotomy of evaluating homomorphism-closed queries on probabilistic graphs. LMCS, 2022. URL: https://doi.org/10.46298/lmcs-18(1:2)2022.
  6. Antoine Amarilli and Benny Kimelfeld. Uniform reliability of self-join-free conjunctive queries. LMCS, 2022. URL: https://doi.org/10.46298/lmcs-18(4:3)2022.
  7. Antoine Amarilli, Mikaël Monet, and Pierre Senellart. Conjunctive queries on probabilistic graphs: Combined complexity. In PODS, 2017. URL: https://doi.org/10.1145/3034786.3056121.
  8. Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. #NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes. J. ACM, 68(6), 2021. Extended version available as arXiv preprint https://arxiv.org/abs/1906.09226 [cs.DS]. URL: https://doi.org/10.1145/3477045.
  9. Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. When is approximate counting for conjunctive queries tractable? In STOC. ACM, 2021. Extended version available as arXiv preprint https://arxiv.org/abs/2005.10029 [cs.DS]. URL: https://doi.org/10.1145/3406325.3451014.
  10. Paul Beame, Jerry Li, Sudeepa Roy, and Dan Suciu. Exact model counting of query expressions: Limitations of propositional methods. TODS, 42(1), 2017. URL: https://doi.org/10.1145/2984632.
  11. Dietmar Berwanger, Anuj Dawar, Paul Hunter, Stephan Kreutzer, and Jan Obdrzálek. The DAG-width of directed graphs. J. Comb. Theory, Ser. B, 102(4), 2012. URL: https://doi.org/10.1016/j.jctb.2012.04.004.
  12. Robert M. Corless, Gaston H. Gonnet, D. E. G. Hare, David J. Jeffrey, and Donald E. Knuth. On the lambert W function. Adv. Comput. Math., 5(1), 1996. URL: https://doi.org/10.1007/BF02124750.
  13. Nilesh N. Dalvi and Dan Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004. URL: https://doi.org/10.1016/B978-012088469-8.50076-0.
  14. Nilesh N. Dalvi and Dan Suciu. The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM, 59(6), 2012. URL: https://doi.org/10.1145/2395116.2395119.
  15. Adnan Darwiche. Decomposable negation normal form. J. ACM, 48(4), 2001. URL: https://doi.org/10.1145/502090.502091.
  16. Adnan Darwiche and Pierre Marquis. A knowledge compilation map. J. Artif. Intell. Res., 17, 2002. URL: https://doi.org/10.1613/jair.989.
  17. Weiming Feng and Heng Guo. An FPRAS for two terminal reliability in directed acyclic graphs, 2023. URL: https://doi.org/10.48550/arXiv.2310.00938.
  18. Martin Grohe and Dániel Marx. On tree width, bramble size, and expansion. J. Comb. Theory, Ser. B, 99(1), 2009. URL: https://doi.org/10.1016/j.jctb.2008.06.004.
  19. Heng Guo and Mark Jerrum. A polynomial-time approximation algorithm for all-terminal network reliability. SIAM J. Comput., 48(3), 2019. URL: https://doi.org/10.1137/18M1201846.
  20. Tomasz Imielinski and Witold Lipski Jr. Incomplete information in relational databases. J. ACM, 31(4), 1984. URL: https://doi.org/10.1145/1634.1886.
  21. Abhay Kumar Jha and Dan Suciu. Knowledge compilation meets database theory: Compiling queries to decision diagrams. ToCS, 52(3), 2013. URL: https://doi.org/10.1007/s00224-012-9392-5.
  22. Ravi Kannan. Markov chains and polynomial time algorithms. In FOCS. IEEE, 1994. URL: https://doi.org/10.1109/SFCS.1994.365726.
  23. David R. Karger. A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM Rev., 43(3), 2001. URL: https://doi.org/10.1137/S0036144501387141.
  24. Batya Kenig and Dan Suciu. A dichotomy for the generalized model counting problem for unions of conjunctive queries. In PODS, 2021. URL: https://doi.org/10.1145/3452021.3458313.
  25. Jingcheng Liu and Pinyan Lu. FPTAS for counting monotone CNF. In SODA. SIAM, 2015. URL: https://doi.org/10.1137/1.9781611973730.101.
  26. Mikaël Monet. Combined complexity of probabilistic query evaluation. (Complexité combinée d'évaluation de requêtes sur des données probabilistes). PhD thesis, University of Paris-Saclay, France, 2018. URL: https://pastel.archives-ouvertes.fr/tel-01980366.
  27. Mikaël Monet. Solving a special case of the intensional vs extensional conjecture in probabilistic databases. In PODS. ACM, 2020. URL: https://doi.org/10.1145/3375395.3387642.
  28. Knot Pipatsrisawat and Adnan Darwiche. New compilation languages based on structured decomposability. In AAAI. AAAI Press, 2008. URL: http://www.aaai.org/Library/AAAI/2008/aaai08-082.php.
  29. J. Scott Provan and Michael O. Ball. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput., 12(4), 1983. URL: https://doi.org/10.1137/0212053.
  30. Pierre Senellart. Provenance in databases: Principles and applications. In Reasoning Web, volume 11810 of LNCS. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-31423-1_3.
  31. Allan Sly. Computational transition at the uniqueness threshold. In FOCS. IEEE Computer Society, 2010. URL: https://doi.org/10.1109/FOCS.2010.34.
  32. Dan Suciu, Dan Olteanu, Christopher Ré, and Christoph Koch. Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. ISBN: 978-1608456802. URL: https://doi.org/10.2200/S00362ED1V01Y201105DTM016.
  33. Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3), 1979. URL: https://doi.org/10.1137/0208032.
  34. Timothy van Bremen and Kuldeep S. Meel. Probabilistic query evaluation: The combined FPRAS landscape. In PODS. ACM, 2023. URL: https://doi.org/10.1145/3584372.3588677.
  35. Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In STOC. ACM, 1982. URL: https://doi.org/10.1145/800070.802186.
  36. Mihalis Yannakakis. Algorithms for acyclic database schemes. In VLDB. IEEE Computer Society, 1981. Google Scholar
  37. Rico Zenklusen and Marco Laumanns. High-confidence estimation of small s-t reliabilities in directed acyclic networks. Networks, 57(4), 2011. URL: https://doi.org/10.1002/net.20412.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail