A Simple Algorithm for Worst Case Optimal Join and Sampling

Authors Florent Capelli , Oliver Irwin , Sylvain Salvati



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2025.23.pdf
  • Filesize: 0.86 MB
  • 19 pages

Document Identifiers

Author Details

Florent Capelli
  • Université d'Artois, CNRS, UMR 8188 - CRIL, F-62300 Lens, France
Oliver Irwin
  • Université de Lille, CNRS, Inria, UMR 9189 - CRIStAL, F-59000 Lille, France
Sylvain Salvati
  • Université de Lille, CNRS, Inria, UMR 9189 - CRIStAL, F-59000 Lille, France

Cite As Get BibTex

Florent Capelli, Oliver Irwin, and Sylvain Salvati. A Simple Algorithm for Worst Case Optimal Join and Sampling. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 23:1-23:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025) https://doi.org/10.4230/LIPIcs.ICDT.2025.23

Abstract

We present an elementary branch and bound algorithm with a simple analysis of why it achieves worstcase optimality for join queries on classes of databases defined respectively by cardinality or acyclic degree constraints. We then show that if one is given a reasonable way for recursively estimating upper bounds on the number of answers of the join queries, our algorithm can be turned into algorithm for uniformly sampling answers with expected running time Õ(UP/OUT) where UP is the upper bound, OUT is the actual number of answers and Õ(⋅) ignores polylogarithmic factors. Our approach recovers recent results on worstcase optimal join algorithm and sampling in a modular, clean and elementary way.

Subject Classification

ACM Subject Classification
  • Information systems → Relational database model
  • Theory of computation → Branch-and-bound
Keywords
  • join queries
  • worst-case optimality
  • uniform sampling

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Albert Atserias, Martin Grohe, and Dániel Marx. Size Bounds and Query Plans for Relational Joins. SIAM Journal on Computing, 42(4):1737-1767, 2013. URL: https://doi.org/10.1137/110859440.
  2. Ashok K. Chandra and Philip M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, STOC '77, pages 77-90, New York, NY, USA, 1977. ACM. URL: https://doi.org/10.1145/800105.803397.
  3. Yu Chen and Ke Yi. Random sampling and size estimation over cyclic joins. In Carsten Lutz and Jean Christoph Jung, editors, 23rd International Conference on Database Theory, ICDT 2020, March 30-April 2, 2020, Copenhagen, Denmark, volume 155 of LIPIcs, pages 7:1-7:18. Schloss-Dagstuhl-Leibniz Zentrum für Informatik, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. URL: https://doi.org/10.4230/LIPIcs.ICDT.2020.7.
  4. Zbigniew J Czech, George Havas, and Bohdan S Majewski. Perfect hashing. Theoretical Computer Science, 182(1-2):1-143, 1997. URL: https://doi.org/10.1016/S0304-3975(96)00146-6.
  5. Shiyuan Deng, Shangqi Lu, and Yufei Tao. On Join Sampling and the Hardness of Combinatorial Output-Sensitive Join Algorithms. In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 99-111. ACM, 2023. URL: https://doi.org/10.1145/3584372.3588666.
  6. Ehud Friedgut. Hypergraphs, entropy, and inequalities. The American Mathematical Monthly, 111(9):749-760, 2004. URL: http://www.jstor.org/stable/4145187.
  7. Martin Grohe and Dániel Marx. Constraint Solving via Fractional Edge Covers. ACM Transactions on Algorithms, 11(1):1-20, 2014. URL: https://doi.org/10.1145/2636918.
  8. Mahmoud Abo Khamis, Hung Q. Ngo, and Dan Suciu. What do shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another? In Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts, editors, Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 429-444. ACM, 2017. URL: https://doi.org/10.1145/3034786.3056105.
  9. Kyoungmin Kim, Jaehyun Ha, George Fletcher, and Wook-Shin Han. Guaranteeing the Õ(AGM/OUT) Runtime for Uniform Sampling and Size Estimation over Joins. In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 113-125. ACM, 2023. URL: https://doi.org/10.1145/3584372.3588676.
  10. Hung Q. Ngo. Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 111-124. ACM, 2018. URL: https://doi.org/10.1145/3196959.3196990.
  11. Hung Q. Ngo, Ely Porat, Christopher Ré, and Atri Rudra. Worst-case optimal join algorithms: [extended abstract]. In Michael Benedikt, Markus Krötzsch, and Maurizio Lenzerini, editors, Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 37-48. ACM, 2012. URL: https://doi.org/10.1145/2213556.2213565.
  12. Hung Q Ngo, Ely Porat, Christopher Ré, and Atri Rudra. Worst-case optimal join algorithms. Journal of the ACM (JACM), 65(3):1-40, 2018. URL: https://doi.org/10.1145/3180143.
  13. Hung Q. Ngo, Christopher Ré, and Atri Rudra. Skew strikes back: new developments in the theory of join algorithms. SIGMOD Rec., 42(4):5-16, 2013. URL: https://doi.org/10.1145/2590989.2590991.
  14. Paul R. Rosenbaum. Sampling the Leaves of a Tree with Equal Probabilities. Journal of the American Statistical Association, 88(424):1455-1457, 1993. URL: https://doi.org/10.1080/01621459.1993.10476433.
  15. Dan Suciu. Applications of information inequalities to database theory problems. In 38th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2023, Boston, MA, USA, June 26-29, 2023, pages 1-30. IEEE, 2023. URL: https://doi.org/10.1109/LICS56636.2023.10175769.
  16. Todd Veldhuizen. Triejoin: A Simple, Worst-Case Optimal Join Algorithm. Proceedings of the 17th International Conference on Database Theory (ICDT), Athens, Greece, 2014, 17(13):96-106, 2014. URL: https://doi.org/10.5441/002/ICDT.2014.13.
  17. Ru Wang and Yufei Tao. Join Sampling Under Acyclic Degree Constraints and (Cyclic) Subgraph Sampling. In Graham Cormode and Michael Shekelyan, editors, 27th International Conference on Database Theory (ICDT 2024), volume 290 of Leibniz International Proceedings in Informatics (LIPIcs), pages 23:1-23:20, Dagstuhl, Germany, 2024. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.ICDT.2024.23.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail