Parallel Acyclic Joins with Canonical Edge Covers

Author Yufei Tao



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2022.9.pdf
  • Filesize: 0.93 MB
  • 19 pages

Document Identifiers

Author Details

Yufei Tao
  • Chinese University of Hong Kong, Hong Kong

Cite AsGet BibTex

Yufei Tao. Parallel Acyclic Joins with Canonical Edge Covers. In 25th International Conference on Database Theory (ICDT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 220, pp. 9:1-9:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ICDT.2022.9

Abstract

In PODS'21, Hu presented an algorithm in the massively parallel computation (MPC) model that processes any acyclic join with an asymptotically optimal load. In this paper, we present an alternative analysis of her algorithm. The novelty of our analysis is in the revelation of a new mathematical structure - which we name canonical edge cover - for acyclic hypergraphs. We prove non-trivial properties for canonical edge covers that offer us a graph-theoretic perspective about why Hu’s algorithm works.

Subject Classification

ACM Subject Classification
  • Theory of computation → Massively parallel algorithms
Keywords
  • Joins
  • Conjunctive Queries
  • MPC Algorithms
  • Parallel Computing

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. Google Scholar
  2. Foto N. Afrati, Manas R. Joglekar, Christopher Ré, Semih Salihoglu, and Jeffrey D. Ullman. GYM: A multiround distributed join algorithm. In Proceedings of International Conference on Database Theory (ICDT), pages 4:1-4:18, 2017. Google Scholar
  3. Foto N. Afrati and Jeffrey D. Ullman. Optimizing multiway joins in a map-reduce environment. IEEE Transactions on Knowledge and Data Engineering (TKDE), 23(9):1282-1298, 2011. Google Scholar
  4. Kaleb Alway, Eric Blais, and Semih Salihoglu. Box covers and domain orderings for beyond worst-case join processing. In Proceedings of International Conference on Database Theory (ICDT), pages 3:1-3:23, 2021. Google Scholar
  5. Albert Atserias, Martin Grohe, and Daniel Marx. Size bounds and query plans for relational joins. SIAM Journal of Computing, 42(4):1737-1767, 2013. Google Scholar
  6. Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. Journal of the ACM (JACM), 64(6):40:1-40:58, 2017. Google Scholar
  7. Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering conjunctive queries under updates. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 303-318, 2017. Google Scholar
  8. Xiao Hu. Cover or pack: New upper and lower bounds for massively parallel joins. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 181-198, 2021. Google Scholar
  9. Xiao Hu and Ke Yi. Instance and output optimal parallel algorithms for acyclic joins. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 450-463, 2019. Google Scholar
  10. Xiao Hu, Ke Yi, and Yufei Tao. Output-optimal massively parallel algorithms for similarity joins. ACM Transactions on Database Systems (TODS), 44(2):6:1-6:36, 2019. Google Scholar
  11. Muhammad Idris, Martín Ugarte, and Stijn Vansummeren. The dynamic yannakakis algorithm: Compact and efficient query processing under updates. In Proceedings of ACM Management of Data (SIGMOD), pages 1259-1274. ACM, 2017. Google Scholar
  12. Bas Ketsman and Dan Suciu. A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 417-428, 2017. Google Scholar
  13. Bas Ketsman, Dan Suciu, and Yufei Tao. A near-optimal parallel algorithm for joining binary relations. CoRR, abs/2011.14482, 2020. Google Scholar
  14. Mahmoud Abo Khamis, Hung Q. Ngo, Christopher Ré, and Atri Rudra. Joins via geometric resolutions: Worst-case and beyond. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 213-228, 2015. Google Scholar
  15. Paraschos Koutris, Paul Beame, and Dan Suciu. Worst-case optimal algorithms for parallel query processing. In Proceedings of International Conference on Database Theory (ICDT), pages 8:1-8:18, 2016. Google Scholar
  16. Hung Q. Ngo, Dung T. Nguyen, Christopher Re, and Atri Rudra. Beyond worst-case analysis for joins with minesweeper. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 234-245, 2014. Google Scholar
  17. Hung Q. Ngo, Ely Porat, Christopher Re, and Atri Rudra. Worst-case optimal join algorithms: [extended abstract]. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 37-48, 2012. Google Scholar
  18. Hung Q. Ngo, Ely Porat, Christopher Re, and Atri Rudra. Worst-case optimal join algorithms. Journal of the ACM (JACM), 65(3):16:1-16:40, 2018. Google Scholar
  19. Hung Q. Ngo, Christopher Re, and Atri Rudra. Skew strikes back: new developments in the theory of join algorithms. SIGMOD Rec., 42(4):5-16, 2013. Google Scholar
  20. Miao Qiao and Yufei Tao. Two-attribute skew free, isolated CP theorem, and massively parallel joins. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 166-180, 2021. Google Scholar
  21. Yufei Tao. A simple parallel algorithm for natural joins on binary relations. In Proceedings of International Conference on Database Theory (ICDT), pages 25:1-25:18, 2020. Google Scholar
  22. Yufei Tao. Parallel acyclic joins with canonical edge covers. CoRR, abs/2201.03832, 2022. Google Scholar
  23. Mihalis Yannakakis. Algorithms for acyclic database schemes. In Proceedings of Very Large Data Bases (VLDB), pages 82-94, 1981. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail