eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2020-03-11
25:1
25:18
10.4230/LIPIcs.ICDT.2020.25
article
A Simple Parallel Algorithm for Natural Joins on Binary Relations
Tao, Yufei
1
Chinese University of Hong Kong, Hong Kong
In PODS'17, Ketsman and Suciu gave an algorithm in the MPC model for computing the result of any natural join where every input relation has two attributes. Achieving an optimal load O(m/p^{1/ρ}) - where m is the total size of the input relations, p the number of machines, and ρ the fractional edge covering number of the join - their algorithm requires 7 rounds to finish. This paper presents a simpler algorithm that ensures the same load with 3 rounds (in fact, the second round incurs only a load of O(p²) to transmit certain statistics to assist machine allocation in the last round). Our algorithm is made possible by a new theorem that provides fresh insight on the structure of the problem, and brings us closer to understanding the intrinsic reason why joins on binary relations can be settled with load O(m/p^{1/ρ}).
https://drops.dagstuhl.de/storage/00lipics/lipics-vol155-icdt2020/LIPIcs.ICDT.2020.25/LIPIcs.ICDT.2020.25.pdf
Natural Joins
Conjunctive Queries
MPC Algorithms
Parallel Computing