PAC: Computing Join Queries with Semi-Covers
Abstract
An increased and growing interest in large-scale data processing has triggered a demand for specialized algorithms that thrive in massively parallel shared-nothing systems. To answer the question of how to efficiently compute join queries in this setting, a rich line of research has emerged specifically for the Massively Parallel Communication (MPC) model. In the MPC model, algorithms are executed in rounds, with each round consisting of a synchronized communication phase and a separate local computation phase. The main cost measure is the load of the algorithm, defined as the maximum number of messages received by any server in any round.
We study worst-case optimal algorithms for the join query evaluation problem in the constant-round MPC model. In the single-round variant of MPC, the worst-case optimal load for this problem is well understood and algorithms exist that guarantee this load for any join query. In the constant-round variant of MPC, queries can often be computed with a lower load compared to the single-round variant, but the worst-case optimal load is only known for specific classes of join queries, including graph-like and acyclic join queries, and the associated algorithms use very different techniques. In this paper, we propose a new constant-round MPC algorithm for computing join queries. Our algorithm is correct for every join query and its load matches (up to a polylog factor) the worst-case optimal load for at least all join queries that are acyclic or graph-like.
Keywords and phrases:
Worst-case optimal load, MPC model, join queriesCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Distributed computing models ; Theory of computation Logic and databases ; Theory of computation Abstract machinesFunding:
This work is partially funded by FWO-grant G062721N.Editors:
Sudeepa Roy and Ahmet KaraSeries and Publisher:

1 Introduction
In this paper, we study the query evaluation problem in distributed shared-nothing systems with very large numbers of servers. More precisely, we present a new algorithm for the computation of join queries (also called full self-join free conjunctive queries) in the MPC model. The MPC model is a well-established model within the database theory community for studying distributed algorithms. The model takes the number as a parameter, representing the number of available servers. Servers have no direct access to each other’s memory but they are connected through a network and can communicate with each other via private (one-to-one) message-passing. Algorithms for the MPC model are executed in rounds, with each round consisting of a local computation phase followed by a global communication phase. During the local computation phase, every server performs a computation over the data it has locally. During this phase, no messages are sent or received. During the communication phase, every server can send messages to other servers in the network. During this phase, no computation is performed by the servers. The communication phase ends with a global synchronization step after which all messages sent during the communication phase arrive in batch at their destination. We refer to the number of messages (i.e., tuples) received by a server during the communication phase as the server’s load in that round.
The cost of an algorithm in the MPC model is measured in terms of maximum load and number of rounds (the number of synchronization barriers, to be precise). By maximum load, we mean the maximum load of any server during any of the algorithm’s rounds. In this paper, we focus on worst-case optimality results for join queries. That is, we are interested in algorithms that optimize the maximum load w.r.t. given cardinality constraints rather than for the specific database instance over which the query is computed. We assume that initially the database instance is evenly (and randomly) partitioned over the servers, hence requiring a linear load of with being the number of tuples in , which is essentially the best load one can hope for. Clearly, for the join query evaluation problem, a linear load of in the worst-case will be unrealistically low for most queries. In this context, we remark that some care is necessary with respect to the number of allowed rounds, as with rounds one server can learn the entire database with a load as low as by having every other server send all its data in a round, after which computing any query over becomes trivial. For this reason, it is common to add a restriction on the number of rounds to the model, with single-round and constant-round algorithms being of particular interest in the literature.
An algorithm in the MPC model evaluates a join query correctly, when every tuple in the output can be produced by any of the servers at any round. Several lower bounds for the join query evaluation problem in the MPC model are known. Since many of these bounds [10, 7, 12, 6, 13] rely on the assumption that relation instances have equal cardinalities , we will also assume this condition throughout the paper. In other words, , with the number of relations of the query and the cardinality of the individual relation instances. The bounds relevant to this paper are all of the form , with a number depending on the structure of the target query where varies depending on the considered variant of the problem. It is noteworthy that the bounds are in terms of (and not the number of bits) because we only consider tuple-based algorithms, similar to other works.
Early work mostly focuses on single-round MPC algorithms [2, 4, 5, 3, 10] and in this variant of the model the worst-case load is now well-understood. An algorithm of particular importance in this context is HyperCube [2, 4], also called the shares algorithm. The HyperCube algorithm is parameterized with an assignment of values for the query attributes that represent their shares. These shares are often computed based on a fractional vertex cover which is a function assigning rational weights between and to the attributes of the query such that, for every relation, the sum of weights assigned to its attributes is at least (i.e., the relation is covered). An example of a fractional vertex cover of the query , whose hypergraph is in Figure 1(a), is given in Figure 1(b). The sum of the assigned weights is called the weight of the fractional vertex cover. The choice of cover influences the load that the HyperCube will guarantee. More precisely, the lower the weight of the chosen vertex cover is, the lower the load gets. The optimal (i.e., least) weight is called the fractional vertex covering number, denoted . The fractional vertex cover in Figure 1(b) is an example of an optimal cover for and hence . When the HyperCube is parameterized with an optimal fractional vertex cover, the algorithm is optimal in the sense that its load matches a lower bound [5] up to a polylog factor and with high probability when the input database is skew-free. Skew-freeness means that the degrees of attribute values (and by extension, partial tuples over attributes) do not exceed certain threshold values. While the precise definition of the threshold values is unimportant for the discussion, we remark that they depend on the considered fractional vertex cover. Intuitively: a higher weight on an attribute requires a lower degree for the attribute’s values. In particular, a weight of means that no constraints apply on the attribute’s values.
Without constraints on the database, the worst-case optimal load of single-round MPC algorithms is and an algorithm exists that can achieve it w.h.p. and up to a polylog factor [10]. The number is the edge quasi-packing number and equals the maximum fractional vertex covering number of residuals of the target query. A residual of a query is the query obtained after removing some of the attributes of (from all relations that have these attributes). Since every query is a residual of itself, it is immediate that . An example of a residual query of is given in Figure 1(c). A simple algorithm to compute join queries with load [10] goes as follows: first partition the database in fragments such that in each fragment, for every attribute of the query, all values for that attribute either all have a low degree or all have a high degree (with respect to some threshold value). Given a specific such database fragment, say with the set of heavy attributes, HyperCube is applied parameterized with a fractional vertex cover that assigns weights to all the attributes in . Since the threshold values will guarantee that the fragment is skew-free w.r.t. any fractional vertex cover assigning weights to attributes in , the load of this computation is decided by the weight of the considered fractional vertex cover, and is thus at best dependent on the fractional vertex covering number of the residual of based on . Since the total number of considered fragments is constant (under data complexity), the overall algorithm computes the target query over all fragments in parallel using the same set of servers.
The main advantage of using multiple rounds lies in the key insight that the evaluation of some (residual) queries can be simplified by computing semi-joins in one round with linear load. For example, the evaluation of query in Figure 1(c) over some fragment can be simplified into the evaluation of query in the same figure (over a modification of the fragment with similar size) by applying semi-joins and hence lowering the load from (as in the single-round algorithm, because ) to . Leveraging this technique, multi-round join algorithms rely on the following reduction procedure: A set of heavy attributes is chosen based on some threshold values. Now, suppose there are possible heavy tuples over the attributes in the considered fragment. The set of available servers gets equally divided into (disjoint) groups such that each heavy tuple gets a dedicated group of servers that is responsible for computing the query over the considered fragment only when the attributes of are fixed as per the values in . The attributes are then dropped from the query and, over every group of servers, semi-joins are applied. How to proceed and evaluate the simplified query afterwards varies depending on the algorithm under consideration.
For the multi-round variant, a few results are known and several questions remain open. It is known that the worst-case optimal load is at least , where the number is the fractional edge covering number of the query. This lower bound result was obtained in [10], and for some join queries this bound is tight. This is the case for all graph-like queries (i.e., join queries involving relations with at most two attributes) [10, 7, 12, 11, 8] as well as for all join queries that are acyclic [6, 13]. For a while, it has been thought that could be the optimal load in general, but this idea was recently debunked. In particular, it is shown in [6] that is a tighter lower bound for a specific class of cyclic queries over relations with an arbitrary number of attributes. We remark that both graph-like and acyclic join queries have an important property in common, namely ,111Property for reduced acyclic queries was only conjectured in [6]. Interestingly, our construction in Section 6.2 gives a proof for this conjecture as a side result. while for the class of queries considered in [6] we have . It is worth noting that the worst-case optimal algorithms proposed in the literature for either the graph-like or the acyclic queries cannot straightforwardly be generalized to arbitrary join queries. Thus, it remained open whether there exists a unified multi-round MPC algorithm that can work on arbitrary join queries. Later, in [11] an algorithm is introduced that can compute all join queries with load based on a newly-specified number. This number strictly exceeds for some queries and hence on such queries, using more rounds incurs more load. For this reason and for brevity, we no longer mention this algorithm in the rest of the paper and we will compare it against ours in future work.
In this paper, we propose a new distributed join algorithm whose load is bounded by up to a polylog factor and with high probability, and with a new query related number that we call PAC. Our algorithm runs in three rounds and is relatively simple, yet it is powerful enough to show the following:
-
1.
We show that for all join queries. Moreover, we show that for acyclic and graph-like join queries, i.e., our algorithm is worst-case optimal for all classes of queries for which the worst-case optimal load is currently known (with one exception discussed in Section 7).
-
2.
We also show that (and hence worst-case optimal) for some join queries that are not considered by any other previous algorithms.
PAC refers to three main concepts (Patches, Anchors, and semi-Covers) that our number, , is based on and our algorithm is parameterized with. A high-level overview: both patches and anchors are sets of sets of attributes from the query, and semi-covers are a relaxation of normal fractional vertex covers in which only a subset of the relations of the query should be covered. Similar to before, our algorithm evaluates a query over a database by considering the different fragments separately. Over a particular fragment, the algorithm is parameterized with a choice of patches, anchors, and a semi-cover. The algorithm then generalizes the aforementioned reduction procedure in several aspects. The attributes in the chosen patches are simply the attributes of the set . This results in evenly dividing the set of available servers among the possible heavy tuples over the attributes in . Before applying the semi-joins, the algorithm performs the following extra step: every group of servers, for some heavy tuple, gets unevenly divided based on the attributes in the chosen anchors such that the higher the degree of the values in the anchors, the larger the subgroup of servers it gets assigned. The semi-join step gets applied in every subgroup separately, and afterwards, the simplified query gets evaluated using HyperCube parameterized with the chosen semi-cover and extra shares to cover the rest of the query.
The definition of is more complicated than the earlier considered numbers, but links quite directly to the underlying algorithm, similar to how the worst-case optimal single-round MPC algorithm relates to the definition of . While this number is not quite as elegant as , , or , we believe that it brings us another step closer to a worst-case optimal algorithm for all join queries.
The paper is organized as follows: We give the preliminaries in Section 2. In Section 3 we introduce PAC, which is the central number in this paper. In Section 4 and Section 5 we introduce algorithmic techniques followed by the algorithm underlying . In Section 6 we relate to , , and . We conclude in Section 7.222Due to space limitations, most of the proofs are omitted and will appear in the full version of the paper.
2 Preliminaries
For a set , denotes the set of all non-empty subsets of . We write as abbreviation for the set of rationals between (and including) and .
Relations and join queries
Let rels and attrs be respectively an infinite domain of relation names and attribute names. For convenience of notation, specifically to avoid non-determinacy in some of our notations, we assume a total order over rels.
A join query is a pair consisting of a finite set of relation names and a mapping associating each relation name to a finite set of attribute names from attrs.333 Our definition of join a query disallows self-joins and repeated occurrences of attributes in relations, which are restrictions also imposed by all previous papers on the topic.
For a join query , we write and 444ins stands for incidence (attribute) subsets. to denote:
where the former is the set of all attributes appearing in the relations of , and the latter is the set of sets of attributes of appearing together in at least one relation of . For consistency, we also write instead of .
Two attributes from are called adjacent if they are different and there is a relation with . Moreover, for two different relations , we say that is reducable into subject to a set of attributes , if . When we simply say that is reducable into . Finally, we say that is reduced if it has no relations. When is not reduced, we will write to denote the reduced query of , which is the query obtained from by removing all reducable relations.555 We assume that reductions are done using some deterministic procedure, making use of to break ties, such that defines a unique query.
Tuples and instances
A tuple over a finite set of attributes is a function from to dom. We will use the term -tuple to refer to a tuple with specific domain (in other words, is precisely defined over the attributes of ). A finite set of tuples over the same set of attributes is called a relation instance over . For a tuple over and a subset of its attributes, we write to denote the tuple over with, for every , in which case we say that is an extension of . By convention, we consider every tuple an extension of itself and of the empty tuple. We say that an -tuple and a -tuple are consistent if .
A (database) instance for join query consists of a mapping associating every relation name with a relation instance over . We write to denote the total number of tuples in .
For a tuple over , we use the following notations:
denoting, respectively, number of tuples in that are extensions of ; number of tuples in that are extensions of ; and the set of all -tuples for which an extension in exists.
Join query semantics
Given an instance for some join query and a tuple over , we say that is consistent with if, for every , contains a tuple that is consistent with . We denote the set of all -tuples consistent with as .
The output of a join query over an instance can then be defined as .
Weight mappings
In this paper we will often consider weight mappings for join queries , which are functions over some set of attributes associating a non-negative rational weight to every attribute . Remark that we do not require a weight mapping to assign weights to all attributes of the query. Henceforth, given a particular weight mapping for , we will use to denote the set of attributes is defined over. For simplicity of notation, we often write with a set of attributes , to denote the sum of the respective attribute weights. We remark that can be larger than . Moreover, we associate a cost to a weight mapping with .
We say that a weight mapping for covers a relation if . Given a weight mapping and a join query (it is possible that is not defined for ), we define to be the join-query with:
Consistent with the literature, we call a fractional vertex cover for if and covers every relation of (in which case is query itself and we say that is a cover for ).666We note that every fractional vertex cover for a query is a cover for and that every cover for can be extended to a fractional vertex cover of with same cost (i.e., ) by assigning a weight to attributes .
Residual queries
For a join query and subset , the residual query is the join query with properties
In Figure 1(c), we give the residual query . Notice that also exemplifies that the residual query of a reduced query is not necessarily reduced.
3 Towards PAC
We define the PAC number for join queries. The PAC number is defined relative to a sequence of rationals that we call a spectrum. More precisely, a spectrum is a finite sequence of strictly increasing weights from with and .
3.1 Configurations
A configuration of a join query is a monotone mapping that assigns a weight to every set of attributes in appearing together in at least one relation of . By monotone we mean that for sets and : . In addition, we require for relations for which no relation with exists. We say that adheres to if its image contains only weights occurring in , i.e., for every .
We will use configurations as a type of histogram for database fragments, More precisely, will mean that the number of tuples extending any specific -tuple is below a threshold value based on and above a threshold value based on the element following in (if ). The precise threshold values are unimportant for the moment and will be defined in Section 4.
We call a set of attributes heavy w.r.t. if and that an individual attribute is heavy if is heavy. We say that a relation is heavy w.r.t. if all its attributes are heavy w.r.t. . We denote the set of all heavy attributes and heavy relations of w.r.t. by and , respectively.
Example 1.
An example of a configuration for is depicted in Figure 2. For this configuration, , and . The sets , of are the (only) heavy sets of attributes for w.r.t. .
Under the presence of a configuration, we will not be interested in arbitrary weight mappings for , but only in those that are compatible with in the following sense: A weight mapping for is compatible with if for every with .
Example 2.
In our algorithm, compatible weight mappings will be used like fractional vertex covers in HyperCube over skew-free databases (a formal argument is given in Section 4.4). Since compatible weight mappings covering all relations of the query do not always exist we next introduce two new concepts: Anchors and Patches that, intuitively, allow to cover the remaining relations of the query. In fact, even when compatible weight mappings exist that cover the entire query, they are not necessarily the best choice cost-wise.
3.2 Anchors
We call a pair of non-empty sets of attributes of an anchor for and refer to the relations with as relations anchored by . From now on we will only consider anchors for which at least one anchored relation exists. We remark that and do not need to be disjoint.
For a set of anchors for we write to denote the set of all anchored relations, that is, . We write to denote the union of the ’s in , that is, .
We say that an anchor of is compatible with if and , and we call a (possibly empty) set of anchors for that are compatible with an anchoring of w.r.t. . Moreover, we associate a cost to such an anchoring with .
Example 3.
Set is an example anchoring of w.r.t. with and . Sets and are examples of anchors for not compatible with .
3.3 Patches
Formally, a patch for (w.r.t. ) is a subset of whose elements are all heavy and disjoint. We write for the set of all attributes occurring in elements of , i.e., . We also associate a cost to every set with where . We recall that, by definition of a patch, , hence indeed always exists. The cost of a patch , denoted , is then defined as .
Example 4.
For an example of a patch for , consider the set . Then, for w.r.t. adhering to spectrum , we have . However, if the considered spectrum had weights , would be . Furthermore, we remark that sets and cannot be chosen as patches for w.r.t. because their elements are not disjoint (for the former) or not all heavy (for the latter).
3.4 Semi-Covers and Solutions
Finally, we combine all concepts in a solution for a query w.r.t. a configuration . Before giving the definition, we make formal what we expect from a weight mapping not covering all relations:
Definition 5.
We call a weight mapping a semi-cover for w.r.t. a configuration of , a set , and a set , if and for every relation , there is a relation such that is reducable into subject to , and is either covered by or in the set .
Example 6.
Remark that is a semi-cover for w.r.t. any configuration of , any set , and any set because covers (i.e., is a fractional vertex cover for ).
For a more interesting example, consider the weight mapping in Figure 3. We can see that is the only relation that is covered by . Now, take to be and to be . Then is a semi-cover for w.r.t. , , and . Indeed, , is reducable into subject to , both and are reducable into subject to , and is covered by .
A solution for w.r.t. is a triple with an anchoring of w.r.t. , a patch for w.r.t. , and a compatible semi-cover for w.r.t. , set , and set . The cost we associate with a solution is defined .
3.5 The PAC Number
As mentioned earlier, configurations correspond to database fragments. We will show later that the cost of a solution w.r.t. a given configuration is the cost of computing the query over the corresponding fragment using the different components of the solution. We can now define the PAC number for a join query w.r.t. a spectrum ,777To be precise, should be parameterized with the spectrum considered. We omit this parameterization as it will significantly overload the notation throughout the paper. Except where specified differently all results hold for any choice of . as
with ranging over all configurations of adhering to and ranging over all possible solutions for w.r.t. . Intuitively, the formula of the PAC number states that the cost of a query is determined (and bounded) by its most computationally difficult configuration. Since there could be multiple solutions w.r.t. the same configuration, we are only interested in any solution with minimal cost.
Example 7.
The reader may wonder if the complexity of the PAC number is justified and whether all of its concepts are necessary. The answer to this question is open, but we will show in the full version of the paper that all obvious simplifications of our number, like restricting/forbidding anchors, considering only spectrum , etc, all give a number that is strictly less tight than for some join queries.
The remainder of the paper is devoted to showing the below theorem and how relates to the existing numbers , , and .
Theorem 8.
Let be a join query, an instance, and a fixed spectrum. Then, can be computed over using servers in three rounds with load with high probability.
4 General Techniques
In our algorithm, we will make use of techniques and insights that follow directly or with minor modifications from classical results in the literature. In this section, we state these results.
4.1 Database Fragments
Like several of the earlier proposed algorithms in the literature [10, 7, 12, 8], our algorithm will first divide the input database instance in sub-databases. This is where configurations adhering to a spectrum become useful: The consecutive weights in a spectrum define intervals of threshold values that can be used to partition sets of tuples based on their degrees. With this intuition in place, a configuration adhering to thus pinpoints an interval for all sets . The intervals implied by a spectrum are based on the weights in , but also take , and into account. Formally, a tuple is in the interval of if its degree is below (or equal to) and (if ) strictly above . Henceforth, we will refer to the subset of -tuples in in the interval as .
We can now define a database instance over with . Intuitively, contains all tuples from that respect all degree constraints imposed by . Henceforth, we will refer to as . We remark that some care must be taken when interpreting the imposed degree constraints in the context of , as they are defined w.r.t. . For example, for a tuple it does not necessarily follow from that .
The next proposition, which states that the output of over is the same as the combined output of over the different fragments of , justifies our definition:
Proposition 9.
For a join query , an instance for , and a spectrum , we have
with ranging over all configurations for adhering to . Moreover, when is fixed, is constant, with denoting the set of configurations adhering to .
4.2 Tuple-Restricted Database Fragments
Sometimes we will consider a value assignment for some of the attributes of (defined as a tuple over a subset of ), and then be interested in the subinstance of an instance for containing all and only those tuples of consistent with . Formally, this instance for is defined as with . Henceforth, we refer to by . Accordingly, we can easily see that computing amounts to the following:
Proposition 10.
Let be a join query, an instance for , and , we have .
4.3 Semi-Joins and Reductions
It is well known that semi-joins and intersections can be computed with a single round of communication using servers with a load not exceeding w.h.p., and that guarded join queries (i.e., join queries having a single relation where all the other relations of the query are reducible into) can be computed with precisely two rounds and with same load guarantees independent of the number of reducible relations [10]. The latter follows from the observation that a reducible relation with its guard is a special type of semi-join and we can compute all these semi-joins at the same time in one round. Since after this step there will exist many copies of the guard relation (one for each reducible relation) a second step is needed to compute the intersection of its copies. Formally,
Proposition 11.
A database for a join query can be translated to a database for using servers in two rounds with load not exceeding w.h.p.
4.4 HyperCube-Style Query Evaluation
Compatible weight mappings for w.r.t. a configuration are essentially generalizations of the weight mappings used by the HyperCube algorithm [9] and the degree constraints associated with so-called skew-free databases w.r.t. these weight mappings, as made formal by the below proposition.
Proposition 12.
Let be a join query, an instance for , a spectrum, a configuration of adhering to , the fragment , and a compatible weight mapping for w.r.t. with . Then, can be computed over using servers in one round with load w.h.p.
4.5 Join Query Decompositions
Some join queries can be decomposed into a set of subqueries, and then computed in one round by computing each subquery in that round, over its own share of servers. This is formulated in Proposition 13. For this, we say that a set of queries is a decomposition of a join query if and for every for , we have .
Proposition 13.
Let be a join query, an instance for , and a decomposition of . Suppose for , we know that is computable over using servers in one round with load . Then, can be computed over using servers in one round with load w.h.p.
The observation follows from viewing the evaluation of a join query as a Cartesian product of its decomposition. When distinct queries share common attributes, the Cartesian product returns copies of those attributes (one for each occurrence in a query).
The formal argument of the above load analysis is based on a result in [8, Lemma 3.2], in which this result is proven for decompositions with join queries not sharing any attributes. It however follows immediately that this technique also works if the involved queries share attributes, only requiring an additional local computation step during which tuples are combined taking the values of common attributes into account (see [6] for further discussion).
5 Algorithm
In this section, we show how to compute a join query over a specific fragment of the database instance using servers, based on some spectrum and configuration for adhering to using a specific solution for w.r.t. . More precisely, this section is entirely devoted to showing the following:
Theorem 14.
Let be a join query, an instance, and a configuration for adhering to a spectrum . For any solution for w.r.t. , we can compute over using servers in three rounds with load with high probability.
It is worth mentioning that the number “three” refers to the number of communication rounds of the described algorithm. For the sake of precision, we remark that extra rounds are needed if the required degree information is not readily present. It is not uncommon to assume that this information is available for each server at the start of the algorithm [7].
5.1 Server Organization
Towards the algorithm proving Theorem 14, we first show how the servers are organized. For this, let . We partition the available servers in disjoint groups, with equally many groups as there are -tuples in the fragment . We refer to this set of -tuples by ; formally defined as follows:
For each such -tuple , the associated group consists of precisely servers:
with (intuitively denoting the total number of -tuples having values in fragment ). The next lemma shows some desirable properties of the server organization (a proof is given in Appendix B.1):
Lemma 15.
-
1.
; and
-
2.
.
5.2 Communication and Computation
Our algorithm proceeds with a computation of for each instance apart, using its designated group of servers. Correctness of this approach follows directly from Lemma 15 (1), Proposition 10 and the following lemma.
Lemma 16.
.
Now the computation of over for a specific goes as follows. Here, we assume w.l.o.g. that all servers are aware of the tuples in as well as their associated server groups.
- Step 1:
-
Instance is transformed in instance for by dropping the attributes of from all relations in and computing semi-joins as explained in Section 4.3 using the servers.
- Step 2:
-
The join query is now viewed as a set of join queries , (clearly a decomposition of ), with a query for every anchor having precisely the relations of anchored by ; with the query consisting of all relations from remaining in ; and with . In this step, we broadcast all tuples from relations in to all servers of the group. Moreover, we hash-partition the relations of over the share using HyperCube parameterized with , and, for each anchor , we hash-partition the relations of the query over share using the values of attributes (pretending they are a single value).888We remark that it is not an issue if this requires applying hash functions over attributes from (which are not present in nor ) as in that case there is only one value to consider, namely .
5.3 Load Analysis
We argue that the outlined algorithm computes over using servers with the desired load. A key aspect of the analysis is that our algorithm requires only communication to servers in the group . In other words, although this algorithm will be applied in parallel for all tuples of over their respective database fragments, the load of servers will be decided entirely by the specific run of the algorithm for the tuple they are responsible for.
In the remainder of this subsection, we will thus focus on the load of the individual steps 1 and 2. For Step 1, the load follows directly from Proposition 11 and Lemma 15. We also remark that the result of a semi-join (and by extension the size of relations in ) do not exceed the size of relations they are based on (the size of relations in , respectively). For Step 2, the desired load follows from Proposition 13 and the fact that the individual queries as well as and can be computed with desired load over their respective shares.
Lemma 17.
The load of computing each of the following in one round is w.h.p.
-
1.
using server;
-
2.
using servers; and
-
3.
using servers, for each anchor .
6 PAC in Relation to Other Numbers
In this section, we show how relates to the different existing numbers , , and , that have been considered in the literature. The main result of this section is the following:
Theorem 18.
For any choice of spectrum:
-
(a)
for all join queries ; and
-
(b)
for reduced join queries .
For any spectrum including weight :
-
(i)
for join queries that are acyclic or graph-like;
-
(ii)
for some join queries that are neither acyclic nor graph-like.
Property is a direct consequence of the correctness of our algorithm (see Theorem 8) and the known lower bound on the load of such algorithms based on [10]. Properties (for reduced queries ) and follow from the below Lemmas 19 and 20. Intuitively, Lemma 19 shows that a solution w.r.t. the configuration that has no heavy attribute sets always makes use of a solution whose semi-cover is also a cover for , thus with .
Lemma 19.
Let be a configuration for some join query with , for every . Then, for every solution for w.r.t. , we have the following properties:
-
1.
; and
-
2.
is a cover for .
To see , it suffices to consider a minimal spectrum and observe that, for every configuration adhering to , a fractional vertex cover for is a semi-cover for compatible with .
Lemma 20.
Let be a fractional vertex cover for . Then, is a solution for w.r.t. . In other words:
-
1.
is a semi-cover for w.r.t. , , and ; and
-
2.
is compatible with .
For graph-like and acyclic join queries (which are the main classes of join queries for which the worst-case optimal load is known), we have an even stronger result, namely that the load of our algorithm matches the optimal load.
Theorem 21.
For join queries that are graph-like or acyclic we have that .
In the next two subsections, we show why for graph-like, respectively, acyclic join queries.
As for item (ii) in Theorem 18, it is enough to give an example of a join query showing that our algorithm is also optimal for join queries that are neither acyclic nor graph-like. A simple example is
Notice that and . With spectrum , we obtain that , which is clearly the worst-case optimal cost for this join query. Verifying that yields this cost is left as an exercise for the reader. We remark that and thus that our algorithm has a strictly better load than any previously known algorithm that could compute this query.
6.1 Optimality for Graph-like Joins
To show for reduced graph-like join queries, it is sufficient to consider spectrum . The remainder of this section will be devoted to showing for an arbitrary configuration adhering to that there exists a solution with .
For the construction, we need extra terminology and call an attribute based on
-
relevant if it occurs in a relation ;
-
heavy if it is relevant and ;
-
isolated if and all adjacent attributes are heavy;
-
light if it is relevant non-isolated and .
We denote the set of isolated attributes by , the light attributes by , and the heavy attributes by .
For the construction of , we first choose an anchoring for . For this, let us call a relation an anchor candidate if it has an attribute from (and therefore also an attribute from ). Now let be any maximal set of anchor candidates with property that the relations in share no attributes. From we then construct the desired anchoring of . For convenience, we will refer by to the isolated attributes occurring in and by to the remaining attributes in . As patch, we choose . Finally, as weight mapping we take assigning weight to all attributes in and weight to attributes in . Now, the desired result follows from the next proposition.
Proposition 22.
-
1.
is compatible with ;
-
2.
is a semi-cover for w.r.t. , , and ;
-
3.
.
6.2 Optimality for Acyclic Joins
To show for reduced acyclic join queries , it is sufficient to consider spectrum . In the remainder of the section, we show how, for an arbitrary configuration adhering to , a solution for and can be constructed with .
Before going into the details of the construction, we remark that every reduced acyclic join query has an integral fractional edge cover [6]. That is, there is a set with property and every attribute must appear in at least one relation in EC. Since is acyclic it also has a join-tree [1]: a tree having the relations of as nodes and such that any two relations sharing an attribute are connected by a path in the tree with property that every relation on this path has .
In the special case it follows inevitably (by definition of ) that and hence that is the desired solution with trivial cost .
Therefore, we continue with the assumption that there is a relation and we make use of this assumption to also assume w.l.o.g. an orientation of making its root. The solution that we will construct for this case is of the form with , i.e., with empty patch and a weight mapping using only weights .
Next, we describe an iterative procedure applied on . During this procedure we pinpoint the anchors of as well as the attributes of . The procedure keeps track of a set of already visited relations from , which is initially the empty set. The procedure continues as long as but will eventually terminate as every step increases by one or more relations. The procedure goes as follows: Take a relation whose ancestors (according to ) are already in . We distinguish between three cases:
-
1.
is the root of ;
-
2.
a relation exists that is a descendant of (according to ) and there is a non-empty set of attributes with ; or
-
3.
there is no such set of attributes for .
In case (1), we know . We add to and one of its attributes with to .
In case (2), note that it is possible that and are the same relation. We can assume w.l.o.g. that none of the relations between and in are in EC (beside itself). Indeed, if there is such a relation , we know that by definition of join-tree and thus we can substitute by . If there are multiple choices for the set , we look at the relation with occurring closest to the root in (remark that, by definition of join-tree, is unique). We then choose the set with whose occurs closest to the root of in .
Now, if there is an attribute with and either has no parent, or, has a parent with , then in both cases we add to . Otherwise, we add as anchor to , with a singleton containing an attribute with (if has no parent) or (if has a parent ). We add all relations on the path from to in to .
In case (3), we simply add to .
The aforementioned construction is indeed a solution w.r.t. as stated in the following lemma.
Lemma 23.
is a solution for w.r.t. .
For the cost of we remark that every step of the iterative procedure contributing to (namely cases (1 and 2) creates either one anchor or adds one attribute to . Since both steps add exactly one relation from to , it is immediate that . Furthermore, since (by choice of the root of which falls in case (1)) we have that , as desired.
7 Conclusion and Future Work
Our algorithm can compute every join query in the constant-round MPC model with a load bounded by the PAC number . Since , this load is at least as low as the optimal load for in the single-round MPC model. Moreover, for any join query for which the worst-case optimal load is known, including all join queries that are graph-like or acyclic. We remark that the queries in [6] shown to have an optimal load w.r.t. have property and hence are also computed optimally by our algorithm. There is one exception: the Loomis-Whitney (LW) join [10]. For some configurations of an LW join query , we only have to apply step 1 of our algorithm since the query consists of exactly one relation. In that case, our algorithm requires more load than what is expected “optimally”. The reason for this is that our analysis for step 1 is not fine-grained enough to show that, in some peculiar cases, step 1 can be computed with a better load guarantee. Capturing this case requires a more fine-grained analysis.
Although our algorithm is relatively simple (compared to other worst-case optimal algorithms described in the literature) the number is complicated and relative to a spectrum that needs to be decided in advance. Although the spectrum is sufficient for all the proven relationships with , , and , there exist queries for which this choice of spectrum is suboptimal. In general, it is not known if a bound on spectra can be assumed or if it is computable from the structure of the query. We remark that it is currently also still unknown what the worst-case optimal load for arbitrary join queries in the constant-round MPC model is.
References
- [1] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
- [2] Foto N. Afrati and Jeffrey D. Ullman. Optimizing multiway joins in a map-reduce environment. IEEE Trans. Knowl. Data Eng., 23(9):1282–1298, 2011. doi:10.1109/TKDE.2011.47.
- [3] Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. Parallel-correctness and transferability for conjunctive queries. In Tova Milo and Diego Calvanese, editors, Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS, 2015, pages 47–58. ACM, 2015. doi:10.1145/2745754.2745759.
- [4] Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. In Richard Hull and Wenfei Fan, editors, Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA, 2013, pages 273–284. ACM, 2013. doi:10.1145/2463664.2465224.
- [5] Paul Beame, Paraschos Koutris, and Dan Suciu. Skew in parallel query processing. In Richard Hull and Martin Grohe, editors, Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS’14, Snowbird, UT, USA, June 22-27, 2014, pages 212–223. ACM, 2014. doi:10.1145/2594538.2594558.
- [6] Xiao Hu. Cover or pack: New upper and lower bounds for massively parallel joins. In Leonid Libkin, Reinhard Pichler, and Paolo Guagliardo, editors, PODS’21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2021, pages 181–198. ACM, 2021. doi:10.1145/3452021.3458319.
- [7] Bas Ketsman and Dan Suciu. A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries. In Emanuel Sallinger, Jan Van den Bussche, and Floris Geerts, editors, Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 417–428. ACM, 2017. doi:10.1145/3034786.3034788.
- [8] Bas Ketsman, Dan Suciu, and Yufei Tao. A near-optimal parallel algorithm for joining binary relations. Log. Methods Comput. Sci., 18(2), 2022. doi:10.46298/lmcs-18(2:6)2022.
- [9] Paraschos Koutris. Query Processing for Massively Parallel Systems. PhD thesis, University of Washington, USA, 2015. URL: https://hdl.handle.net/1773/33697.
- [10] Paraschos Koutris, Paul Beame, and Dan Suciu. Worst-case optimal algorithms for parallel query processing. In Wim Martens and Thomas Zeume, editors, 19th International Conference on Database Theory, ICDT 2016, volume 48 of LIPIcs, pages 8:1–8:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICDT.2016.8.
- [11] Miao Qiao and Yufei Tao. Two-attribute skew free, isolated cp theorem, and massively parallel joins. In PODS, pages 166–180, 2021. doi:10.1145/3452021.3458321.
- [12] Yufei Tao. A simple parallel algorithm for natural joins on binary relations. In Carsten Lutz and Jean Christoph Jung, editors, 23rd International Conference on Database Theory, ICDT 2020, volume 155 of LIPIcs, pages 25:1–25:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.ICDT.2020.25.
- [13] Yufei Tao. Parallel acyclic joins with canonical edge covers. In Dan Olteanu and Nils Vortmeier, editors, 25th International Conference on Database Theory, ICDT 2022, volume 220 of LIPIcs, pages 9:1–9:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ICDT.2022.9.
Appendix A Proofs for Section 4
A.1 Proof of Proposition 12
One of the most well-known algorithms in the literature is the HyperCube algorithm [2, 9], which is known to compute join queries with a load based on shares assigned to the attributes of given that some degree constraints on relations in the input database are true. The shares assigned based on fractional vertex covers for have a special interest, because they are sufficient to find an optimal assignment of shares. The following result (translated to our terminology) is known:
Theorem 24 ([9]).
Let be a join query and an instance for . Let be the shares of the HyperCube algorithm with each is for a specific attribute . Suppose that for every relation and every -tuple with , we have for some constant . Then w.h.p. the maximum load per server is .
Its proof follows from the next lemma:
Lemma 25 ([9]).
Let be a join query and an instance for . Let with be the attributes of listed in some order. Let be integers and let . Suppose that we hash each tuple to the bin , where are independent and perfectly random hash functions from the domain dom to respectively. Then suppose that for every -tuple with we have
for some constant . Then the probability that the maximum load exceeds is exponentially small in .
Next, we show that compatible weight mappings resemble the degree constraints required by the HyperCube algorithm. More precisely, let be a join query, an instance for , a spectrum, a configuration of adhering to , the fragment , and a compatible weight mapping for w.r.t. with and with . (Notice that in case , then we assume for every .) Moreover, let be some integer and be the share used by HyperCube for attribute . Now, suppose that a relation is covered by (i.e., and ), then we argue next that w.h.p. the load of over using HyperCube does not exceed .
To compute the load of over , we will rely on the HyperCube, as already mentioned, by assigning a share of to every attribute . When , then we can verify that it satisfies the degree constraints of Lemma 25, and hence, the load of is in . Indeed, since is compatible w.r.t. and , we know that for every . Accordingly, the following holds for any : .
Since every -tuple in is light w.r.t. , we obtain that
Moreover, we can see that
with and , which yields that . Thus, we obtain that the load of does not exceed w.h.p. by Lemma 25 and the fact that . Moreover, from the fact that is covered by (i.e., ), we guarantee that the load of does not exceed .
Now, consider the case when . In order to show that the load will not exceed the required load, we will show that if we hash partition using slightly smaller buckets by assigning a lower share compared to the shares assigned by (i.e., less bins) instead, we can still guarantee a load of . Hence, the load can only improve when we assign the larger shares.
Thereto, for a relation that is covered by , we use the shares with for every . Notice that for each , we can verify that since . From the compatibility of , we know that, for every , which implies that . Remark that according to this construction. Therefore, using the argument mentioned above for the case when , we see that with servers, the load of is indeed .
Appendix B Proofs for Section 5
B.1 Proof of Lemma 15
In what follows, let be the set of anchors in in some order. Recall that each group of servers is defined as where . Moreover, the set that ranges over is defined as
Lemma 26.
For a join query , a fragment of w.r.t. and , and a solution for w.r.t. , we have
-
1.
the number of tuples over in is at most ;
-
2.
; and
-
3.
.
Proof.
(1) By definition, we know that for any set of attributes in , and that all of -tuples in are heavy for . Consequently, we have at most heavy tuples in . Now recall that the sets of attributes in are pair-wise disjoint. Hence, the total number of heavy tuples over is at most .
(2) First remark that the values for ranges over the tuples in , while ranges over the tuples from . We can see that . Indeed, is a restriction of the values in .
The inequality holds since the possible values for are a subset of the values possible for . Moreover, the second to last equality follow from the definition of (recall that ).
(3) Now, we verify that as follows:
where the first inequality follows from the fact that possible values of are a subset of the ones possible over and disjointly. The second inequality follows from item (1), and the last inequality follows from item (2).
Lemma 27.
For a join query , a fragment of w.r.t. and , and a solution for w.r.t. , we have
-
1.
for every anchor and every tuple over in ; and
-
2.
.
Proof.
(1) Equivalently, we want to show that . Since , we obtain that indeed . Moreover, it is clear that and it is safe to assume that .
B.2 Proof of Lemma 17
Proof.
Since and given that the size of a relation in cannot exceed its size in , it is sufficient to show that for every relation , we have that in order to establish our proof of item (1). Let be a relation from the set . By definition, for every . Moreover, by definition, we know that for any -tuple in . Therefore, the total number of possible -tuples in respecting is clearly less than . Consequently, the total number of -tuples in is less than . Since is the instance that corresponds to such a configuration, we see that which is considered a low load compared to by our the assumption that (and precisely, the load is larger than ).
The proof of item (2) directly follows from Proposition 12. As for item (3), it is sufficient to show that the set of relations anchoring can be hash-partitioned based on the attributes of with the required load. Accordingly, Lemma 25 requires that for every -tuple . Recall that by definition of anchor, we know that and we also know that because (recall that for some tuple .)