Distribution Policies for Datalog

Modern data management systems extensively use parallelism to speed up query processing over massive volumes of data. This trend has inspired a rich line of research on how to formally reason about the parallel complexity of join computation. In this paper, we go beyond joins and study the parallel evaluation of recursive queries. We introduce a novel framework to reason about multi-round evaluation of Datalog programs, which combines implicit predicate restriction with distribution policies to allow expressing a combination of data-parallel and query-parallel evaluation strategies. Using our framework, we reason about key properties of distributed Datalog evaluation, including parallel-correctness of the evaluation strategy, disjointness of the computation effort, and bounds on the number of communication rounds.


Introduction
Modern data management systems-such as Spark [32,38], Hadoop [13,18], and others [19]-have extensively used parallelism to speed up query processing over massive volumes of data. Parallelism enables the distribution of computation into multiple servers, and thus significantly reduces the completion time for several critical data processing tasks. This trend has inspired a rich line of research on how to formally reason about the parallel complexity of join computation, one of the core tasks in massively parallel systems. Several papers [9,10,[22][23][24] have studied the trade-off between synchronization (number of rounds) and communication cost, and have proposed and analyzed known and new parallel algorithms [4,11]. Among these, the Hypercube algorithm [4,16] can compute any multiway join query in one round by properly distributing the input data.
To reason about Hypercube-like algorithms, Ameloot et al. [6,7] recently introduced a framework that captures one-round evaluation of joins under different data distributions. Their framework implicitly describes a single-round parallel algorithm through a distribution policy, which specifies how the facts in the input relations are distributed among the servers. While for non-recursive queries a distribution policy defines a scalable parallel evaluation strategy, for Datalog programs this is typically not the case. For instance, a simple transitive closure query already requires that for each component of the input database there exists a server containing all facts of the component.
To reason about Datalog evaluation in a distributed setting, we introduce a general theoretical framework that allows a combination of data and query parallelization strategies. The central concept in this framework is the notion of an economic policy.
Our key observation is that, in order to deal with intensional predicates, we need to specify not only where a fact must be located to be consumed by a rule, but also where a fact must be produced by evaluating a rule of the program. An economic policy in our framework is defined as a pair of distribution policies: a consumption policy, which specifies the location of the facts that are used in the body of rules, and a production policy, which specifies the location of facts that appear in the head of a rule. The evaluation strategy that is implicitly defined by the data distribution must communicate any produced facts to the servers where they will be consumed, and thus can run over multiple rounds.
Our framework is inspired by a rich line of research on parallel evaluation strategies for Datalog programs from the early 90's [16,35,36,39]. There, Datalog evaluation strategies are based on the idea of partitioning the instantiations of the program rules among servers by adding conditions to the bodies of the rules, called program restrictions. Some of the strategies proposed require no communication of intermediate (intensional) facts and thus can be completed in one round; other strategies require communication over multiple rounds. We show that an economic policy can capture several algorithms used for parallel evaluation of recursive and nonrecursive queries, including the Hypercube algorithm [4,16], and the decomposable strategies based on program restrictions [35].
In this framework we study several properties of economic policies. We first explore the property of parallel-correctness: when does an economic policy lead to a correct evaluation strategy? As can be expected, it is undecidable to show parallel-correctness for a general Datalog program, even for the simplest of economic policies. We therefore identify a sufficient condition: every minimal valuation of a rule must be supported by the policy. A rule valuation is supported if some server consumes all the facts in the body, and produces the fact in the head. For unions of conjunctive queries, this condition is also necessary, recovering the result of Ameloot et al. [7]; however, we show that even for non-recursive programs with intermediate relations, the condition is no longer required. To overcome the undecidability of parallel-correctness, we identify a general family of economic policies, called Generalized Hypercube Policies (GHPs), which are always parallel-correct, and further capture several commonly used parallel evaluation strategies.
Second, we study the property of boundedness: can we decide whether a given economic policy terminates in k rounds, independent of the input size? We show that there exists a sharp increase in complexity as we move from k = 1 to k ≥ 2. For k = 1, we can succinctly characterize the structure of a policy that always terminates in one step. Additionally, given a GHP, we can do this in polynomial time in the description of the GHP. On the other hand, for k ≥ 2 it is undecidable to determine whether it terminates in at most k steps, even for a GHP. We then ask which Datalog programs admit economic policies that are bounded by one round: we show that such programs are characterized by a syntactic property called pivoting, which was also identified by Wolfson and Silberschatz [37] in the context of decomposable programs.
The present paper is the full version of the extended abstract [21] and provides the missing proofs.

Parallel Complexity
The parallel complexity of Datalog was first investigated by Cosmadakis and Kanellakis [12,20]. Later work used the complexity class NC to theoretically capture which Datalog programs are efficiently parallelizable. Since Datalog evaluation is P -complete and the question whether P equals NC is a longstanding open problem, it is not known if every Datalog program belongs in NC, which implies that certain Datalog programs may not be significantly sped up through parallelism. Ullman and Van Gelder [33] showed that if a Datalog program has the polynomial fringe property, which says that every fact in the output has a proof tree of polynomial size, evaluation is in NC. Every linear Datalog program has the polynomial fringe property and is thus in NC. Afrati and Papadimitriou [3] showed that for simple chain queries (including non-linear queries) evaluation is either in NC or P -complete. Recently, Afrati and Ullman [5] studied the trade-off between communication and number of rounds. They describe a very restricted class of Datalog programs where it is possible to reduce the number of recursion steps (to a number that is logarithmic in the size of the input) without significantly increasing the communication cost.

Decomposability
The concept of predicate decomposability was first introduced by Wolfson and Silberschatz [37]. A predicate T is decomposable if there are r > 1 restricted copies P 1 , P 2 , . . . , P r of the Datalog program P (using arithmetic predicates) such that (i) the copies compute a partition of T for every input, and (ii) there exists an input instance where each copy will produce tuples over T . The main result is that decomposability is equivalent to pivoting for sirups where there are no constants, no repeating variables, and the sirup is linear or a simple chain rule. Here, a sirup is a Datalog program with one intensional predicate S and two rules: (i) a base rule S(x) ← B(x), and (ii) a recursive rule with head predicate S. A sirup is linear if S appears exactly once in the body of the recursive rule.
Later works [35,36] redefine the concept of decomposability semantically. A Datalog program is decomposable if it is possible to partition the output domain (to at least two blocks) such that for every instance I , every output fact has a proof tree where all the intensional database facts belong in the same partition block. Wolfson and Ozen [36] show that deciding whether a given Datalog program is decomposable is undecidable. Cohen and Wolfson [35] provide necessary and sufficient syntactic conditions for decomposability for sirups where the arity of the intensional predicate is ≤ 2. They also define the notion of strongly decomposable sirups, where the partition must guarantee that, for some input, at least two blocks will produce a fact using the recursive rule of the sirup. Following the same line of work, Zhang et al. [39] present a more general framework that constructs partitionings of the rule instantiations. A related notion has also been studied by Ameloot et al. [8] in the context of connected Datalog programs.

Other Parallel Schemes
In addition to decomposability, several frameworks for parallel recursive processing were introduced in the early 90s [16,35,36]. Wolfson [35] generalizes decomposability to load sharing schemes, by allowing the output of a predicate to have overlap in the copies of the program P . Under a load sharing scheme, every linear program can be parallelized, even if it is not pivoting. In [15,16,36], general schemes are introduced that parallelize the evaluation by partitioning the set of rule instantiations, and allowing for communication among the servers (decomposable and load sharing schemes need no communication). Dewan et al. [14] proposes similar techniques with dynamic adjustments, to balance the load of a computation. Our framework differs in that the set of rule instantiations is distributed implicitly among the servers, according to the production and consumption policies, and that the communication between servers is made explicit.

Systems
Recent work studies the implementation of Datalog (or fragments of Datalog) on modern shared-nothing distributed systems. Seo et al. [29] present a distributed version of a Datalog variant for social network analysis called Socialite; however, their framework requires that the user provides annotations to guide the distribution of data. Wang et al. [34] implement a variant of Datalog on the Myria system [19], focusing mostly on asynchronous evaluation and fault-tolerance. The BigDatalog system [31] describes an implementation of Datalog on Apache Spark, but focuses mostly on linear Datalog programs that use aggregation. The task of parallelizing Datalog has also been studied in the context of the popular MapReduce framework [2,5,30]. Motik et al. [26] provide an implementation of parallel Datalog in main-memory multicore systems.

Preliminaries
We assume an infinite domain dom. A database schema σ is a finite set of relation names {R i } n i=1 with associated arities ar(R i ). We shall write R (k) to denote a relation R with arity k. A fact R(a 1 , . . . , a k ) is a tuple consisting of a relation name and a sequence of values from dom. We say that R(a 1 , . . . , a k ) is over schema σ , if R (k) ∈ σ . For a schema σ , we denote by facts(σ ) the complete set of facts over σ . An instance I over σ is defined as a finite subset of facts(σ ). We write I |σ to denote the subset of I containing all facts in I that are over schema σ .
For i ∈ N, we abbreviate the set {1, . . . , i} by [i], and for a set S we denote by P(S) its powerset.

Datalog
We assume an infinite domain of variables var, disjoint from dom. An atom is a formula R(t 1 , . . . , t k ) consisting of a relation name and a tuple of terms; a term t i is either a variable from var or a constant from dom.
A Datalog rule τ is of the form R(x) ← S 1 (y 1 ), . . . , S n (y n ), where R(x) is a single atom called the head of τ , denoted head τ , and all S i (y i ) are atoms called body atoms of τ , denoted body τ . We say that S i (y i ) is over schema σ , when S i ∈ σ and y i is a tuple of ar(S i ) terms. We say that τ is over schema σ if all its atoms are. We assume that Datalog rules are always safe, i.e., that all variables in the head occur in at least one body atom. By vars(τ ) we denote the set of variables in rule τ .
A Datalog program P is a finite set of Datalog rules. A program P is said to be over schema σ if all its rules are. Particularly, by EDB(P ) ⊆ σ we denote the relation names occurring only in the body of rules, and by IDB(P ) ⊆ σ all other relation names occurring in P . We further distinguish the names in IDB(P ) by calling some of them output relations, denoted out(P ) ⊆ IDB(P ); all other intensional relations are auxiliary. We write σ (P ) to denote EDB(P ) ∪ IDB(P ).
Consider the directed graph whose nodes are the intensional relation names, and there is an edge from S to S if S occurs in the head of some rule τ of P , and S in the body of τ . We say that P is recursive if the graph is cyclic; otherwise, we say it is non-recursive. A non-recursive Datalog program with only one rule is called a conjunctive query (CQ).

Evaluation Semantics
We define the evaluation semantics of Datalog programs as usual, through the immediate consequence operator. Let P be a Datalog program and I an instance over EDB(P ). A valuation v for rule τ ∈ P is a constant-preserving mapping of the terms in τ to values in dom. For a rule τ ∈ P and valuation v, we say that τ derives fact v(head τ ) over instance I if v(body τ ) ⊆ I . We refer to v(τ ) as the instantiation of rule τ with valuation v.
We use T P to denote the immediate consequence operator for P , which applies all rules in P exactly once over a given instance and adds all derived facts to that instance. Formally, Then, P (I ) is defined as the fixpoint reached after iteratively applying the immediate consequence operator over I . It is not difficult to see that T P is monotone, and thus always reaches a fixpoint after a finite number of iterations. Moreover, the output of the query that P computes is defined as P (I ) |out(P ) . We refer to Abiteboul et al. [1] for a detailed description.
We call a fact f P -derivable if f ∈ P (I ) for some instance I , and P -consumable if during the evaluation of P on some instance I a rule instantiation v(τ ) fires that requires f . Both notions naturally generalize to atoms and relation names, e.g., relation name R is said to be P -consumable if some P -consumable fact f exists with symbol R. Atom A is P -consumable if a rule instantiation as above exists, with A ∈ body τ .

Proof Theoretic Concepts
Let T = (V , E) be a tree. By fringe T we denote its leaves and by root T its root vertex. All other vertices are called internal vertices. For a vertex n ∈ V we denote by children t (n) the set of child vertices of n in T . We now recall the classical notion of proof tree [1]. A proof tree T for a fact f on instance I and Datalog program P is a tree T with vertices over facts(σ (P )), where fringe T ⊆ I , root T = f , and for every internal vertex g, there is a rule τ ∈ P and valuation v such that g = v(head τ ) and children t (g) = v(body τ ). In this case, we shall say that T uses the instantiation of τ with valuation v. It is easy to see that P (I ) consists of exactly those facts f for which a proof tree for f on I and P exists. We say that a rule instantiation v(τ ) is useless if v(head τ ) ∈ v(body τ ); otherwise, we say that it is useful. W.l.o.g. we will consider only proof trees where all rule instantiations are useful.
We say that a proof tree T is subsumed by proof tree T for P , denoted T T , if fringe T ⊆ fringe T and root T = root T . 1

The Framework
Our framework considers a setting with p servers that share no memory and can communicate only via messages-this is commonly referred to as a shared-nothing parallel architecture. The set of servers forms a network [p] that we assume is fully connected. In order to define how computation is performed, we will use policies that specify how the data (input and output facts) are distributed over the network. We borrow the definition of a distribution policy from [7]: Definition 1 (Distribution Policy) A distribution policy P = (facts P ) over schema σ and network [p] consists of a function facts P : [p] → P(facts(σ )) mapping servers to sets of facts over σ .
Distribution policies are instance independent, i.e., they are oblivious of the specific database instance. Intuitively, a policy expresses on which servers a fact should reside if the fact is in the network, but not whether the fact is in the network. Henceforth, we slightly abuse notation and write P (f ) to denote the set of servers responsible for f , i.e., In contrast to [7], where the focus is on single-round query evaluation and policies that define only the initial data distribution over extensional database facts, we consider a multi-round setting that allows the communication of intermediate facts.

Definition 2 (Economic Policy) An economic policy E over schema σ and network
[p] is a pair (P , C) of distribution policies over the same universe U , where: -P is defined over IDB(P ) and is called the production policy; and -C is defined over EDB(P ) ∪ IDB(P ) and is called the consumption policy.
A production policy describes which servers have the responsibility of producing a certain intensional database fact. A consumption policy describes which servers need an extensional or intensional database fact to satisfy the body of a rule instantiation. We say that a fact f is C-consumable if C(f ) = ∅ and that relation R is C-consumable if some fact over dom and R is C-consumable.
A family of economic policies F is a set of economic policies over a common universe and schema. We say that a family F satisfies property P if all the policies in F satisfy P.

Datalog Evaluation Modulo Policies
Instead of letting a server compute the full program over its local instance, we restrict the evaluation process based on a server's economic policy. That is, for economic policy E = (P , C) and Datalog program P , the following sequential evaluation algorithm takes place on server i: -First, every rule τ ∈ P is annotated with policy-predicates as follows. For the head R(x), we add a predicate Policy P R (x) to the body of τ . Here, relation name Policy P R refers to relation facts P (i) |{R} . -Second, for every atom S(y) in the body of τ , we add the predicate Policy C S (y), where now Policy C S refers to the relation facts P (i) |{S} . The added predicates may be infinitely large, but can be accessed through queries of the form "t ∈ facts P (i) |{R} ?" or "t ∈ facts C (i) |{S} ?".
Throughout the paper, we assume the semi-naive evaluation strategy for Datalog programs. Semi-naive evaluation proceeds as usual over the annotated program: after each application of the fixpoint operator, the newly derived facts are added to a delta relation, and a rule instantiation is triggered only if at least one of its facts is in the delta relation from the previous iteration. We denote by P E (I, J ) the fixpoint instance when we execute P restricted to E on input I , with delta relations initialized with J .

Distributed Evaluation Strategy
We now present how an economic policy induces a parallel evaluation strategy. Our parallel model is the BSP-based Massively Parallel Communication Model (MPC) [25]. In this model, computation is performed over servers in a multi-round fashion. Each round has two distinct phases: a local computation phase, and a synchronized communication phase.
Consider a Datalog program P , a network [p], and an economic policy E = (P , C). Moreover, let I be the input instance, which we initially assume to be partitioned arbitrarily over the p servers. Denote by local 0 i the initial local instance of server i. Let local k i be the instance on server i right after the k-th communication phase.
At the k-th round (for k ≥ 1), we perform the following procedure: 1. Communication: Every server sends its facts as defined by the consumption policy C. That is, server i sends local fact f ∈ local k−1 i to server j if (and only if) f ∈ fact C (j ). Let rec k i be the facts received by server i during the k-th communication phase. 2 2. Computation: Every server computes the local fixpoint: if k = 1, then Intuitively, the algorithm terminates when, after a round is finished, for every server all locally derived facts that need to be sent to some other server according to the consumption policy, were already sent to these servers in an earlier round.
Formally, for server i, we define set F i = {f | C(f ) \ {i} = ∅}. Intuitively, F i represents all facts consumed by servers other than i itself. We say that a server has reached a local fixpoint state for E and P after round . We say that the network [p] has reached a global fixpoint state for E and P after round k, if all servers i ∈ [p] have reached a local fixpoint state after round k. Notice that this condition is as desired, because every round goes into the communication phase first, then into the local computation phase. Hence, all earlier sent messages have been taken into account.
One could imagine a smarter communication procedure that incorporates Datalog semantics as well. For example, a server does not need to send a local fact f ∈ facts C (j ) to server j if for every input I server j is guaranteed to already have f in its local instance. However, it is in general undecidable to make such a decision (see Lemma 2).
For an instance I , let [P , E](I ) denote the union of all facts over out(P ) found at any server after reaching the global fixpoint. Notice that the above evaluation strategy always reaches a fixpoint, due to monotonicity of Datalog.
For any function h : dom → [p], we define the economic policy (P 1 , This policy works as follows: it replicates the extensional database facts everywhere, and then produces/consumes each fact T (a, b) at server h(a). It is easy to see that the economic policy correctly computes the transitive closure. In fact, the evaluation always terminates in a single round.
Consider a different policy (P 2 , C 2 ), which again takes any function h : dom → [p] and which has the following definition: This policy does not replicate the extensional database facts, but it hash-partitions them according to the first attribute. Whenever a server discovers a new fact, the new fact has to be consumed to the location determined by the hash of the second attribute. Observe that the production policy is [p] because we do not know where each fact will be produced (in other words, each server will produce as many intensional database facts as possible from its local input without any restrictions).
We will see later in Section 6 that all the above economic policies belong in a specific family of policies that we call Generalized Hypercube Policies (GHPs). We notice that our framework supports evaluation strategies that are oblivious of the instance: each fact is communicated, consumed, and produced independent of whether other facts are in the same local instance or not. Lastly, we note that monotonicity of Datalog ensures monotonic behaviour of economic policies for Datalog programs, as made formal by Proposition 1.

Proposition 1 For every Datalog program P and economic policy
For the proof, we first extend the concept of proof tree for Datalog programs, to annotated proof trees for Datalog evaluation with economic policies. For program P , economic policy E, instance I , and fact f , an annotated proof tree T is a proof tree for P , I , and f , where, additionally, every node g in T has a label server T (g). For non-leafs we assume the following constraint: g ∈ facts P (server T (g)), and children T (g) ⊆ facts C (server T (g)).
We also assign to all nodes in T a number round T (g), which is obtained through the following iterative argument: For leaf nodes g in T , round T (g) = 1. For all nodes g, for which all nodes in children T (g) have already a number assigned, let max g = max g ∈children T (g) {round T (g )} and L = {g 1 , . . . , g k } ⊆ children T (g) be exactly those child nodes, with round T (g i ) = max g . Now, we define round T (g) = max g if server T (g i ) = server T (g), for all g i ∈ L, and round T (g) = max g + 1 otherwise.
Intuitively, an annotated proof tree encodes possible runs in the evaluation of P over I using E. More specifically, T encodes an upperbound on the moment where a fact is derived during the evaluation. More formally: Lemma 1 For a Datalog program P , economic policy E, instance I , and fact f , the following implications hold:

f ∈ [P , E](I ) implies existence of an annotated proof tree T for P , E, I and f . If f is derived by E in round i on server s, then T exists with round T (root T ) = i and server T (root T ) = s. 2. Existence of annotated proof tree T for E, P , I and f implies f ∈ [P , E](I ).
More specifically, f is derived on server server Proof (1) The proof is by induction on the round in which f is derived. Clearly, after round 1, all facts residing in the network have a desired annotated proof tree. The proof then proceeds by induction, assuming that condition (1) of the lemma holds up to ≤ k rounds, for some k. Now suppose that f is derived at round k + 1 on server s. The latter means that some proof-tree T for f , P and rec k s ∪ local k−1 s exists. We and set for all facts g in T , server T (g) = s. Since for all leaves g in T there exists a desired annotated proof-tree T with round T [T ](g) ≤ k (by the hypothesis), we can simply attach these to T . It is now easy to see that round T (g) ≤ k + 1, for all nodes g in T . Hence, the proof-tree is as desired.
(2) By definition of annotated proof-tree, particularly due to the constraints on server T , fact f becomes derivable on server s during the computation of E over I . We only need to show that this happens in round at most round T (f ). The proof is by induction on round T (f ). Clearly, if round T (f ) = 1, then all facts g in T are marked round T (g) = 1, and therefore, server T (g) = s. The latter means that all leafs of T where present on node s after the first communication phase, and thus either f ∈ I , or (because T is also a valid proof tree for f ), f has been derived on node s in the first computation phase.
Assume now that condition (2) holds for i ≤ k (induction hypothesis). Suppose round T (f ) = k+1. By the induction hypothesis, all facts g in T with round T (f ) ≤ k have been derived at some node in round ≤ k. Now it is easy to see that the top-fragment T of T (i.e., all the subtree of all facts marked with round T (g) = k + 1 and their immediate children), describes a proof-tree for f and P on node s.
Let j be the earliest communication round after which all leaf nodes have reached server s. Since all leaf-nodes have been derived in round ≤ k (by the hypothesis), and the semantics of E and T guarantees arrival of these facts on server s after the next communication phase, we have that j ≤ k + 1. It is now easy to see (due to T ), that f will be derived on server s in computation round j ≤ k + 1. This concludes the proof.
Since for instances I ⊆ I , every annotated proof tree T for E, P , and I is trivially also an annotated proof tree for E, P , and I , Proposition 1 is now a corollary of Lemma 1.

Parallel-Correctness
An economic policy for a Datalog program does not necessarily lead to the desired output. For example, if the production policy maps every fact onto the empty set of servers, then the execution will generate only empty intensional database relations. Henceforth, we are only interested in economic policies that generate the expected output.

Definition 3 (Parallel-correctness) An economic policy
Parallel-correctness is in general undecidable, even for simple classes of policies. For instance, consider the class of policies, where P (f 1 ) = P (f 2 ) and C(f 1 ) = C(f 2 ), whenever f 1 , f 2 are facts with the same relation symbol. We call this class of policies value-independent, denoted E indep , since the facts are mapped to servers only according to the relations they belong to. Value-independent policies allow a succinct representation by simply enumerating the intensional relation names of P and the subsets of [p] where each relation is assigned.
We consider the following decision problem.
Proof The proof is by a reduction from the Datalog containment problem, which is well-known to be undecidable [1]. Let P 1 and P 2 be two arbitrary Datalog programs given as input for the containment problem. We assume that both are over the same nullary output relation name, say O.
We first denote by P * i an indexed version of program P i ; particularly we define P * i as P i in which all intensional relation names are annotated with index i. We now construct a program P by taking all rules from P * 1 and P * 2 , and adding the rules O() ← O i (), for i ∈ {1, 2}. We note that edb(P ) = edb(P * 1 ) ∪ edb(P * 2 ) and out(P ) = {O}. As economic policy we take E = (P , C) over the 2-node network {1, 2}. The consumption policy maps all facts with index i to server i. The production policy maps all facts with index i to server i, and the fact O() to server 2. The extensional database facts are consumed on all servers.
Intuitively, programs P * 1 and P * 2 are computed locally on server 1 and server 2. It thus follows from the construction that ( †) P 1 (I ) ∪ P 2 (I ) ⊆ [P , E](I ), for every instance I . Notice that rule O() ← O 1 () is never used, since server 2 cannot consume facts over relation names with index 1.
It remains to show that E is parallel-correct for P if and only if P 1 ⊆ P 2 . Indeed, if P 1 ⊆ P 2 , then P (I ) = P 2 (I ) for every instance I , which implies that the policy will compute the correct result for O. The other direction follows from monotonicity of P . From ( †) it follows that this condition is satisfied if and only if all facts over the O relation produced by P (I ) are also produced by In fact, the above proof yields an even stronger result: Lemma 2 Let P be an arbitrary Datalog program and E = (P , C) an economic policy over σ that is parallel-correct for P . Let f ∈ facts(σ ) and C be a consump- Proof We simply observe that the economic policy E = (P , C) in the proof of Proposition 1 has this property. Indeed, updating C(O 1 ()) = {1} to C(O 1 ()) = {1, 2} makes the policy trivially parallel-correct for P .
Despite the above results, we can present some syntactic conditions that are necessary for parallel-correctness, and some that are sufficient.
We say that an economic policy E supports a proof tree T if all the rule instantiations in T are supported.

Lemma 3
Let P be a Datalog program and E an economic policy. If a proof tree T for P is supported by E, then for every instance I , with fringe T ⊆ I , we have Proof The proof is by induction on the depth d of T . Particularly we show using a simple inductive argument that root T ∈ local k i , for some server i and k ≤ d, which implies root T ∈ [P , E]. Recall that local k i denotes the facts residing locally on server i after the k-th computation round.
As base case let d = 1, meaning that T describes a single rule instantiation. After the first communication round, all servers j have local 0 j ∪ rec 1 j ⊆ I ∩ facts C (j ). By the assumption that E supports T , it follows that for some server i, thus after the first computation round, root T ∈ local 1 i . For d > 1 we observe that root T and its children in T define a rule instantiation (τ, v), and, by the assumptions of the lemma, this rule instantiation is supported by E. More specifically, some server i exists where root T ∈ facts P (i) and children T (root T ) ⊆ facts C (i). Further, for all facts f ∈ children T (root T ), the respective subtree T f of T with root f is supported by E and with depth d − 1. By the induction hypothesis it follows that for all these facts f there is a server j and We now have a characterization for parallel-correctness of a program P w.r.t. an economic policy. For this, let f ∈ P (I ), which means that a proof tree T exists with fringe T ⊆ I and root T = f . Particularly, by the assumption of the lemma we can choose T so that it is also supported by E. It now follows from Lemma 3 that f ∈ [P , E].
(Only if) We assume (P , C) is parallel-correct for P . Let T be an arbitrary proof tree. The proof is by construction following the derivation of root T using E. First, from parallel-correctness it follows that P (I ) = [P , E](I ), for any instance I . Here we take I = fringe T , implying root T ∈ [P , E](I ). The proof now continues by induction on the number of rounds needed for E to derive root T .
The induction hypothesis is that if k rounds are needed to derive root T , then a supported proof-tree of depth k subsumed by T exists.
As a base case suppose k = 1. That is, root T ∈ local 1 i , meaning that root T ∈ P E (local 0 j ∪ j rec 1 j ) for some server j . Particularly, a valuation v and rule τ ∈ P existed with v(body τ ) ⊆ facts C (j ) ∩ I and v(head τ ) = root T , which means that the corresponding rule instantiation is supported by E. Here, the proof tree admitted by (τ, v) is as desired.
For k > 1 the proof is analogous, but now we take as proof tree the tree obtained by concatenating the rule instantiation with the proof trees for each child. Existence of the latter follows from the induction hypothesis. As the number of rounds decreases by one in each inductive step, and the fringes of the obtained trees cannot have other facts than does in I , the constructed proof tree is as again as desired.
We consider various categories of economic policies based on which rule instantiations are supported for a given Datalog program P : ∈ v(body τ ). N ess P : the set of all essential rule instantiations of P . An instantiation of rule τ with valuation v is essential if for some P -derivable fact f and instance I , every proof tree T for f on I and P has a vertex g with g = v(head τ ) and v(body τ ) ⊆ children T (g).
If the program is non-recursive, then N use P = N all P , since there will be no rule that contains the same relation in the head and the body. We also have: Before giving a proof, we first show the following Lemma.

Lemma 4 For every proof tree T of depth d, there exists a proof tree T T of depth at most d that uses only minimal and useful rule instantiations.
Proof The proof is by induction on the depth of T , which we denote d.
For the base case, let d = 1. Then, T corresponds to a single rule instantiation (τ, v) for P where all the facts in v(body τ ) are extensional database facts. By definition, there is also a minimal rule instantiation (τ , v ), with v (head τ ) = v(head τ ) and v (head τ ) ⊆ v(body τ ), which admits the desired proof tree.
As induction hypothesis we take the statement of the lemma. Now, for the induction step, suppose T has depth d > 1. Then, the root of T , together with its children, defines a rule instantiation (τ, v) for P . Now take an subsumed minimal instantiation (τ , v), such that v (head τ ) = v(head τ ) and v (body τ ) ⊆ v(body τ ). For every fact f ∈ v (head τ ), let T f be the subtree of T with root f (child of root T ). By the induction hypothesis, there is a proof tree T f T f with depth ≤ d − 1 that uses only minimal rule instantiations. The proof tree that combines instantiation (τ , v ) with T f for all f ∈ v (τ ) is as desired.
Proof of Proposition 3 The containment N ess P ⊆ N use P is straightforward, since a proof tree does not use any useless rule instantiations. We next show that N ess P ⊆ N min P . Suppose that we have an instantiation of rule τ with valuation v that is essential. Then, there exists some fact f and instance I for which every proof tree T has a vertex g with g = v(head τ ) and v(body τ ) ⊆ children T (g). By Lemma 4, we can pick this tree such that it uses only minimal rule instantiations. This implies that the rule instantiation with head g and body children T (g) is minimal. Hence, the instantiation with head v(head τ ) and body v(body τ ) is also minimal.
The following example demonstrates the different types of rule instantiations.
Example 2 Let P be the left-linear transitive closure program from Example 1; consider a rule instantiation of the recursive rule : T (a, b) ← T (a, c), R(c, b), for some (not necessarily different) constants a, b, c. We distinguish the following cases: c = a: in this case, the instantiation is not minimal, since we can derive the same head fact from the instantiation T (a, b) ← R(a, b) of the first rule. c = b: in this case, the instantiation is useless, since T (a, b) also belongs in the body. Depending on which types of rule instantiations are supported by an economic policy, we can define different types of policies. An economic policy that supports all possible rule instantiations, that is, N all P , is said to be strongly supporting for Datalog program P .

Proposition 4
Let P be a Datalog program and E an economic policy. If E supports all minimal and useful rule instantiations in P , then it is parallel-correct. If E is parallel-correct for P , then it supports all essential rule instantiations.
Proof The first item follows from Proposition 2 and Lemma 4. For the second item, consider a parallel-correct policy E and an essential instantiation of rule τ with valuation v. By the definition of essential, for some fact f and instance I , every proof tree T for f on I and P has a vertex g with g = v(head τ ) and v(body τ ) ⊆ children T (g). By Proposition 2, there must exist such a tree T that is supported. This implies that there exists server s with v(head τ ) = g ∈ facts P (s) and v(body τ ) ⊆ children T (g) ⊆ facts C (s). Hence, the essential rule instantiation is indeed supported.

Proposition 5
Let P be a Datalog program where each intensional relation name occurs only in the head of rules (i.e., P is a union of CQs). Then, N ess Proof Because P is not recursive, N use P = N all P ; hence, because of Proposition 3 it suffices to show that N min P ⊆ N ess P . Indeed, consider a minimal instantiation for rule τ with valuation v, and consider the instance I = v(body τ ) and fact f = v(head τ ). Take any proof tree T for f on I and P ; T must have depth one. Because of the minimality of the rule instantiation, it must be that children T (f ) = v(body τ ), which proves the essentiality.
Together with Proposition 4, the above proposition implies that a Datalog program where the body of each rule contains only extensional database relations is parallelcorrect if and only if it supports every minimal rule instantiation, or equivalently if and only if it supports every essential rule instantiation. Notice that this class of Datalog programs corresponds to a program that computes a set of UCQs, and thus the above result captures the characterization of parallel-correctness for CQs and UCQs in [7,17]. We should emphasize here that [7,17] consider only economic policies where P assigns every fact to every server, while a general economic policy can assign facts to any subset of servers.
For general Datalog programs, N ess is not true anymore, and thus supporting essential instantiations is not a sufficient condition for parallelcorrectness, even if P is non-recursive. (Recall that non-recursiveness is a syntactic condition, and that all such programs are straightforwardly rewritable to UCQs.) Example 3 Consider the following non-recursive Datalog program P :

and take the rule instantiation with head U() and body {V (), R(a, b), S(c, d)}.
Assume that c = d. This rule instantiation is minimal, but we will show that it is not essential.
For the sake of contradiction, assume that it is essential. Then, for some instance I there exists a proof tree T for U() on I and P such that there exists a vertex U() R(a, b), S(c, d)} ⊆ children T (U ()). Since the proof tree contains the fact V (), it also contains a rule instantiation that derives the fact V () with body {R(a , b ), S(c , d ), S(d , c )} for some constants a , b , c , d . We can now construct two proof trees for U() on the same instance, as seen in Fig. 1. Because c = d, one of the facts S(c , d ), S(d , c ) must be different from S(c, d) (In Fig. 1 we assume this fact is S (d , c )). Thus, for one of the two trees, the children of U() will not be a subset of {V (), R(a, b), S(c, d)}. This implies that the rule instantiation we considered is indeed not essential.  T (x, y) ← R(x, y), is trivially minimal, useful and essential. As for the recursive rule, we showed in Example 2 that an instantiation that is minimal and useful is also essential. Observe that if this instantiation is only minimal but not useful, or only useful and not minimal, it is not essential. Thus, both properties are necessary to guarantee essentiality.
We conclude this section by commenting on whether it is computationally feasible to test the different properties of rule instantiations. It is easy to see that given an instantiation, it is possible to check whether it is useful in polynomial time. The complexity for checking the minimality of a rule instantiation is CONP-complete [7]. Unfortunately, testing essentiality of a rule instantiation is undecidable.

Proposition 6 Testing essentiality of rule instantiations, as well as whether for a given rule an essential instantiation exists, is undecidable.
Proof We first show the latter. The proof is again by a reduction from the Datalog containment problem. For this let P 1 and P 2 be programs serving as input, with output predicate O (k) . As before, let P * 1 and P * 2 be the indexed versions of these programs.
Now the question whether P 1 ⊆ P 2 reduces to the question whether some essential rule instantiation for O(x) ← O 1 (x) exists. Indeed, if P 1 ⊆ P 2 , this cannot be the case, since a proof tree over {O} ∪ σ P 2 will always exist.
If P 1 ⊆ P 2 , then some I and t exist, with O 1 (t) ∈ P 1 (I ), O 2 (t) ∈ P 2 (I ). Then, it is easy to see that all proof trees T with root T = O(t) contain the instantiation O(t) ← O 1 (t), which is thus essential.
Next, we show that essentiality of rule instantiations is undecidable. More precisely the proof is by contradiction. We show that, if essentiality of rule instantiations is decidable, then testing whether an essential instantiation for a given rule exists is decidable as well, which contradicts the earlier obtained result.
The algorithm relies on the observation that positive Datalog programs (without function symbols) are C-generic, with C being the constants occurring in P . Thus, if a rule instantiation is essential, all isomorphic instantiations (where values from C are preserved) are essential. Clearly there are only finitely many distinct instantiations (up to isomorphisms).
For given rule τ ∈ P , one can thus iterate over the above defined equivalence class, choose from each a specific instantiation, and test whether the chosen instantiation is essential. An essential instantiation is found if and only if the rule has an essential instantiation.

Generalized Hypercube Policies
In this section, we present a general class of economic policies, called Generalized Hypercube Policies (GHP), which encompass a broad variety of evaluation strategies.
We first give an intuitive explanation. The formalism of GHPs relies on the Hypercube partitioning for CQs [4], which has been shown to provide guarantees on the communication-cost for CQ evaluation [9]. Let P = {τ } be a CQ with k distinct variables. Hypercube conceptually orders the p servers as a hypercube H = [p 1 ] × [p 2 ]×· · ·×[p k ], with i p i = p, where every dimension p i ≥ 0 corresponds to a variable x i from the query; every server is assigned a unique point in the space H ; and every variable x i is associated to a hash function h x i : dom → [p i ]. Then, a fact R(a 1 , . . . , a r ), matching with atom R(y 1 , . . . , y r ) ∈ body τ , is sent to all servers whose coordinate in the dimension of H associated to variable y i is equal to h y i (a j ), for all j ∈ [r]. Then, program P is computed on each server over the data at hand.
For GHPs, we associate to every rule a hypercube over the full p-server network, and intuitively define the consumption policy so that "a fact is consumed at server i if and only if one of the considered Hypercube specifications would send it to server i", and "a fact is in the production policy of server i if and only if one of the Hypercube specifications would derive it on server i".

GHP Parameters
Let P be a Datalog program, and assume we have a network [p]. A GHP for P defines a finite set of k-dimensional hypercubes H 1 , . . . , H , for some parameter k. We note that the assumption that every hypercube has the same number of dimensions is without loss of generality, since we allow the range [1] for dimensions. The range of the dimensions of the hypercubes are parametrized by a matrix of dimensions × k with entries p i,j , such that k i=1 p j,i = p, for each j ∈ [ ]. Each hypercube is then defined as H j = [p j,1 ] × [p j,2 ] × . . . [p j,k ]. For each hypercube H j , we also define a bijective mapping map j that assigns to every point in H j a server s ∈ [p]. The latter thus provides the mapping between conceptual servers in the cube and real servers in the considered network.
A GHP policy next assigns each rule τ to exactly one of the hypercubes: let χ : P → [ ] be the function that encodes this assignment. Given this assignment, a GHP defines a mapping ρ τ : [k] → P(vars(τ )) that maps each dimension of the hypercube H χ(τ ) to a subset of the variables that appear in τ .
Finally, the GHP defines for each dimension i ∈ [k] and each hypercube H j a hash function h j i that maps subsets 3 of dom with size lesser or equal than the largest size of ρ τ (i) (for any τ with χ(τ ) = j ) to a value in the i-th dimension. For hash functions that accept non-empty sets, we require that they are surjective. Notice that our concept of hash-function is a generalization of the hash-functions used in, e.g., the Hypercube algorithm, where α = 1. This generalization allows to scatter tuples over a row in a more fine-grained way than is possible via a single variable. Further, we notice that, by definition, rules that use the same hypercube, also use the same hash function for each dimension of that hypercube.
GHP Semantics Let f be a fact and suppose that f = v(A), for some valuation v and atom A = R(y) that appears in rule τ . 4 We define the following set of servers: Intuitively, S τ f ,A denotes the set of servers whose coordinate q is consistent with the hash mappings specified for τ . Notice that if the atom R(y) has only a part of the variables that correspond to some dimension i, then facts are broadcast over dimension i, as it happens if none of these variables are in y.
The consumption policy C(f ) is defined as the union over all sets S τ f ,A for rules τ and atoms A ∈ body τ with instantiation f . The production policy P (f ) is similarly defined as the union over all sets S τ f ,A for rules τ and atom head τ with instantiation f . Fig. 2. We choose two hypercubes H 1 , H 2 ( = 2) with dimension k = 2. The first two rules τ 1 , τ 2 are mapped to the hypercube H 1 , and the third rule τ 3 is mapped to H 2 . We choose the dimensions of the hypercubes such that p 1,1 · p 1,2 = p, p 2,1 = p, and p 2,2 = 1. The two functions map 1 , map 2 map the points of H 1 , H 2 respectively to {1, . . . , p} in a one-to-one fashion. Finally, the mapping of variables to dimensions is:

Example 5 Consider the Datalog program depicted in
Consider the first two rules (which form the left-linear TC example), and assume that p 1,1 = 1 and p 1,2 = p. Then, the resulting GHP is equivalent to the hash partitioning policy that we described in Example 1. Notice that since we use the same hypercube for both rules, the extensional database relation R will be hash partitioned only once. If we now change the dimensions to p 1,1 = p, p 1,2 = 1, we obtain the decomposable policy of Example 1 that broadcasts the extensional R to every server and can terminate in a single round. Apart from the above two GHPs, we can also define other GHPs by configuring different dimensions of the hypercube H 1 . For example, we can choose p 1,1 = p 1,2 = √ p.
We next show that GHPs are strongly supporting policies.

Proposition 7
Let P be a Datalog program. Every GHP E for P is strongly supporting for P and, as a consequence, parallel-correct for P . Proof To show that E is strongly supporting, consider some rule τ ∈ P , and its instantiation w.r.t. some valuation v. Consider some atom A = R(y) in the body of τ ; then the consumption policy says that its instantiation f = v(A) will be consumed in the set S τ f ,A . Similarly if A is the head, the fact f will be produced in S τ f ,A . Now we can write the intersection A∈τ S τ f ,A as: In other words, there will be at least one server in A∈τ S τ A , which means that every instantiation of the rule τ will be strongly supported.

GHP Families
Since we do not want to consider an encoding mechanism for hash functions-which is necessary to formally reason about properties for GHPs-we introduce the concept of GHP families. Given a Datalog program P and network [p], a GHP family F is defined as the set of GHPs over P and [p] that all have the same parametrization for P, map j , χ, ρ τ . In other words, policies in F can differ only with respect to the choice of hash functions, and for every choice of hash functions, the associated GHP is in the family. By F GHP we denote the class of all GHP families.

Bounded & Disjoint Evaluation
In this section, we we ask two main questions: First, can we reason about the number of rounds that an economic policy needs to compute a Datalog program? Second, can we constrain the number of servers that derive a copy of the same fact? We start with a formal definition of boundedness.
Definition 5 (Boundedness) An economic policy E for Datalog program P is bounded if some constant k exists such that, for every instance I , the network reaches a global fixpoint for E and P , when round k is finished. We say E is -bounded if k ≤ .
First, we remark that setting ρ τ to map to the emptyset for all rules τ , does not eliminate communication. Indeed, economic policies always send facts to all servers that may need the fact according to the policy, independently of whether the fact is already known by the target server. In other words, the responsibility to decide whether a fact is send lies entirely on the sender. There is no trivial way to provide a 1-bounded economic policy.
Second, one should not confuse the number of rounds in the parallel computation with the number of iterations of semi-naive evaluation. Nevertheless, as the following proposition shows, boundedness of the Datalog program implies boundedness of the evaluation.

Proposition 8 If P is a bounded Datalog program, then every parallel-correct economic policy E for P is k-bounded, for some constant k that depends on P .
Proof We use the following claim.
( †) if we run E on an instance with bounded size, then E will finish its evaluation in a bounded number of rounds.
The result now follows from boundedness of P and Proposition 1. Boundedness of P implies that some constant exists, such that for every instance I and fact f , f ∈ P (I ) implies the existence of a proof tree with depth no more than the bound. We observe that a bound on depth implies also a bound on fringe size. Now, for arbitrary f and I , for f ∈ [P , E](I ) we observe that f ∈ P (I ), due to parallel-correctness, and thus, due to boundedness of P , that some proof-tree T with bounded fringe exists. Then, it follows from ( †) that the number of rounds of E on the instance consisting only of the fringe is bounded, and due to parallel-correctness of E, that f ∈ [P , E](fringe T ).
Since this observation holds for all f ∈ [P , E](I ), it follows from Proposition 1 that the number of rounds of E over the whole instance is also bounded.
It remains to show ( †). The crucial observation is that, in all but the last computation round at least some fact is communicated in the network that has not been communicated in any earlier round. Indeed, only new derivations can trigger a next communication round, and when a fact is received it will trigger new derivations only if it is not already known by the receiving server.
Since the instance is bounded, the active domain (of this instance) is bounded, and thus the number of facts that can be introduced during the evaluation is bounded as well. The result follows.
Surprisingly, there exist economic policies for bounded Datalog programs that are not bounded. However, due to Proposition 8, such policies cannot be parallel-correct.
Example 6 Consider the following bounded program.

T (x) ← A(x). T (x) ← B(x), T (y).
We construct a network with p > 1 servers. Consider a policy that consumes T (i) and B(i) at server (i mod p) + 1, and produces T (i) at server (i mod p). Every tuple in A is consumed at server 1. Now, consider the following input instance: B(1), B(2), . . . , B(p − 1)}. It is easy to see that T (0) is produced at server 1 at round 1, T (1) is produced at server 2 at round 2, and so on, until T (p − 1) is produced at round p at server p.
In the remainder of this section, we focus on pure Datalog (PureDatalog). A Datalog program is pure if it is free of constants and variables occur at most once in every atom [28]. We emphasize that this definition prohibits a variable from occurring on multiple positions in an atom, but that a variable can still occur in multiple (distinct) atoms of a rule. We note that, for a program P in pure Datalog, every fact over a P -consumable (P -derivable) relation itself is P -consumable (P -derivable).
We consider the following decision problems.

Since the proof is by reduction to an economic policy that is either 2-bounded or not bounded et all, it follows that BOUNDEDNESS
Proof The proof is by a reduction from the undecidable containment problem for Datalog programs. Let P 1 , P 2 be two Datalog programs with the same distinguished nullary output predicate O that serves as input. As before, we annotate the relation names of both programs P 1 and P 2 with index 1 and 2, respectively, and denote the obtained programs by P * 1 and P * 2 .
We now construct program P over the schema: (1) , O (2) , E (2) }, by combining the rules from P * 1 , P * 2 , and those mentioned below. First we add rules Adom(x j ) ← X(x 1 , . . . , x α ) for every relation X (α) ∈ σ (P * 1 ) ∪ σ (P * 2 ) ∪ {E (2) } and j ∈ [α]. Further, we add: Notice that new relation E is an extensional database relation, while O and Adom are intensional database relations. Next, we define a GHP H. For this take a single 1dimensional cube of p = 2 servers, say cube 1, and define χ(τ ) = 1 for all rules in P . For rules in P * 1 and P * 2 , as well as the Adom producing rules, we define ρ τ (1) = ∅. For rule τ 1 we again define ρ τ 1 (1) = ∅, for τ 2 and τ 3 we define ρ τ 2 We claim that P is 2-bounded if, and only if, P 1 ⊆ P 2 . Otherwise, a GHP exists in H for which the number of rounds depends on the size of the input, particularly on the size of relation E.
(If) Let E = (P , C) be an arbitrary economic policy from H. We observe that after a single round, programs P * 1 and P * 2 , as well as relation Adom are fully computed on both servers. (The latter is due to our choice ρ τ (1) = ∅ for the involved rules.) During the first round, O-facts may be produced by rule τ 2 . After this first round, several facts will be communicated, particularly the C-consumable facts derived with rules from P * 1 and P * 2 , as well as the facts with relation name Adom and O. Since these relations are computed on both servers, no server receives a new fact (particularly due to P 1 ⊆ P 2 ). Hence, the fixpoint is reached and no further communication steps are needed.
(Only if) Since P 1 ⊆ P 2 , some instance I exists, with O() ∈ P 1 (I ) and O() ∈ P 2 (I ). We convert instance I to an instance for P , by annotating the relations with respective index, and add a relation E, with the chain E(0, 1), E (1, 2), . . . , E(m − 1, m) for some integer m.
We define a specific GHP from H. For this, let T I be the transitive closure relation of E I . As hash function, we choose h 1 1 ({i}) = imod2. We note that, by the choice of h 1 1 , E(0, 1) is consumed at server 1, E(1, 2) is consumed at server 2, E(2, 3) is consumed again at server 1, etc. We observe that server 1 derives the fact O(0, 1) in the first round and sends it to server 2. Then, server 2 can derive (in the second round) the fact O(1, 2), based on O(0, 1) and the fact E(1, 2) which it had already received in an earlier round. Now, a straightforward inductive argument shows that server imod2 receives fact O(i − 2, i − 1) for the first time in round i, and thus that we need (m) rounds to reach a fixpoint. So for m large enough we need more than k rounds.
Proof The proof is again by a reduction from the undecidable Datalog containment problem. Given two Datalog programs P 1 , P 2 over single output relation, which serve as input for Datalog containment. We construct program P by taking all rules in P 1 , where all IDB relations are annotated by index 1, and all rules in P 2 , where IDB relations marked with index 2. Here we assume that O 1 is the output predicate for P 1 , and O 2 for P 2 . We add the following rules, with fresh relation names {D i | i ∈ [k]}: And for each i ∈ {1, . . . , k − 1} we add the rule: We take an economic policy E = (P , C) over a two-node network. The consumption and production policies are defined as follows: -All relations with index 1 are consumed and produced by server 1; -All relations with index 2 are consumed and produced by server 2; -All relations D i with even i are produced at server 2 and consumed at server 1; -All relations D i with odd i are produced at server 1 and consumed at server 2; and -Relation D k is produced at server 2 (even if k is odd).
Next, we show that E is k-bounded if and only if P 1 ⊆ P 2 .
(Only if) Suppose P 1 ⊆ P 2 . Then let I be an instance with O 1 () ∈ P 1 (I ) and O 2 () ∈ P 2 (I ). In the first round, server 1 derives fact D 1 (), which needs to be communicated in the next round (round 2) to server 2. In round 2, server 2 receives D 1 and produces D 2 , which needs to be communicated in the next round (round 3) to server 1. Since server 1 and server 2 cannot produce facts for relations D i in another way than via rule D i+1 () ← D i (), they are deemed to continue this exchange of facts till D k is produced (at round k) and received by its consuming server (at round k + 1). Policy E is thus clearly not k-bounded.
(If) Suppose P 1 ⊆ P 2 . On every instance I , server 1 computes P 1 (I ), and server 2 computes P 2 (I ). We distinguish between three cases: If P 2 (I ) is empty, then the network fixpoint is reached immediately after the first round. If P 1 (I ) is empty and P 2 is not, then the fact D k () is derived at server 2 and may have to be send to server 1 in the round (if k is even). Since D 1 is not derived, and will not be derived after receiving D k (), the network fixpoint is reached after at most two rounds. The more interesting case is when P 1 (I ) is not empty. Then server 1 derives fact D 1 (), which triggers the consecutive exchange of D i () facts between the two servers as deribed in the only-if case of the proof, except that, when receiving fact D k−1 () (in round k), the fact D k () is already known by its consuming server (i.e., server 1 if k is even, server 2 otherwise). Therefore, the network fixpoint is reached already in round k, which concludes the proof.
Result (4) follows from the syntactical characterization shown in the next subsection. Towards this characterization, we first give a general characterization of 1-boundedness for strongly supporting policies.
Let P be a Datalog program and E = (P , C) an economic policy. We denote by P * the policy obtained by removing from every P (f ) any server s for which no rule instantiation v(τ ) exists with v(head τ ) = f , v(body τ ) ⊆ facts C (s), with v(body τ ) being all P -derivable. Intuitively, P * (f ) removes those servers that are allowed to produce f , but cannot due to limitations of the consumption policy C. Notice that if E = (P , C) is strongly supporting for P , then so is E = (P * , C), since we have not removed the support of any rule instantiation.

Proposition 11
Let P be a Datalog program and E = (P , C) a strongly supporting economic policy for P . E is 1-bounded if and only if for every P -derivable intensional database fact f : Proof (If) All intensional database facts derived during the distributed evaluation are P -derivable. Consider a rule instantiation (τ, v) that is fired on some server s and produces fact f = v(head τ ). Then, condition (1) tells us that |C(f )| ≤ 1. If |C(f )| = 0, then f is not consumed anywhere and thus will not be communicated.
If |C(f )| = 1, condition (2) tells us that C(f ) = P * (f ). But since s ∈ P * (f ), this implies that C(f ) = {s}. Hence, s is the only server that consumes f , and f does not have to be sent to another server. Thus indeed E is 1-bounded. Notice that extensional database facts are never communicated after round 1.
(Only if) Suppose that E is 1-bounded. Let f be a P -derivable fact. Since E is strongly-supporting, it is parallel-correct, thus f is derived at some server s over some instance I in round 1. In particular, s ∈ P * (f ). If C(f ) ⊆ {s}, then f needs to be communicated by s, which enforces another round and contradicts 1-boundedness. Hence, C(f ) ⊆ {s} and |C(f )| ≤ 1. Assume C(f ) = {s} and suppose that there exists some s ∈ P * (f ) \ {s}. Then, over some instance J , f is derived in s in round 1, and then needs to be communicated to s, which again contradicts 1-boundedness.

Weakly Pivoting GHPs
We present a necessary and sufficient syntactic condition for 1-boundedness of GHP families. Here, for atom A and set of variables X ⊆ vars(A), we denote by pos A (X) the set of positions in A containing variables from X. Definition 6 (Pivoting Relation) A relation R is pivoting for GHP family H if for every two atoms A 1 , A 2 (in rules τ 1 , τ 2 respectively) over R, and for all dimensions i of cube χ(τ 1 ) with p χ(τ 1 ),i > 1: Intuitively, if R is pivoting, then every rule that sends R tuples will send each R tuple to exactly one server, and the rules agree on this server.
Definition 7 (Pivoting/Weakly pivoting) We say that a GHP family is pivoting (weakly pivoting, resp.) for P if all (all P -consumable, 5 resp.) intensional database relations are pivoting.
The program from Example 7 is weakly pivoting. For pure programs we can test whether a GHP family is weakly pivoting in polynomial time, since we need to go over all P -consumable intensional database relations (for pure programs, these are all relations occurring in the body of a rule), and then for each such relation R test all pairs of atoms over R. This observation, along with the proposition below-that shows that weakly pivoting is a necessary and sufficient condition for 1-boundedness-implies that deciding 1-boundedness for GHP families is indeed in PTIME.

Proposition 12 Let P be a pure Datalog program, and H a GHP family. Then, H is 1-bounded for P if and only if it is weakly pivoting for P .
Proof (If) Let E = (P , C) be an arbitrary economic policy in H that is weakly pivoting for P . We show that H is 1-bounded for P by making use of Proposition 11. For this, let f be an arbitrary P -derivable fact. We first deal with the case when f is not P -consumable. Due to pureness of P , the latter implies that no rule in P exists with a body atom that can match with f . It follows from the definition of GHP that f is not C-consumable (i.e., |C(f )| = 0) and thus that the conditions in Proposition 11 are true for f .
For the remainder of this direction of the proof we have to deal only with the case when f is P -derivable and P -consumable, which implies |P (f )| ≥ 1, and |C(f )| ≥ 1. Recall that a server s is in C(f ) iff there is a rule τ ∈ P , atom A ∈ body τ , and valuation v, with v(A) = f . Analogously, server s is in P (f ) iff there is a rule τ ∈ P and valuation v, with v(A) = f for A = head τ . Moreover, per definition 6 for weakly pivoting programs, in both cases server s is uniquely determined (for given τ and A) by the parameters χ(τ ) and pos A (ρ τ (i)), for all dimensions i of χ(τ ). (We ignore map χ(τ ) , which is fixed for χ(τ ).) Next, we show that |C(f )| = 1. For this, arbitrary elements s 1 , s 2 ∈ C(f ). Then, due to Definition 6 it follows directly that s 1 = s 2 . Hence, indeed |C(f )| = 1, which corresponds to condition (1) of Proposition 11. The argument for condition (2) of Proposition 11, that s 1 = s 2 for every pair of servers s 1 ∈ P (f ) and s 2 ∈ C(f ) (which implies P (f ) = C(f )) is analogous. We thus conclude from Proposition 11 that E is 1-bounded. Then, from the generality of the argument it follows that H is 1-bounded.
(Only If) Let R be an arbitrary consumable intensional predicate name in P . We argue that R is pivoting by showing conditions (1), (2), and (3) from Definition 6 by contraposition. Notice that all facts over R are both P -consumable and P -derivable, due to pureness of P and because R is intensional.
First suppose that (1) fails for some τ 1 , i, and atom A 1 over R. If A 1 is a body atom it follows immediately that all facts f over R are replicated in the construction for C over dimension i, which implies |C(f )| ≥ 2 and thus contradicts with 1boundedness due to Proposition 11. Now assume A 1 is the head of τ 1 . If ρ τ 1 (i) = ∅ it follows that all rule instantiations for τ 1 are replicated over dimension i, and thus that P * (f ) > 1 for all facts f matching the head of τ 1 . Since such a fact f is C-consumable and p i > 1 (which implies |C(f )| ≥ 2, this again contradicts 1boundedness. For the case where ρ τ 1 (i) = ∅, a similar argument holds: Take x ∈ ρ τ 1 (i) \ vars(A 1 ) and consider two valuations mapping all variables on the same value, except for x. We can now chose the hash functions for ρ τ 1 so that both rule instantiations are fired on different servers (due to p 1 > 1), and thus again |P * (f )| > 1, for some C-consumable fact f , which contradicts 1-boundedness.
For condition (2), χ(τ 1 ) = χ(τ 2 ) allows choosing valuations for τ 1 and τ 2 that agree on the A 1 and A 2 (due to pureness), and then hash functions can be chosen so that both are fired on different servers. Since all matching facts are C-consumable, this would contradict 1-boundedness (since |P (f )| > 1 implies P (f ) = C(f )).
We remark that Proposition 12 cannot be easily generalized. For example, one cannot replace GHP families by strongly supporting policies, since then facts f that are not P -consumable may still be C-consumable (i.e., C(f ) = ∅). Reasoning about the latter requires a concrete representation mechanism for policies. (See also [7] for a discussion on this matter.) Further, it is unclear what the complexity becomes for testing 1-boundedness under general (not necessarily pure) Datalog, since then it is required to reason about P -derivability of facts.

Example 8
For an example showing that not every 1-bounded GHP is weakly pivoting, consider the following non-pure Datalog program P : T (x, y) ← R(x, y). T (x, y) ← T (z, x), R(z, y).
and GHP family H over a single one-dimensional cube 1. Let map 1 be the identity mapping, χ(τ ) = 1 and ρ τ (1) = {x} for all rules τ . Clearly, H is not weakly pivoting. Nevertheless, it can be shown that H is 1-bounded, which follows from the observation that only single-valued rule instantiations can satisfy under P .

Weakly Pivoting Datalog
We have so far looked at whether a given GHP family is 1-bounded. In this section, we ask: which Datalog programs admit a 1-bounded policy?  , x 2 , x 3 ) and b = (1, 3) Definition 8 (Pivot Base) Let P be a Datalog program, and let σ ⊆ IDB(P ). Let β be a function that takes as input some relation name R ∈ σ and outputs a non-empty tuple with values in [ar(R)]. We say that β is a pivot base for σ if: -For every rule τ ∈ P and for every pair of atoms A Datalog program P is pivoting (weakly pivoting, resp.) if it has a pivot base for all relations in IDB(P ) (for all relations in IDB(P ) that occur in the body of some rule in P ). Here, there are two intensional database relations, but only T occurs in the body of a rule. The pivot base β from before is still a pivot base for {T }; hence the program is weakly pivoting. However, there is no pivot base for to {T , U}, which means that the program is not pivoting.
The concept of pivoting Datalog was first introduced for single rule programs [35] and then generalized to full Datalog [28] where it is called generalized pivoting. The definition in [28] is based on a rather complex argument over fractional weightmappings, but relates to pivoting in that every generalized pivoting Datalog program is pivoting for all intensional database relations. For pure Datalog these notions are equivalent. The proposition below shows that for pure Datalog, a weakly pivoting program admits a weakly pivoting (and thus 1-bounded) GHP family.

Proposition 13
Let P be a pure Datalog program and p ≥ 2. There is a 1-bounded GHP family for P if and only if P is weakly pivoting.
In the below propositions, we show slightly stronger results. We start with the if-direction of Proposition 13, which follows from Proposition 12 and the below Proposition 14. Henceforth, we use for an atom A with relation symbol R, and tuple b of integers from [ar(R)], the notation vars A [b] to denote the set of variables in A on the positions defined by b.

Proposition 14
Let P be a pure and weakly pivoting Datalog program. For every p there is a weakly pivoting GHP family for P over [p].
Proof Take a weak pivot base β for P . We construct GHP E over network [p] by considering a single cube, cube 1, with only one dimension. We choose χ(τ ) = 1 for every τ ∈ P , and map 1 as the mapping from the single-point coordinates to servers in [p] that expresses identity. Now for rules τ ∈ P having no atom with associated pivot base, we define ρ τ (1) = ∅; for all other rules we define ρ τ (1) = vars A [β(R)], with R the relation symbol of A. It is easy to see that H is indeed weakly pivoting.
For the only-if direction, we introduce the concept of straggler. Let P be a Datalog program and E a strongly supporting economic policy for P over [p]. We call s ∈ [p] a straggler for an intensional relation R if either s ∈ C(f ) for all facts f of R or s ∈ P * (f ) for all facts f of R. A straggler thus is a server that consumes or produces an entire relation.
The only-if direction of Proposition 13 then follows from Proposition 15 and Proposition 16, which are given below.

Proposition 15
Let P be a pure Datalog program. If H is a weakly pivoting GHP for P over a network where p ≥ 2, then every E ∈ H is without stragglers for P -consumable intensional database relations.
Proof To show that E has no stragglers, we recall that for a weakly pivoting policy (a) |ρ τ (i)| > 0 for every rule τ and dimension of the cube χ(τ ), and that (b) for every atom A over a P -consumable intensional relation, ρ τ (i) ⊆ vars(A). Condition (b) implies that the facts from P -consumable intensional relations are consumed and produced at just one server. Condition (a) implies that the hash functions used by E are surjective and thus that, for every intensional P -consumable relation R, for at least some pair of facts f and g over R we have C(f ) = C(g), and analogously, for some pair we have P * (f ) = P * (g). In other words, not all facts over R are consumed or produced on the same server, which proves the desired result.

Proposition 16
Let P be a pure and not weakly pivoting Datalog program. Then, every strongly supporting economic policy that is 1-bounded has a straggler for some consumable intensional database relation in P .
In the remainder of this section, we prove Proposition 16. For this, we introduce the notion of policy key for economic policy E = (C, P ) and Datalog program P . Let R be some intensional database relation and γ a tuple of integers in |ar(R)|. Then γ is called a policy key for R in E, if for all facts f , g over R, implies P * (f ) = P * (g) = {s}, for some server s; and C(f ) = C(g) = ∅ or C(f ) = C(g) = {s} (with s denoting the same server as before). When E is clear from the context we omit mentioning E and say that γ is a policy key for R. We call γ empty if γ = ().
For 1-bounded and strongly supporting economic policies, all C-consumable intensional database relations have a policy key (namely the tuple containing all positions), which follows immediately from Proposition 11. Lemma 5 For pure Datalog program P , and 1-bounded strongly supporting economic policy E = (P , C) for P , are equivalent for each intensional predicate R: 1. R has empty policy key in E; 2. E has a straggler for R.
Proof For (1) ⇒ (2): By definition of policy key, there is a server s that belongs to P * (f ) for every fact f with predicate R. Hence, s is a straggler for R.
For We can also show the following technical results regarding policy keys.

Lemma 6
Let E be an economic policy and R a relation in the schema of E. If γ 1 and γ 2 are policy keys for R, then every tuple γ having all integers that are in both γ 1 and γ 2 , is a policy key for R. vars A 1 [γ 1 ] = vars A 2 [γ 2 ] follows from Lemma 6 and the assumption that γ 1 and γ 2 are minimal.
To show that γ is indeed a key for R 1 , let f and g be two arbitrarily chosen facts over R 1 , with f [γ ] = g[γ ]. Let v 1 be a valuation for τ , with v 1 (A 1 ) = f ; and let v 2 be a valuation for τ with v 2 (A 1 ) = g. We consider also a valuation v for rule τ , with v(x) = v 1 (x) for all x ∈ vars(A 1 ) \ vars A 2 [γ 2 ] and v(x) = v 2 (x) for all other variables. (Recall that all these valuations exist because P is pure.) Since R 1 and R 2 have a key, every fact h with relation name R 1 or R 2 is associated to a unique server s, with P * (h) = {s}, and C(h) = ∅ (if it is not consumed) or C(f ) = {s} (if it is consumed). Therefore, and since E is strongly supporting, the server associated to v 2 (A 1 ) is the same server as associated to v 2 (A 2 ). Let's call this server s 2 . For the same reason, the server associated to v(A 1 ) is the same server as associated to v(A 2 ). Let's call this server s 3 . Now since v(A 2 ) and v 2 (A 2 ) agree on their key (that is, v(vars A 2 [γ 2 ]) = v 2 (vars A 2 [γ 2])), it follows that s 2 = s 3 .
To conclude the proof, we observe that v 1 (A 1 ) agrees with v(A 1 ) on its key (particularly because v 1 and v 2 agree on the values for variables in vars A 1 [γ 1 ] ∩ vars A 2 [γ 2 ]), and thus that the server s 2 is associated also to v 1 (A 1 ). In other words, P * (f ) = P * (g) and C(f ) = C(g), which proves that γ is indeed a key for R 1 .

Proof of Proposition 16
Suppose E is a strongly supporting economic policy that is 1-bounded. For the sake of contradiction, assume that E has no stragglers for any C-consumable intensional database relations. Then, by Lemma 5 all P -consumable relations have only non-empty policy keys, which implies existence of minimal non-empty policy keys for the C-consumable intensional relations. In turn, due to Lemma 7 we can take these minimal keys as pivot base. However, this contradicts the fact that P is not weakly pivoting.

Bounded and Disjoint Evaluation
Sometimes we want to guarantee that, at the end of a computation, no two copies of the same fact have been derived at different servers. We call this property disjointness.
Definition 9 (Disjointness) Let P be a Datalog program, and R an intensional relation name of P . We call an economic policy E for P R-disjoint if for every instance, every fact of R is produced in at most one server.
We study economic policies that are both 1-bounded and disjoint.

Proposition 17
Let P ∈ PureDatalog and H a GHP family for P . Then, H is 1bounded, disjoint for P , and without stragglers for intensional database relations, if and only if, H is pivoting.
For the proof, we use the next auxiliary result.

Lemma 8
An economic policy E = (C, P ) that is 1-bounded, disjoint, strongly supporting and without stragglers for intensional database relations of the associated Datalog program has non-empty minimal keys for these relations.
Proof Due to 1-boundedness and the absence of stragglers intensional database relations, it follows from Lemma 5 that only non-empty keys (and thus non-empty minimal keys) for C-consumable intensional database relations exist.
We can now show the proof for Proposition 17.
Proof for Proposition 17 (If.) Since a pivoting GHP is also weakly pivoting, it follows from Proposition 12 and Proposition 15 that E is 1-bounded and without stragglers for P -consumable intensional relations.
For the remainder of the proof we observe that there exists s ∈ P * (f ) iff there is a rule τ ∈ P and valuation v such that v(head τ ) = f . Due to Definition 6 for pivoting GHPs, s is identified uniquely per rule τ by the combination χ(τ ) and pos head τ (ρ τ (i)) for all i with p i ≥ 1.
For s 1 , s 2 ∈ P * (f ), it follows again from Definition 6 for pivoting GHPs, that s 1 = s 2 , thus |P * (f )| = 1. Hence, E is indeed disjoint. Due to surjectivity of the considered hash functions, it follows that E has no stragglers for intensional database relations.
(Only if.) 1-boundedness implies that H is weakly pivoting due to Lemma 8. It remains to show that condition (1) and (2) also hold for relations that are not consumable. The proof is again by contraposition and completely analogous to the proof of Proposition 12. Only now A 1 and A 2 must be head atoms, and we use disjointness to argue |P * (f )| = 1 for all facts f over R 1 .
Next, we show which programs admit a 1-bounded, disjoint policy.

Proposition 18
Let P ∈ PureDatalog. Then P is pivoting if, and only if, P admits a 1-bounded, strongly supporting, disjoint economic policy without stragglers for intensional database relations.
Proof (If) The result follows immediately by Lemma 8 and Lemma7.
(Only If) The proof is analogous to the proof of Proposition 14 and takes the pivot base B for P to obtain a pivoting GHP for P . The result then follows from Proposition 17, which shows that this pivoting GHP is 1-bounded, strongly supporting, disjoint, and without stragglers for intensional database relations.

Remark 1
The reader may wonder how the above concepts relate to the class of decomposable programs [36,37]. A decomposable program is a (single rule) Datalog program that admits an evaluation strategy (via predicate restrictions) that is parallel-correct, 1-bounded, disjoint, and non-trivial. (Here non-triviality means that all servers do part of the work.) We did not consider the non-triviality property, but instead require the absence of stragglers. Nevertheless, for GHPs, non-triviality is implied-at least for pure Datalog-by the use of surjective hash functions).

Conclusion
We have introduced a theoretical framework to reason about multi-round Datalog evaluation in a distributed setting. In this framework we study three properties: parallel-correctness, boundedness, and disjointness. There are many interesting questions left open. For example, it would be interesting to come up with restrictions on Datalog programs and economic policies, for which the mentioned properties are decidable. In fact, recent work by Neven et al. [27] extends our work in that direction. Among other results, they show that parallel-correctness is already undecidable even for heavily restricted fragments of Datalog, including monadic Datalog (for which the containment problem is decidable).
Another interesting direction for future work would be to study the problem of finding economic distribution policies with desired properties, which should not necessarily be harder than deciding properties over given policies. A related question then is which properties, besides the ones studied in the paper, are relevant in a practical context. One interesting option would be to define a fairness condition for economic policies, e.g., an instance independent notion of load-balancing; another option is to study bounds on the amount of communication needed to evaluate Datalog programs. Yet another direction is to consider smarter algorithms for local Datalog evaluation than semi-naive, by, for example, allowing to express unique-decomposition conditions (c.f., [5]) in the economic policy.