Abstract 1 Introduction 2 Preliminaries 3 Model-checking 4 Coherence 5 Disjunctions of higher-arity dependence atoms 6 Outlook References Appendix A Appendix

Disjunctions of Two Dependence Atoms

Nicolas Fröhlich ORCID Leibniz Universität Hannover, Germany Phokion G. Kolaitis ORCID University of California Santa Cruz, CA, USA
IBM Research, San Jose, CA, USA
Arne Meier ORCID Leibniz Universität Hannover, Germany
Abstract

Dependence logic is a formalism that augments the syntax of first-order logic with dependence atoms asserting that the value of a variable is determined by the values of some other variables, i.e., dependence atoms express functional dependencies in relational databases. On finite structures, dependence logic captures NP, hence there are sentences of dependence logic whose model-checking problem is NP-complete. In fact, it is known that there are disjunctions of three dependence atoms whose model-checking problem is NP-complete. Motivated from considerations in database theory, we study the model-checking problem for disjunctions of two unary dependence atoms and establish a trichotomy theorem, namely, for every such formula, one of the following is true for the model-checking problem: (i) it is NL-complete; (ii) it is L-complete; (iii) it is first-order definable (hence, in AC0). Furthermore, we classify the complexity of the model-checking problem for disjunctions of two arbitrary dependence atoms, and also characterize when such a disjunction is coherent, i.e., when it satisfies a certain small-model property. Along the way, we identify a new class of 2CNF-formulas whose satisfiability problem is L-complete.

Keywords and phrases:
Dependence logic, coherence, model-checking, complexity, functional dependencies
Funding:
Nicolas Fröhlich: The author appreciates funding by the German Research Agency (DFG) under the grant ME2479/3-1 and project id 511769688.
Arne Meier: The author appreciates funding by the German Research Agency (DFG) under the grant ME2479/3-1 and project id 511769688.
Copyright and License:
[Uncaptioned image] © Nicolas Fröhlich, Phokion G. Kolaitis, and Arne Meier; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Problems, reductions and completeness
; Theory of computation Logic ; Theory of computation Logic and databases
Related Version:
Full Version: https://arxiv.org/abs/2508.16146 [6]
Acknowledgements:
We thank the anonymous reviewers for their valuable comments.
Editors:
Stefano Guerrini and Barbara König

1 Introduction

Notions of dependence and independence are ubiquitous in several different areas of mathematics and computer science, including linear algebra, probability theory, and database theory. Dependence logic 𝒟, developed by Väänänen in [14], is a formalism for expressing and studying such notions. The basic building blocks of dependence logic are the dependence atoms, which, in effect, express functional dependencies on database relations. Recall that functional dependencies are integrity constraints on databases asserting that the values of some attribute of a database relation is determined by the values of some of the other attributes of that relation (see [1]). For example, if R(course,dpt,inst) is a database relation with attributes for courses, departments, and instructors, then the functional dependency (coursedpt) asserts that each course is offered by only one department, while the functional dependency (instcourse) asserts that each instructor teaches only one course. In dependence logic 𝒟, these two functional dependencies are captured by the dependence atoms dep(course,inst) and dep(inst,course), respectively. The formulas of 𝒟 are obtained by combining dependence atoms, atomic formulas, and negated atomic formulas using conjunction , disjunction , existential quantification , and universal quantification .

Even though the syntax of 𝒟 resembles that of first-order logic 𝖥𝖮, the semantics of 𝒟 is second-order. Instead of single assignments of values to variables, the semantics of 𝒟 uses sets of assignments, called teams, that represent database relations on which the dependence atoms are interpreted. The second-order character of the semantics of 𝒟 is already manifested in the semantics of disjunction: a structure and a team T satisfy an 𝒟-formula of the form φ1φ2 (in symbols, (,T)φ1φ2) if there are two teams T1 and T2 such that T=T1T2, (,T1)φ1, and (,T2)φ2. In terms of expressive power and as regards sentences, 𝒟 is known to have the same expressive power as existential second-order logic 𝖤𝖲𝖮 [10, 14]. From this result and Fagin’s Theorem [4], it follows that, on the class of all finite structures, the sentences of dependence logic can express precisely all decision problems in NP. In particular, there are 𝒟-sentences that can express NP-complete problems; in other words, the model-checking problem for such sentences is NP-complete.

In view of the preceding state of affairs, it is natural to ask: are there syntactic restrictions such that the model-checking problem for 𝒟-formulas obeying these restrictions is tractable? Rather surprisingly, Kontinen [8] showed that the model-checking problem can be NP-complete even for disjunctions of three dependence atoms. Concretely, consider the 𝒟-formula dep(x,y)dep(u,v)dep(u,v). In [8], it is shown that the following problem is NP-complete: given a team T of arity 4, does Tdep(x,y)dep(u,v)dep(u,v)? In other words, are there three teams T1,T2,T3 such that T=T1T2T3, T1dep(x,y), T2dep(u,v), and T3dep(u,v)? On the tractability side, Kontinen [8] showed that the model-checking problem for disjunctions of two dependence atoms is always in NL, i.e., it is always solvable in non-deterministic logarithmic space. This was achieved via a logspace reduction of the model-checking problem for a disjunction of two dependence atoms to 2SAT, the satisfiability problem for 2-CNF formulas, which is well known to be in NL.

Contributions.

In this paper, we carry out a systematic investigation of the disjunctions of two dependence atoms. The motivation for this investigation is twofold. First, while the model-checking for disjunctions of two dependence atoms is always in NL, there are such disjunctions for which the model-checking is NL-complete, while for others it is 𝖥𝖮-definable (hence also in uniform AC0). So, one main aim is to pinpoint the exact complexity of the model-checking problem for a given disjunction of two dependence atoms. Second, disjunctions of dependence atoms capture natural database integrity constraints that have not been considered earlier. To make this point, let us consider the disjunction dep(course,dpt)dep(inst,dpt). The disjunct dep(course,dpt) expresses the functional dependency that each course is offered by only one department, while the disjunct dep(inst,dpt) expresses the functional dependence that each instructor is affiliated with only one department. It is unlikely that either of these two functional dependencies holds in any university, since it is typically the case that some courses are cross-listed by two departments and some instructors have joint appointments in two or more departments. Now, the disjunction dep(course,dpt)dep(inst,dpt) expresses the constraint that the database relation R(course,dpt,inst) can be split into two parts such that the functional dependency dep(course,dpt) holds in the first part and the functional dependency dep(inst,dpt) holds in the second part. This is a more relaxed integrity constraint, and it is quite plausible that it holds in many universities. Yet, this type of integrity constraint has not been investigated in database theory. The reason is that the largest collection of integrity constraints in databases studied in the past is the collection of embedded implicational dependencies [5], each of which is 𝖥𝖮-definable, while, as our result will imply, the disjunction dep(course,dpt)dep(inst,dpt) is not 𝖥𝖮-definable. We believe that disjunctions of dependence atoms deserve to be studied as database integrity constraints in their own right; the work reported here makes a first step in this direction by focusing on the model-checking problem for such formulas.

We classify the complexity of the model-checking problem for disjunctions of two dependence atoms by establishing a trichotomy theorem: for every 𝒟-formula φ that is the disjunction of two dependence atoms, one of the following three statements holds: (i) the model-checking problem for φ is NL-complete; (ii) the model-checking problem for φ is L-complete, where L is the class of all decision problems solvable in deterministic logarithmic space; (iii) the model-checking problem for φ is 𝖥𝖮-definable, hence in uniform AC0, where AC0 is the class of all decision problems solvable by constant-depth, polynomial-size circuits.

As an illustration of this trichotomy theorem, consider again the database relation R(course,dpt,inst) with data about courses, departments, and instructors. Consider also the dependence atoms dep(course,inst), dep(course,dpt), dep(inst,dpt), dep(inst,course) that express, respectively, the functional dependencies (courseinst), (coursedpt), (instdpt), (instcourse). Our trichotomy theorem classifies the complexity of the model-checking for all possible disjunctions of these four dependence atoms. In particular, it implies that the following statements are true:

  • The model-checking problem for dep(course,dpt)dep(inst,dpt) is NL-complete.

  • The model-checking problem for dep(inst,course)dep(course,inst) is L-complete.

  • The model-checking problem for dep(course,dpt)dep(course,inst) is 𝖥𝖮-definable, hence it is in uniform AC0.

As a byproduct of the preceding complexity-theoretic classification, we also identify a collection of 2-CNF formulas for which the satisfiability problem is L-complete.

The preceding results are about unary dependence atoms dep(x,y), where x and y are variables. The syntax of dependence logic allows also for higher-arity dependence atoms, i.e., for dependence atoms of the form dep(x1,,xn,y) with n>1; such atoms assert that the value of the variable y is determined by the values of the variables x1,,xn. We leverage out results about the model-checking problem for disjunctions of two unary dependence atoms to establish a dichotomy theorem about the model-checking problem for disjunctions of two higher-arity dependence atoms: this problem is either NL-complete or 𝖥𝖮-definable.

We complement these complexity-theoretic classifications of the model-checking problem with a structural classification of disjunctions of two dependence atoms. Specifically, we determine whether a given disjunction of two dependence atoms is a coherent or an incoherent 𝒟-formula. As defined in [8], a 𝒟-formula φ is coherent if there is an integer k such that a team T satisfies φ if and only if every sub-team T of T of size k satisfies φ. If no such k exists, then φ is incoherent. We show that a disjunction of two dependence atoms is coherent if and only if the model-checking for this formula is 𝖥𝖮-definable. Furthermore, for each coherent disjunction of two dependence atoms, we determine its coherence level, i.e., the smallest k that establishes the coherence of that disjunction. Our findings for disjunctions of two unary dependence atoms are summarized in Table 1.

The work reported here paves the way for further exploration of the model-checking problem for the quantifier-free fragment of dependence logic and, perhaps more importantly, it paves the way for further interaction between dependence logic and database theory.

Table 1: Complexity and coherence of disjunctions of two unary dependence atoms.
Formula Coherence Model-checking
dep(x,y)dep(z,u) incoherent (Prop. 8) NL-complete (Prop. 10)
dep(x,z)dep(y,z) incoherent (Thm. 25) NL-complete (Thm. 13)
dep(x,y)dep(y,z) incoherent (Thm. 25) NL-complete (Thm. 13)
dep(x,y)dep(y,x) incoherent (Thm. 25) L-complete (Thm. 16)
dep(x,y)dep(x,y) coherent (level 3) (Prop. 8) 𝖥𝖮 (Prop. 9)
dep(x,y)dep(x,z) coherent (level 4) (Thm. 22) 𝖥𝖮 (Prop. 9)

Due to space constraints, full proofs of results marked with a “” in the body of the paper are given in the Appendix.

2 Preliminaries

This section contains some basic material about dependence logic. For additional material, we refer the reader to the monograph [14].

A vocabulary is a tuple τ=(R1,,Rm) of relation symbols, each of which has a specified natural number ri as its arity. A τ-structure is a tuple =(A,R1,,Rm), where A is a set, called the domain of , and each Ri is an ri-ary relation on A.

Let Var be a countably infinite set of variables. The syntax of dependence logic 𝒟 over a vocabulary τ extends the syntax of first-order logic 𝖥𝖮 in negation normal form with dependence atoms as additional atomic formulas. A dependence atom of arity n is an expression of the form dep(y1,,yn,z), where y1,,yn,z are distinct variables in Var and n0. If n=1, we say that the dependence atom is unary. Intuitively, a dependence atom asserts that the value of the variable z depends on the values of the variables y1,,yn. In particular, if n=0, we have a dependence atom of the form dep(z), which asserts that z is a constant value, since it depends on no other variables; such atoms are called constancy atoms.

The formulas of dependence logic 𝒟 are defined by the following Backus-Naur form:

ψxk=xRi(x1,,xri)¬Ri(x1,,xri)dep(y1,,yn,z)ψψψψxψxψ,

where xk,x,x1,,xri are (not necessarily distinct) variables in Var, Ri is a relation symbol in τ of arity ri, and y1,,yn,z are distinct variables in Var.

Recall that the semantics of 𝖥𝖮-formulas are given using a structure and an assignment, i.e., a mapping from Var to the domain of . Due to the presence of dependence atoms, however, the semantics of 𝒟-formulas are given using a structure and a set of assignments.

Definition 1.

Let X be a set of variables (i.e., XVar) and let A be a set.

  • An assignment with domain X and range A is a mapping s:XA.

  • A team with domain X and range A is a set T of assignments with domain X and range A. We will write Dom(T) to denote the domain X of the team T. Similar Rng(T) denotes the range A of the team T.

  • If T is a team with domain X and YX, then the restriction of T on Y is the team TY={sYsT}, where sY is the restriction of the assignment s to Y.

Before giving the semantics of 𝒟-formulas, we need to introduce two operations on teams.

Definition 2.

Let Var be the set of all variables and let A be a set.

  • If s:VarA is an assignment with domain Var and range A, x is a variable, and a is an element of A, then sax:VarA is the assignment such that sax(x)=a and sax(y)=s(y), for every variable yx.

  • If T is a team with domain Var and range A, f:TA is a function, and x is a variable, then the supplement team Tfx is the team {sf(s)xsT}.

  • If T is a team with domain Var and range A, and x is a variable, then the duplicate team TAx is the team {saxxVarandaA}.

We are now ready to define the semantics of dependence logic 𝒟. In doing so, the supplement team will be used to define the semantics of existential quantification, while the duplicate team will be used to define the semantics of universal quantification.

Definition 3.

Let τ=(R1,,Rm) be a vocabulary and let =(A,R1,,Rm) be a τ-structure. The satisfaction relation (,T)φ, where T is a team with domain Var and range A, and φ is a formula of dependence logic 𝒟, is defined by induction on the construction of φ and simultaneously for all teams with domain Var and range A as follows:

  • (,T)xk=xl if for all sT, we have s(xk)=s(xl);

  • (,T)Ri(x1,,xri) if for all sT, we have (s(x1),,s(xri))Ri;

  • (,T)¬Ri(x1,,xri) if for all sT, we have (s(x1),,s(xri))Ri;

  • (,T)dep(y1,,yn,z) if for all s1,s2T with s1(yj)=s2(yj) for 1jn, we have that s1(z)=s2(z);

  • (,T)φ1φ2 if (,T)φ1 and (,T)φ2;

  • (,T)φ1φ2 if there are teams T1, T2 with T=T1T2 and (,Ti)φi, i=1,2;

  • (,T)xφ if there is a function f:TA such that (,Tfx)φ, where Tfx is the supplement team associated with T, x, and f;

  • (,T)xφ if (,TAx)φ, where TAx is the duplicate team associated with T and x.

The semantics of disjunction in dependence logic are quite different from those in first-order logic. In particular, unlike first-order logic, a disjunction of the form φφ need not be equivalent to φ. The following examples illustrates this state of affairs.

Example 4.

Consider the 𝒟-formulas dep(x,y) and dep(x,y)dep(x,y). Let =({0,1}) be a structure over the empty vocabulary and let T={s,t} be a team with two s and t assignments such that s(x)=0,s(y)=0 and t(x)=0,t(y)=1. Clearly (,T)⊧̸dep(x,y). Nonetheless, (,T)dep(x,y)dep(x,y), since (,{s})dep(x,y) and (,{t})dep(x,y).

The next proposition, whose proof can be found in [14], states two important properties of dependence logic. In what follows, if φ is a 𝒟-formula, then free(φ) is the (finite) set of all free variables of φ.

Proposition 5 ([14, Lemma 3.27, Prop. 3.10]).

Let =(A,R1,,Rm) be a τ-structure and let φ be a 𝒟-formula.

(Locality)

Let X be a set such that free(φ)XVar. If T1, T2 are teams with domain Var and range A such that T1X=T2X, then (,T1)φ if and only if (,T2)φ.

(Downward Closure)

Let P and T be teams with domain Var and range A such that PT. If (,T)φ, then also (,P)φ.

The Locality property tells that the satisfaction relation (,T)φ depends only on the restriction of the team T on the set of the free variables of φ. In particular, if ψ is a 𝒟-sentence (i.e., a 𝒟-formula with no free variables), then (,T)ψ if and only ,{})ψ, where {} is the team consisting of the empty assignment.

Väänänen [14] showed that 𝒟-sentences have the same expressive power as the sentences of existential second-order logic 𝖤𝖲𝖮. Combined with Fagin’s theorem [4], this result implies that, on the class of all finite structures, 𝒟-sentences express precisely all NP problems. Kontinen and Väänänen [11] showed that 𝒟-formulas (with free variables) have the same expressive power as the downward closed formulas of existential second-order logic 𝖤𝖲𝖮 (see [11] for the precise statement of this result).

In view of the Locality property, from now on we will only consider teams on finite domains, as long as, in each case, the domain of the team considered contains the free variables of the 𝒟-formula at hand.

Coherence.

In the following, we present the definition of a small-model property for dependence logic.

Definition 6 (k-coherence, [8, Def. 3.1]).

Let φ(x1,,xn) be a quantifier-free 𝒟-formula. We say that φ is k-coherent if for all structures and teams T of range such that free(φ)Dom(T), the following are equivalent:

  1. (1.)

    (,T)φ.

  2. (2.)

    For all k-element sub-teams ST it holds that (,S)φ.

Notice that for the above equivalence, the direction “(1.) (2.)” is always true due to 𝒟 being downwards closed (see Proposition 5).

Definition 7 (coherence-level, incoherence).

The coherence-level of φ is the least natural number k such that φ is k-coherent; if such a k does not exist, we call φ incoherent.

Clearly, first-order atomic formulas have a coherence-level of 1 and dependence atoms a coherence-level of 2.

Proposition 8.

The following statements are true:

  1. (1.)

    The formula dep(x,y)dep(x,y) has coherence-level 3; [8, Prop. 3.8]

  2. (2.)

    The formula dep(x,y)dep(z,u) is incoherent. [8, Thm. 3.11]

We will use the shortcut Rel(T){(s(x1),,s(xn))sT}, when the team T is given as a relation. The following proposition shows that every quantifier-free k-coherent formula of dependence logic is first-order definable (and, hence, is in uniform AC0).

Proposition 9 ([8, Thm. 4.9]).

Suppose φ(x1,,xn)𝒟 is a quantifier-free k-coherent formula over a vocabulary τ. Then there is a sentence φ𝖥𝖮(τ{R}), where R is n-ary, such that for all τ-structures and for all teams T of domain {x1,,xn}, we have that (,T)φ(x1,,xn) if and only if Rel(T)φ(R).

3 Model-checking

The model-checking problem is a fundamental decision problem arising in every logic. Informally, this problem asks if a finite structure satisfies a formula of the logic at hand. Actually, according to the taxonomy introduced by Vardi [15], there are two versions of the model-checking problem, the combined complexity version and the data complexity version. In the first version, the input consists of a formula and a finite structure, while in the second version the formula is fixed and the input consists of just a finite structure.

Here, we focus on the data complexity version of the model-checking problem for dependence logic 𝒟. Specifically, every 𝒟-formula φ gives rise to the following decision problem.

Problem: MC(φ) – the model-checking problem for φ𝒟
Input: A finite τ-structure =(A,R1,,Rm) and a team T with domain a finite set X and range A such that free(φ)XVar.
Question: Does (,T)φ?

As mentioned in the introduction, the results in [9, 14] imply that MC(φ) is in NP for every 𝒟-formula φ, and that there are 𝒟-formulas ψ for which MC(ψ) is an NP-complete problem. Furthermore, the intractability of the model-checking problem is even true for some disjunctions of three dependence atoms; for example, MC(dep(x,y)dep(u,v)dep(u,v)) is NP-complete, as shown by Kontinen [8].

Note that if φ is a 𝒟-formula in which no relation symbols from the vocabulary τ occur, then the satisfaction relation (,T)φ depends only on the team T and not on the τ-structure . From this point on, we will focus on disjunctions of two dependence atoms, hence no relation symbols from the vocabulary τ occur in such dependence logic formulas. For this reason, we will drop from the satisfaction relation and, instead, write Tφ, where φ contains no relation symbols from τ. Thus, if φ is such a 𝒟-formula, then the model-checking problem MC(φ) asks: given a team T with domain a finite set X such that free(φ)XVar, does Tφ? The following result yields a sufficient condition for the model-checking problem to be in NL, the class of all decision problems solvable in non-deterministic logarithmic space.

Proposition 10 ([8, Thm. 4.12]).

Suppose φ and ψ are 2-coherent quantifier-free 𝒟-formulas with no relation symbols. Then there is a logarithmic-space reduction from the model-checking problem MC(φψ) to 2SAT. Consequently, MC(φψ) is in NL.

We now describe the reduction of MC(φψ) to 2SAT in some detail, since it will be relevant later on (Section 3.2) in the present paper.

Given a team T={s1,,sk}, go through all two-element subsets {si,sj} of T, and construct a set C of two-variable clauses, as follows:

  • If {si,sj}⊧̸φ, then (xixj)C.

  • If {si,sj}⊧̸ψ, then (¬xi¬xj)C.

Inuitively, these two clauses encode the property that the team {s,t} satisfies neither φ nor ψ. Then let ΘT be the 2CNF-formula θCθ. It can be shown that Tφψ if and only if ΘT is satisfiable (see [8] for a detailed proof).

Since every dependence atom is a 2-coherent formula, we obtain the following result.

Corollary 11.

If φ and ψ are dependence atoms, then MC(φψ) is in NL.

3.1 NL-Completeness

This section will cover the cases in which the model-checking problem for disjunctions of two unary dependence atoms is complete for nondeterministic logarithmic space.

Proposition 12 ([8, Thm. 4.14]).

The problem MC(dep(x,y)dep(z,w)) is NL-complete.

We will outline the reduction, constructed in the proof of this result, because the next result (Theorem 13) is based on this construction. The full proof of Proposition 12 can be found in [8]. The intuition of the construction is that the team encodes the clauses of the formula. The split ensures a consistent assignment on the left side, while the right side allows to “buffer” one unsatisfied literal per clause. Consider a 2SAT instance θ(p1,,pm)=iIEi, with Ei=(Ai1Ai2),iI, where Ai1,Ai2 are literals. For each conjunct Ei, create a team TEi={si1,si2} with domain {x,y,z,w}, where variable x ranges over {p1,,pm}; variable y ranges over the truth values {0,1}; variable z ranges over the indices iI, thus z denotes the clause Ei; and variable w ranges over the values {0,1}. Variables z and w ensure that at least one of the assignments from each TEi go into the subset of T that eventually encodes an assignment satisfying θ. For example, the team TEi for a clause (pk¬pj) is the one on the left in Figure 1. Finally T=iITEi.

TEixyzwsi1pk1i0si2pj0i1TEixyzsi1pki1ksi2pji0jTEixyzsi1ipk1si2ipj0

Figure 1: Teams for (pk¬pj) in Proposition 12 (left) and Theorem 13 (middle/right).
Theorem 13 ().

The following statements are true:

  1. (1.)

    The problem MC(dep(x,z)dep(y,z)) is NL-complete.

  2. (2.)

    The problem MC(dep(x,y)dep(y,z)) is NL-complete.

Proof sketch.

Membership follows from Proposition 10. For hardness, we proceed as in the proof of Proposition 12, that is, we transform a 2CNF formula into a team, additionally, we require that the clauses do not contain the same variable twice. For the first case, dep(x,z)dep(y,z), we use TEi shown in the middle of Figure 1. For the second case, dep(x,y)dep(y,z), we use TEi depicted in the right of Figure 1. Furthermore, let again I be the set of clause indices and m be the number of propositions in the 2CNF formula θ.

(1.) Variable x ranges over {p1,,pm}, variable y ranges over I, and variable z ranges over { 0k,1k1km}, that is, each proposition has its own truth values. The dependence atom dep(x,z) ensures the existence of a consistent assignment that satisfies θ, while dep(y,z) allows only one of si1 and si2 to be on its “side of the split.”

(2.) Variable x ranges over I, variables y ranges over {p1,,pm}, and variable z ranges over {0,1}. Here, the atom dep(y,z) ensures the existence of a consistent assignment that satisfies θ, while dep(x,y) ensures only one of si1 and si2 is on its “side of the split.”

3.2 L-Completeness

This section will cover the cases in which the model-checking problem for disjunctions of two unary dependence atoms is complete for logarithmic space.

If T is a team with domain X and x is a variable in X, then we will use the notation Rngx(T)Rng(T{x}) to denote the set of values in the range of T that x takes.

Definition 14.

Let T be a team with domain {x,y} such that Rngx(T)Rngy(T)=. The undirected graph of T is GT(VT,ET), where

VT =Rng(T),ET={{s(x),s(y)}|sT}.

Notice that, by definition, GT is a bipartite graph. The next lemma provides a graph-based interpretation of the condition under which a team T satisfies the formula dep(x,y)dep(y,x).

Lemma 15 ().

Let T be a team with domain {x,y} such that Rngx(T)Rngy(T)=. The following statements are equivalent:

  1. (1.)

    Tdep(x,y)dep(y,x).

  2. (2.)

    GT can be transformed into a directed graph GT=(VT,ET) such that

    1. (a)

      each node has at most one out-going edge, and

    2. (b)

      for each {u,v}ET either (u,v)ET or (v,u)ET.

  3. (3.)

    For each connected component C=(VC,EC) of GT, we have that |EC||VC|.

A visual example for GT and the equivalence of (1.) and (2.) are depicted in Figure 2.

Txys100^s201^s310^s411^Txys100^s411^s201^s310^

Figure 2: Example depicting the construction of GT from team T on the left. Followed by the transformation into directed graph GT, where each node has at most one out-going edge. This induces the split {s1,s4}dep(x,y) and {s2,s3}dep(y,x) of T shown on the right.
Theorem 16.

The problem MC(dep(x,y)dep(y,x)) is in L.

Proof.

Algorithm 1 decides Tdep(x,y)dep(y,x) requiring only logarithmic space.

We assume the input to the algorithm to be the graph GT, which is possible since its edge set is, depending on the used encoding, the same as the team T. The correctness follows immediately from Lemma 15, i.e., Algorithm 1 returns true if and only if |EC||VC| for all connected components C of GT. The algorithm iterates over all nodes v in GT and checks that the size of the corresponding connected component (so its number of nodes) is not smaller than its number of edges. The size count, c, is achieved by counting the number of reachable nodes. In parallel, the sizes of their respective neighborhoods are added up in a different counter, d. Since this counts each edge twice, halving d and comparing to c suffices.

The amount of space required by the algorithm is as follows. Computing the number of nodes in a connected component can be done in logarithmic space in |GT| (which is in O(|T|)) using undirected reachability (which is well-known to be in L by Reingold’s theorem [13]). The algorithm stores an index of the current node v and an index of an auxiliary node u, hence both are, by binary encoding, in O(log|T|). Furthermore, two counters are used: one to count the size of the connected component, c, and another to count the number of edges in the connected component, d. Both counters are, again, in O(log|T|). Storing these values together requires only logspace in the input size. The claimed result applies, as LL=L (that is to say, L is low for itself).

Algorithm 1 Model-checking dep(x,y)dep(y,x).
Lemma 17 ().

Let T be a team with domain {x,y}. If the size |T|>|Rngx(T)|+|Rngy(T)|, then T⊧̸dep(x,y)dep(y,x).

In the following, we will utilize the definition of first-order reductions as defined by Immerman [7] and Dahlhaus [3].

Theorem 18.

The problem MC(dep(x,y)dep(y,x)) is L-hard under first-order reductions.

Proof.

We reduce from the complement of UFA (which is known to be L-complete [2]):

Problem: UFA – Undirected Forest Accessibility
Input: acyclic undirected graph G=(V,E), nodes u and v.
Question: is there is a path between u and v?

T{a,b}xys1abs2baT{,u}xys1us2uT{,v}xys1vs2vT{,u}xys1us2uT{,v}xys1vs2v

Figure 3: Team T{a,b} for edge {a,b}E and teams T{,u},T{,v},T{,u},T{,v}.

Suppose G=(V,E),u,v is an instance of UFA. We will construct a team T, such that Tdep(x,y)dep(y,x) if and only if there is no path between u and v in G.

For each undirected edge {a,b}E, create a team T{a,b} with two assignments s and t, such that s(x)=t(y)=a and s(y)=t(x)=b. Next, let and be two elements such that {,}V=. Create four more teams T{,u},T{,v},T{,u}, and T{,v} as depicted in Figure 3. Now, T is the union {a,b}ET{a,b}{T{,u},T{,v},T{,u},T{,v}}. The team T can be constructed using an 𝖥𝖮 query I:STRUC[τg]STRUC[τ2t], where τg=(E2) is the vocabulary of graphs with edge relation E and τ2t=(T2) is the vocabulary of teams with domain of size two, encoded via the binary relation Rel(T) (see, Proposition 9).

Now, we will show that G=(V,E),u,vUFA¯ if and only if Tdep(x,y)dep(y,x).

”: Assume that there is no path between u and v in G. Then take u and v as roots for trees corresponding to their acyclic connected components. For all other connected components choose any node as root. We partition T into two teams as follows:

T1 ={sTs(y) is the parent node of s(x)}
{sT(s(x),s(y))=(u,)}{sT(s(x),s(y))=(v,)}
{sT(s(x),s(y))=(,v)}{sT(s(x),s(y))=(,u)}
T2 =TT1={sT(s(x),s(y))=(t(y),t(x)),tT1}

Observe that T2 contains the “reverse directions” of T1 (see Figure 4).

Figure 4: (Left): Example UFA instance. (Right): Visualization of partition into T1 (solid lines) and T2 (dashed lines).

Now, s(x)t(x) for all s,tT1, because each s(x)V only has one parent s(y)V in their corresponding trees (the cases containing or satisfy this condition by construction of T1, as they appear only once for x; and u,v are roots, so do not have parents themselves). Thus, T1 trivially satisfies dep(x,y). Since the construction and definition of T2 are symmetric, s(x)t(x) for all s,tT2. This yields T2dep(y,x).

”: For the other direction, assume the contraposition, i.e., that there is a path between u and v in G. Let C=(VC,EC) be the connected component with {u,v}VC. We will focus now on the sub-team for C together with the special assignments (shown in Figure 3) and argue why this sub-team cannot be split to satisfy dep(x,y)dep(y,x) (this would imply that the full team cannot be split as well). For that purpose, let TC be the union {a,b}CT{a,b} of the teams created by the edges in C. Observe that

|TC|=2|EC|=2(|VC|1) and |Rngx(T)|=|Rngy(T)|=|VC|,

i.e., the size of the team is twice the number of edges in C, because each edge adds two assignments by definition. The number of edges in C is then one less than the number of vertices in C. The sizes of the domains of x and y are the same as the number of nodes in C.

Next, let TC=TC{T{,u},T{,v},T{,u},T{,v}}, then

|TC|=2|EC|+8=2(|VC|1)+8 and |Rngx(T)|=|Rngy(T)|=|VC|+2,

because eight new assignments are added to the team and two new elements to the domains. Thus, we get

|TC|=2(|VC|1)+8=2|VC|+6>2|VC|+4=|Rngx(T)|+|Rngy(T)|

and, by Lemma 17, we have that TC⊧̸dep(x,y)dep(y,x). Therefore T⊧̸dep(x,y)dep(y,x) is true, as TCT and due to downwards closure.

A restricted case of Krom-satisfiability.

In the sequel, we will show how our results also establish the L-completeness of the satisfiability problem restricted to a particular type of 2CNF formulas, namely, those that are monotone, transitive, and dual-free.

Definition 19.

Let ϕ be a 2CNF formula with variables Var(ϕ)={x1,,xn}, such that:

  1. (1.)

    Every clause is monotone, i.e., it is of the form (xixj) or of the form (x¯ix¯j). The first type is called positive, the second type negative.

  2. (2.)

    No two “dual” clauses (xiyj), (x¯ix¯j) appear in ϕ.

  3. (3.)

    Transitivity holds: if (xixj) and (xjxk) are clauses of ϕ, then so is (xixk)ϕ; furthermore, if (x¯ix¯j) and (x¯jx¯k) are clauses of ϕ, then so is (x¯ix¯k)ϕ.

We call such 2CNF formulas monotone transitive dual-free and abbreviate the corresponding satisfiability problem by mtdf-2SAT.

Lemma 20 ().

The problems MC(dep(x,y)dep(y,x)) and mtdf-2SAT are equivalent via first-order reductions.

Proof sketch.

Notice how the 2CNF formulas in the reduction of Proposition 10 are monotone, transitive and dual-free, if we have that φ=dep(x,y) and ψ=dep(y,x). This gives a one-to-one correspondence, thus the other directions is simply the inverse.

Lemma 20 together with Theorem 16 and Theorem 18 yield the following result.

Theorem 21.

Satisfiability of monotone transitive dual-free 2CNF formula is L-complete under first-order reductions.

4 Coherence

This section is devoted to identifying the coherent disjunctions of two unary dependence atoms and determining their coherence-level.

dep(x,y)dep(x,z)Txyzs1111s2112s3121s4122(a)dep(x)dep(y,z)Txyzs1111s2112s3221s4222(b)dep(x)dep(x,y)Txys111s212s321s422(c)dep(y)dep(x,y)Txys111s221s332s442(d)

Figure 5: Counterexamples of 3-coherence for formulas used in the proofs of Thms. 22 and 24.
Theorem 22.

The formula dep(x,y)dep(x,z) has coherence-level 4.

Proof.

We first show that dep(x,y)dep(x,z) is not 3-coherent. Consider the team T in Figure 5 (a). This team does not satisfy dep(x,y)dep(x,z). To see this, consider the two maximal subsets, T1={s1,s2} and T2={s3,s4}, that satisfy dep(x,y). We cannot split T=T1T2, because T1⊧̸dep(x,z) and T2⊧̸dep(x,z).

Every 3-element subset of T satisfies dep(x,y)dep(x,z), because either T1 or T2 is now a singleton (overlapping splits not need to be considered), thus trivially satisfying the dependence atom dep(x,z). Therefore dep(x,y)dep(x,z) is not 3-coherent.

Next, we show that if T does not satisfies dep(x,y)dep(x,z), then there is a sub-team TT of size |T|4 that does not satisfies dep(x,y)dep(x,z) either. Let T be a team with T⊧̸dep(x,y)dep(x,z). If |T|4, then T=T. Otherwise assume without loss of generality that all assignments in T agree on x. Two assignments with different values for x trivially satisfy both atoms. Thus, we can make this assumption by splitting T into sub-teams that agree on x, and then consider each sub-team separately.

Now, let T={s,u,v,t}T, such that the following constraints hold: s(y)t(y) and s(z)t(z); u(y)s(y) and u(z)t(z); v(z)s(z) and v(y)t(y). We first show that T⊧̸dep(x,y)dep(x,z) is true and then that such a TT always exists (see Figure 6). By the constraints on y, we have that {s,v} and {u,t} are the two maximal sub-teams that satisfy dep(x,y). Therefore at least one of the two sub-teams must also satisfy dep(x,z) for the disjunction to be true. But {s,v}⊧̸dep(x,z), because s(x)=v(x) and s(z)v(z); analogously {u,t}⊧̸dep(x,z). Thus T⊧̸dep(x,y)dep(x,z) is true.

Let sT and assume no tT meets the constraints above. Then T can be split via T1={wTw(y)=s(y)}dep(x,y) and T2={wTw(z)=s(z)}dep(x,z). Next, assume there are s,t and u, but no v. Then we split T via T1={wTw(y)=t(y)}dep(x,y) and T2={wTw(z)=s(z)}dep(x,z). The last case is analogous. This is a contradiction to T⊧̸dep(x,y)dep(x,z), hence such a sub-team must always exists.

Txyzs1000s2001s3012s4020s5021   Txyzs1000s2002s3011s4021s5120

Figure 6: Two examples for the construction in Theorem 22. Edge label denote the variable values where the assignments from the vertices differ, e.g., edge {s1,s3} with label y means that s1(y)s3(y). On the left we have that s1 corresponds to s, s3 to u and v, and s5 to t from the construction.

As a special case of the previous lemma (ignore the x column), we deduce the following corollary.

Corollary 23.

The formula dep(y)dep(z) has coherence-level 4.

Theorem 24 ().

Each of the following formulas has coherence-level 4:
(1.) dep(x)dep(y,z); (2.) dep(x)dep(x,y); (3.) dep(y)dep(x,y).

Proof sketch.

For the lower bound consider the teams (b)–(d) in Figure 5. The upper bound works with a similar construction as in the proof of Theorem 22.

Finally, we turn towards the three incoherent cases. For these formulas, there exists no k, such that the respective formula is k-coherent. Notice that the incoherence result can here be deduced from the fact that 𝖥𝖮L. Nevertheless, we provide independent proofs for these results in the appendix.

Theorem 25 ().

The following three formulas are incoherent:
(1.) dep(x,y)dep(y,x); (2.) dep(x,y)dep(y,z); (3.) dep(x,z)dep(y,z).

By combining the results from Section 3 and 4, we obtain the following complete classification for the complexity of the model-checking of disjunctions of two dependence or constancy atoms.

Theorem 26.

The following statements are true for the model-checking problem of disjunctions of two unary dependence or constancy atoms:

  1. (1.)

    If φ is one of the formulas dep(x,y)dep(z,u), dep(x,z)dep(y,z), dep(x,y)dep(y,z), then the problem MC(φ) is NL-complete.

  2. (2.)

    The problem MC(dep(x,y)dep(y,x)) is L-complete.

  3. (3.)

    In all other cases, the problem MC(φ) is in 𝖥𝖮, hence also in uniform AC0.

5 Disjunctions of higher-arity dependence atoms

In this section, we consider disjunctions of two dependence atoms whose arity may be higher than one. We leverage the preceding results about disjunctions of two unary dependence atoms to classify the complexity of disjunctions of two dependence atoms of higher arity.

Before stating and proving the main result of this section, we present two lemmas. The idea in the next lemma is to encode a subset of the domain by a tuple and also change the range appropriately.

Lemma 27.

Consider a 𝒟-formula of the form dep(x1,,xn,y)dep(u1,,um,v). For every team T={s1,,sk} with domain {x1,,xn,y,u1,,um,v} and range A, there is a team T with domain {x,y,u1,,um,v} and range AnA, such that

Tdep(x1,,xn,y)dep(u1,,um,v)iffTdep(x,y)dep(u1,,um,v).

Proof.

Consider the assignments si, 1ik, with domain {x,y,u1,,um,y} and range AnA defined as follows: si(x)=(s(x1),,s(xn)) and si(w)=si(w), if w is one of the variables y,u1,,um,v. Furthermore, consider the team T={s1,,sk} be the team, which has domain {x,y,u1,,um,v} and range AnA.

Assume first that Tdep(x1,,xn,y)dep(u1,,um,v) via T=T1T2, that is, T1dep(x1,,xn,y) and T2dep(u1,,um,v). Let T1={siTsiT1} and T2={siTsiT2}. Then the following hold:
T1dep(x1,,xn,y) iffs,tT1:s(x1)=t(x1),,s(xn)=t(xn)s(y)=t(y) iffs,tT1:(s(x1),,s(xn))=(t(x1),,t(xn))s(y)=t(y) iffs,tT1:s(x)=t(x)s(y)=t(y)iffT1dep(x,y).

Furthermore, T2dep(u1,,um,v), since T2{u1,,um,v}=T2{u1,,um,v}. Thus, if Tdep(x1,,xn,y)dep(u1,,um,v), then Tdep(x,y)dep(u1,,um,v). The converse is established using a similar argument.

By applying Lemma 27 twice, we obtain the following result.

Corollary 28.

Consider a 𝒟-formula of the form dep(x1,,xn,y)dep(u1,,um,v). For every team T={s1,,sk} with domain {x1,,xn,y,u1,,um,v} and range A, there is a team T with domain {x,y,u,v} and range AnAmA, such that

Tdep(x1,,xn,y)dep(u1,,um,v)iffTdep(x,y)dep(u,v).

The next lemma establishes the coherence of one of the simplest higher-arity disjunctions.

Lemma 29.

The formula dep(x,z)dep(x,y,u) has coherence-level 4.

Proof.

To show that dep(x,z)dep(x,y,u) is not 3-coherent, notice that this formula is equivalent to dep(x,z)dep(x,u) on every team T that is constant on y, i.e., s(y)=t(y) holds, for all s,tT. Thus, the supplement team Tfy given by T in Figure 5 (a) (and with suitable variable renamings) and constant function f(s)=a is a counterexample to the 3-coherence of dep(x,z)dep(x,y,u).

To show that dep(x,z)dep(x,y,u) is 4-coherent, consider an arbitrary team T with domain {x,z,y,u}. Without loss of generality, we may assume that T is constant on x. This is so because T can be split into sub-teams Tai={sTs(x)=aiA} that are constant on x. If T falsifies dep(x,z)dep(x,y,u), then at least one of these sub-teams also falsifies dep(x,z)dep(x,y,u). Indeed, if every sub-team Tai satisfied dep(x,z)dep(x,y,u), then their union T would also satisfy dep(x,z)dep(x,y,u), since every team T consisting of two assignments sTaj and tTak with ajak satisfies both dependence atoms (because s(x)=ajak=t(x)).

Now, since T is constant on x, whether or not Tdep(x,z) depends only on the values of z and so dep(x,z) acts like the constancy atom dep(z). Similarly, whether or not Tdep(x,y,u), depends only on the values of y and u, and so dep(x,y,u) acts like the unary dependence atom into dep(y,u). Therefore, we have that Tdep(x,z)dep(x,y,u) if and only if Tdep(z)dep(y,u). As the latter disjunction was shown to be 4-coherent in Theorem 24, we have that dep(x,z)dep(x,y,u) is 4-coherent as well.

We are now ready to state and prove a classification theorem for the complexity of the model-checking problem for disjunctions of dependence atoms of higher arities. Note that, when a dependence atom dep(y1,,yn,u) of higher arity is considered, then, by the semantics of dependence logic 𝒟, the order y1,,yn of the variables is immaterial.

Theorem 30.

Let dep(x1,,xm,z) and dep(y1,,yn,u) be two dependence atoms such that nm and n>1 (i.e., at least one of them is non-unary), and some of the variables of the first atom may coincide with some of the variables of the second. Then the following hold:

  1. (1.)

    If neither the set {x1,,xm} is contained in the set {y1,,yn} nor the set {y1,,yn} is contained in the set {x1,,xm}, then the problem

    MC(dep(x1,,xm,z)dep(y1,,yn,u))

    is NL-complete.

  2. (2.)

    Otherwise (i.e., if {x1,,xm}{y1,,yn}), the problem

    MC(dep(x1,,xm,z)dep(y1,,yn,u))

    is in 𝖥𝖮, hence also in uniform AC0.

Proof.

For the first part, membership in NL follows from Proposition 10. We show NL-hardness for the special case z=u, by exhibiting a logarithmic-space reduction from the problem MC(dep(x,z)dep(y,z)) (see Theorem 13). Let x^ be a variable in {x1,,xm} but not in {y1,,yn}; similarly, let y^ be a variable in {y1,,yn} but not in {x1,,xm}. Given a team T, we construct in logarithmic space a team T as follows: for every sT, we put in T an assignment s, such that s(z)=s(z), s(xi)=s(xi) if xi=x^ and a otherwise, likewise s(yi)=s(yi) if yi=y^ and a otherwise, where a is some fixed element in the range of T. Then the following equivalences hold:

Tdep(x,z)dep(y,z)
iffT1,T2:T=T1T2,T1dep(x,z) and T2dep(y,z)
iffT1,T2:T=T1T2,s1,t1T1:s1(x)=t1(x)s1(z)=t1(z) and
s2,t2T2:s2(y)=t2(y)s2(z)=t2(z)
iffT1,T2:T=T1T2,s1,t1T1:s1(x^)=t1(x^)s1(z)=t1(z) and
s2,t2T2:s2(y^)=t2(y^)s2(z)=t2(z)
iffT1,T2:T=T1T2,T1dep(x1,,xm,z) and T2dep(y1,,yn,z) ()
iffTdep(x1,,xm,z)dep(y1,,yn,z).

Equivalence () holds, because if xix^, then s(xi)=t(xi)=a (and similarly for the yi’s).

For the second part, we distinguish two cases, namely whether m=n or m<n.

Case “m=n”:

The formula dep(x1,,xm,z)dep(x1,,xm,u) has coherence-level 4, because, via Corollary 28, the model-checking problem of dep(x1,,xm,z)dep(x1,,xm,u) is equivalent to that of dep(x,y)dep(x,z). Thus, by Proposition 9 and Theorem 22, we have that MC(dep(x1,,xm,z)dep(y1,,yn,u)) is in 𝖥𝖮.

Case “m<n”:

Assume ym+1,,yn are the variables that do not occur in the first dependence atom dep(x1,,xm,z). Let x=(x1,,xm) and y=(ym+1,,yn). Then, by Corollary 28, the disjunction dep(x1,,xm,z)dep(x1,,xm,ym+1,,yn,u) is equivalent to the disjunction dep(x,z)dep(x,y,u), which is 4-coherent by Lemma 29. Thus by Proposition 9, we have that MC(dep(x1,,xm,z)dep(y1,,yn,u)) is in 𝖥𝖮.

Observe that, whenever MC(φ) was shown to be in 𝖥𝖮, the proof of Theorem 30 actually showed that φ is coherent. Combined with the earlier results about unary dependence atoms, we obtain the following characterization of coherence of disjunctions of dependence atoms.

Corollary 31.

Let φ be a disjunction of two dependence atoms of arbitrary arities. Then φ is coherent if and only if MC(φ) is first-order definable.

6 Outlook

In this paper, we carried out a systematic investigation of the model-checking problem for disjunctions of two dependence atoms. The work reported here suggests several different directions for future research, including the following:

  1. 1.

    Is it possible to classify the computational complexity of the model-checking problem for all quantifier-free formulas of dependence logic 𝒟? In particular, is there a dichotomy theorem between NP-completeness and polynomial-time solvability for this problem? Note that it is conceivable that every problem in NP is polynomial-time equivalent to the model-checking problem of a quantifier-free 𝒟-formula. In that case, such a dichotomy theorem would be impossible (unless P=NP), since by Ladner’s theorem [12], if PNP, then there are problems in NP that are neither NP-complete nor are in P.

  2. 2.

    By Proposition 9, if φ is a coherent, quantifier-free 𝒟-formula, then φ is equivalent to an 𝖥𝖮-formula φ(T) that includes a relation symbol T for the team. We conjecture that the converse is true, which would imply that coherence coincides with first-order definability for quantifier-free 𝒟-formulas. Our results confirm this conjecture for the case in which φ is a disjunction of two dependence atoms.

  3. 3.

    Study the implication problem for disjunctions of two dependence atoms, i.e., given a finite set Σ of disjunctions of two dependence atoms and a disjunction ψ of two dependence atoms, does Σ logically imply ψ? For functional dependencies (i.e., single dependence atoms), the implication problem is solvable in polynomial time; moreover, there is a set of simple axioms, known as Armstrong’s axioms [1, p. 186], for reasoning about functional dependencies. As argued in the introduction, disjunctions of two dependence atoms form a natural class of database dependencies that have not been studied in their own right thus far. Investigating the implication problem for disjunctions of two dependence atoms will enhance the interaction between dependence logic and database theory.

References

  • [1] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/.
  • [2] Stephen A. Cook and Pierre McKenzie. Problems complete for deterministic logarithmic space. J. Algorithms, 8(3):385–394, 1987. doi:10.1016/0196-6774(87)90018-6.
  • [3] Elias Dahlhaus. Reduction to NP-complete problems by interpretations. In Logic and Machines, volume 171 of Lecture Notes in Computer Science, pages 357–365. Springer, 1983. doi:10.1007/3-540-13331-3_51.
  • [4] Ronald Fagin. Generalized first-order spectra and polynomial-time recognizable sets. In Richard Karp, editor, Complexity of Computation, number 7 in SIAM-AMS Proceedings, pages 43–73. SIAM-AMS, 1974.
  • [5] Ronald Fagin and Moshe Y. Vardi. The Theory of Data Dependencies - A Survey. Symposia in Applied Mathematics, pages 19–71, 1986. doi:10.1090/psapm/034/846853.
  • [6] Nicolas Fröhlich, Phokion G. Kolaitis, and Arne Meier. Disjunctions of two dependence atoms, 2025. arXiv:2508.16146.
  • [7] Neil Immerman. Descriptive complexity. Graduate texts in computer science. Springer, 1999. doi:10.1007/978-1-4612-0539-5.
  • [8] Jarmo Kontinen. Coherence and computational complexity of quantifier-free dependence logic formulas. Studia Logica, 101(2):267–291, 2013. doi:10.1007/s11225-013-9481-8.
  • [9] Juha Kontinen, Sebastian Link, and Jouko A. Väänänen. Independence in database relations. In WoLLIC, volume 8071 of Lecture Notes in Computer Science, pages 179–193. Springer, 2013. doi:10.1007/978-3-642-39992-3_17.
  • [10] Juha Kontinen and Jouko A. Väänänen. On definability in dependence logic. Journal of Logic, Language and Information, 18(3):317–332, 2009. doi:10.1007/s10849-009-9082-0.
  • [11] Juha Kontinen and Jouko A. Väänänen. Axiomatizing first-order consequences in dependence logic. Ann. Pure Appl. Logic, 164(11):1101–1117, 2013. doi:10.1016/j.apal.2013.05.006.
  • [12] Richard E. Ladner. On the structure of polynomial time reducibility. J. ACM, 22(1):155–171, 1975. doi:10.1145/321864.321877.
  • [13] Omer Reingold. Undirected connectivity in log-space. J. ACM, 55(4):17:1–17:24, 2008. doi:10.1145/1391289.1391291.
  • [14] Jouko A. Väänänen. Dependence Logic – A New Approach to Independence Friendly Logic, volume 70 of London Mathematical Society student texts. Cambridge University Press, 2007. URL: http://www.cambridge.org/de/knowledge/isbn/item1164246/?site_locale=de_DE.
  • [15] Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In STOC, pages 137–146. ACM, 1982. doi:10.1145/800070.802186.

Appendix A Appendix

Theorem 13 (). [Restated, see original statement.]

The following statements are true:

  1. (1.)

    The problem MC(dep(x,z)dep(y,z)) is NL-complete.

  2. (2.)

    The problem MC(dep(x,y)dep(y,z)) is NL-complete.

Proof.

Membership follows directly from Proposition 10. We now give the proof of NL-hardness for (1.). The proof works analogous for (2.).

We show 2SATmlogMC(dep(x,z)dep(y,z)). Suppose θ(p1,,pm) is an instance of 2SAT of the form iIEi, with conjuncts Ei=(i1i2), where i1,i2 are different literals, i. e., i1i2. This can easily be archived by replacing conjuncts of the form () with (q) and (¬q), where q{p1,,pm} is a new proposition.

We will construct a team T, such that the following are equivalent:

  • Tdep(x,z)dep(y,z).

  • θ(p1,,pm) is satisfiable.

For each conjunct Ei, we create a team TEi with two assignments si1 and si2 of domain {x,y,z}, that encode the literals of Ei. To this end, variable x ranges over the propositions {p1,,pm}, variable y ranges over clause (index) I, and variable z ranges over { 0k,1k1km}, such that each proposition has its own two truth values. This is necessary to ensure that z has two different values in each TEi. Thus, the dependence atom dep(y,z) makes sure we have to choose at least one of the assignments from each TEi into the subset of T, that will satisfy dep(x,z), thereby encoding the assignment that satisfies θ.

Each literal ij in Ei, gives rise to one assignment. For example, the team TEi for a clause (pkpj) is the one in the middle in Figure 1.

Now T is the union iITEi. Suppose θ(p1,,pm) is satisfiable. Then there exists an assignment F:{p1,,pm}{0,1}, that satisfies θ(p1,,pm). We write F(pk)k to denote the truth values {0k,1k} used by variable z. For example, if F(pk)=0, then F(pk)k=0k. Define the partition of the team T into two sets in the following way:

T1 ={sTs(x)=pk and s(z)=F(pk)k},T2=TT1

The assignments in T that agree with the assignment F are chosen to T1. Since F evaluates iIEi true, it satisfies every conjunct Ei. Thus T1 contains at least one assignment from each TEi. Thus there will be at most one tuple from each TEi left to T2. This T2 trivially satisfies dep(y,z) since all tuples in T2 disagree on y, i. e., for all s,tT2 it is true that s(y)t(y), if st.

Next we will show that T1 satisfies dep(x,z). Let s,tT1, such that s(x)=t(x)=pk. Then by the definition of T1 it holds that s(y)=F(pk)k=t(y) holds. Thus T1dep(x,z).

The other direction: Suppose Tdep(x,z)dep(y,z). Then there is a partition of T into T1 and T2, such that T1dep(x,z) and T2dep(y,z). We will define the assignment F:{p1,,pm}{0,1} in the following way:

  • If there exists sT1, such that s(x)=pk, then F(pk)k=s(z).

  • Otherwise, if for all sT1 it holds s(x)pk, then F(pk)=1.

Let us show that F is a function, which satisfies θ(p1,,pm):

  1. 1.

    Clearly, Dom(F)={p1,,pm} and Rng(F)={0,1}.

  2. 2.

    F is a function: Let pk{p1,,pm}. Suppose there exists s,tT1, such that s(x)=t(x)=pk holds. Since T1dep(x,z) holds, it follows that s(z)=t(z) holds. Suppose there are no sT1, such that s(x)=pk. Then by definition of F it holds that F(pk)=1.

  3. 3.

    F satisfies θ(p1,,pm): Note that y is constant and z is assigned different values by each tuple in each TEi. Thus T1 contains at least one of the tuples from each TEi. Let sTEi, such that sT1. Recall, that in every assignment the values s(z) encodes the truth value of s(x) such that Ei is satisfied. Since s agrees with F, it holds that F(ij)ij=s(z), which implies that F(Ei)=1.

Each conjunct Ei, of θ gives rise to a constant size team of two assignments with domain {x,y,z}. Thus the team T can be constructed in L for each θ.

Lemma 15 (). [Restated, see original statement.]

Let T be a team with domain {x,y} such that Rngx(T)Rngy(T)=. The following statements are equivalent:

  1. (1.)

    Tdep(x,y)dep(y,x).

  2. (2.)

    GT can be transformed into a directed graph GT=(VT,ET) such that

    1. (a)

      each node has at most one out-going edge, and

    2. (b)

      for each {u,v}ET either (u,v)ET or (v,u)ET.

  3. (3.)

    For each connected component C=(VC,EC) of GT, we have that |EC||VC|.

Proof.

We will prove the equivalences as follows.

(1.) (2.)

Assume, w.l.o.g., T=T1T2, such that T1dep(x,y) and T2dep(y,x). Then the directed graph GT=(VT,ET) with ET={(s(x),s(y))sT1}{(s(y),s(x))sT2} has properties (a) and (b).

For (a), wrongly assume, w.l.o.g., that there are two edges e1=(a,b) and e2=(a,b) with bb and a,b,b values in Rng(T) (the same argument could be made for the second component being a and the first components being different). Now, these edges correspond to assignments s(x)=a, s(y)=b, and s(x)=a, s(x)=b. Both s,sT1 and would violate dep(x,y) which is a contradiction.

For (b), notice that each {u,v} corresponds to a single assignment sT. By assumption, s is either in T1 or T2, whence it has to appear in ET in either direction.

(2.) (1.)

Let GT=(VT,ET) be a directed graph, where each node has at most one out-going edge. Define T1={sT(s(x),s(y))ET} and T2={sT(s(y),s(x))ET}. Then, T1dep(x,y), because each s(x) appears only once by assumption, so the dependency is not violated. By analogous arguments we have that T2dep(y,x).

(2.) (3.)

We make a case distinction. If each node in GT has exactly one out-going edge, then |VT|=|ET|=(b)|ET| and likewise |EC|=|VC| for each connected component C of GT. Otherwise, if some nodes have no out-going edge, then |EC|<|VC|.

(3.) (2.)

A connected component in GT has at least |VC|1 edges (otherwise it would not be connected). So there are only two cases:

  1. (i)

    If |EC|=|VC|1, then C is a (spanning-)tree. Define GT by orienting the edges from the leaves to the root, i.e., bottom-up. Since each node has one parent, each node has one outgoing edge (except for the root which has no outgoing edge).

  2. (ii)

    If |EC|=|VC|, then C contains exactly one cycle 𝒞, because C is connected and has one more edge than its spanning-tree. Choose any orientation of 𝒞 (clockwise or counter-clockwise). Afterwards, all nodes in C𝒞 (hence outside of the cycle 𝒞) form tree-like connected components and we can proceed as in (i): There is one node directly connecting to 𝒞 which will become the root, and its outgoing edge connects to a node in the cycle 𝒞.

See 17

Proof.

Assume Tdep(x,y)dep(y,x) and |T|>|Rngx(T)|+|Rngy(T)|, then there exists a split T=T1T2 such that T1dep(x,y) and T2dep(y,x). Clearly, |T1||Rngy(T)| and |T2||Rngy(T)|, because otherwise each of T1 and T2 would not satisfy the respective dependence atom. That is true, because then there would be a pair s,tT1 with s(x)=t(x) and s(y)t(y), respectively s,tT2 with s(y)=t(y) and s(x)t(x). Furthermore, we have that |T||T1|+|T2|. As a result, we get |T||T1|+|T2||Rngx(T)|+|Rngy(T)|, which is a contradiction to |T|>|Rngx(T)|+|Rngy(T)|.

Definition 19. [Restated, see original statement.]

Let ϕ be a 2CNF formula with variables Var(ϕ)={x1,,xn}, such that:

  1. (1.)

    Every clause is monotone, i.e., it is of the form (xixj) or of the form (x¯ix¯j). The first type is called positive, the second type negative.

  2. (2.)

    No two “dual” clauses (xiyj), (x¯ix¯j) appear in ϕ.

  3. (3.)

    Transitivity holds: if (xixj) and (xjxk) are clauses of ϕ, then so is (xixk)ϕ; furthermore, if (x¯ix¯j) and (x¯jx¯k) are clauses of ϕ, then so is (x¯ix¯k)ϕ.

We call such 2CNF formulas monotone transitive dual-free and abbreviate the corresponding satisfiability problem by mtdf-2SAT.

We call a set {v1,,vk} of variables in ϕ a (variable-)clique, if for each pair vi,vj,ij it is true that either (vivj) or (¬vi¬vj) is a clause in ϕ. Note that, by the transitivity (3.), the variables of a mtdf-2SAT instance appear in at most two cliques, once positive and once negative. Thus we can represent ϕ, by listing its positive and negative cliques. See 20

Proof.

We present reductions between MC(dep(x,y)dep(y,x)) and mtdf-2SAT (for illustrations, see Figure 7). Let τt=(T3) be the vocabulary of structures where T encodes a team such that T(i,j,k) is true if and only if si(xj)=ak, where i,j,k are natural numbers. Note that this differs from the encoding Rel(T), used in the proof of Theorem 18, in that it explicitly contains the index of assignments. This additional information is necessary for this reduction. Then MC(dep(x,y)dep(y,x)):STRUC[τt]{0,1} is the Boolean query that is true if and only if T is a valid team and Tdep(x,y)dep(y,x).

Further let τpn=(P2,N2) be the vocabulary of structures for mtdf-2SAT, where P(v,c) encodes positive variables that occur in the same clique c and N(v,c) encodes negative variables. Then mtdf-2SAT:STRUC[τpn]{0,1} is the Boolean query that is true if and only if P and N encode a mtdf-2CNF formula ϕ and there is some assignment such that ϕ.

The first-order reduction Itpn:STRUC[τt]STRUC[τpn] is as follows:

Itpnλvctrue,ψP,ψN,ψP(v,c)T(v,0,c),ψN(v,c)T(v,1,c),

where true means the universe is the same. We show that AMC(dep(x,y)dep(y,x))Itpn(A)mtdf-2SAT. Assume A encodes a team T such that T=T1T2, where T1dep(x,y) and T1dep(y,x). Then the assignment with (xi)=1, if siT1 and (xi)=0, if siT2 satisfies the formula encodes by Itpn(A). This is because, in a positive clique, at most one variable is false. Thus, in all clauses, at least one variable is true. The opposite holds for negative cliques.

Now for the reduction in the other direction. Consider the following first-order reduction Ipnt:STRUC[τpn]STRUC[τt] given by:

Ipntλsxatrue,ψT,ψT(s,x,a)(x=0N(s,a))(x=1P(s,a)).

Notice that Ipnt is simply the inverse of Itpn, i. e., Ipnt=Itpn1.

Txys000s101s202s311s421s522

T(0,0,0), T(0,1,0), T(1,0,0), T(1,1,1), T(2,0,0), T(2,1,2), T(3,0,1), T(3,1,1), T(4,0,2), T(4,1,1), T(5,0,2), T(5,1,2), P(0,0), N(0,0), P(1,0), N(1,1), P(2,0), N(2,2), P(3,1), N(3,1), P(4,2), N(4,1), P(5,2), N(5,2), (x0x1), (x1x0), (x1x2), (x2x1), (x0x2), (x2x0), (x4x5), (x5x4), (¬x1¬x3), (¬x3¬x1), (¬x3¬x4), (¬x4¬x3), (¬x1¬x4), (¬x4¬x1), (¬x2¬x5), (¬x5¬x2),

Figure 7: Example reduction for Lemma 20. Team T gets encodes via structure {0,1,2,3,4,5},T3 that gets mapped to the structure {0,1,2,3,4,5},P2,N2 which encodes a mtdf-2CNF formula ϕ; and vice versa. Notice how the split {s1,s3,s5}dep(x,y) and {s0,s2,s4}dep(y,x) corresponds to the assignment (x1)=(x3)=(x5)=1 and (x0)=(x2)=(x4)=0.
Theorem 24 (). [Restated, see original statement.]

Each of the following formulas has coherence-level 4:
(1.) dep(x)dep(y,z); (2.) dep(x)dep(x,y); (3.) dep(y)dep(x,y).

Proof.

We first demonstrate that dep(x)dep(y,z), dep(x)dep(x,y) and dep(y)dep(x,y) are not 3-coherent.

(1.) Consider the team T in Figure 5 (b). We have that T⊧̸dep(x)dep(y,z) which is straightforward to see; consider for example the splits {s1,s2}dep(x) or {s3,s4}dep(x) and notice that in both cases dep(y,z) is falsified by {s3,s4} and {s1,s2} respectively.

Now, all 3-element sub-teams have a split T=T1T2, where T1 is of size two and T1dep(x) and T2 is of size one and therefore satisfies any dependence atom.

(2.) Not 3-coherent, because of the team in Figure 5 (c).

(3.) Not 3-coherent, because of the team in Figure 5 (d).

For coherence it suffices to only consider dep(x)dep(y,z). The other two cases directly follow from the first.

We show that if T⊧̸dep(x)dep(y,z), then there must be a sub-team TT with |T|4 and T⊧̸dep(x)dep(y,z). Start with a pair of assignments s1,s2T for which {s1,s2}⊧̸dep(y,z) holds, i. e., s1(y)=s2(y) and s2(z)s2(z). Such a pair has to exist in T to falsify the disjunction. Now, there are two cases for the assignment of x.

Case 1:

The assignments s1 and s2 have the same value for x, i. e., s1(x)=s2(x). Then for T to falsify the disjunction, there must be a second pair {s3,s4}⊧̸dep(y,z) with s3(x)s1(x) and s4(x)s1(x). but then T={s1,s2,s3,s4} and T⊧̸dep(x)dep(y,z). To see this, assume there is a split T1T2={s1,s2,s3,s4} with T1dep(x) and T2dep(y,z). If s1T1, then s3T2, but s4 cannot be in T1 or T2. If s1T2, then s2T1, therefore s3 and s4 must be in T2 which is not possible. Therefore no split exists.

Case 2:

The assignments s1 and s2 have different values for x, i. .e, s1(x)s2(x). Then the pair satisfies neither dep(x) nor dep(y,z), so in any potential split s1T1 implies s2T2 and vice versa. Since the original team falsified the disjunction, there must be an assignment s3T that can neither be in T1 nor T2 if s1T1 and s2T2. In the same way there must be an assignment s4T that is not in T1 or T2 for s1T2 and s2T1. Therefore T={s1,s2,s3,s4} also has no split satisfying dep(x)dep(y,z).

Theorem 25 (). [Restated, see original statement.]

The following three formulas are incoherent:
(1.) dep(x,y)dep(y,x); (2.) dep(x,y)dep(y,z); (3.) dep(x,z)dep(y,z).

Proof.

(1.) We show that for all even n, there is a team T of size |T|=n+1 with T⊧̸dep(x,y)dep(y,x), but for all n-element sub-teams Tdep(x,y)dep(y,x) holds.

Tnxys01n2s111s212s322s423sn1n2n2snn21Txys01n2s322snn21s111s212sn1n2n2Tnxyzs01n23s1111s2122s3221s4232sn1n2n21snn212Txyzs01n23s3221snn212s1111s2122sn1n2n21

Figure 8: Team Tn in the proof of Theorem 25, and constructed split of an example sub-team T=Tn{s4}. On the left for dep(x,y)dep(y,x) and on the right for dep(x,z)dep(y,z).

Let the team Tn be as depicted in Figure 8. We show that it is impossible to split Tn=T1T2 such that T1dep(x,y) and T2dep(y,x). Start with assignment s1 and choose s1T1. Next, s2T2 is the only choice, because T1={s1,s2}⊧̸dep(x,y). Then s3T1, because T2={s2,s3}⊧̸dep(y,x), but T1={s1,s3}dep(x,y). Continue this procedure until all assignments but s0 are either in T1 or T2. Now, s0 cannot be in T1, because {s0,s1}⊧̸dep(x,y) and it cannot be in T2, because {s0,sn2}⊧̸dep(y,x). If instead we chose s1T2 and repeated the process backwards, that is continued with sn, then {s0,s2}⊧̸dep(x,y) and {s0,sn1}⊧̸dep(y,x). Since s1 has to be in either T1 or T2, we can conclude that no split exists and Tn⊧̸dep(x,y)dep(y,x).

Next, we show that all sub-teams of size n satisfy the disjunction. The sub-team Tn{s0}dep(x,y)dep(y,x) given the split described above. Otherwise, let T=Tn{si} and consider the split above where s1T1. We define a new split for T as follows

T1 ={sjT11<j<i}{sjT2j>i}{s0}
T2 ={sjT21<j<i}{sjT1j>i}{s1}

Now, T1dep(x,y) and T2dep(y,x).

We will present the case that siT1; the case siT2 follows analogously. If siT1, then si+1 has to be in T2 as shown above. Now, since siT, it is possible that si+1T1. From this si+2T2,si+3T1, follows immediately. This chain leads to snT1, which allows s1T2, which in turn makes s0T1 possible. Therefore we have that T1dep(x,y) and T2dep(y,x).

We have shown that for every even n we can construct a team such that dep(x,y)dep(y,x) is not n-coherent, that is dep(x,y)dep(y,x) is incoherent.

(2.) Extend the team Tn in (1.) with variable z, such that s(z)=s(x) for all sTn. Then incoherence of dep(x,y)dep(y,z) follows immediately from the incoherence of dep(x,y)dep(y,x).

(3.) For dep(x,z)dep(y,z) extend Tn with variable z such that if s(x)=s(x) or s(y)=s(y), then s(z)s(z) for all s,sT. For example see team Tn in Figure 8. It is easy to check, that the arguments of (1.) also hold true in such a team for dep(x,z)dep(y,z).