Abstract 1 Introduction 2 Preliminaries 3 FOC𝟐 has Unbounded VC Dimension 4 Bound on the Number of Types 5 VC Density and VC Dimension 6 Stability 7 Final Remarks References

On the VC Dimension of First-Order Logic
with Counting and Weight Aggregation

Steffen van Bergerem ORCID Humboldt-Universität zu Berlin, Germany Nicole Schweikardt ORCID Humboldt-Universität zu Berlin, Germany
Abstract

We prove optimal upper bounds on the Vapnik–Chervonenkis density of formulas in the extensions of first-order logic with counting (FOC1) and with weight aggregation (FOWA1) on nowhere dense classes of (vertex- and edge-)weighted finite graphs. This lifts a result of Pilipczuk, Siebertz, and Toruńczyk [14] from first-order logic on ordinary finite graphs to substantially more expressive logics on weighted finite graphs. Moreover, this proves that every FOC1 formula and every FOWA1 formula has bounded Vapnik–Chervonenkis dimension on nowhere dense classes of weighted finite graphs; thereby, it lifts a result of Adler and Adler [1] from first-order logic to FOC1 and FOWA1.

Generalising another result of Pilipczuk, Siebertz, and Toruńczyk [14], we also provide an explicit upper bound on the ladder index of FOC1 and FOWA1 formulas on nowhere dense classes. This shows that nowhere dense classes of weighted finite graphs are FOC1-stable and FOWA1-stable.

Keywords and phrases:
VC dimension, VC density, stability, nowhere dense graphs, first-order logic with weight aggregation, first-order logic with counting
Copyright and License:
[Uncaptioned image] © Steffen van Bergerem and Nicole Schweikardt; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Finite Model Theory
Acknowledgements:
We thank the anonymous reviewers for their valuable comments that helped to improve the presentation of this paper.
Funding:
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 541000908 (gefördert durch die Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 541000908).
Editors:
Jörg Endrullis and Sylvain Schmitz

1 Introduction

The Vapnik–Chervonenkis dimension (for short: VC dimension) is a measure for the complexity of set systems; it was introduced in the 1970s [19, 17, 16] and has been widely studied since then. It is formally defined as follows. Let X be a set and let 2X be a family of subsets of X. A set YX is shattered by if every subset of Y can be obtained as the intersection of Y with some F, i. e., {YF:F}=2Y. The VC dimension of is the maximum size of a set YX that is shattered by (or , if this maximum does not exist).

Given a logical formula φ(x¯,y¯) with its free variables partitioned into a k-tuple x¯ and an -tuple y¯, the VC dimension of φ(x¯,y¯) on a graph G=(V(G),E(G)) is defined as the VC dimension of the family Sφ(G/V(G))SGφ(V(G)/V(G)), where for V,WV(G) we let

SGφ(V/W){tpGφ(v¯/W):v¯Vk},wheretpGφ(v¯/W){w¯W:Gφ[v¯,w¯]}.

We say that φ(x¯,y¯) has bounded VC dimension on a class 𝒞 of graphs if there is a number c such that for every G𝒞 the VC dimension of φ(x¯,y¯) on G is at most c. In the following, all graphs considered in this paper are finite.

Motivated by applications on the learnability of concept classes in the model of Probably Approximately Correct (PAC) learning, Grohe and Turán [9] showed that every first-order formula φ(x¯,y¯) has bounded VC dimension on classes of graphs of bounded local clique-width (this, in particular, includes planar graphs). Adler and Adler [1] generalised this to all nowhere dense classes of graphs. The notion of nowhere dense classes was introduced by Nešetřil and Ossona de Mendez [12, 11] as a formalisation of classes of “sparse” graphs. It subsumes and extends many well-known classes of sparse graphs, including planar graphs, trees, classes of graphs of bounded tree-width or bounded degree, and all classes that exclude a fixed topological minor. It is a robust notion that has numerous equivalent characterisations; for details we refer to the book [13].

The goal of the present paper is to lift Adler and Adler’s result [1] from first-order logic FO to the substantially more expressive logics FOC1 and FOWA1 (introduced in [8, 5]) that enrich FO by mechanisms for counting and for weight aggregation. An obstacle in achieving this is that the proof in [1] relies on model-theoretic results of [15] based on the compactness of FO – and these are not available for FOC1 or FOWA1. Fortunately, Pilipczuk, Siebertz and Toruńczyk [14] presented a different, constructive proof of Adler and Adler’s result. Their proof is based on Gaifman locality and Feferman–Vaught decompositions of FO. Similar locality results and decompositions were achieved for FOC1 and FOWA1 in [8, 5].

The logic FOC (first-order logic with counting terms) was introduced in [10] and further studied in [8, 3]. This logic extends FO by the ability to formulate counting terms that evaluate to integers, and by numerical predicates that allow to compare counting terms. If φ is a formula with free variables x¯=(x1,,xk) and y¯=(y1,,y), then #y¯.φ is a counting term with free variables x¯ that specifies the number of tuples y¯ that satisfy the formula φ. Apart from this, every fixed integer is a counting term; and if t1 and t2 are counting terms, then so are (t1+t2) and (t1t2). The results of terms can be combined into a formula by means of numerical predicates: an m-ary numerical predicate 𝖯 is an m-ary relation on the integers (e. g. 𝖯 is the binary relation consisting of all pairs (i,j) of integers where ij). The logic FOC allows formulas of the form 𝖯(t1,,tm) that evaluate to “true” if and only if the m-tuple of integers obtained by evaluating the counting terms t1,,tm belongs to the relation 𝖯.

The logic FOWA (first-order logic with weight aggregation) was introduced in [5]. Formulas and terms of this logic are evaluated on weighted graphs, which extend ordinary undirected graphs by assigning weights (i. e., elements from particular rings or abelian groups) to vertices or edges present in the graph. Pairs that do not occur as edges of the graph receive the weight 0, i. e., the neutral element of the ring or abelian group. FOWA extends FO by the ability to formulate (weight aggregation) terms that evaluate to elements in the given ring (or abelian group), and by predicates that allow to compare these terms. Every fixed element of the ring or abelian group is a term, as well as every expression of the form 𝚠(x) or 𝚠(x,y); the latter yields the weight of vertex x and edge (x,y), respectively. If φ is a formula with free variables x¯=(x1,,xk) and y¯=(y1,,y), then 𝚠(y¯).φ is a (weight aggregation) term with free variables x¯ that specifies the sum (w.r.t. the ring or abelian group) of the weights of all tuples y¯ for which the formula φ is satisfied. More generally, instead of a single expression 𝚠(y¯), the term may also refer to a product (w.r.t. the given ring) of such expressions and fixed elements of the ring. Analogously as for FOC, terms can be combined using the operations present in the ring or abelian group; and the results of terms can be combined into a formula by means of predicates on the ring or abelian group: a formula of the form 𝖯(t1,,tm) expresses that the m-tuple of elements in the ring or abelian group obtained by evaluating the terms t1,,tm belongs to the relation 𝖯.

FOC can be viewed as a special case of FOWA where the ring is the ring of integers, and every vertex of the graph is equipped with the weight 1. Thus, all results that are available for (fragments of) FOWA immediately translate into analogous results on (the corresponding fragment of) FOC (but not necessarily vice versa).

For each number n, the fragments FOCn and FOWAn of FOC and FOWA restrict subformulas of the form 𝖯(t1,,tm) to have at most n free variables.

In this paper, we follow the approach of Pilipczuk, Siebertz and Toruńczyk [14] and extend it to FOC and FOWA by utilising results of van Bergerem and Schweikardt [5] and Grohe and Schweikardt [8]. Our main results are as follows.

  1. (1)

    There is a formula φ(x,y) of FOC2 that has unbounded VC dimension on the class 𝒯3 of unranked trees of height 3 (note that 𝒯3 is nowhere dense). (Theorem 3.1)

  2. (2)

    Every formula φ(x¯,y¯) of FOC1 or FOWA1 has bounded VC dimension on every nowhere dense class 𝒞 of weighted graphs. (Corollary 5.3)

Result (1) is obtained by representing arbitrary graphs G via unranked trees TG of height 3 in the same way as in [8]. Then, arbitrary FO formulas on G can be translated into corresponding FOC2 formulas on TG. By applying this translation to the formula E(x,y), which has unbounded VC dimension on the class of all graphs, one obtains Result (1).

For obtaining Result (2), we combine the approach of [14] with the locality results of [8, 5]. This allows us to lift the following key result of [14] from FO to FOC1 and FOWA1.

  1. (3)

    For every nowhere dense class 𝒞 of weighted graphs, for every formula φ(x¯,y¯) of FOWA1 or FOC1, and for and every ε>0, there exists a number c such that for every G𝒞 and every non-empty WV(G), we have |Sφ(G/W)|c|W||x¯|+ε, where Sφ(G/W)SGφ(V(G),W). (Theorem 5.1)

As an immediate consequence of this, by definition, we obtain the following result.

  1. (4)

    Every formula φ(x¯,y¯) of FOWA1 or FOC1 has VC density at most |x¯| on every nowhere dense class 𝒞 of weighted graphs. (Corollary 5.2)

Here, the VC density of φ(x¯,y¯) on 𝒞 is defined as the infimum of all reals α>0 such that |Sφ(G/W)|𝒪(|W|α), for all G𝒞 and all WV(G) (where constants hidden in the 𝒪-notation may depend on α). We want to remark that Result (4) implies Result (2), because the VC dimension is finite if and only if the VC density is finite (see, e. g., [2]).

For proving Result (3), we rely on a technical main lemma (see Lemma 4.1). The same statement was proven in [14] for FO instead of FOWA1. Lifting this from FO to FOWA1 (and FOC1) was one of the main technical obstacles we had to overcome in this paper.

From [14], we know that the bounds provided by Results (3) and (4) are optimal (since FO is included in FOC1 and FOWA1) and, furthermore, that Results (2)–(4) cannot be extended to classes that are not nowhere dense but closed under taking subgraphs.

As another application of our main technical lemma (Lemma 4.1), we provide upper bounds (Theorem 6.1) on the ladder index, which is defined as follows. For a FOWA1 formula φ(x¯,y¯), a φ-ladder of length L in a weighted graph G is a sequence v¯1,,v¯L,w¯1,,w¯L such that v¯i(V(G))|x¯| and w¯i(V(G))|y¯| for all i[L], and, for all i,j[L], it holds that Gφ[v¯i,w¯j] if and only if ij. The smallest L for which there is no φ-ladder of length L in G is called the ladder index of φ in G.

A class 𝒞 of graphs is called stable if the ladder index of every first-order formula φ in every graph from 𝒞 is bounded by a constant depending only on φ and 𝒞 [18]. Adler and Adler [1] showed that every nowhere dense class of graphs is stable. Using our bound on the ladder index (Theorem 6.1), we obtain the following result, which also implies Result (2).

  1. (5)

    Every nowhere dense class 𝒞 of weighted graphs is FOC1-stable and FOWA1-stable, that is, the ladder index of every FOWA1 formula (and therefore also of every FOC1 formula) φ in every weighted graph from 𝒞 is bounded by a constant depending only on φ and 𝒞. (Corollary 6.2)

The remainder of the paper is structured as follows. Section 2 provides the necessary background on graphs, nowhere dense classes, the logics FOC and FOWA, and the locality results that are known for these logics and used in our proofs. Section 3 presents the proof of Result (1). Section 4 is devoted to the main technical lemma (Lemma 4.1). In Section 5, we utilise this lemma to prove our Results (2)–(4). Section 6 proves Result (5) based on Lemma 4.1. We conclude in Section 7.

2 Preliminaries

We let , , 1, >0 denote the sets of integers, non-negative integers, positive integers, and positive rationals, respectively. For m,n, we let [m,n]{:mn} and [n][1,n]. For a k-tuple v¯=(v1,,vk), we write |v¯| to denote its length k. We denote the power set of a set S by 2S.

A group (G,) is a set G equipped with a binary operator :G×GG that is associative (i. e. (ab)c=a(bc) for all a,b,cG) and has a neutral element eGG (i. e. aeG=eGa=a for all aG) such that each aG has an inverse aG (i. e. aa=aa=eG); we write a1 for this a. A group is abelian if is commutative (i. e. ab=ba for all a,bG). A ring (R,+,) is a set R equipped with two binary operators + (addition) and (multiplication) such that (R,+) is an abelian group with neutral element 0RR, is associative and has a neutral element 1RR, and multiplication is distributive with respect to addition, i. e. a(b+c)=(ab)+(ac) and (a+b)c=(ac)+(bc) for all a,b,cR. A ring is commutative if is commutative.

When referring to an abelian group (or ring), we will usually write (S,+S) (or (S,+S,S)), we denote the neutral element of the group by 0S, and a denotes the inverse of an element a in (S,+S) (and we denote the neutral element of the ring for (S,S) by 1S).

𝝈-Graphs

A (simple, undirected and finite) graph G=(V(G),E(G)) consists of a finite set V(G) (the vertices of G) and a set E(G) of subsets of V(G) of size 2 (the edges of G).

A graph signature σ is a finite set consisting of a symbol E and a finite number of further symbols. The symbol E as arity ar(E)=2, while all other symbols Rσ{E} have arity ar(R){0,1}. Let σ be a graph signature. A σ-graph G consists of a graph (V(G),E(G)), and a relation R(G)(V(G))ar(R) for every Rσ{E}. Note that relations of arity 1 are subsets of V(G), and since S0={()} for every set S, there exist only two relations of arity 0, namely and {()}. We identify the latter with true and the former with false.

The order of a σ-graph G is |G||V(G)|.

Weighted 𝝈-Graphs

Let σ be a graph signature. Let 𝕊 be a collection of rings and/or abelian groups. Let 𝐖 be a finite set of weight symbols such that each 𝚠𝐖 has an associated arity ar(𝚠){1,2} and a type type(𝚠)𝕊. A (σ,𝐖)-graph (or, 𝐖-weighted σ-graph) is a σ-graph G that is enriched, for every 𝚠𝐖, by an interpretation 𝚠G:(V(G))ar(𝚠)type(𝚠), which satisfies the following edge condition for all 𝚠𝐖 with ar(𝚠)=2: if 𝚠G(v1,v2)0S for Stype(𝚠), and v1,v2V(G), then {v1,v2}E(G).

Standard notions used for graphs are defined for (𝐖,σ)-graphs G by referring to their Gaifman graph (V(G),E(G)). In particular, a path between two vertices u and v in G is a path between u and v in the graph (V(G),E(G)), and the distance distG(u,v) between vertices u and v is their distance in the graph (V(G),E(G)). The degree deg(G) is the maximum degree of (V(G),E(G)).

For a set XV(G), the induced subgraph of G on X is the (σ,𝐖)-graph G[X] with vertex set V(G[X])=X, edge set E(G[X])={eE(G):eX}, relations R(G[X])=R(G)Xar(R) for every Rσ{E}, and weights 𝚠G[X](v¯)=𝚠G(v¯) for every 𝚠𝐖 and every v¯Xar(𝚠). For a (σ,𝐖)-graph G and a set SV(G), we let GSG[V(G)S].

For a number r0, the r-ball around a vertex vV(G) is NrG(v){uV(G):distG(v,u)r}, and the r-ball around a set SV(G) is NrG(S)vSNrG(v). The r-neighbourhood around S is the (σ,𝐖)-graph 𝒩rG(S)G[NrG(S)]. For a tuple a¯=(a1,,ak)V(G)k we let 𝒩rG(a¯)𝒩rG(S) and NrG(a¯)NrG(S) for S{a1,,ak}.

Let σ be a graph signature with σσ, and let 𝐖 be a finite set of weight symbols with 𝐖𝐖. A (σ,𝐖)-graph G is a (σ,𝐖)-expansion of a (σ,𝐖)-graph G if V(G)=V(G), R(G)=R(G) for all Rσ, and 𝚠G=𝚠G for every 𝚠𝐖. If G is a (σ,𝐖)-expansion of the (σ,𝐖)-graph G, then G is the (σ,𝐖)-reduct of G.

Let G and H be two (σ,𝐖)-graphs with V(G)V(H)=. The disjoint union of G and H is the (σ,𝐖)-graph GH with vertex set V(GH)=V(G)V(H), and R(GH)=R(G)R(H) for all Rσ, and weight functions as follows: For all unary 𝚠𝐖 we have 𝚠GH(v)=𝚠G(v) for all vV(G) and 𝚠GH(v)=𝚠H(v) for all vV(H). For all binary 𝚠𝐖 we have 𝚠GH(u,v)=𝚠G(u,v) for all (u,v)V(G)2, 𝚠GH(u,v)=𝚠H(u,v) for all (u,v)V(H)2, and 𝚠GH(u,v)=0S for all (u,v)(V(G)×V(H))(V(H)×V(G)), where S=type(𝚠).

Nowhere Dense Classes

For n, we write Kn for the complete graph on n vertices. A depth-n minor of a graph G=(V(G),E(G)) is a subgraph of a graph obtained from G by contracting mutually vertex-disjoint connected subgraphs of radius at most n to single vertices.

As mentioned in Section 1, the notion of nowhere dense classes of graphs is a robust notion that has numerous equivalent characterisations; for an overview we refer to the introduction of [14]; details can be found in the book [13]. For the purpose of this paper, the following characterisation serves as our definition of the notion.

Definition 2.1.

A class 𝒞 of graphs is nowhere dense if there is a function t: such that for every r, no graph G𝒞 contains the complete graph Kt(r) as a depth-r minor. A class 𝒞 of (σ,𝐖)-graphs is nowhere dense if and only if the class {(V(G),E(G)):G𝒞} is nowhere dense.

The following theorem was proved in [14] (there, it was formulated for classes of graphs; here we adapted the formulation to classes of (σ,𝐖)-graphs). We will use this result for proving our results on VC density in Section 5. The result uses the following notion. Let G be a (σ,𝐖)-graph, let r, and let V,W,SV(G). We say that V and W are r-separated by S (in G) if every path of length at most r in G from a vertex in V to a vertex in W contains a vertex from S. This notion naturally extends to tuples v¯=(v1,,vk) and w¯=(w1,,w) for any k,1 by considering the sets {v1,,vk} and {w1,,w}, and it thereby also naturally extends to sets of tuples V and W.

Theorem 2.2 (Uniform quasi-wideness for tuples [14, Theorem 2.9]).

Let r,t, and let 𝒞 be a class of (σ,𝐖)-graphs G whose Gaifman graph (V(G),E(G)) does not include Kt as a depth-18r minor. For every d, there is a number s and a polynomial N: computable from r,t, and d with the following property.

For every G𝒞, every m, and every set X(V(G))d with |X|N(m), there are sets SV(G) and YX with |S|s and |Y|m such that all distinct v¯,v¯Y are r-separated by S in G.

The Weight Aggregation Logic FOWA

Fix a countably infinite set vars of variables. A (σ,𝐖)-interpretation =(G,β) consists of a (σ,𝐖)-graph G and an assignment β:varsV(G). For k1, elements a1,,akV(G), and k distinct variables y1,,yk, we write a1,,aky1,,yk for the interpretation (G,βa1,,aky1,,yk), where βa1,,aky1,,yk is the assignment β with β(yi)=ai for every i[k] and β(z)=β(z) for all zvars{y1,,yk}.

Recall that 𝕊 is a collection of rings and/or abelian groups. An 𝕊-predicate collection is a 4-tuple (,ar,type,), where is a countable set of predicate names and, to each 𝖯, ar assigns an arity ar(𝖯)1, type assigns a type type(𝖯)𝕊ar(𝖯), and assigns a semantics 𝖯type(𝖯). For the remainder of this paper, fix an 𝕊-predicate collection (,ar,type,).

For every S𝕊 that is not a ring but just an abelian group, a 𝐖-product of type S is either an element in S or an expression of the form 𝚠(z¯), where 𝚠𝐖 is of type S and either ar(𝚠)=1 and z¯ is a single variable, or ar(𝚠)=2 and z¯=(z1,z2) for distinct variables z1,z2.

For every ring S𝕊, a 𝐖-product of type S is an expression of the form  t1t, where 1, and for each i[], either tiS or there exists a 𝚠𝐖 with type(𝚠)=S and either ar(𝚠)=1 and ti is of the form 𝚠(z) for a variable z or ar(𝚠)=2 and ti is of the form 𝚠(z1,z2) for distinct variables z1,z2. By vars(p), we denote the set of all variables that occur in a 𝐖-product p. The syntax and semantics of first-order logic with weight aggregation FOWA is defined as follows.

Definition 2.3.

For FOWA()[σ,𝕊,𝐖], the set of formulas and 𝕊-terms is built according to the following rules.

  1. (1)

    x1=x2 and R(x1,,xk) are formulas for x1,,xkvars and Rσ with ar(R)=k.

  2. (2)

    If 𝚠𝐖, S=type(𝚠), sS, k=ar(𝚠), and x¯=(x1,,xk) is a tuple of k pairwise distinct variables, then (s=𝚠(x¯)) is a formula.

  3. (3)

    If φ and ψ are formulas, then ¬φ and (φψ) are also formulas.

  4. (4)

    If φ is a formula and xvars, then xφ is a formula.

  5. (5)

    If φ is a formula, 𝚠𝐖, S=type(𝚠), sS, k=ar(𝚠), and x¯=(x1,,xk) is a tuple of k pairwise distinct variables, then (s=𝚠(x¯).φ) is a formula.

  6. (6)

    If 𝖯, m=ar(𝖯), and t1,,tm are 𝕊-terms with type(𝖯)=(type(t1),,type(tm)), then 𝖯(t1,,tm) is a formula.

  7. (7)

    For every S𝕊 and every sS, s is an 𝕊-term of type S.

  8. (8)

    For every S𝕊, every 𝚠𝐖 of type S, and every tuple (x1,,xk) of kar(𝚠) pairwise distinct variables in vars, 𝚠(x1,,xk) is an 𝕊-term of type S.

  9. (9)

    If t1 and t2 are 𝕊-terms of the same type S, then (t1+t2) and (t1t2) are also 𝕊-terms of type S; furthermore, if S is a ring (and not just an abelian group), then also (t1t2) is an 𝕊-term of type S.

  10. (10)

    If φ is a formula, S𝕊, and p is a 𝐖-product of type S, then p.φ is an 𝕊-term of type S.

Let =(G,β) be a (σ,𝐖)-interpretation. For a formula or 𝕊-term ξ from FOWA()[σ,𝕊,𝐖], the semantics ξ is defined as follows.

  1. (1)

    x1=x2=1 if β(x1)=β(x2), and x1=x2=0 otherwise; E(x1,x2)=1 if {β(x1),β(x2)}E(G), and E(x1,x2)=0 otherwise; for all Rσ with ar(R)=1, we have R(x1)=1 if β(x1)R(G), and R(x1)=0 otherwise; for all Rσ with ar(R)=0, we have R()=1 if ()R(G), and R()=0 otherwise.

  2. (2)

    (s=𝚠(x¯))=1 if s=𝚠G(β(x1),,β(xk)), and (s=𝚠(x¯))=0 otherwise.

  3. (3)

    ¬φ=1φ  and  (φψ)=max{φ,ψ}.

  4. (4)

    xφ=max{φvx:vV(G)}.

  5. (5)

    (s=𝚠(x¯).φ)=1 if s=S{𝚠G(v¯):v¯=(v1,,vk)(V(G))k with φv1,,vkx1,,xk=1}, and (s=𝚠(x¯).φ)=0 otherwise. As usual, SX=0S if X=.

  6. (6)

    𝖯(t1,,tm)=1 if (t1,,tm)𝖯, and 𝖯(t1,,tm)=0 otherwise.

  7. (7)

    s=s for sS for some S𝕊.

  8. (8)

    𝚠(x1,,xk)=𝚠G(β(x1),,β(xk)).

  9. (9)

    (t1t2)=t1St2, for {+,,}.

  10. (10)

    p.φ=S{pv1,,vkx1,,xk:v1,,vkV(G),φv1,,vkx1,,xk=1}, where vars(p)={x1,,xk}, k=|vars(p)| and p=t1SSt if p=t1t is of type S.

An expression is a formula or an 𝕊-term. The set vars(ξ) of an expression ξ is defined as the set of all variables in vars that occur in ξ. The free variables free(ξ) of ξ are inductively defined as follows.

  1. (1)

    free(x1=x2)={x1,x2} and free(R(x1,,xk))={x1,,xk} for Rσ.

  2. (2)

    free((s=𝚠(x1,,xk)))={x1,,xk}.

  3. (3)

    free(¬φ)=free(φ) and free(φψ)=free(φ)free(ψ).

  4. (4)

    free(xφ)=free(φ){x}.

  5. (5)

    free((s=𝚠(x1,,xk).φ))=free(φ){x1,,xk},

  6. (6)

    free(𝖯(t1,,tm))=i=1mfree(ti).

  7. (7)

    free(s)= for sS for some S𝕊.

  8. (8)

    free(𝚠(x1,,xk))={x1,,xk}.

  9. (9)

    free((t1t2))=free(t1)free(t2) for {+,,}.

  10. (10)

    free(p.φ)=free(φ)vars(p).

We write ξ(x1,,xk) to indicate that free(ξ){x1,,xk}. A sentence is a formula without free variables, and a ground 𝕊-term is an 𝕊-term without free variables.

For a formula φ and a (σ,𝐖)-interpretation , we write φ to indicate that φ=1. Likewise, ⊧̸φ indicates that φ=0. For a formula φ, a (σ,𝐖)-graph G, and a tuple v¯=(v1,,vk)(V(G))k, we write Gφ[v¯] or (G,v¯)φ to indicate that (G,β)φ for one (and hence every) assignment β with β(xi)=vi for all i[k]. Furthermore, we set φ(v¯)G1 if Gφ[v¯], and φ(v¯)G0 otherwise. Similarly, for an 𝕊-term t(x¯), we write tG[v¯] to denote t. The fragments FOWAn and FOW1 of FOWA are defined as follows.

Definition 2.4.

For every n, the set of expressions of FOWAn()[σ,𝕊,𝐖] is built according to the same rules as for the logic FOWA()[σ,𝕊,𝐖], with the following restrictions:

  • rule (5) can only be applied if S is finite,

  • rule (6) can only be applied if |free(t1)free(tm)|n.

FOW1()[σ,𝕊,𝐖] is the restriction of FOWA1()[σ,𝕊,𝐖] where rule (10) cannot be applied.

As pointed out in [5], FOW1 can be viewed as an extension of first-order logic with modulo-counting quantifiers, and FOWA and FOWA1 can be viewed as extensions of the counting logics FOC and FOC1 of [10] and [8]. In fact, every formula in FOC can be viewed as a formula in FOWA.

Note that first-order logic FO is the restriction of FOW1 where only rules (1), (3), and (4) can be applied. As usual, we write (φψ) and xφ as shorthands for ¬(¬φ¬ψ) and ¬x¬φ. The quantifier rank qr(ξ) of an FOWA()[σ,𝕊,𝐖] expression ξ is defined as the maximum nesting depth of constructs using rules (4) and (5) in order to construct ξ. The aggregation depth dag(ξ) of ξ is defined as the maximum nesting depth of term constructions using rule (10) in order to construct ξ.

Example 2.5.

Consider the following setting. 𝕊 consists of a single ring, the ring (,+,) of integers with the natural addition and multiplication. consists of a single predicate, the binary equality predicate 𝖯= with 𝖯=={(i,i):i}. 𝐖 consists of a single weight symbol 𝚠, and ar(𝚠)=2. Furthermore, σ={E}. We interpret a (σ,𝐖)-graph G=(V(G),E(G),𝚠G) as a flow network, where 𝚠G(u,v) indicates the flow through edge {u,v} in the direction from u to v, and 𝚠G(v,u) indicates the flow through edge {u,v} in the direction from v to u.

The fact that a node x is a source node, i.e., all edges incident with x have weight 0 in the direction into x, can be described by the FOW1()[σ,𝕊,𝐖] formula source(x)z(0=𝚠(z,x)). Similarly, target(y)z(0=𝚠(y,z)) is an FOW1()[σ,𝕊,𝐖] formula expressing that node y is a target node, i.e., all edges incident with y have weight 0 in the direction outgoing from y. Furthermore, tin(z)𝚠(u,z).(z=zE(u,z)) is a term of FOWA1()[σ,𝕊,𝐖] which specifies the total flow through edges incoming into node z. Moreover, tout(z)𝚠(z,u).(z=zE(z,u)) is a term of FOWA1()[σ,𝕊,𝐖] which specifies the total flow through edges going out of node z. Thus, ψ(z)𝖯=(tin(z),tout(z)) is a FOWA1()[σ,𝕊,𝐖] formula expressing that for node z, the incoming flow is equal to its outgoing flow. Finally, φ(x,y)((source(x)target(y))z((z=xz=y)ψ(z))) is a FOWA1()[σ,𝕊,𝐖] formula expressing the following: Gφ[s,t] for nodes s,tV(G) if and only if 𝚠G is a feasible flow for the flow network G with source and sink nodes s and t, i.e., for all vertices vV(G){s,t} the incoming flow is equal to its outgoing flow.

Locality Results

For proving the main results (2)–(5) stated in Section 1, we heavily rely on the following two locality results achieved in [5].

Theorem 2.6 (Feferman–Vaught decompositions for FOW1 [5, Theorem 4.3]).

Let k,, and let x¯=(x1,,xk), y¯=(y1,,y) be tuples of k+ pairwise distinct variables. For every FOW1()[σ,𝕊,𝐖] formula φ with free variables among {x1,,xk,y1,,y}, there is a finite, non-empty set Δ of pairs (α,β) of FOW1()[σ,𝕊,𝐖] formulas with free(α){x1,,xk} and free(β){y1,,y} such that the following holds. For all (σ,𝐖)-graphs G and H with V(G)V(H)= and all v¯(V(G))k and w¯(V(H)), we have GHφ[v¯,w¯] if and only if there is a pair (α,β)Δ with Gα[v¯] and Hβ[w¯].

Furthermore, all formulas occurring in Δ have quantifier rank at most qr(φ), and they only use those 𝖯 and S𝕊 that occur in φ and only those 𝕊-terms that occur in φ or that are of the form s for an sS with S𝕊 where S is finite and occurs in φ.

Moreover, there is an algorithm that computes Δ upon input of φ, x¯, and y¯.

For stating the second locality result, we need the following notation of local formulas. Let r. A FOWA formula φ(x¯) with free variables x¯=(x1,,xd) is r-local (around x¯) if for every (σ,𝐖)-graph G and all a¯V(G)d, we have  Gφ[a¯]𝒩rG(a¯)φ[a¯] . A formula is local if it is r-local for some r.

Theorem 2.7 (Localisation Theorem for FOWA1 [5, Theorem 4.7]).

Let d. For every formula φ(x1,,xd) of FOWA1()[σ,𝕊,𝐖], there is an r, an extension σ of σ with relation symbols of arity 1, and an FOW1()[σ,𝕊,𝐖] formula φ(x1,,xd) that is a Boolean combination of r-local formulas and statements of the form R() for a 0-ary relation symbol Rσ such that the following holds. There is an algorithm that, upon input of a (σ,𝐖)-graph G, computes in time |V(G)|(deg(G))𝒪(1) a (σ,𝐖)-expansion G of G such that, for all v¯V(G)d, it holds that Gφ[v¯] if and only if Gφ[v¯]. Furthermore, r, σ, and φ are computable from φ.

3 FOC𝟐 has Unbounded VC Dimension

This section proves main result (1) stated in Section 1. Let σ{E}. Let 𝕊 consist of the integer ring (,+,), and let 𝐖 consist of a unary weight symbol 𝚘𝚗𝚎. We identify a graph G=(V(G),E(G)) with a (σ,𝐖)-graph by letting 𝚘𝚗𝚎G(v)=1 for all vV(G). For a formula φ, we write #(y1,,yj).φ for the weight aggregation term p.φ for p𝚘𝚗𝚎(y1)𝚘𝚗𝚎(yj). Note that this term evaluates to the number of tuples (a1,,aj)V(G)j for which the formula φ is satisfied when assigning the variables y1,,yj the vertices a1,,aj. Let be the predicate collection consisting only of the equality predicate 𝖯=, where 𝖯=={(i,i):i}. The logic FOC()[σ] considered in [8] precisely corresponds to the logic FOWA()[σ,𝕊,𝐖], and FOCn()[σ] corresponds to FOWAn()[σ,𝕊,𝐖], for n.

Theorem 3.1.

Let 𝒯3 be the class of undirected, unranked trees of height at most 3. There is an FOC2()[σ] formula ψ(x,y) such that, for every n, there exist H𝒯3 and WV(H) with |W|=n and |SHφ(V(H)/W)|=2|W|. In particular, this implies that ψ(x,y) has unbounded VC dimension on 𝒯3.

Proof.

Recall the notions introduced at the beginning of Section 1. In particular, we write Sφ(G/W) as a shorthand for SGφ(V(G)/W).

Let 𝒞all be the class of all graphs. The proof of [8, Theorem 4.1] associates with every G𝒞all a tree HG𝒯3 and an injective mapping πG from V(G) to V(HG). Furthermore, the construction presented there allows associating with every FO[σ] formula φ(x,y) an FOC2()[σ] formula φ^(x,y) such that the following is true for every G𝒞all:

  1. 1.

    For all v,wV(G), we have: Gφ[v,w] HGφ^[πG(v),πG(w)].

  2. 2.

    For all v,wV(HG) with vimg(πG) or wimg(πG), we have: HG⊧̸φ^[v,w].

This implies that for all WV(G) and all vV(G) we have:
πG(tpGφ(v/W))={wπG(W):HGφ^[πG(v),w]}=tpHGφ^(πG(v)/πG(W)).
Hence, πG(Sφ(G/W))Sφ^(HG/πG(W)), and thus

|Sφ(G/W)||Sφ^(HG/πG(W))|. (1)

Consider the FO formula φ(x,y)E(x,y). For every n, there is a graph G𝒞all and a set WV(G) with |W|=n and |Sφ(G/W)|=2|W|. For example, we could use the graph G with V(G)[n]{0,1}n, E(G){{i,w¯}:i,w¯{0,1}n,wi=1}, and W[n]. Let WπG(W), and note that |W|=|W|=n. From Equation 1, we obtain that 2|W|=|Sφ(G/W)||Sφ^(HG/W)|2|W|=2|W|. Therefore, |Sφ^(HG/W)|=2|W|. Choosing ψ(x,y) to be the formula φ^(x,y) thus proves the first statement of Theorem 3.1. The second statement of the theorem is an immediate consequence of its first statement and the definition of the notion of VC dimension.

4 Bound on the Number of Types

In this section, we prove the main technical tool for this paper. For that, we use the following notation. For every k, I={i1,,i}[k] with i1<i2<<i, and for a tuple v¯=(v1,,vk), we let v¯I(vi1,vi2,,vi) be the tuple obtained from v¯ by keeping only entries at positions contained in I.

Lemma 4.1.

There are computable functions T:FOWA1()[σ,𝕊,𝐖]× and r:FOWA1()[σ,𝕊,𝐖] such that, for every FOWA1()[σ,𝕊,𝐖] formula φ(x¯,y¯), every m, every (σ,𝐖)-graph G, and all V,WV(G) that are r(φ)-separated by a set of size at most m, we have |SGφ(V/W)|T(φ,m).

Proof.

Let φ(x¯,y¯)FOWA1()[σ,𝕊,𝐖], k|x¯|, and |y¯|. W.l.o.g., we assume that , σ, 𝕊, and 𝐖 only contain elements that occur in φ. Using Theorem 2.7, from φ, we can compute an r, an extension σ of σ with relation symbols of arity 1, and an FOW1()[σ,𝕊,𝐖] formula φ(x¯,y¯) that is a Boolean combination of r-local formulas and statements of the form R() for a 0-ary relation symbol Rσ such that the following holds. For every (σ,𝐖)-graph G, there is a (σ,𝐖)-expansion G of G such that for all v¯(V(G))k and w¯(V(G)), it holds that Gφ[v¯,w¯] if and only if Gφ[v¯,w¯]. We set r(φ)2r+1. Note that, for all V,WV(G), we have that SφG(V/W)=SφG(V/W).

Let m. We extend σ and 𝐖 to be able to remove a set of vertices of size at most m from G and encode the missing information in the remaining graph. For that, for every i,j[m], we introduce a new 0-ary relation symbol Ri for every unary relation symbol Rσ, we introduce the new unary relation symbol Ei, and we introduce the new 0-ary relation symbol Ei,j. Analogously, for every i[m], we introduce two new unary weight symbols 𝚠i,1,𝚠i,2 for every binary weight symbol 𝚠𝐖. In addition, for all i,j[m], for all weight symbols 𝚠𝐖, for all stype(𝚠) that occur in φ (and type(𝚠) may be infinite) and all stype(𝚠) if type(𝚠) is finite, we add the new 0-ary relation symbol R𝚠,i,s if 𝚠 is a unary weight symbol, and we add the new 0-ary relation symbol R𝚠,i,j,s if 𝚠 is a binary weight symbol. Let σm and 𝐖m denote the resulting signature and the resulting set of weight symbols, respectively. Note that both σm and 𝐖m are finite.

Claim 4.2.

Let H be a (σ,𝐖)-graph, let z1,,ztV(H) be pairwise distinct vertices with tm, let Z{z1,,zt}, and let ψ(x1,,xp)FOW1()[σ,𝕊,𝐖] for some p.

There is a (σm,𝐖m)-expansion Hz¯,ψ of HZ such that for every mapping f:[p][0,t], there is a FOW1()[σm,𝕊,𝐖m] formula ψH,z¯,f(x¯′′), where x¯′′ is obtained from x¯ by dropping all variables xi with f(i)0, with the following properties.

For all v¯(V(H))p, we have that Hψ[v¯] if and only if Hz¯,ψψH,z¯,f[v¯], where v¯ is obtained from v¯ by dropping all elements that are contained in Z, and f:[p][0,t] maps i[p] to j[t] if vi=zj, and it maps i[p] to 0 if viZ.

Further, for a fixed formula ψ and a fixed mapping f, the formulas ψH,z¯,f are structurally identical. That is, the syntax trees of all the formulas ψH,z¯,f have the same inner nodes, and the leaf nodes that do not represent constants from rule (7) coincide. Hence, the dependence on H and z¯ is only reflected in the use of different constants for rule (7).

Proof.

Let H be a (σ,𝐖)-graph, let z1,,ztV(H) be pairwise distinct vertices with tm, let Z{z1,,zt}, and let ψ(x1,,xp)FOWA1()[σ,𝕊,𝐖] for some p.

We use the new relation symbols Ri and Ei,j to encode whether ziR(H) and {zi,zj}E(H), and we let Ei include all vertices v such that {zi,v}E(H). The relation symbols R𝚠,i,s and R𝚠,i,j,s are used to encode whether 𝚠H(zi)=s and (𝚠)H(zi,zj)=s. Finally, the unary weight symbols 𝚠i,1 and 𝚠i,2 are used to encode the weights 𝚠H(zi,v) and 𝚠H(v,zi) for all vV(H)Z. Formally, we let Hz¯,ψ be the (σm,𝐖m)-expansion of HZ with

  • Ri(Hz¯,ψ) if and only if i[t] and ziR, for all unary Rσ and i[m],

  • Ei(Hz¯,ψ)N1H(zi), for all i[t],

  • Ei(Hz¯,ψ), for all i[m][t],

  • Ei,j(Hz¯,ψ) if and only if i,j[t] and {zi,zj}E(H), for all i,j[m],

  • R𝚠,i,s(Hz¯,ψ) if and only if i[t] and 𝚠H(zi)=s, for all unary 𝚠𝐖 and all stype(𝚠) that occur in ψ and all stype(𝚠) if type(𝚠) is finite,

  • R𝚠,i,j,s(Hz¯,ψ) if and only if i,j[t] and 𝚠H(zi,zj)=s, for all binary 𝚠𝐖 and all stype(𝚠) that occur in ψ and all stype(𝚠) if type(𝚠) is finite,

  • 𝚠i,1Hz¯,ψ:V(Hz¯,ψ)type(𝚠),v𝚠H(zi,v), for all binary 𝚠𝐖 and i[t],

  • 𝚠i,2Hz¯,ψ:V(Hz¯,ψ)type(𝚠),v𝚠H(v,zi), for all binary 𝚠𝐖 and i[t], and

  • 𝚠i,jHz¯,ψ:V(Hz¯,ψ)type(𝚠),v0, for all binary 𝚠𝐖, j[2], and i[m][t].

Next, for every mapping f:[p][0,t], we recursively construct a FOW1()[σm,𝕊,𝐖m] formula ψH,z¯,f(x¯′′), where x¯′′ is obtained from x¯ by dropping all variables xi with f(i)0. Intuitively, if f(i)0, then this indicates that the variable xi should be replaced by the vertex zf(i). For all i[p] and j[0,t], we let fij:[p][0,t] be the mapping with fij(i)f(i) for all ii and fij(i)j. Moreover, for i,i[p] and j,j[0,t], we analogously define fij,ij:[p][0,t].

  1. (1)

    If ψ is of the form xi=xj, then we let ψH,z¯,fψ if f(i)=f(j)=0, ψH,z¯,f if f(i)=f(j)0, and ψH,z¯,f else. If ψ is of the form R(), or ψ is of the form R(xi) and f(i)=0, or ψ is of the form E(xi,xj) and f(i)=f(j)=0, then we let ψH,z¯,fψ. If ψ is of the form R(xi) and f(i)0, then we let ψH,z¯,fRf(i)(). If ψ is of the form E(xi,xj) and f(i)0 and f(j)=0, then we let ψH,z¯,fEf(i)(xj). If ψ is of the form E(xi,xj) and f(i)=0 and f(j)0, then we let ψH,z¯,fEf(j)(xi). If ψ is of the form E(xi,xj) and f(i),f(j)0, then we let ψH,z¯,fEf(i),f(j)().

  2. (2)

    If ψ is of the form (s=𝚠(xi)) and f(i)=0, or ψ is of the form (s=𝚠(xi,xj)) and f(i)=f(j)=0, then we let ψH,z¯,fψ. If ψ is of the form (s=𝚠(xi)) and f(i)0, then we let ψH,z¯,fR𝚠,f(i),s. If ψ is of the form (s=𝚠(xi,xj)) and f(i),f(j)0, then we let ψH,z¯,fR𝚠,f(i),f(j),s. If ψ is of the form (s=𝚠(xi,xj)) and f(i)0 and f(j)=0, then we let ψH,z¯,f(s=𝚠f(i),1(xj)). If ψ is of the form (s=𝚠(xi,xj)) and f(i)=0 and f(j)0, then we let ψH,z¯,f(s=𝚠f(j),2(xi)).

  3. (3)

    If ψ is of the form (ψψ′′), then we recursively construct ψH,z¯,f and ψH,z¯,f′′, and we let ψH,z¯,f(ψH,z¯,fψH,z¯,f′′). If ψ is of the form ¬ψ, then we recursively construct ψH,z¯,f, and we let ψH,z¯,f¬ψH,z¯,f.

  4. (4)

    If ψ is of the form xiψ, then we recursively construct ψH,z¯,fij for all j[0,t]. We let ψH,z¯,f(xiψH,z¯,fi0j=1tψH,z¯,fij).

  5. (5)

    If ψ is of the form (s=𝚠(xi).ψ) for a unary weight symbol 𝚠𝐖 of finite type Stype(𝚠), then we recursively construct ψH,z¯,fij for all j[0,t]. We let

    ψH,z¯,fs0,s1,,stSs0+s1++st=s(s0=𝚠(xi).(ψH,z¯,fi0j=1t(R𝚠,j,sjψH,z¯,fij))).

    If ψ is of the form (s=𝚠(xi,xi).ψ) for a binary weight symbol 𝚠𝐖 of finite type Stype(𝚠), then we recursively construct ψH,z¯,fij,ij for all j,j[0,t]. We let

    ψH,z¯,fs0,0,s0,1,,s0,t,s10,,st,tSs00+s0,1++st,t=s (s0=𝚠(xi,xi).(ψH,z¯,fi0,i0
    j=1tj=1t(R𝚠,j,j,sj,jψH,z¯,fij,ij))).
  6. (6)

    Finally, if ψ is of the form 𝖯(t1,,tj), then t1,,tj are terms according to rules (7)–(9), and they have at most one free variable, say xi. If f(i)=0, then we let ψH,z¯,fψ. Otherwise, we let t1,,tj be the terms obtained from t1,,tj by replacing every occurrence of a term of the form 𝚠(xi) by the constant 𝚠H(zf(i)) and every occurrence of a term of the form 𝚠(xi,xi) by the constant 𝚠H(zf(i),zf(i)). We set ψH,z¯,f𝖯(t1,,tj).

It follows from the construction that, for all v¯(V(H))p, we have Hψ[v¯] if and only if Hz¯,ψψH,z¯,f[v¯], where v¯ is obtained from v¯ by dropping all elements that are contained in Z, and f:[p][0,t] maps i[p] to j[t] if vi=zj, and it maps i[p] to 0 if viZ.

Moreover, for a fixed formula ψ and a fixed mapping f, the formulas ψH,z¯,f are structurally identical. That is, the syntax trees of all the formulas ψH,z¯,f have the same inner nodes, and even the leaf nodes that do not represent constants from some abelian group or ring (rule (7)) coincide. Hence, the dependence on H and z¯ is only reflected in the use of different constants for rule (7).

Let V,WV(G), let z1,,ztV(G) be pairwise distinct vertices with tm such that V and W are r(φ)-separated in G (and thus also in G) by the set Z{z1,,zt}, and let z¯(z1,,zt). W.l.o.g., we may assume that every vertex from Z is contained in some path from V to W in G of length at most r(φ)=2r+1, so ZV(𝒩rG(VW)).

By applying 4.2 to H𝒩rG(VW), z1,,zt, and φ, we obtain a (σm,𝐖m)-expansion Hz¯,φ of HZ and, for every mapping f:[k+][0,t], a FOW1()[σm,𝕊,𝐖m] formula φH,z¯,f. Since V is (2r+1)-separated from W by Z, there is no path from VZ to WZ in HZ=𝒩rG(VW)Z. Hence, there are (σm,𝐖m)-graphs HV and HW such that VZV(HV), WZV(HW), and Hz¯,φ=HVHW.

Let v¯Vk and w¯W. We have Gφ[v¯,w¯] if and only if Gφ[v¯,w¯]. Moreover, since φ is a Boolean combination of r-local formulas and statements of the form R() for a 0-ary relation symbol Rσ, we have that Gφ[v¯,w¯] if and only if 𝒩rG(v¯w¯)φ[v¯,w¯] if and only if 𝒩rG(VW)φ[v¯,w¯]. Furthermore, by 4.2, it holds that 𝒩rG(VW)φ[v¯,w¯] if and only if Hz¯,φφH,z¯,f[v¯,w¯], where v¯ and w¯ are obtained from v¯ and w¯, respectively, by dropping all entries that are contained in Z, and f:[k+][0,t] is defined by f(i)j if ik and vi=zj or i>k and wik=zj, and f(i)0 if ik and viZ or i>k and wikZ. Let x¯ and y¯ be the tuples of variables obtained analogously from x¯ and y¯, respectively.

Using Theorem 2.6, we obtain a Feferman–Vaught decomposition ΔφH,z¯,f of φH,z¯,f(x¯,y¯) w.r.t. (x¯;y¯), that is, a set of pairs (α(x¯),β(y¯)) of FOW1()[σm,𝕊,𝐖m] formulas such that Hz¯,φφH,z¯,f[v¯,w¯] if and only if there is a pair (α(x¯),β(y¯)) in ΔφH,z¯,f such that HVα[v¯] and HWβ[w¯]. Since the structure of φH,z¯,f is independent of H and z¯, and only the used constants might differ, it is easy to see from the proof of Theorem 2.6 (see [5] for details) that the size of ΔφH,z¯,f only depends on φ and f, and that it is independent of H and z¯. Furthermore, the number of mappings f:[k+][0,t] only depends on φ and m (recall that tm), so we can let T(φ,m):FOWA1()[σ,𝕊,𝐖]× be an upper bound on the number of pairs in the decomposition ΔφH,z¯,f for all H, z¯, and f.

All in all, we have Gφ[v¯,w¯] if and only if there is a pair (α(x¯),β(y¯)) in ΔφH,z¯,f such that HVα[v¯] and HWβ[w¯]. Hence, for every v¯Vk, tpGφ(v¯/W) only depends on

  • which vertices of v¯ are contained in Z and

  • which formulas α of pairs (α,β) in any of the ΔφH,z¯,f are satisfied by v¯, where v¯ is obtained from v¯ by dropping all entries that are contained in Z, and f ranges over all mappings f:[k+][0,t] with, for all i[k], f(i)=j if vi=zj, and f(i)=0 if viZ.

Since the number of possibilities for both can be bounded in terms of φ and m, there is a function T:FOWA1()[σ,𝕊,𝐖]× such that |SGφ(V/W)|=|{tpGφ(v¯/W):v¯Vk}|T(φ,m). This is the statement of Lemma 4.1.

5 VC Density and VC Dimension

In this section, we prove Results (2)–(4) stated in Section 1. Our main result of this section is the following.

Theorem 5.1.

Let 𝒞 be a nowhere dense class of (σ,𝐖)-graphs, and let φ(x¯,y¯) be a FOWA1()[σ,𝕊,𝐖] formula. For every ε>0, there exists a constant c such that for every G𝒞 and every non-empty WV(G), we have |Sφ(G/W)|c|W||x¯|+ε.

As discussed in the introduction, this immediately implies the following bound on the VC density of FOWA1 formulas.

Corollary 5.2.

Let 𝒞 be a nowhere dense class of (σ,𝐖)-graphs, and let φ(x¯,y¯) be a FOWA1()[σ,𝕊,𝐖] formula. The VC density of φ(x¯,y¯) on 𝒞 is at most |x¯|.

Moreover, this implies that the VC dimension of FOWA1 formulas on nowhere dense classes is bounded.

Corollary 5.3.

Let 𝒞 be a nowhere dense class of (σ,𝐖)-graphs, and let φ(x¯,y¯) be a FOWA1()[σ,𝕊,𝐖] formula. It holds that φ(x¯,y¯) has bounded VC dimension on 𝒞.

Proof.

As described in the introduction, Corollary 5.2 already implies Corollary 5.3, since the VC dimension is finite if and only if the VC density is finite (see, e. g., [2]). However, since we find it short and instructive, we also give a proof of Corollary 5.3 based on Theorem 5.1.

Let k|x¯|, |y¯|, let ε>0, and let c be the constant from Theorem 5.1 applied to 𝒞, φ(x¯,y¯), and ε. Moreover, let m0 be such that c(m)k+ε<2m for all mm0.

Let G𝒞 and Y(V(G)) such that |Y|mm0. Let WV(G) be the set of vertices appearing in any tuple in Y. We have |W||Y|=m. Moreover, we have {YF:FSφ(G/V(G))}Sφ(G/W). Hence, by Theorem 5.1, we have |{YF:FSφ(G/V(G))}||Sφ(G/W)|c(m)k+ε<2m. This shows that {YF:FSφ(G/V(G))}2Y, so Y is not shattered by Sφ(G/V(G)). Thus, the VC dimension of Sφ(G/V(G)) is less than m0. Since m0 does not depend on G, this proves that φ(x¯,y¯) has bounded VC dimension on 𝒞.

For the proof of Theorem 5.1, we rely on the following lemma on the neighbourhood complexity in nowhere dense graph classes. Let G be a (σ,𝐖)-graph, and let XV(G). For vertices vX and wV(G), a path P from v to w in G is called X-avoiding if all vertices on the path except for v are not contained in X. For an r and wV(G), the r-projection of w on X, denoted by MrG(w,X), is the set of all vertices vX that are connected to w by an X-avoiding path of length at most r.

Lemma 5.4 ([6, Lemmas 21 and 22]).

Let 𝒞 be a nowhere dense class of graphs. There is a function fcl:×>0 and an algorithm111In [6], the authors even show that this can be computed by a polynomial-time algorithm. However, running-time bounds are not relevant for our purposes. that, given a graph G𝒞, XV(G), r, and δ>0, computes a set clr,δ(X), called the r-closure of X w.r.t. δ, with the following properties.

  1. 1.

    Xclr,δ(X)V(G),

  2. 2.

    |clr,δ(X)|fcl(r,δ)|X|1+δ, and

  3. 3.

    |MrG(u,clr,δ(X))|fcl(r,δ)|X|δ for all uV(G)clr,δ(X).

Moreover, for all XV(G), it holds that

  1. 4.

    |{MrG(u,X):uV(G)}|fcl(r,δ)|X|1+δ.

We can now prove Theorem 5.1.

Proof of Theorem 5.1.

The proof is similar to the proof of the analogous result for first-order logic in [14], using Lemma 4.1 instead of the corresponding result for FO.

Let 𝒞 be a nowhere dense class of (σ,𝐖)-graphs, let φ(x¯,y¯) be a FOWA1 formula, and let ε>0. Let k|x¯|, |y¯|, let r:FOWA1 and T:FOWA1× be the functions from Lemma 4.1, let t: be the function from Definition 2.1, and let rr(φ) and tt(36r). We have that no graph G𝒞 contains Kt as a depth-36r minor.

By Theorem 2.2, there is a number s and a polynomial N: such that, for every graph G𝒞, every m, and every set X(V(G))k with |X|N(m), there are sets SV(G) and YX with |S|s and |Y|m such that all distinct v¯,v¯Y are 2r-separated by S in G. Let d be the degree of N.

Let G𝒞, and let WV(G) be a non-empty set of vertices. We set δε4k+4d, and we let Wclr,δ(W) be the r-closure of W w.r.t. δ, obtained via Lemma 5.4. We shall prove that

|Sφ(G/W)|𝒪ε,φ(|W|k+ε)for εε/2>0, ()

where 𝒪ε,φ() omits factors depending only on ε and φ. Since WW, we have |Sφ(G/W)||Sφ(G/W)|. Moreover, by Lemma 5.4, we have |W|=|clr,δ(W)|fcl(r,δ)|W|1+δ, and we have (1+δ)(k+ε)=(1+δ)(k+ε/2)k+ε by the choice of δ, so

|Sφ(G/W)|𝒪ε,φ((fcl(r,δ)|W|1+δ)k+ε)𝒪ε,φ(|W|k+ε),

which is the statement of Theorem 5.1.

It remains to prove (5). Recall that Sφ(G/W)={tpGφ(v¯/W):v¯(V(G))k}. We partition the tuples v¯=(v1,,vk)(V(G))k based on their projection MrG(v¯,W)i=1kMr(vi,W) into sets V1,,Vp. That is, two tuples v¯,v¯(V(G))k are contained in the same set Vj for some j[p] if and only if MrG(v¯,W)=MrG(v¯,W). By Item 4 of Lemma 5.4, there are at most fcl(r,δ)|W|1+δ different projections of vertices in V(G) on W, so we have p𝒪ε,φ(|W|(1+δ)k). Hence, to prove (5), it suffices to show that

|{tpGφ(v¯/W):v¯Vj}|𝒪ε,φ(|W|ε′′)for ε′′εkδ>0, ()

for all j[p], since then |Sφ(G/W)|𝒪ε,φ(|W|(1+δ)k|W|εkδ)=𝒪ε,φ(|W|k+ε).

Let j[p], and let XMrG(v¯,W) be the r-projection of v¯ on W for any (and, due to the definition of Vj, for all) v¯Vj. By Item 3 of Lemma 5.4, we have |X|kfcl(r,δ)|W|δ𝒪ε,φ(|W|δ).

Let Vj be a maximal subset of Vj such that all pairwise distinct tuples v¯,v¯ from Vj have different types tpGφ(v¯/W)tpGφ(v¯/W). Note that |{tpGφ(v¯/W):v¯Vj}|=|Vj|. Now let m be the maximum number with |Vj|N(m). Then |Vj|<N(m+1)𝒪ε,φ(md).

By Theorem 2.2, as described above, there are sets SV(G) and YVj with |S|s and |Y|m such that all distinct v¯,v¯Y are 2r-separated by S in G.

We partition Y into two sets Y1Y2, where Y1 contains all tuples that are r-separated by S from W, and Y2 contains the remaining tuples. By Lemma 4.1, since all tuples in Y1 are r-separated by S from W, and all tuples in Y1 have distinct types, we know that |Y1|T(φ,s)𝒪ε,φ(1). Moreover, for every tuple v¯Y2, there is a vertex wW such that v¯ and w are not r-separated by S in G. Note that we can choose w to be contained in X. Moreover, since all tuples in Y2 are mutually 2r-separated by S in G, we know that for two distinct tuples v¯,v¯Y2, the vertices in C connected to them by paths of length at most r avoiding S must also be distinct. This shows that |Y2||X|. Combined, we obtain that |Y|𝒪ε,δ(|X|). Furthermore, since |Y|m, we have

|Vj|𝒪ε,φ(md)𝒪ε,φ(|Y|d)𝒪ε,φ(|X|d)𝒪ε,φ(|W|dδ)𝒪ε,φ(|W|ε′′),

where the last inclusion holds because ε′′=ε/2kδε/4dδ by the choice of δ. This proves (5), which, as discussed above, implies the statement of Theorem 5.1.

6 Stability

In this section, we provide the following bound on the ladder index of FOC1 formulas and FOWA1 formulas on nowhere dense classes of weighted graphs. Based on this, we prove Result (5) stated in Section 1.

Theorem 6.1.

There are computable functions f:FOWA1()[σ,𝕊,𝐖]× and g:FOWA1()[σ,𝕊,𝐖] such that, for every FOWA1()[σ,𝕊,𝐖] formula φ, for every t, and for every (σ,𝐖)-graph G excluding Kt as a depth-g(φ) minor, the ladder index of φ in G is at most f(φ,t).

Proof.

The proof is similar to the proof of the analogous statement in [14] for first-order formulas. Let r:FOWA1()[σ,𝕊,𝐖] and T:FOWA1()[σ,𝕊,𝐖]× be the functions from Lemma 4.1. We set g:FOWA1()[σ,𝕊,𝐖],φ18r(φ).

Let φ(x¯,y¯) be a FOWA1()[σ,𝕊,𝐖] formula, let t, and let 𝒞 be the class of (σ,𝐖)-graphs excluding Kt as a depth-g(φ) minor. Let d|x¯|+|y¯|, and let s be the number and N: be the polynomial computed from r(φ), t, and d using Theorem 2.2. Moreover, let Lf(φ,t)N(2T(φ,s)+1). (Note that N and s can be computed from φ and t.) We show that every φ-ladder in every graph G𝒞 has length less than L.

Towards a contradiction, suppose there are a graph G𝒞 and tuples v¯1,,v¯L(V(G))|x| and w¯1,,w¯L(V(G))|y| that form a φ-ladder in G, that is, Gφ[v¯i,w¯j] if and only if ij. In particular, the tuples v¯1,,v¯L are pairwise distinct, and the same holds for the tuples w¯1,,w¯L. Let X{v¯iw¯i:i[L]}(V(G))d. By Theorem 2.2, for m2T(φ,s)+1, since |X|N(m), there are sets SV(G) and YX with |S|s and |Y|m such that all distinct u¯,u¯Y are r(φ)-separated by S in G. Let I{i[]:v¯iw¯iY}. Let I1,I2 be an alternating partition of I, that is, for all successive i,jI1, there is exactly one kI2 with i<k<j. Note that |I1|T(φ,s)+1. Let VV(G) be the set of vertices appearing in a tuple v¯iw¯i with iI1, and let WV(G) be the set of vertices appearing in a tuple v¯iw¯i with iI2. Since all distinct u¯,u¯Y are r(φ)-separated by S in G, it also holds that the sets V and W are r(φ)-separated by S in G.

Now we can apply Lemma 4.1 to V and W, and we obtain |SGφ(V/W)|T(φ,s)<|I1|. Hence, there are two indices i,jI1 with i<j such that tpGφ(v¯i/W)=tpGφ(v¯j/W). Let kI2 with i<k<j. Then w¯ktpGφ(v¯i/W) if and only if w¯ktpGφ(v¯j/W), so Gφ[v¯i,w¯k] if and only if Gφ[v¯j,w¯k]. However, this contradicts v¯1,,v¯ and w¯1,,w¯ being a φ-ladder, because we need to have Gφ[v¯i,w¯k] (since i<k) and G⊧̸φ[v¯j,w¯k] (since j>k). This shows that there is no φ-ladder in G of size at least L=f(φ,t), so the ladder index of φ in G is at most f(φ,t).

We call a class 𝒞 of weighted graphs FOWA1-stable (FOC1-stable) if the ladder index of every FOWA1 (FOC1) formula φ in every weighted graph from 𝒞 is bounded by a constant depending only on φ and 𝒞.

Corollary 6.2.

Every nowhere dense class of weighted graphs is FOC1-stable and FOWA1-stable.

Proof.

Let 𝒞 be a nowhere dense class of (σ,𝐖)-graphs, let φ(x¯,y¯) be a formula in FOWA1()[σ,𝕊,𝐖], and let k|x¯| and |y¯|. By Definition 2.1, there is a function t: such that for all r and G𝒞, it holds that G does not contain Kt(r) as a depth-r minor.

Let f:FOWA1()[σ,𝕊,𝐖]× and g:FOWA1()[σ,𝕊,𝐖] be the functions from Theorem 6.1. For all G𝒞, we have that G does not contain Kt(g(φ)) as a depth-g(φ) minor. Thus, by Theorem 6.1, for every G𝒞, the ladder index of φ in G is at most Lf(φ,t(g(φ))), which only depends on φ and 𝒞.

7 Final Remarks

In this paper, we have presented upper bounds on the VC dimension and the ladder index as well as optimal bounds on the VC density of formulas in the first-order logic with counting FOC1 and the first-order logic with weight aggregation FOWA1 on nowhere dense classes of vertex- and edge-weighted graphs. This lifts results of Adler and Adler [1] and results of Pilipczuk, Siebertz, and Toruńczyk [14] from first-order logic to substantially more expressive logics.

In [4], van Bergerem, Grohe, and Ritzert combined the result by Adler and Adler with the fixed-parameter tractable (fpt) model-checking result for FO on nowhere dense graph classes [7] to prove learnability results for FO on nowhere dense graph classes in the Probably Approximately Correct (PAC) learning framework. We remark that, by combining our results on the VC dimension for FOC1 formulas with the fpt model-checking result for FOC1 by Grohe and Schweikardt [8], we also obtain fpt PAC learnability for FOC1-definable concepts over nowhere dense graph classes. We are currently working on lifting these model-checking and learnability results from FOC1 to FOWA1.

References

  • [1] Hans Adler and Isolde Adler. Interpreting nowhere dense graph classes as a classical notion of model theory. Eur. J. Comb., 36:322–330, 2014. doi:10.1016/j.ejc.2013.06.048.
  • [2] Matthias Aschenbrenner, Alf Dolich, Deirdre Haskell, Dugald Macpherson, and Sergei Starchenko. Vapnik–Chervonenkis density in some theories without the independence property, I. Transactions of the American Mathematical Society, 368(8):5889–5949, August 2016. doi:10.1090/tran/6659.
  • [3] Steffen van Bergerem. Learning concepts definable in first-order logic with counting. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, Vancouver, BC, Canada, June 24–27, 2019, pages 1–13. IEEE, 2019. doi:10.1109/LICS.2019.8785811.
  • [4] Steffen van Bergerem, Martin Grohe, and Martin Ritzert. On the parameterized complexity of learning first-order logic. In PODS 2022: International Conference on Management of Data, Philadelphia, PA, USA, June 12–17, 2022, pages 337–346. ACM, 2022. doi:10.1145/3517804.3524151.
  • [5] Steffen van Bergerem and Nicole Schweikardt. Learning concepts described by weight aggregation logic. In 29th EACSL Annual Conference on Computer Science Logic, CSL 2021, Ljubljana, Slovenia (Virtual Conference), January 25–28, 2021, volume 183 of LIPIcs, pages 10:1–10:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.CSL.2021.10.
  • [6] Kord Eickmeyer, Archontia C. Giannopoulou, Stephan Kreutzer, O-joung Kwon, Michał Pilipczuk, Roman Rabinovich, and Sebastian Siebertz. Neighborhood complexity and kernelization for nowhere dense classes of graphs. In 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10–14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 63:1–63:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.ICALP.2017.63.
  • [7] Martin Grohe, Stephan Kreutzer, and Sebastian Siebertz. Deciding first-order properties of nowhere dense graphs. J. ACM, 64(3):17:1–17:32, 2017. doi:10.1145/3051095.
  • [8] Martin Grohe and Nicole Schweikardt. First-order query evaluation with cardinality conditions. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2018, Houston, TX, USA, June 10–15, 2018, pages 253–266. ACM, 2018. doi:10.1145/3196959.3196970.
  • [9] Martin Grohe and György Turán. Learnability and definability in trees and similar structures. Theory Comput. Syst., 37(1):193–220, 2004. doi:10.1007/s00224-003-1112-8.
  • [10] Dietrich Kuske and Nicole Schweikardt. First-order logic with counting. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavík, Iceland, June 20–23, 2017, pages 1–12. IEEE Computer Society, 2017. doi:10.1109/LICS.2017.8005133.
  • [11] Jaroslav Nešetřil and Patrice Ossona de Mendez. First order properties on nowhere dense structures. J. Symb. Log., 75(3):868–887, 2010. doi:10.2178/jsl/1278682204.
  • [12] Jaroslav Nešetřil and Patrice Ossona de Mendez. On nowhere dense graphs. Eur. J. Comb., 32(4):600–617, 2011. doi:10.1016/j.ejc.2011.01.006.
  • [13] Jaroslav Nešetřil and Patrice Ossona de Mendez. Sparsity – Graphs, Structures, and Algorithms, volume 28 of Algorithms and combinatorics. Springer, 2012. doi:10.1007/978-3-642-27875-4.
  • [14] Michał Pilipczuk, Sebastian Siebertz, and Szymon Toruńczyk. On the number of types in sparse graphs. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09–12, 2018, pages 799–808. ACM, 2018. doi:10.1145/3209108.3209178.
  • [15] Klaus-Peter Podewski and Martin Ziegler. Stable graphs. Fundamenta Mathematicae, 100(2):101–107, 1978. URL: http://eudml.org/doc/210953.
  • [16] Norbert Sauer. On the density of families of sets. J. Comb. Theory A, 13(1):145–147, 1972. doi:10.1016/0097-3165(72)90019-2.
  • [17] Saharon Shelah. A combinatorial problem: stability and order for models and theories in infinitary languages. Pacific Journal of Mathematics, 41(1):247–261, 1972. doi:10.2140/pjm.1972.41.247.
  • [18] Katrin Tent and Martin Ziegler. A Course in Model Theory. Lecture Notes in Logic. Cambridge University Press, 2012. doi:10.1017/CBO9781139015417.
  • [19] Vladimir Naumovich Vapnik and Alexey Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264–280, 1971. doi:10.1137/1116025.