On the Beck-Fiala Conjecture for Random Set Systems

Motivated by the Beck-Fiala conjecture, we study discrepancy bounds for random sparse set systems. Concretely, these are set systems $(X,\Sigma)$, where each element $x \in X$ lies in $t$ randomly selected sets of $\Sigma$, where $t$ is an integer parameter. We provide new bounds in two regimes of parameters. We show that when $|\Sigma| \ge |X|$ the hereditary discrepancy of $(X,\Sigma)$ is with high probability $O(\sqrt{t \log t})$; and when $|X| \gg |\Sigma|^t$ the hereditary discrepancy of $(X,\Sigma)$ is with high probability $O(1)$. The first bound combines the Lov{\'a}sz Local Lemma with a new argument based on partial matchings; the second follows from an analysis of the lattice spanned by sparse vectors.


Introduction
Let (X, Σ) be a finite set system, with X a finite set and Σ a collection of subsets of X. In other words, the discrepancy of the set system (X, Σ) is the minimum over all colorings χ of the largest deviation from an even split, over all subsets in Σ. For background on discrepancy theory, we refer the reader to the books of Chazelle [Cha00] and Matoušek [Mat09].
In this paper, our interest is in the discrepancy of sparse set systems. The set system (X, Σ) is said to be t-sparse if any element x ∈ X belongs to at most t sets S ∈ Σ. A wellknown result of Beck and Fiala [BF81] is that sparse set systems have discrepancy bounded only in terms of their sparsity.
Beck and Fiala conjectured that in fact, the bound can be improved to O( √ t), analogous to Spencer's theorem for non-sparse set systems [Spe85]. This is a long standing open problem in discrepancy theory. The best result to date is by Banaszczyk [Ban98].
Our results. In this paper, we study random sparse set systems. To sample a random t-sparse set system (X, Σ) with |X| = n, |Σ| = m, for each x ∈ X choose uniformly and independently a subset T x ⊂ [m] of size |T x | = t. Then set S i = {x ∈ X : i ∈ T x } and Σ = {S 1 , . . . , S m }. Letting E[·] denote expectation, our main quantity of interest is E[disc(Σ)]. We show that when m ≥ n, this is close to the conjectured bound of Beck and Fiala. Specifically, we show E[disc(Σ)] = O( √ t log t). In particular, the bound does not depend on n.
In fact, we obtain such bound for the hereditary discrepancy of the set system. For Y ⊂ X let Σ| Y = {S ∩ Y : S ∈ Σ} be the set system restricted to Y . The hereditary discrepancy of a set system (X, Σ) is defined as Our main result is the following.
We note that our technique can be extended to the case where m ≥ cn for any absolute constant c > 0, but fails whenever m ≪ n. The main reason is that in this regime, most sets are large. Nevertheless, when n is considerably larger than m, we use a different approach and show that the discrepancy is small in this case as well. Specifically, when n is somewhat larger than m t we show that the discrepancy is only O(1). Theorem 1.4. Fix m ≥ t and let N = m t . Assume that n ≥ Ω(N log N). Let (X, Σ) be a random t-sparse set system with |X| = n, |Σ| = m. Then In fact, the bound holds with probability 1 − N −Ω(1) .
To summarize, the work in this paper was motivated by the elusive Beck-Fiala conjecture. We considered a natural setting of random t-sparse set systems, and showed that in this case, in some regimes of parameters, the conjecture holds (with the bound of O( √ t) replaced by the slightly weaker bound of O( √ t log t) in our first result). We hope that the techniques developed in this work will be useful for the study of random sparse set systems in the full spectrum of parameters, as well as for the original Beck-Fiala conjecture.

Preliminaries and Proof Overview
The Lovász Local Lemma. The Lovász Local Lemma [EL75] is a powerful probabilistic tool. In this paper we only need its symmetric version.
Theorem 2.1. Let E 1 , E 2 , ..., E k be a series of events such that each event occurs with probability at most p and such that each event is independent of all the other events except for at most d of them.
Tail bounds. In our analysis we exploit a few standard tail bounds for the sum of independent random variables (Chernoff-Hoeffding bounds, see, e.g., [AS00]).

Proof Overview for Theorem 1.3
We next present an overview of our proof for Theorem 1.3. For simplicity of exposition, we present the overview only for the derivation of the discrepancy bound. In Section 3 we present the actual analysis and show a bound on the hereditary discrepancy. First, we classify each set as being either "small" if its cardinality is O(t), or "large" otherwise. Then we proceed in several steps: • (i) Making large sets pairwise disjoint: Initially, we show that with high probability over the choice of the set system, it is possible to delete at most one element from each large set, such that they become pairwise disjoint after the deletion. This property is proved in Lemma 3.1.
• (ii) Partial matching: For each large set resulting after step (i), we pair its elements, leaving at most, say, two unpaired elements. Since each pair appears in a unique set, this process results in a partial matching M = {(a 1 , b 1 ), . . . , (a k , b k )} on X. We observe that as soon as we have such a matching, we can restrict the two-coloring function χ on X to assign alternating signs on each pair of M. Since each large set S has at most two unpaired elements, we immediately conclude that |χ(S)| ≤ 2.
• (iii) Applying the Lovász Local Lemma on the small sets: We are thus left to handle the small sets. In this case, we observe that a random coloring χ, with alternating signs on M as above 1 , satisfies with positive probability that |χ(S)| ≤ O( √ t log t) for all small sets S ∈ Σ. This is a consequence of the Lovász Local Lemma, as each small set S contains only O(t) elements, and each of these elements participates in t sets of Σ. The fact that some of these elements appear in the partial matching implies that S can "influence" (w.r.t. the random coloring χ) at most 2|S|t = O(t 2 ) other small sets; see Section 3 for the details.
We point out that as soon as we have a partial matching M as above, we can "neutralize" the deviation that might be caused by the large sets, and only need to keep the deviation, caused by the small sets, small. The latter is fairly standard to do, and so the main effort in the analysis is to show that we can indeed make large sets disjoint as in step (i).
We note that our proof technique is constructive. Our arguments for steps (i) and (ii) (see Lemma 3.1 and our charging scheme in Claim 4.2) give an efficient algorithm to find an element to delete in each large set, thereby making large sets disjoint, as well as build the partial matching, or, alternatively, report (with small probability) that a partial matching of the above kind does not exist and halt. In step (iii) we can apply the algorithmic Lovász Local Lemma of Moser and Tardos [Mos09,MT10], since the colors are assigned independently among the pairs in M as well as the unpaired elements. Thus, we obtain an expected polynomial time algorithm, which, with high probability over the choice of the set system, constructs a coloring with discrepancy O( √ t log t).

A Low Hereditary Discrepancy Bound: The Analysis
We now proceed with the proof of Theorem 1.3. We classify the sets in Σ based on their size. A set S ∈ Σ is said to be large if |S| ≥ 6t and small otherwise. Note that as m ≥ n, most sets in Σ are small. Let I = {i : S i is large} be a random variable capturing the indices of the large sets. To construct a coloring, we proceed in several steps. First, we show that with high probability the large sets are nearly disjoint. We will assume throughout that t is sufficiently large (concretely t ≥ 55).
Lemma 3.1. Fix t ≥ 55. Let E denote the following event: "there exists a choice of We defer the proof of Lemma 3.1 to Section 4 and prove Theorem 1.3 based on it, in the remainder of this section. Decompose where we bounded E[herdisc(Σ)|E] by the Beck-Fiala theorem (Theorem 1.1) which holds for any t-sparse set system, and bounded Pr[E] by 2 −t according to Lemma 3.1. To conclude the proof we will show that when E holds then herdisc(Σ) ≤ O( √ t log t). Thus, we assume from now on that the event E holds. Fix a subset Y ⊂ X, where we will construct a two-coloring for Σ ′ = Σ| Y of low discrepancy. Partition where |A i | is even, |B i | ≤ 2 and the sets {A i : i ∈ I} are pairwise disjoint. Partition each A i arbitrarily into |A i |/2 pairs, and let M be the union of these pairs. That is, M is a partial matching on Y given by M =  {(a 1 , b 1 ), . . . , (a k , b k )} where a 1 , b 1 , . . . , a k , b k ∈ Y are distinct, and each A i is a union of a subset of M, and each pair a j , b j appears in a unique set A i due to the fact that these sets are pairwise disjoint (they thus form a partition of M). We say that a coloring χ : . Note that if S i is a large set, then for any coloring χ consistent with M, Thus, we only need to minimize the discrepancy of χ over the small sets in Σ. To do so, we choose χ uniformly from all two-colorings consistent with M. These are given by choosing uniformly and independently χ(a i ) ∈ {−1, +1} for i ∈ [k], setting χ(b i ) = −χ(a i ) and choosing χ(x) ∈ {−1, +1} uniformly and independently for all Each pair {a j , b j } contained in S i contributes 0 to the discrepancy, and all other elements obtain independent colors. Hence χ(S i ) is the sum of t ′ ≤ 6t independent signs. By Lemma 2.2, for an appropriate constant c we have We next claim that each event E i depends on at most d = 12t 2 other events {E j : j = i}.
it must be the case that S j intersects S ′ i . However, as each x ∈ S ′ i is contained in t sets, there are at most 12t 2 such events E j .
We are now in a position to apply the Lovász Local Lemma (Theorem 2.1). Its condition are satisfied as we have p = 1/100t 2 and d = 12t 2 . Hence Pr[∧E i ] > 0, that is, there exists a coloring χ consistent with M for which |χ(S i )| ≤ c √ t log t for all small sets S i . This coloring shows that disc(Σ ′ ) ≤ max(c √ t log t, 2) as claimed.
4 Proof of Lemma 3.1 Let (X, Σ) be a t-sparse set system with |X| = n, |Σ| = m. It will be convenient to identify it with a bi-partite graph G = (X, V, E) where |V | = m and E = {(x, i) : x ∈ S i }. Then, a random t-sparse set system is the same as a random left t-regular bi-partite graph. That is, a uniform graph satisfying deg(x) = t for all x ∈ X.
Large sets in Σ correspond to the subset of the vertices V ′ = {v ∈ V : deg(v) ≥ 6t}. For a vertex v ∈ V let Γ(v) ⊂ X denote its neighbors. Lemma 3.1 is equivalent to the following lemma, which we prove in this section.
Lemma 4.1. Fix t ≥ 55. With probability at least 1 − 2 −t over the choice of G, there exists a choice of Let G ′ be the induced (bi-partite) sub-graph on (X, V ′ ). We will show that with high probability G ′ has no cycles. In such a case Lemma 4.1 follows from the straightforward scheme described below: Claim 4.2. Assume that G ′ has no cycles. Then there exists a choice of Proof. We present a charging scheme of the vertices x v ∈ Γ(v), for each v ∈ V ′ . If G ′ has no cycles then it is a forest. Fix a tree T in G ′ and an arbitrary root v T ∈ V ′ of T . Orient the edges of T from v T to the leaves. For each v ∈ T other than the root, choose x v to be the parent of v in the tree, and choose Then v 1 , x, v 2 is a path in G ′ and hence v 1 , v 2 must belong to the same tree T . However, the only case where this can happen (as T is a tree) is that x is the parent of both v 1 , v 2 in T . However, by construction in this case In the remainder of the proof we show that with high probability G ′ has no cycles. The girth of G ′ , denoted girth(G ′ ), is the minimal length of a cycle in G ′ if such exists, and otherwise it is ∞. Note that as G ′ is bipartite, then girth(G ′ ) is (if finite) the minimal 2ℓ such that there exist a cycle x 1 , v 1 , x 2 , v 2 , . . . , x ℓ , v ℓ , x 1 in G ′ with x i ∈ X and v i ∈ V ′ . Proof. Fix x 1 , x 2 ∈ X and v 1 , v 2 ∈ V . They form a cycle of length 4 if v 1 , v 2 ∈ Γ(x 1 ) ∩ Γ(x 2 ). As each Γ(x i ) is a uniformly chosen set of size t we have that Next, conditioned on the event that v 1 , v 2 ∈ Γ(x 1 ) ∩ Γ(x 2 ), we still need to have v 1 , v 2 ∈ V ′ (that is v 1 , v 2 represent large sets of Σ). We will only require that v 1 ∈ V ′ for the bound. Note that so far we only fixed Γ(x 1 ), Γ(x 2 ), and hence the neighbors of Γ(x) for x = x 1 , x 2 are still uniform. Then v 1 ∈ V ′ if at least 6t − 2 other nodes x ∈ X have v 1 as their neighbor. By Lemma 2.3, the probability for this is bounded by To bound Pr[girth(G ′ ) = 4] we union bound over all n 2 m 2 choices of x 1 , x 2 , v 1 , v 2 . Using our assumption that m ≥ n we get Proof . Let x 1 , v 1 , . . . , x ℓ , v ℓ denote a potential cycle of length 2ℓ. As it is a minimal cycle and ℓ ≥ 3, the vertices v i , v j have no common neighbors, unless j = i + 1 in which case x i is the only common neighbor of v i , v i+1 (where indices are taken modulo ℓ). Thus there exist sets Let E(x 1 , v 1 , . . . , x ℓ , v ℓ , X 1 , . . . , X ℓ ) denote the event described above, for a fixed choice of x 1 , v 1 , . . . , x ℓ , v ℓ , X 1 , . . . , X ℓ . The event holds if There are independent events, as Γ(x) is independently chosen for each x ∈ X. So .

The regime of large sets
We next prove Theorem 1.4. Let (X, Σ) be a t-sparse set system with |X| = n, |Σ| = m. In this setting, we consider the case of fixed m, t and n → ∞. Consider its m × n incidence matrix. The columns are t-sparse vectors in {0, 1} m , and hence have N = m t possible values. When n ≫ N, there will be many repeated columns. We show that in this case, the discrepancy of the set system is low. Setting notations, let v 1 , . . . , v N ∈ {0, 1} m be all the possible t-sparse vectors, and let r 1 , . . . , r N denote their multiplicity in the set system. Note that they define the set system uniquely (up to permutation of the columns, which does not effect the discrepancy).
Our main result in this section is the following. We will assume throughout that m is large enough and that 4 ≤ t ≤ m − 4. We note that if t ≤ 3 or t ≥ m − 3 then result immediately follows from the Beck-Fiala theorem (Theorem 1.1), for any set systems. The first case follows by a direct application, and the second case by first partitioning the columns to pairs and subtracting one vector from the next in each pair, which gives a 6-sparse {−1, 0, 1} matrix, to which we apply the Beck-Fiala theorem.
Note that the statement in Theorem 5.1 is somewhat stronger than that in Theorem 1.4, as it only assumes that all possible t-sparse column vectors comprise the incidence matrix of (X, Σ), and their multiplicity is 7 or higher. In fact, Theorem 1.4 follows from Theorem 5.1 using a straightforward coupon-collector argument [EA61]. In this regime, with high probability (say, with probability at least 1 − 1/N), a random sample of Θ(N log N) columns guarantees that each t-sparse column appears with multiplicity 7 (or higher). Therefore, we obtain: We are thus left to prove Theorem 5.1. First, we present an overview of the proof.
Proof overview. Every column v i is repeated r i times. As we may choose arbitrary signs for each occurrence of a vector, the aggregate total would be c i v i , where c i ∈ Z, |c i | ≤ r i and c i ≡ r i mod 2. Our goal is to show that such a solution c i always exists, for which c i v i ∞ is bounded, for any initial settings of r 1 , . . . , r N , as long as they are all large enough.
We show that such a solution always exists, with |c i | ≤ 7. In order to show it, we first fix some solution with the correct parity, and then correct it to a low discrepancy solution, by adding an even number of copies of each vector. In order to do that, we study the integer lattice L spanned by the vectors v 1 , . . . , v N , as our correction comes from 2L. We show that L = {x ∈ Z m : x i = 0 mod t}, which was already proved by Wilson [Wil90] in a more general scenario. However, we need an additional property: vectors in L are efficiently spanned by v 1 , . . . , v N . This allows us to perform the above correction efficiently, keeping the number of times that each v i is repeated bounded. Putting that together, we obtain the result.

Proof of Theorem 5.1
Initially, we investigate the lattice spanned by the vectors v 1 , . . . , v N . As the sum of the coordinates of each of them is t, they sit within the lattice We first show that they span this lattice, and moreover, they do so effectively.
Lemma 5.2. For any w ∈ L there exist a 1 , . . . , a N ∈ Z such that Proof. Assume first that we have w i = 0. We will later show how to reduce to this case. Pair the positive and negative coordinates of w. For L = w 1 /2 let (i 1 , j 1 ), . . . , (i L , j L ) be pairs of elements of [N] such that: if (i, j) is a pair then w i > 0, w j < 0; each i ∈ [m] with w i > 0 appears w i times as the first element in a pair; and each j ∈ [m] with w j < 0 appears −w j times as the second element in a pair. For any ℓ ∈ [L] choose S ℓ ⊂ [m] of size t − 1.
We choose the sets S 1 , . . . , S L to minimize the maximum number of times that each vector from {v 1 , . . . , v N } is repeated in the decomposition. When we choose S ℓ , we can choose one of M = m−2 t−1 many choices. There is a choice for S ℓ such that both I ℓ and J ℓ appeared thus far less than 2ℓ/M times. Choosing such a set, we maintain the invariant that after choosing S 1 , . . . , S ℓ , each vector is repeated at most 2ℓ/M + 1 times. Thus, at the end each vector is repeated at most 2L/M + 1 times.
In the general case, we have w i = st, where we may assume s > 0. We apply the previous argument to w − (v i 1 + . . . + v is ), whose coordinates sum to zero. We choose i 1 , . . . , i s ∈ [N] (potentially with repetitions) so as to minimize the maximum number of times that each vector participates; this number is ⌈s/N⌉ ≤ w 1 /M +1. Combining the two estimates, we obtain that at the end each vector is repeated at most 4L/M +2 = 2 w 1 /M +2 times.
Thus, with probability at least 1/2, u 2 ≤ √ 2Nt and hence u 1 ≤ √ 2Ntm. Fix such a u. Next, we choose w ∈ L such that u − 2w ∞ is bounded. If we only wanted that w ∈ Z m we could simply choose q ∈ {0, 1} m with q i = u i mod 2 and take w = (u − q)/2. In order to guarantee that w ∈ L, namely that w i = 0 mod t, we change at most t coordinates in q by adding or subtracting 2. Thus, we obtain q ∈ {−2, −1, 0, 1, 2} m where q i ≡ u i mod 2 and set w = (u − q)/2 ∈ L. We have u − 2w ∞ ≤ 2.
Next, we apply Lemma 5.2 to w. We obtain a decomposition w = a i v i . This implies that if we set c i = z i − 2a i then indeed c i ≡ b i mod 2 and c i v i ∞ = u − 2w ∞ ≤ 2. To bound |c i |, note that w 1 ≤ u 1 /2 + m. We have by Lemma 5.2 that |a i | ≤ A for Proof of Theorem 5.1. Assume that r 1 , . . . , r N ≥ 7. By Lemma 5.3, there exists c i ∈ Z such that c i ≡ r i mod 2, |c i | ≤ 7 and c i v i ∞ ≤ 2. For each i ∈ [N], we color |c i | of the vectors v i with sign(c i ) ∈ {−1, +1} and the remaining r i − |c i | vectors with alternating +1 and −1 colors (so that their contribution cancels, since r i − |c i | is even). The total coloring produces exactly the vector c i v i , which as guaranteed has discrepancy bounded by 2.