On the Probabilistic Degrees of Symmetric Boolean functions

The probabilistic degree of a Boolean function $f:\{0,1\}^n\rightarrow \{0,1\}$ is defined to be the smallest $d$ such that there is a random polynomial $\mathbf{P}$ of degree at most $d$ that agrees with $f$ at each point with high probability. Introduced by Razborov (1987), upper and lower bounds on probabilistic degrees of Boolean functions --- specifically symmetric Boolean functions --- have been used to prove explicit lower bounds, design pseudorandom generators, and devise algorithms for combinatorial problems. In this paper, we characterize the probabilistic degrees of all symmetric Boolean functions up to polylogarithmic factors over all fields of fixed characteristic (positive or zero).

Our result. In this paper, we give an almost-complete understanding of the probabilistic degrees of all symmetric Boolean functions over all fields of fixed positive characteristic and characteristic 0. For each Boolean function f on n variables, our upper bounds and lower bounds on pdeg(f ) are separated only by polylogarithmic factors in n. We now introduce some notation and give a formal statement of our result. We shall use the notation [a, b] to denote an interval in R as well as an interval in Z; the distinction will be clear from the context. Throughout, fix some field F of characteristic p which is either a fixed positive constant or 0. Let n be a growing integer parameter which will always be the number of input variables. We use sB n to denote the set of all symmetric Boolean functions on n variables. Note that each symmetric Boolean function f : {0, 1} n → {0, 1} is uniquely specified by a string Spec f : [0, n] → {0, 1}, which we call the Spectrum of f , in the sense that for any a ∈ {0, 1} n , we have f (a) = Spec f (|a|).
Given a f ∈ sB n , we define the period of f , denoted per(f ), to be the smallest positive integer b such that Spec f (i) = Spec f (i + b) for all i ∈ [0, n − b]. We say f is k-bounded if Spec f is constant on the interval [k, n − k]; let B(f ) denote the smallest k such that f is k-bounded.
Standard decomposition of a symmetric Boolean function [Lu01]. Fix any f ∈ sB n . Among all symmetric Boolean functions f ′ ∈ sB n such that Spec f ′ (i) = Spec f (i) for all i ∈ [⌈n/3⌉, ⌊2n/3⌋], we choose a function g such that per(g) is as small as possible. We call g the periodic part of f . Define h ∈ sB n by h = f ⊕ g. We call h the bounded part of f .
We will refer to the pair (g, h) as a standard decomposition of the function f . Note that we have f = g ⊕ h.
In this paper, we prove the following upper and lower bounds for the probabilistic degrees of symmetric Boolean functions.
Theorem 3 (Upper bounds on probabilistic degree). Let F be a field of constant characteristic p (possibly 0) and n ∈ N be a growing parameter. Let f ∈ sB n be arbitrary and let (g, h) be a standard decomposition of f . Then we have the following for any ε > 0.
1. If per(g) = 1, then pdeg F ε (g) = 0. If per(g) is a power of p, then pdeg F ε (g) ≤ per(g), [Lu01]  where the O(·) hides polylogarithmic factors in n (and are independent of ε). When p is positive, we can replaced the O(·) with O(·) in all the above bounds.
We obtain almost (up to polylogarithmic factors) matching lower bounds for all symmetric Boolean functions over all fields and all errors.
Theorem 4 (Lower bounds on probabilistic degree). Let F be a field of constant characteristic p (possibly 0) and n ∈ N be a growing parameter. Let f ∈ sB n be arbitrary and let (g, h) be a standard decomposition of f . Then for any ε ∈ [1/2 n , 1/3], we have 1. pdeg F ε (g) = Ω( n log(1/ε)) if per(g) > 1 and is not a power of p and Ω(min{ n log(1/ε), per(g)}) otherwise. where the Ω(·) hides poly(log n) factors (independent of ε).

Remark 5.
A natural open question following our results is to remove the polylogarithmic factors separating our upper and lower bounds. We remark that in characteristic 0, such gaps exist even for the very simple OR function despite much effort [MNV16,HS16,BHMS18]. Over positive characteristic, there is no obvious barrier, but our techniques fall short of proving tight lower bounds for natural families of functions such as the Exact Threshold functions (defined in Section 2).

Proof Outline
For the outline below, we assume that the field is of fixed positive characteristic p.
Upper bounds. Given a symmetric Boolean function f on n variables with standard decomposition (g, h), it is easy to check that pdeg ε (f ) = O(pdeg ε (g) + pdeg ε (h)). So it suffices to upper bound the probabilistic degrees of periodic and bounded functions respectively.
For periodic functions g with period a power of p, Lu [Lu01] showed that the exact degree of the Boolean functions is at most per(g). If the period is not a power of p, then we use the upper bound of Alman and Williams [AW15] that holds for all symmetric Boolean functions (as we show below, this is nearly the best that is possible).
For a t-constant function h (defined in Section 3), we use the observation that any t-constant function is essentially a linear combination of the threshold functions Thr 0 n , . . . , Thr t n (defined in Section 2) and so it suffices to construct probabilistic polynomials for Thr i n for i ∈ [0, t]. 4 Our main technical upper bound is a new probabilistic degree upper bound of O( t log(1/ε) + log(1/ε)) for any threshold function Thr t n . This upper bound interpolates smoothly between a classical upper bound of O(log(1/ε)) due to Razborov [Raz87] for t = 1 and a recent result of Alman and Williams [AW15] that yields O( n log(1/ε)) for t = Ω(n).
The proof of our upper bound is based on the beautiful inductive construction of Alman and Williams [AW15] which gives their above-mentioned result. The key difference between our proof and the proof of [AW15] is that we need to handle separately the case when the error ε ≤ 2 −Ω(t) . 5 In [AW15], this is a trivial case since any function on n Boolean variables has an exact polynomial of degree n which is at most O( n log(1/ε)) when ε ≤ 2 −Ω(n) . In our setting, the correct bound in this case is O(log(1/ε)), which is non-obvious. We obtain this bound by a suitable modification of Razborov's technique (for t = 1) to handle larger thresholds.
Lower bounds. Here, our proof follows a result of Lu [Lu01], who gave a characterization of symmetric Boolean functions that have quasipolynomial-sized AC 0 [p] circuits. 6 To show circuit lower bounds for a symmetric Boolean function h, Lu showed how to convert a circuit C computing h to a circuit C ′ computing either the Majority or a MOD q function (where q and p are relatively prime). Since both of these are known to be hard for AC 0 [p] [Raz87,Smo87a], we get the lower bound.
Lu's basic idea was to use a few restrictions 7 of h along with some additional circuitry to compute either Majority or MOD q . These functions are also known to have large probabilistic degree (in fact, this is the source of the AC 0 [p] lower bound), and so this high-level idea seems applicable to our setting as well. Indeed we do use this strategy, but our proofs are different when it comes down to the details. As Lu's aim was to derive optimal circuit lower bounds for h, his reductions were tailored towards using as small an amount of additional circuitry as possible. Our focus, however, is to prove the best possible probabilistic degree lower bound, so we would like our reductions to be computable by polynomials of small degree. This makes the actual reductions quite different. 8

Preliminaries
Some Boolean functions. Fix some positive n ∈ N. The Majority function Maj n on n Boolean variables accepts exactly the inputs of Hamming weight greater than n/2. For t ∈ [0, n], the Threshold function Thr t n accepts exactly the inputs of Hamming weight at least t; and similarly, the Exact Threshold function EThr t n accepts exactly the inputs of Hamming weight exactly t. Finally, for b ∈ [2, n] and i ∈ [0, b − 1], the function MOD b,i n accepts exactly those inputs a such that |a| ≡ i (mod b). In the special case that i = 0, we also use MOD b n . 5 This case comes up naturally in the inductive construction, even if one is ultimately only interested in the case when ε is a constant. 6 Recall that an AC 0 [p] circuit is a constant-depth circuit made up of gates that can compute the Boolean functions AND, OR, NOT and MOD p (defined below). 7 A restriction of a Boolean function is obtained by setting some of its input variables to constants in {0, 1}. 8 In an earlier version of this paper, we actually used Lu's reductions (and variants thereof) directly in the setting of probabilistic polynomials. This still works in certain parameter regimes because the additional circuitry itself has low probabilistic degree. However, in the setting of small error, this strategy seems to yield suboptimal results.
Fact 6. We have the following simple facts about probabilistic degrees. Let F be any field.

(Composition)
For any Boolean function f on k variables and any Boolean functions g 1 , . . . , g k on a common set of m variables, let h denote the natural composed function f (g 1 , . . . , g k ) on m variables. Then, for any ε, δ > 0, 3. (Sum) Assume that f, g 1 , . . . , g k are all Boolean functions on a common set of m variables such that f = i∈[k] g i . Then, for any δ > 0, we have pdeg F kδ (f ) ≤ max i∈[k] pdeg F δ (g i ).

Some previous results on probabilistic degree
The following upper bounds on probabilistic degrees of OR and AND functions were proved by Razborov [Raz87] and Smolensky [Smo87a] in the case of positive characteristic and Tarui [Tar93] and Beigel, Reingold and Spielman [BRS91] in the general case. For the latter, we state a slightly tighter result that follows from [Bra10, Lemma 8].
Lemma 7 (Razborov's upper bound on probabilistic degrees of OR and AND). Let F be a field of characteristic p. For p > 0, we have For any p, we have pdeg F ε (OR n ) = pdeg F ε (AND n ) ≤ 4⌈log n⌉ · ⌈log(1/ε)⌉. (2) Further, the probabilistic polynomials have one-sided error in the sense that on the all 0 input, they output 0 with probability 1.
We now recall two probabilistic degree lower bounds due to Smolensky [Smo87b,Smo93b], building on the work of Razborov [Raz87].
Lemma 8 (Smolensky's lower bound for close-to-Majority functions). For any field F, any ε ∈ (1/2 n , 1/5), and any Boolean function g on n variables that agrees with Maj n on a 1 − ε fraction of its inputs, we have pdeg F ε (g) = Ω( n log(1/ε)). Lemma 9 (Smolensky's lower bound for MOD functions). For 2 ≤ b ≤ n/2, any F such that char(F) = p is coprime to b, any ε ∈ (1/2 n , 1/(3b)), there exists an i ∈ [0, b − 1] such that pdeg F ε (MOD b,i n ) = Ω( n log(1/bε)). Remark 10. From the above lemma, it also easily follows that if b ≤ n/4, then for every i ∈ . This is the usual form in which Smolensky's lower bound is stated. The above form is slightly more useful to us because it holds for b up to n/2.
We will also need the following result of Alman and Williams [AW15].

A string lemma
Given a function w : I → {0, 1} where I ⊆ N is an interval, we think of w as a string from the set {0, 1} |I| in the natural way. For an interval J ⊆ I, we denote by w| J the substring of w obtained by restriction to J.
The following simple lemma can be found, e.g. as a consequence of [JJJ80, Chapter I, Section 2, Theorem 1].
Corollary 13. Let g ∈ sB n be arbitrary with . Then u = w and the assumption uv = vw implies uv = vu. By Lemma 12, there exists a string z such that uv = z k for k ≥ 2 and therefore per(g) < b. This contradicts our assumption on b.

Upper bounds
In this section, we will first prove upper bounds on the probabilistic degree of a special class of symmetric Boolean functions that we call t-constant functions, and then use it to prove Theorem 3.

Upper bound on probabilistic degree of t-constant functions
Definition 14 (t-constant function). For any positive n ∈ N and t ∈ [0, n], a Boolean function The following observation is immediate.
We will prove an upper bound on the probabilistic degree of t-constant Boolean functions. For this, we first generalize the notion of probabilistic polynomial and probabilistic degree to a tuple of Boolean functions. This generalization was implicit in [AW15].
Definition 16 (Probabilistic poly-tuple and probabilistic degree). Let f = (f 1 , . . . , f m ) : {0, 1} n → {0, 1} m be an m-tuple of Boolean functions and ε ∈ (0, 1). An ε-error probabilistic poly-tuple for f is a random m-tuple of polynomials P (with some distribution having finite support) from F[X 1 , . . . , X n ] m such that We say that the degree of P is at most d if P is supported on m-tuples of polynomials P = (P 1 , . . . , P m ) where each P i has degree at most d. Finally we define the ε-error probabilistic degree of f , denoted by pdeg F ε (f ), to be the least d such that f has an ε-error probabilistic poly-tuple of degree at most d.
We make a definition for convenience.
The main theorem of this subsection is the following.
Theorem 18. For any positive n ∈ N, t ∈ [0, n], if T is an (n, t)-threshold tuple and ε ∈ (0, 1/3), then As a corollary to the above theorem, we get an upper bound for the probabilistic degree of t-constant functions.
t] such that f = g(Thr 0 n , . . . , Thr t n ). We note that deg g = 1. So by Theorem 18, we get High-level outline of the proof. The basic strategy behind the inductive construction of probabilistic poly-tuples is due to Alman and Williams [AW15]. We describe the construction of an ε-error probabilistic polynomial for a single threshold Thr t n (the construction for a tuple is similar). Assume that, by induction, we already have probabilistic polynomials T m,t,ε for Thr t m where m < n. The idea is to try to use T m,t,ε for m < n to compute Thr t n (x). We do this by sampling: we sample a random subvectorx of length n/10 of x by sampling uniform random entries of x with replacement. If the Hamming weight of x is "sufficiently far" from the threshold t, then the weight ofx is on the "same side" of t/10 as x is of t w.h.p. (say at least 1 − ε/4); in particular, in this case P n/10,t/10,ε/4 gives the right answer with probability 1 − ε/4 and we are done. However, if |x| is "not sufficiently far" from t, then we need to do something else: here, we simply interpolate a polynomial that outputs the right answer on these values (see Theorem 21 below). Finally, to check which of "far" or the "not far" cases we are in, we again use the inductive hypothesis on the subvectorx, which again gives the right answer with probability 1−ε/4. Putting these things together yields the ε-error probabilistic polynomial.
In the analysis of the construction above, the distance parameter (say θ) that determines "far" vs. "not far" comes from the concentration properties of Bernoulli random variables (see Lemma 20 below). in our setting, θ is roughly t log(1/ε)). In particular, to check that |x| is not much larger than t, we need to apply a probabilistic polynomial for the threshold function Thr t/10+θ n/10 to the random vectorx. Here, to keep the threshold parameter bounded by t, we need that log(1/ε) is not much larger than t, or equivalently that ε is not much smaller than 2 −t .
When ε does fall below 2 −t , 9 we need to do something different, as the above inductive strategy fails. This case does not occur in [AW15] since there t = Ω(n) and when ε ≤ 2 −n , we can always use an exact polynomial representation of the threshold function (which has degree n = O( n log(1/ε))). In our setting, though, we aim for a bound ofÕ(log(1/ε)) in this case, which is non-trivial. To handle this case, we use a different construction, which is a modification of Razborov's probabilistic polynomial construction for the OR function (Lemma 7 above). This changes the base case of the induction and certain elements of the inductive analysis. Overall, though, we are able to use these ideas to obtain a probabilistic polynomial of degreẽ O( t log(1/ε) + log(1/ε)) (and only a constant-factor loss when the characteristic p > 0).

Proof of Theorem 18
Before we prove Theorem 18, we will gather a few results that we require. The following lemma is a particular case of Bernstein's inequality (Theorem 1.4, [DP09]).
Lemma 20. Let X 1 , . . . , X m be independent and identically distributed Bernoulli random variables with mean q. Let X = m i=1 X i . Then for any θ > 0, We will also need the following polynomial construction. Remark 22. In particular, the polynomial EX [a,b] f may be interpreted as a polynomial over any field F satisfying the above property.
We will now prove Theorem 18.
For positive characteristic p, we prove that for any positive n ∈ N, t ∈ [0, n] and ε ∈ (0, 2 −100 ), any (n, t)-threshold tuple T has an ε-error probabilistic poly-tuple T of degree at most A p t log(1/ε) + B p log(1/ε), for constants A p = B p = 6, 400, 000p (we make no effort to optimize the constants). For p = 0, we prove a similar result with a degree bound of A 0 log n · t log(1/ε) + B 0 log n · log(1/ε), for A 0 = B 0 = 64, 000, 000. This will prove the theorem for ε ≤ 2 −100 . To prove the theorem for all ε ≤ 1/3, we use error reduction (Fact 6) and reduce the error to 2 −100 and then apply the result for small error.
The proof is by induction on the parameters n, t and ε. At any stage of the induction, given an (n, t)-threshold tuple with error parameter ε, we construct the required probabilistic poly-tuple by using the probabilistic poly-tuples (guaranteed by inductive hypothesis) for suitable threshold poly-tuples with n/10 inputs and error parameter ε/4. Thus the base cases of the induction are as follows.
Suppose n ≤ r. Let Q 1 , . . . , Q m be the unique multilinear representations of T 1 , . . . , T m respectively. Then Q = (Q 1 , . . . , Q m ) is an ε-error probabilistic polynomial with deg Q ≤ n ≤ r. This proves the claim in this case. Now suppose n > r. We first describe how to construct the probabilistic poly-tuple P in this case. Assume for now that char(F) = p > 0.
Further we have |a| ≥ t i , for all i ∈ N a . We will now show that P (i) (a) = 1 w.p. at least 1 − ε, for all i ∈ N a simultaneously. If |a| ≤ r, then again P (i) 1 (a) = 1, for all i ∈ N a and so P (i) (a) = 1 w.p.1. Now suppose |a| ≥ r (note that in this case, N a = [m]). Without loss of generality, assume t 1 ≤ · · · ≤ t m = t. Then we have P where for the final inequality we have used the fact that ℓ ≥ r/10 ≥ 640000 log(1/ε). Thus Pr P r is the unique multilinear polynomial representation of Thr t i r , and for j ∈ [r], O j is the 1/4-error probabilistic polynomial for OR S j , the OR function on variables (X k : k ∈ S j ), given to us by Lemma 7. One can verify that the degree in this case is bounded as above by 10r log n ≤ B 0 log n · log(1/ε). The rest of the analysis follows similarly, proving the base case when char(F) = 0.
We now consider an (n, t)-threshold tuple T = (T 1 , . . . , T m ) = (Thr t 1 n , . . . , Thr tm n ). Assume that the parameter ε > 2 −t/160000 since otherwise we can use the construction from the base case. Define We will now prove that T is an ε-error probabilistic poly-tuple for T .
Correctness of Inductive Construction. We now check that the construction above gives an εerror probabilistic poly-tuple for T . Fix any a ∈ {0, 1} n . Letâ ∈ {0, 1} n/10 be chosen as given in the inductive construction.
Suppose |a| ≤ 2t. Let θ = 10 t log(1/ε). Applying Lemma 20, we get where for the third inequality, we have used the fact that log(1/ε) ≤ t/160000. By induction hypothesis, the probability that T ′ (â) does not agree with T ′ (â) is at most ε/4, and similarly for T ′′ + and T ′′ − . Let G a be the event that none of the above events occur and that ||â| − (|a|/10)| ≤ θ; by a union bound, the event G a occurs with probability at least 1 − ε. In this case, we show that T(a) = T (a), which will prove the correctness of the construction in the case that |a| ≤ 2t.
To see this, observe the following for each i ∈ [m].
. This is because T ′ i (â) = T ′ i (â) by our assumption that the event G a has occurred. Further, we also have T ′ i (â) = T i (a) since |â − |a|/10| ≤ θ (by occurrence of G a ) and hence |a| ≥ t i if and only if |â| ≥ t i /10.
To see this, we proceed as follows.

This implies that
Hence, when G a does not occur, we have T(a) = T (a), which proves the correctness of the construction.
Correctness of Degree. We need to argue that deg(T) satisfies the inductive claim. Suppose char(F) = p > 0. We have Recall that A p = B p = 6400000p. Now So we get where the third inequality uses ε ≤ 2 −100 and the final inequality uses t > 160000 log(1/ε). Now if char(F) = 0, then we get a similar degree bound with A 0 = B 0 = 64000000. This completes the argument for correctness of degree.

Upper bound on pdeg ε (g)
This result is due to Lu [Lu01] but a proof is sketched here for completeness.
Recall that char(F) = p. When per(g) = 1, g is a constant function and hence the result is trivial. So assume that per(g) = p t for t ≥ 1. In this case, we show that g can be represented exactly as a linear combination of elementary symmetric polynomials of degree at most D = p t − 1, which clearly proves the upper bound stated in the theorem. To see that every such g has such a representation, we proceed as follows.
Let V be the vector space generated by all functions f : {0, 1} n → F such that f can be written as a linear combination of elementary symmetric polynomials of degree at most D. Clearly, since there is a 1-1 correspondence between multilinear polynomials and functions from {0, 1} n to F, the vector space V has dimension exactly D+1 = p t . Each function in f is a symmetric (not necessarily Boolean) function on {0, 1} n . Further, a standard application of Lucas' theorem (see [Luc78]) shows for each i ≤ n − p t . Now, consider the vector space W of all functions f ′ : {0, 1} n → F that satisfy property (3). Clearly, W ⊆ V. Furthermore, the functions MOD i p t (i ∈ [0, p t − 1]) is a set of p t many linearly independent functions in W . Hence, the dimension of W is exactly p t and therefore, W = V.
Since g ∈ W , we immediately see that g ∈ V and is hence a linear combination of elementary symmetric polynomials of degree at most D.

Lower Bounds
We now prove the lower bounds given in Theorem 4. Throughout this section, let F be any field. We use pdeg ε (·) instead of pdeg F ε (·).
High-level outline of proof. We use a similar proof strategy to Lu [Lu01], who gave a characterization of symmetric functions computable by quasipolynomial-sized AC 0 [p] circuits. To prove a lower bound on the probabilistic degree of a symmetric function f ∈ sB n , we will use known lower bounds for Majority (Lemma 8) and MOD q functions (Lemma 9), where q is relatively prime to the characteristic p. The basic idea is to use a few "restrictions" of f to compute either a Majority function or a MOD q function.
Here, a restriction of f is a function h ∈ sB m obtained by setting a few inputs of f to 0s and 1s. That is, we define h(x) = f (x0 a 1 n−m−a ) for some a. From the definition of such an h, it is clear that for any δ > 0, pdeg δ (h) ≤ pdeg δ (f ). We design a small number of such restrictions h 1 , . . . , h ℓ and a "combining function" P : {0, 1} ℓ → {0, 1} such that either Majority of MOD q can be written as P (h 1 , . . . , h ℓ ).
The main restriction we will have on P is that it should be a low-degree polynomial. 10 Given this, using Fact 6, we can write Using lower bounds on the probabilistic degree of Majority and MOD q , we then get lower bounds on the probabilistic degree of f .
The non-trivial part is to determine the hard function we use in the reduction and how to carry out the reduction with a polynomial P of low degree. Both of these are dependent on the structure of Spec f. We give the details below.
We start with a preliminary lemma.
Then there exists S ⊆ F such that |S| ≤ log n and f S = f ∈S f has support size 1. 11 Proof. It is enough to prove that for every positive integer k ≤ ⌊log n⌋, there exists S k ⊆ F such that |S| ≤ k and 1 ≤ |supp(f S k )| ≤ n/2 k . We do so by induction on k.
If k = 1, consider any f ∈ F . If |supp(f )| ≤ n/2, then we choose S = {f }; otherwise we have |supp(1 − f )| ≤ n/2 and then we choose S = {1 − f }. Now consider any k ∈ [⌊log n⌋ − 1] and assume the existence of S k . If |supp(f S k )| = 1, then we choose S k+1 = S k , which satisfies the required conditions. Now suppose |supp(f S k )| > 1. Choose any i, j ∈ supp(f S k ), i = j. By the given condition, there exists f i,j ∈ F such that f i,j (i) = 1 and f i, At the end of this process, we have a set S such that 1 ≤ |supp(f S )| ≤ n/2 ⌊log n⌋ < 2 and hence |supp(f S )| = 1. To prove the latter, we use Lemma 23.
We first note that F satisfies the hypotheses of Lemma 23. Clearly, v ∈ F implies 1 − v ∈ F . Now consider any i, j ∈ [0, m − 1], i = j. If v(i) = v(j) for all v ∈ F , then in particular, we have u k (i) = u k (j), for all k ∈ [0, m − 1]. This implies u is periodic, which is a contradiction. Thus there exists v ∈ F such that v(i) = 0, v(j) = 1. So by Lemma 23, there exists S ⊆ F , |S| ≤ log m such that v∈S v has support size 1. We have We now prove the lower bound for pdeg ε (g).
b is not a power of p. Define the function u : [0, b − 1] → {0, 1} by u(i) = Spec g(i). Note that for u j = u • τ j b (j ∈ [0, b − 1]), as defined in the statement of Lemma 25, we have u j = Spec g(j + i) (the latter is well defined as j + i < 2b < n). Further, as b is the period of g, Corollary 13 implies that u = u j for all j ∈ [b − 1]. This means that u is aperiodic (as defined above).
Let q be any prime divisor of b distinct from p. For each i ∈ [0, q − 1], define a function v i : [0, b − 1] → {0, 1} by v i (j) = 1 iff j ≡ i (mod q). Lemma 25 implies that for each i, there is a P i (Y 0 , . . . , Y b−1 ) of degree at most log b such that P i (u 0 , . . . , u b−1 ) = v i .
Fix any i ∈ [0, b − 1] and consider the function 1 1), . . . , g(x01 b−1 )). Clearly, as all the inputs to P i are b-periodic symmetric functions, the same holds for the function G i . Further, for any j ∈ [0, b − 1], we see that This implies that G i is in fact the MOD q,i n−b function. Note also that q ≤ b ≤ (n − b)/2. This will be relevant below as we will apply Lemma 9 to one of the functions G 0 , . . . , G q−1 .
We will show later how to find m, δ satisfying these properties. Assuming this for now, we first prove the lower bound on pdeg ε (g). First, we fix some b-periodic symmetric function G : {0, 1} n−b → {0, 1} so that G agrees with Maj n−b on all inputs of weight a ∈ ((n − b)/2 − 2 m log(1/δ), (n − b)/2 + 2 m log(1/δ)) (it is possible to define such a b-periodic G because of property (P3) above).
While the above is possibly folklore, we don't know of a reference with a proof, so we give one here. Assume m, ε as above and let D denote pdeg ε (OR m ). By setting some bits to 0, we also get pdeg ε (OR m 1 ) ≤ D where m 1 = ⌊log(1/ε) − 1⌋. Fix such a probabilistic polynomial P of degree at most D for OR m 1 . We have for every x ∈ {0, 1} m 1 , Pr P [P(x) = OR m 1 (x)] ≤ ε < 1 2 m 1 .
By a union bound, there is some polynomial P in the support of the probability distribution underlying P that agrees with the function OR m 1 everywhere. However, the unique multilinear polynomial representing OR m 1 has degree m 1 . Hence, we see that deg(P) ≥ deg(P ) = m 1 = Ω(log(1/ε)) concluding the proof of (4).
We now prove the lower bound on pdeg ε (h) from Theorem 4.
Lemma 27 now implies the lower bound.

Lower bound on pdeg ε (f )
We start with a slightly weaker lower bound on pdeg ε (f ) that is independent of h. We will now use G to construct the Majority function on Θ(m) inputs. We assume that a ≤ 3m/2 (the other case is similar). Let m 1 = ⌊m/2⌋ and define G i ∈ sB m 1 for i ∈ [0, m 1 ] by G i (x) = G(x1 a−i 0 3m−a+i−m 1 ).