On the Spectral Expansion of Monotone Subsets of the Hypercube
Abstract
We study the spectral gap of subgraphs of the hypercube induced by monotone subsets of vertices. For a monotone subset of density , the previous best lower bound on the spectral gap, due to Cohen [14], was , improving upon the earlier bound established by Ding and Mossel [16]. In this paper, we prove the optimal lower bound . As a corollary, we improve the mixing time upper bound of the random walk on constant-density monotone sets from , as shown by Ding and Mossel, to . Along the way, we develop two new inequalities that may be of independent interest: (1) a directed -Poincaré inequality on the hypercube, and (2) an “approximate” FKG inequality for monotone sets.
Keywords and phrases:
Random walks, mixing time, FKG inequality, Poincaré inequality, directed isoperimetryCategory:
RANDOMFunding:
Renato Ferreira Pinto Jr.: Supported by an NSERC Canada Graduate Scholarship.Copyright and License:
2012 ACM Subject Classification:
Mathematics of computing Markov processes ; Theory of computation Random walks and Markov chainsAcknowledgements:
We thank Lap Chi Lau for many helpful insights and discussions on spectral theory for directed graphs, and Esty Kelman for early discussions in the development of this project.Editors:
Alina Ene and Eshan ChattopadhyaySeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Suppose is a good expander graph, so that a random walk on the vertices of is fast mixing, i. e. converges quickly to its stationary distribution. When is a random walk on a subgraph of also fast mixing? More precisely, what kinds of subgraph restrictions preserve good expansion?
This paper studies the case of the hypercube graph , where vertices are connected by an edge if and only if they differ in exactly one coordinate. Recall that the lazy random walk on has mixing time . Given a subset of vertices, we consider the random walk on censored to .
Definition 1 (Censored random walk, [16]).
Given , the random walk on censored to is defined as follows. On state , sample uniformly at random and let be the vertex obtained by flipping the -th bit of . Then
-
1.
If , flip a coin and either stay at or move to (each with probability ).
-
2.
If , stay at (in which case we call this a censored step).
Without further guarantees on , the censored random walk may mix well or extremely poorly even when is large and connected, as the following two examples illustrate:
Example 2 (Subcube).
Let be a set of indices, and let be the subcube given by the vertices satisfying for all . Then the censored random walk is essentially a random walk on the smaller cube , where , except that only an -fraction of the transitions are not censored. Thus the censored random walk has mixing time .
Example 3 (Middle slice bridge).
Let be an arbitrary vertex with Hamming weight , and consider the set . A spectral argument shows that the mixing time of the censored random walk is exponential in .
Thus, it is natural to ask: what properties of ensure fast mixing? In [16], Ding & Mossel initiated the study of random walks censored to monotone sets111A set is called monotone if implies whenever , where the latter denotes the natural partial order on the hypercube: if for every . and showed that, when is not too small, monotonicity implies fast mixing. Concretely, letting denote the uniform distribution on , they proved
Theorem 4 ([16, Corollary 1.2]).
Let be a non-empty monotone set. Then the random walk on censored to has mixing time
When the density is a constant, the above implies a mixing time bound of . In particular, for the uncensored special case , this result only yields an upper bound of on the mixing time, versus the optimal ; this suggests the potential for improving upon Theorem 4, and indeed [16] asked the following question.
Question 5 ([16, Question 1.1]).
Suppose for some constant . Is it true that ?
Our main result makes progress on this question by showing an mixing time bound for monotone sets of constant density.
Theorem 6.
Let be a non-empty monotone set. Then the random walk on censored to has mixing time
1.1 Spectral gap
It is well-known that the mixing time of a Markov chain is related to the spectral gap of its generator (see e.g. [35]), or equivalently the spectral expansion of the underlying graph. The Poincaré inequality (see e.g. [39]) for the hypercube states that the spectral expansion of the (lazy) hypercube is exactly , which implies the (non-tight) mixing time bound for the lazy random walk on .
Thus, a natural question related to mixing under censoring is the robustness of the classical Poincaré inequality under vertex removal from . Specifically, does the spectral expansion of remain on the order of if only a small fraction of the vertices are removed, or does it exhibit a significant deviation? Example 3 demonstrates that the spectral expansion can shrink to exponentially small values if the removed set is arbitrary. Our goal is to show that when the removed set of vertices is monotone (and not too large), the spectral expansion remains at least on the order of .
For the purpose of clearer comparison with the classical Poincaré inequality, we introduce the Dirichlet form of a function on . Note that in the case , the following definition is exactly the “influence” [39, Definition 2.27] of the function .
Definition 7.
Fix a monotone set . For all , we define222The energy functional defined here differs from the standard Dirichlet form of the censored random walk on by a factor of . This normalization is chosen to better align with the notion of total influence in the analysis of Boolean functions.
Here denotes the binary string obtained by flipping the -th bit of .
We can now state our “robust” version of the Poincaré inequality.
Theorem 8.
Let be a non-empty monotone set. We have for all
Here stands for the variance of where is a uniformly random element of .
Note that in the case , the above theorem recovers the Poincaré inequality on the hypercube. We remark that Theorem 6 follows directly from Theorem 8 due to standard Markov chain theory (e.g. [35, Theorem 12.4]), so the rest of the paper focuses mainly on proving Theorem 8.
Theorem 8 can also be stated as a lower bound on the spectral gap of – the subgraph of induced by , with a self-loop added to vertex for each edge of the hypercube with and (which counts as toward the degree of ). For convenience of reference, in this paper we define the spectral gap using the language of Theorem 8.
Definition 9.
Let be a monotone set with at least 2 elements. We define
where ranges over all non-constant functions from to .
Now Theorem 8 can be stated as , for monotone sets with .
1.2 Proof overview: previous work
We begin by briefly describing the proof of Theorem 4 in [16]. The proof in [16] also analyzes the spectral expansion of , achieving the lower bound .
By Cheeger’s inequality, to obtain a lower bound on , it suffices to lower bound the bottleneck ratio
where denotes the set of edges of with and . By an isoperimetric inequality of the hypercube, there is good lower bound on the number of boundary edges connecting a vertex in to a vertex in . However, it is not immediately clear how many of these edges actually lead to . The crucial observation of [16] is that if an edge from a vertex goes “upward” – that is, if its other endpoint satisfies – then by the monotonicity of , we must have .
Coincidentally, there is a “directed isopetrimetric inequality” [25, Theorem 2], developed by the property testing community, which provides a lower bound on exactly the number of such upward boundary edges. Specifically, it gives a lower bound on the number of edges connecting a vertex to a vertex with in terms of the distance of to monotonicity. The precise notion of distance is less important than the key fact that, when is not too small, at least one of and must be far from any monotone set – i.e., it must have a large distance to monotonicity – due to the FKG inequality [22]. As a result, we can lower bound the number of upward boundary edges from either or . Since both sets of upward edges are subsets , we can thus arrive at a lower bound on and hence .
The work [16] actually achieves the optimal bound on the bottleneck ratio. However, this only translates to a quadratically worse bound for the spectral expansion, due to the loss incurred by applying Cheeger’s inequality. One natural idea is to avoid using Cheeger’s inequality by directly bounding the spectral gap . In this direction, [14] used a canonical path argument to show the bound , which improves upon Theorem 4 by a factor of . However, this improvement is only effective when and the dependence on remains suboptimal.
1.3 Proof overview: our work
At first glance, the proof in [16], as described in the previous subsection, appears to heavily depend on the discrete nature of the bottleneck ratio. In our view, a key conceptual contribution of this work is that the arguments in [16] can be adapted to the setting. While the discrete setting leads to the bottleneck ratio, in the setting, the corresponding arguments directly lead to the spectral expansion as stated in Theorem 8. For a full set of analogies, see Table 1.
| The discrete setting | The setting |
|---|---|
| subset | function |
| the complement set | the function |
| bottleneck ratio | spectral gap |
| directed isoperimetric inequality [25] | directed -Poincaré inequality (Theorem 12) |
| classical FKG inequality [22] | approximate FKG inequality (Theorem 14) |
In contrast to [16], where the two main inequalities used in the proof – the directed isoperimetric inequality from [25] and the FKG inequality from [22] – are classical results, in our settings we have to formulate and prove new versions of these inequalities, which may be of interest on their own.
Directed Poincaré inequality
As indicated in Section 1.2, directed isoperimetric inequalities aim to lower bound the number of “upward boundary edges” from a set to in terms of the “distance” of to monotonicity; see Section 1.6 for more background.
For our application, we require a directed isoperimetric inequality in the setting, which is the setting associated with the classical Poincaré inequality and the spectral gap. The first step is to define, for any , its -distance to monotonicity and its “upward boundary edges”.
Definition 10 (Distance to monotonicity).
For a function , we define
where ranges over all monotone increasing functions from to .
Definition 11 (Upward boundary).
For all , we define
Here for and , stands for the string .
We are now ready to state our directed -Poincaré inequality.
Theorem 12 (Directed Poincaré inequality).
For all functions , we have
Approximate FKG inequality
The classical FKG inequality of [22] states that if are monotone increasing functions and is a uniformly random element of , then the random variables and are nonnegatively correlated. It is well-known that this statement holds for increasing functions over a broader class of partially ordered sets (posets). In our proof, we crucially need a lower bound on the correlation ratio of any two increasing functions , where the set is partially ordered by the natural partial order of the hypercube. However, it is easy to see that the FKG inequality does not generally hold on this poset. Thus, we seek an “approximate” version of the FKG inequality, where we are content with a correlation ratio bounded away from , rather than necessarily nonnegative.
Definition 13 (Approximate FKG ratio).
Fix a monotone set with at least 2 elements. We define the approximate FKG ratio of the poset to be
where and range over all non-constant monotone increasing functions from to . Here, stands for the covariance of the random variable pair where is a uniformly random element of .
Theorem 14 (Approximate FKG inequality).
For any monotone set with at least 2 elements, we have .
1.4 The case of small : fast mixing requires good FKG ratio
The bound in Theorem 8 gives only a spectral gap bound for the random walk censored to . The dependence on is clearly optimal: even in the case , the spectral gap is exactly . The next example shows that the asymptotic dependence on is also optimal.
Example 15 ([16, Example 1.3]).
Assume and consider , the union of two subcubes. In this case, . Let be defined by if and otherwise. Then and , so . Standard Markov chain theory (e.g. [35, Theorem 7.4]) shows that the mixing time of the random walk censored to is exponentially large in .
Remarkably, Example 15 is also where the approximate FKG inequality fails badly – if we let be the union of two subcubes as in Example 15 and consider the indicator functions of the two subcubes, it is easy to see that they are both increasing functions on the poset but are very anti-correlated (i. e. the approximate FKG ratio is very close to ).
Our results actually reveal that, when is a monotone set, torpid mixing happens if and only if the approximate FKG ratio of is close to .
Theorem 16.
Let be a monotone set with at least 2 elements. Then for all functions , we have .
Theorem 17.
Let be a monotone set with at least 2 elements. Then for some non-constant function , we have .
The two theorems above imply that , which means the approximate FKG ratio of characterizes the spectral gap of up to a factor of .
Remark 18.
The case demonstrates that the lower bound is tight. Moreover, there are examples indicating that the upper bound is tight up to a constant factor – for instance, when .
1.5 Open problems
Our work leaves two avenues for potential improvement, roughly corresponding to two regimes in the size of the set . We discuss each of these directions in turn.
Large
When for some fixed constant , we establish the tight asymptotic bound . However, this only yields the mixing time bound , which does not resolve Question 5. One way to establish an mixing time bound would be to prove a log-Sobolev inequality instead of an -Poincaré inequality. It is plausible that our techniques could be further adapted to establish a log-Sobolev inequality, similar to how we extended the argument of [16] from the discrete setting to the setting. Are there analogous versions of the directed isoperimetric inequality and the approximate FKG inequality in the log-Sobolev setting? We leave these as open questions.
Small
When no additional structure is imposed, the random walk censored to may mix very slowly if (e. g. Example 15). In many problems of interest, however, possesses some form of structure, and the goal is to obtain a good bound on the mixing time, aiming for an efficient approximate sampling algorithm. For example, when is a halfspace, i. e. defined by for nonnegative numbers , [38] proves a mixing time bound of which yields a sampling algorithm for 0-1 knapsack solutions.
Our work (Theorems 16 and 17) reveals that the deciding factor for whether rapid mixing holds is the approximate FKG ratio of rather than the size of . However, we do not know how to leverage additional structure of in a direct study of its approximate FKG ratio, and we leave the development of new tools for this purpose as a direction for future work.
1.6 Related work
Mixing time of censored random walks
Markov Chain Monte Carlo methods are the object of extensive study in mathematics, statistical physics and theoretical computer science, and the question of mixing time of random walks lies at the core of algorithms for approximate sampling and counting; see e. g. [29, 27, 37, 35]. In settings featuring combinatorial structure such as in sampling matchings, independent sets, or spanning forests of a graph, or their natural generalizations to the more algebraic setting of matroids, a basis exchange or down-up random walk is usually employed, and spectral arguments are used to bound the mixing time; see e. g. [30, 2, 15, 1].
Our work focuses on the setting where the set may not enjoy such rich structure, and instead is only guaranteed to be monotone. As discussed above, our results improve upon the spectral gap and mixing time bounds shown by the previous works of [16, 14]. In the special case where the monotone set is additionally promised to contain every with Hamming weight at least , i. e. the middle layers of the hypercube, one may expect the censored and uncensored random walks to behave similarly, and indeed in this case [36] gave the optimal bound for the spectral gap and log-Sobolev constant of the censored random walk, which implies the optimal mixing time bound.
Besides combinatorial or algebraic structure, one may also ask what geometric structure affords fast mixing. As mentioned above, [38] studied the censored random walk when the set is a halfspace, i. e. for nonnegative numbers , which corresponds to 0-1 knapsack solutions. Intuitively, such set should not contain bottlenecks even if it is very small, and indeed [38] showed a mixing time bound of . By our Theorem 17, this also gives an inverse polynomial lower bound on the quantity . On the other hand, a lower bound of holds for the mixing time [38], and closing this gap is an interesting open problem333In a different line of investigation, a series of works has explored approaches for approximately sampling and counting knapsack solutions using dynamic programming [17, 26, 42, 24, 41, 18]; the best results in this direction are and -time algorithms for approximately sampling (depending on the model of computation) [17, 24, 41], and a (subquadratic) -time algorithm for approximately counting knapsack solutions [18]. However, these results do not directly say anything about the mixing time of the random walk on knapsack solutions..
Directed isoperimetric inequalities
Directed versions of isoperimetric inequalities such as the classical Poincaré inequality and Talagrand’s inequality [43] have emerged over the last couple of decades as a key tool in the field of property testing. Since the introduction of the problem of monotonicity testing [25] and especially since the work of [11], directed isoperimetric inequalities have unlocked new results on the query complexity of testing monotonicity of Boolean functions over discrete domains such as the hypercube and hypergrid [31, 7, 40, 5, 6, 10], and, more recently, real-valued functions over the hypercube [8] and the continuous cube [20]. In recent work, [12] proved such an inequality toward testing monotonicity of probability distributions over the hypercube, using certain conditional samples from that distribution. To the best of our knowledge, the work of [16] was the first application of a directed isoperimetric inequality (i. e. the inequality of [25] for Boolean functions) outside of property testing, and the present work seems to be the first application of a real-valued directed isoperimetric inequality outside of property testing.
The specific Poincaré inequality we prove in Theorem 12 is most closely related to the following previous developments. The work of [19] introduced the systematic study of directed -Poincaré inequalities for monotonicity testing, and proved an version of our Theorem 12; in the language of that paper, Theorem 12 is a directed -Poincaré inequality, whereas [19] proved an inequality. Another similar inequality for real-valued functions was proved by [8], who considered the Hamming distance as opposed to distance. An inequality (i. e. the same flavor as ours) was proved for functions over the continuous cube in [20], and our proof via the study of a dynamical system is directly inspired by theirs. Most recently, [12] proved a directed -Poincaré inequality444Compared to Theorem 12, that inequality uses the as opposed to distance, and takes a square-root inside the expectation operator in our definition of in Definition 11. By Jensen’s inequality, the square root of the left- and right-hand sides of our inequality are respectively larger than the left- and right-hand sides of the inequality of [12], so the two results are not immediately comparable. for real-valued functions on the hypercube, by extending the result of [31] for Boolean functions via a thresholding argument of [4]. There does not seem to be a trivial reduction between our inequality and the foregoing results.
Organization of the paper
It is clear that Theorems 14 and 16 together imply Theorem 8, and we recall that standard Markov chain theory ([35, Theorem 12.4]) derives Theorem 6 from Theorem 8.
Section 2 presents a proof of the approximate FKG inequality for large monotone sets (Theorem 14). Section 3 is where the heart of the argument of [16] is carried out in the setting. Section 3 demonstrates that the directed isoperimetric inequality of (Theorem 12) implies the undirected isoperimetric inequality of the monotone subset (Theorem 16), and why the approximate FKG ratio of is important for this implication. Sections 4 and 5 are devoted to proving the directed -Poincaré inequality (Theorem 12).
We defer the proof of Theorem 17, which is logically independent from the proof of the main result Theorem 6, to the full version of the paper.
2 Approximate FKG inequality
The goal of this section is to prove Theorem 14, where we need to lower bound the correlation ratio between two monotone increasing functions on . We first note that the case where the functions take values in is easy. Indeed, we have the following simple lemma.
Lemma 19.
Assume that are monotone increasing functions. Then we have .
Proof.
Let and . By the monotonicity of and , the sets and are both monotone subsets of the hypercube . By the classical FKG inequality [22] we know that . Therefore,
It is straightforward to deduce from Lemma 19 that for monotone increasing functions , the desired approximate FKG inequality
holds.
The main challenge in Theorem 14 lies in extending this idea to real-valued functions. In fact, the problem can be reduced to proving the following statement, which involves purely random variables rather than any structural property of the partially ordered set .
Theorem 20.
Let be a pair of real-valued random variables with bounded second moment. Suppose there is a constant such that for all ,
| (1) |
then we must have
| (2) |
Proof of Theorem 14 assuming Theorem 20.
Let be a uniformly random element of and let and . Thus , , and .
Now for each pair of , if we define by
since they are clearly monotone increasing 0/1-valued functions, we can apply Lemma 19 to and to deduce that . If , then and the conclusion follows from the classical FKG inequality. If , we apply Theorem 20 to the random variable pair with constant , which yields exactly the desired conclusion.
The remainder of this section is devoted to proving Theorem 20, which is surprisingly nontrivial. To illustrate the complexity of this inequality, we note that equality in (2) holds for a wide range of joint distributions of beyond the case captured by Lemma 19, i.e. where and take only two possible values.
2.1 A symmetric model
A key challenge in Theorem 20 lies in its lack of “centrosymmetry” with respect to . While the assumption (1) does not remain invariant under the substitution and , the conclusion is unaffected by such substitutions. This raises an intriguing question: what is the “symmetric” information inherent in (1) that leads to the conclusion?
In this subsection, we present an approach that effectively extracts the “symmetric” information from (1). To this end, we first define two Borel measures on induced by and .
Definition 22.
Let be the Borel-measurable map defined by , and then for each Borel set , let be the Lebesgue measure of the inverse image . The countable additivity of easily follows from the countable additivity of the Lebesgue measure.
The measure is referred to as the push-forward of the Lebesgue measure on by the map . Similarly define the Borel measure on to be the push-forward of the Lebesgue measure by the map .
The definition of push-forward measures naturally leads to the following “change of variable” formula, which is a standard fact in measure theory.
Proposition 23 ([9, Theorem 3.6.1]).
Suppose is a Borel-measurable map from to , and suppose is the push-forward of the Lebesgue measure under . Then for any Borel-measurable function , we have
We then define the reverse of a measure, which corresponds to the substitution .
Definition 24.
If is a Borel measure on , we let be the Borel measure on defined by , for Borel subsets of .
The following definition is the crucial tool in our proof of Theorem 20.
Definition 25.
Fix a constant . We define the operator by
for Borel measures on . For , we omit the subscript and write .
The next two propositions demonstrate that the operator is able to capture the variances and covariance of and . Proposition 27 is the key place where “symmetric” information is extracted from the condition (1).
Proposition 26.
We have . Similarly, .
Proof.
Writing expected values as integrations of cumulative distribution functions (the “layer cake representation”), we have
where the fourth equality above follows from Proposition 23.
Proposition 27.
Assuming (1), we have .
Proof.
In a similar way to the proof of Proposition 26, we have
| (3) |
Note that on one hand, by (1) we have
| (4) |
On the other hand, by union bound we have
| (5) |
Plugging (4) and (5) into (3), we have
where the first equality above follows from Proposition 23.
We can now reduce Theorem 20 to the following more “symmetric” lemma, whose proof we defer to the full version.
Lemma 28.
For any two Borel measures , and any constant we have
Proof of Theorem 20 assuming Lemma 28.
Using Proposition 27, Lemma 28 and Proposition 26 successively, we have
3 From directed to undirected isoperimetry
In this section, we provide a proof of Theorem 16 assuming Theorem 12. Throughout the section, we fix a monotone set with at least 2 elements.
3.1 Domain extension
Since the target result, Theorem 16, focuses solely on the subset of the hypercube, while Theorem 12 applies only to functions defined on the entire hypercube, we first introduce a simple method for extending function domains to the whole hypercube.
Definition 29.
We define an operator that extends any function to the function defined by
By defining the value of the function outside of the original domain to be sufficiently small, the extension operator enjoys the following two useful properties that allow us to access the power of Theorem 12.
Proposition 30.
For every function we have
Proof.
See the full version.
Proposition 31.
For every function , there exists a monotone increasing function such that , where the -norm is the norm in the inner product space .
Proof.
Since the collection of all monotone increasing real-valued functions on form a closed set in the Euclidean space , there exists a monotone increasing function such that , where the -norm is the norm in the space . Now note that the restriction is a monotone increasing function on . Therefore,
3.2 Correlation analysis
In this subsection, we lay some groundwork about correlation of functions (or equivalently, random variables) that will help prove Theorem 16. We begin with the following natural definition of correlation ratios.
Definition 32.
For non-constant functions , we define
The following triangle-inequality-type lemma is going to be important in the proof of Theorem 16. Conceptually, the lemma says that if functions and on are not very correlated with each other (that is, is bounded away from 1), then cannot be very correlated with both and at the same time. In particular, we will later use the lemma in the case where is a monotone increasing function and is a monotone decreasing function, which cannot be very correlated if is bounded away from .
Proposition 33.
Consider three non-constant functions . We have
Proof.
We may without loss generality assume that . In this case, , and .
If then the conclusion trivially holds since . Similarly if , the conclusion is also trivial. In the following, we assume that and .
Consider the matrix
For each vector , we know . So is a positive semi-definite matrix. This means , and we can expand it into
| (6) |
If , then (6) implies and we arrive at the conclusion. In the following we assume .
Expanding the Cauchy-Schwarz inequality yields
| (7) |
Multiplying both sides of (7) by and then adding it to (6), we get the desired conclusion .
The following definition serves to interpret correlation ratios in terms of distances.
Definition 34.
For functions , we define
where the -norm is the norm in the inner product space .
Proposition 35.
Consider two non-constant functions . We have
Proof.
Note that
| (8) |
If , then , and the quadratic polynomial in the right hand side of (8) is minimized at . Therefore , as desired.
3.3 Proof of Theorem 16
We are now ready to prove Theorem 16 assuming Theorem 12.
Proof of Theorem 16 assuming Theorem 12.
We have
| (9) |
for some monotone increasing functions . If is non-constant, we pick to be . If is constant, we pick an arbitrary non-constant increasing function . In either case, we trivially have
Similarly we pick a non-constant increasing function such that . We can then continue from (9) and have
4 Spectral theory and heat flow for directed graphs
In this section, as a first step toward proving our directed Poincaré inequality for the hypercube (Theorem 12), we first set up a framework that applies to the more general case of directed weighted graphs. Specifically, we revisit and extend the study of directed analogues of classical concepts from spectral graph theory such as the Laplacian operator, the Dirichlet energy, and the heat flow; define a directed notion of spectral gap for weighed directed graphs; and show that bounding this dynamical spectral gap suffices for proving a directed Poincaré inequality. Then, in the next section, Theorem 12 will follow as an application once we establish a bound on the directed spectral graph of the directed hypercube graph.
Prior work on spectral theory for directed graphs
There has been extensive prior work developing spectral graph theory beyond the classical setting of undirected graphs, toward capturing directed graphs and hypergraphs. Early work of [21, 13] associated a certain Hermitian matrix with each directed graph, and showed a Cheeger-type inequality based on the eigenvalues of that matrix, and many subsequent works have built upon that foundation; we refer to the recent thesis [45] for a thorough review, and here we mention two recent lines of work that are closest to our setting. One line of works [33, 34, 45] has developed a theory of reweighted eigenvalues capturing expansion properties of directed graphs and hypergraphs, proved Cheeger inequalities for these settings, and devised efficient algorithms for graph partitioning. Another line of works [47, 48, 23, 28] has pursued similar goals by analyzing a nonlinear Laplacian operator and the heat equation associated with it.
While our interest in a spectral theory for directed graphs is related to these previous works (and indeed we will build upon the approach of [47]), our focus is slightly different. In a nutshell, while prior works have focused on spectral characterizations of good expansion of a directed graph as captured by directed versions of Cheeger inequalities (for edge conductance or vertex expansion) and mixing time of random walks, our focus will be on the quality of as the “substrate” for a dynamical process; we will consider a good directed spectral expander if it affords fast convergence for that process. In particular, our perspective allows for directed acyclic graphs to be considered good expanders, which is a stark departure from prior perspectives – as we briefly explain next.
Indeed, a central focus of prior works has been to establish Cheeger inequalities of the type , where denotes a relevant second eigenvalue related to the directed weighted graph , and the edge conductance of is
where denotes the outgoing edge boundary of and denotes the total weighted degree of all vertices in . Now, if is not strongly connected, then in general there exists a set with positive volume but no outgoing edges, which makes and thus zero. In particular, this is the case for the directed hypercube graph which we are interested in, so if we hope to show a non-trivial directed Poincaré inequality via a spectral argument, such a quantity will not do.
Our approach
The type of spectral theory for directed graphs we study in this section was first developed by [47] in the context of network analysis. In that work, [47] defined a nonlinear Laplacian operator acting on real-valued functions defined on the vertices of a directed graph, showed that this operator induces a dynamical process that is a directed analogue of the classical heat flow on graphs, proved that this operator has nontrivial eigenvalues, and established a Cheeger inequality for this setting. While [47] focused on the implications of directed spectral theory for graph partitioning and related problems in network analysis, we focus on the dynamical properties of the heat flow on directed graphs – namely its convergence to a monotone limit, and its connection to the directed Poincaré inequality.
Let us briefly motivate and preview the main ideas in our argument. In classical spectral graph theory, given a graph , the following four concepts play a central role:
-
1.
The Laplacian operator , which acts on function by outputting another function .
-
2.
The Dirichlet energy functional , which associates with each an energy measuring the “local variance” of along edges of .
-
3.
The heat flow semigroup , which captures a dynamical process which starts at some initial state and has its rate of change governed by the Laplacian: . The heat flow informally “sends mass” along each edge of from the vertex with higher -value to the vertex with lower -value, causing the system to converge to an equilibrium state.
-
4.
The Poincaré inequality, which states that . The best constant is called the spectral gap of .
The appearance of the Poincaré inequality above hints at the relevance of this theory to our goal of proving a directed Poincaré inequality, and we mentioned above that our strategy toward this goal will be to define a directed version of the spectral gap. As a motivation for this strategy, we recall that the spectral gap ties together essentially all of the elements of the list above; indeed, as summarized in [46, Theorem 2.18], the following are equivalent given a constant :
-
1.
Poincaré inequality: for all .
-
2.
Variance decay: for all .
-
3.
Energy decay: for all .
To prove a directed Poincaré inequality, we replace the Laplacian operator with an operator which, intuitively, only “sends mass” from vertex to vertex if is a directed edge and , i. e. violates monotonicity along edge ; the dynamical system induced by is the directed heat flow on , which was studied by [47]. The directed heat flow is precisely the gradient system for the directed energy functional , i. e. the upward boundary from Definition 11. Thus, this system intuitively “corrects” the local violations of monotonicity as quickly as possible, and indeed it converges to a monotone function as .
This directed theory cannot fully analogize the classical situation above; for example, variance decay fails to hold, because non-constant monotone functions are (non-unique!) stationary solutions. Instead, we will define the dynamical spectral gap of as the best constant characterizing the (directed) energy decay, as in Item 3 above, and then show that (the directed version of) Item 3 implies (the directed version of) Item 1.
As mentioned in the introduction, recent work of [20] also proved a directed Poincaré inequality – for functions defined on the continuous cube – using a dynamical argument. Indeed, that work also took as its starting point the connections between the heat flow and the Poincaré inequality, and showed that the natural directed version of the heat flow in continuous space enjoys exponential energy decay, which implies a directed Poincaré inequality for that setting. Our proof is conceptually similar to the proof of [20], but our techniques differ in at least two ways: 1) [20] required analytical arguments from the theory of partial differential equations (PDEs), while we are able to study our dynamical process as an ordinary differential equation (ODE) thanks to the finite-dimensional nature of our problem; and 2) [20] used tools from optimal transport theory to tensorize their one-dimensional result, while we obtain a multidimensional inequality directly by studying the directed heat flow as a gradient system. In this last regard, our proof also bears resemblance to, and is inspired by prior work of [32] on the so-called Paulsen problem from operator theory, where a “movement decay” property of a suitable dynamical system was used to bound the distance between the initial and equilibrium states of that system.
The rest of this section is organized as follows. Sections 4.1, 4.2, and 4.3 present the directed versions of the Laplacian operator, the Dirichlet energy functional, and the heat flow, respectively. These subsections are largely an alternative exposition of the ideas in [47], but with a different emphasis tailored to our goals555In particular, [47] defined both normalized and unnormalized versions of their Laplacian operator, and focused on the normalized one. We study a single definition for weighted graphs.. In Section 4.4, we define the dynamical spectral gap, which mediates a directed Poincaré inequality for each directed weighted graph.
Notation
Given a discrete set , we denote by the Hilbert space obtained by endowing the set of functions with the inner product , where the expectation is taken with respect to the uniform distribution over . This inner product induces the norm .
For each , we write for the standard basis vector given by .
4.1 The directed Laplacian
Let be a directed weighted graph, where is function specifying the weights of edges in . By convention, we say that is an edge in when . We say is undirected is is a symmetric function.
Definition 36 (Directed Laplacian [47]).
The directed Laplacian operator of is the operator given by
| (10) |
for each . In this paper, we use the notation , for .
Given , we say an edge is -monotone if , and we say it is -antimonotone if . We say is monotone if every edge of is -monotone. If we think of as the distribution of “mass” over the vertices and of as the rate of change of over time , then Definition 36 posits that mass flows along the -antimonotone edges, from the heavier vertex to the lighter one. When is undirected, this process is the standard heat flow on , and indeed Definition 36 recovers the standard graph Laplacian in this case:
Observation 37.
If is undirected, then is (half of) the standard (unnormalized) Laplacian operator of . Indeed, we can see the action of on as follows: 1) remove the -monotone edges from ; 2) view the resulting graph as undirected; and 3) apply the standard Laplacian operator to .
Remark 38.
In spectral graph theory, one typically defines the Laplacian operator of an undirected graph as in our notation, i. e. by replacing with in Definition 36. Our notation follows instead the tradition from probability theory (see e. g. [3, 46]), which has the advantages 1) that itself, rather than , will be the generator of the heat semigroup – our main object of interest; and 2) of consistency with the analytic setting, where we have the Laplacian operator for smooth functions in Euclidean space.
We note that unlike the standard Laplacian operator, the operator is nonlinear and not self-adjoint in general. Instead, one may think of as a “piecewise linear” operator on , which is in particular Lipschitz continuous. In the full version we show
Lemma 39.
The operator is Lipschitz continuous.
4.2 The energy functional
As in the case of the standard Laplacian operator, the directed Laplacian naturally induces an energy functional (or Dirichlet form). The following definition corresponds to the Rayleigh quotient defined in [47].
Definition 40 (Energy functional).
The directed Dirichlet energy functional is given by .
The directed Dirichlet energy measures the local violations of monotonicity along edges of , and it is indeed always non-negative, as shown in the following proposition, which is similar to Lemma 4.3 of [47] for the normalized nonlinear Laplacian and is proved in the full version.
Proposition 41 (Energy functional measures local violations).
For each , it holds that
Observe that since is “piecewise linear”, the energy functional is a “piecewise quadratic” functional on . Furthermore, at each point the Laplacian points at the direction opposite to the gradient of . Using the appropriate analytic formalism, we show the following lemma in the full version.
Lemma 42.
For any , we have .
4.3 Directed heat flow
As previewed in the Section 4.1, the directed Laplacian operator can be thought of as the rate of mass transfer along -antimonotone edges of in a dynamical process. Let us make this notion precise.
Given any , we define the directed heat flow on with initial state as the dynamical system given by the initial value problem (IVP)
| (11) |
Since the operator is Lipschitz, a standard existence and uniqueness theorem for ordinary differential equations (ODEs) implies that this IVP has a unique solution ; see e. g. [44, Corollary 2.6]. This can also be shown using the theory of maximal monotone operators, as done by [28] for the heat flow on hypergraphs, and by [48] in a study that generalizes both the directed graph setting of [47] and the hypergraph setting of [28].
Moreover, the directed heat flow enjoys the following semigroup structure. Define the operator family , with for each , as follows: for each , let be the solution to the IVP (11), and let . Then it immediately follows that satisfies the properties of a semigroup, namely
-
1.
is the identity operator.
-
2.
for all .
-
3.
for all (follows from the differentiability of the solution ).
We call the directed heat semigroup operator. Note that is a nonlinear operator.
Observation 43 (Monotone functions are stationary solutions).
If is monotone, then and hence for all .
Since monotone functions are stationary solutions to the directed heat flow (Observation 43), while any non-monotone has non-zero Laplacian (see Proposition 41), it is natural to expect that always converges to a monotone function as . This is indeed the case, because the directed heat flow is a gradient system for the convex energy functional .
Proposition 44 (Directed heat flow is gradient system).
For all and , we have .
Proof.
By the definition of we have . The claim follows from Lemma 42.
Using basic facts about gradient systems, in the full version we show
Corollary 45 (Convergence to monotone equilibrium).
For every , there exists a (unique) monotone such that as .
In light of Corollary 45, we may define the following limit operator.
Definition 46 (Monotone equilibrium).
We define the operator by for each , and call the monotone equilibrium of .
4.4 Dynamical spectral gap
In this subsection, we associate with the directed Laplacian operator a quantity , the dynamical spectral gap of , as a natural directed generalization of the classical (undirected) case. In particular, characterizes the rate of energy decay in the directed heat flow as converges to its monotone equilibrium, and this also implies a directed Poincaré inequality linking the distance between the initial and equilibrium states to the energy of the initial state (i. e. its violations of monotonicity). We defer to the full version a thorough discussion of the motivation behind our definition. Here, we directly give the following definition, which accomplishes the goals described in the beginning of this section.
Definition 47.
The dynamical spectral gap of is the quantity given by
In the full version, we show that characterizes the rate of exponential decay of , and this suffices to obtain, via a calculus argument, our directed Poincaré inequality:
Theorem 48 (Directed Poincaré inequality).
For all , it holds that
5 The dynamical spectral gap of the directed hypercube
Let denote the unweighted directed hypercube in dimension , i. e. where the weight function is as follows: for each , if with , and otherwise. For simplicity of notation, in this section we also let .
This section studies the spectral gap of endowed with directed Laplacian operator and associated directed Dirichlet energy functional . We show
Theorem 49 (Dynamical spectral gap of the directed hypercube).
satisfies .
Here we sketch the main idea behind the proof of Theorem 49, which is given in the full version. The main point is that and enjoy a useful coordinate-wise decomposition: we write , where each is given by
for each and . It is straightforward to check that this decomposition agrees with Definition 36. Similarly, from Proposition 41 we also obtain the decomposition , where each is given by
for each . (The extra factor of compared to Proposition 41 appears because the summation above counts each edge of twice.)
Now, the upper bound of Theorem 49 is easy and attained by anti-dictator functions, so the main point is to show the lower bound . To this end, we fix any function , consider the fraction in Definition 47, and expand the operators and according to the coordinate-wise decompositions above. The definitions of and readily yield , so the main step is to control the correlation terms . The main conceptual ingredient of the proof shows that this correlation is nonnegative, which is established by an argument reminiscent of the analysis of the “edge tester” for monotonicity of functions on the hypercube [25]666[25] define a switch operator which fixes all violations of monotonicity of a Boolean function along direction , by switching the values of along violating edges. A key lemma in that paper shows that the application of can only make the number of violations of monotonicity along a different direction smaller – informally, the work along direction “helps” toward the work along direction . In our setting, this is captured by the positive correlation between the contributions of directions and to the action of .. We prove this fact below.
Lemma 50.
For every with , we have .
Proof.
Suppose without loss of generality that and . We first observe that it suffices to consider each “square” obtained by fixing all but the first two coordinates, since the inner product decomposes along these squares. Concretely, for each , let be given by for each , where we write for the value of at the input obtained by concatenating and . Then
We will show that for any , the sum is nonnegative, which will complete the proof. Define
Then we have
where the last inequality is proved in Lemma 51 below.
Lemma 51.
For any , we have .
Proof.
Let and , so that our goal is to show that , or equivalently that and . By switching the roles of and , it suffices to prove the first implication, i. e. that .
Suppose . Then , since otherwise we would have and hence . Therefore . Moreover, if then and we are done, so we may assume that . We conclude that
where the first inequality holds since while , and the second inequality holds since .
Combining Theorems 49 and 48, we conclude
Corollary 52 (Directed Poincaré inequality for the hypercube; refinement of Theorem 12).
For all , it holds that .
References
- [1] Vedat Levi Alev and Lap Chi Lau. Improved analysis of higher order random walks and applications. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 1198–1211, 2020. doi:10.1145/3357713.3384317.
- [2] Nima Anari, Kuikui Liu, Shayan Oveis Gharan, and Cynthia Vinzant. Log-concave polynomials ii: high-dimensional walks and an fpras for counting bases of a matroid. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 1–12, 2019. doi:10.1145/3313276.3316385.
- [3] Dominique Bakry, Ivan Gentil, and Michel Ledoux. Analysis and Geometry of Markov Diffusion Operators. Springer Cham, 2014.
- [4] Piotr Berman, Sofya Raskhodnikova, and Grigory Yaroslavtsev. -testing. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 164–173, 2014. doi:10.1145/2591796.2591887.
- [5] Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. A monotonicity tester for boolean functions on -dimensional hypergrids. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 1796–1821, 2023. doi:10.1109/FOCS57990.2023.00110.
- [6] Hadley Black, Deeparnab Chakrabarty, and C Seshadhri. Directed isoperimetric theorems for boolean functions on the hypergrid and an monotonicity tester. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 233–241, 2023.
- [7] Hadley Black, Deeparnab Chakrabarty, and Comandur Seshadhri. A monotonicity tester for boolean functions over the hypergrid . In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2133–2151, 2018.
- [8] Hadley Black, Iden Kalemaj, and Sofya Raskhodnikova. Isoperimetric inequalities for real-valued functions with applications to monotonicity testing. Random Structures & Algorithms, 2024.
- [9] Vladimir Igorevich Bogachev. Measure theory, volume 1. Springer, 2007.
- [10] Mark Braverman, Subhash Khot, Guy Kindler, and Dor Minzer. Improved monotonicity testers via hypercube embeddings. In Proceedings of the Innovations in Theoretical Computer Science Conference (ITCS), 2023.
- [11] D Chakrabarty and C Seshadhri. An monotonicity tester for boolean functions over the hypercube. SIAM Journal on Computing, 45(2):461–472, 2016. doi:10.1137/13092770X.
- [12] Deeparnab Chakrabarty, Xi Chen, Simeon Ristic, C Seshadhri, and Erik Waingarten. Monotonicity testing of high-dimensional distributions with subcube conditioning, 2025. doi:10.48550/arXiv.2502.16355.
- [13] Fan Chung. Laplacians and the cheeger inequality for directed graphs. Annals of Combinatorics, 9:1–19, 2005.
- [14] Emma Cohen. Problems in catalan mixing and matchings in regular hypergraphs. Georgia Tech PhD thesis, 2016.
- [15] Mary Cryan, Heng Guo, and Giorgos Mousa. Modified log-sobolev inequalities for strongly log-concave distributions. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 1358–1370, 2019. doi:10.1109/FOCS.2019.00083.
- [16] Jian Ding and Elchanan Mossel. Mixing under monotone censoring. Electronic Communications in Probability, 19:1–6, 2014.
- [17] Martin Dyer. Approximate counting by dynamic programming. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 693–699, 2003. doi:10.1145/780542.780643.
- [18] Weiming Feng and Ce Jin. Approximately counting knapsack solutions in subquadratic time. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1094–1135, 2025. doi:10.1137/1.9781611978322.32.
- [19] Renato Ferreira Pinto Jr. Directed poincaré inequalities and monotonicity testing of lipschitz functions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), 2023. doi:LIPIcs.APPROX/RANDOM.2023.61.
- [20] Renato Ferreira Pinto Jr. Directed isoperimetry and monotonicity testing: A dynamical approach. In 2024 IEEE 65th Annual Symposium on Foundations of Computer Science (FOCS), pages 2295–2305. IEEE, 2024. doi:10.1109/FOCS61266.2024.00134.
- [21] James Allen Fill. Eigenvalue bounds on convergence to stationarity for nonreversible markov chains, with an application to the exclusion process. The annals of applied probability, pages 62–87, 1991.
- [22] Cees M Fortuin, Pieter W Kasteleyn, and Jean Ginibre. Correlation inequalities on some partially ordered sets. Communications in Mathematical Physics, 22:89–103, 1971.
- [23] Kaito Fujii, Tasuku Soma, and Yuichi Yoshida. Polynomial-time algorithms for submodular laplacian systems. Theoretical Computer Science, 892:170–186, 2021. doi:10.1016/J.TCS.2021.09.019.
- [24] Pawel Gawrychowski, Liran Markin, and Oren Weimann. A faster FPTAS for #knapsack. In Proceedings of the International Colloquium on Automata, Languages, and Programming (ICALP), pages 64–1, 2018. doi:10.4230/LIPIcs.ICALP.2018.64.
- [25] Oded Goldreich, Shafi Goldwasser, Eric Lehman, Dana Ron, and Alex Samorodnitsky. Testing monotonicity. Combinatorica, 3(20):301–337, 2000. doi:10.1007/S004930070011.
- [26] Parikshit Gopalan, Adam Klivans, Raghu Meka, Daniel Štefankovic, Santosh Vempala, and Eric Vigoda. An fptas for #knapsack and related counting problems. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 817–826, 2011. doi:10.1109/FOCS.2011.32.
- [27] Venkatesan Guruswami. Rapidly mixing markov chains: A comparison of techniques (a survey), 2016. arXiv:1603.01512.
- [28] Masahiro Ikeda, Atsushi Miyauchi, Yuuki Takai, and Yuichi Yoshida. Finding cheeger cuts in hypergraphs via heat equation. Theoretical Computer Science, 930:1–23, 2022. doi:10.1016/J.TCS.2022.07.006.
- [29] Mark Jerrum. Mathematical foundations of the markov chain monte carlo method. In Probabilistic methods for algorithmic discrete mathematics, pages 116–165. Springer, 1998.
- [30] Mark Jerrum. Counting, sampling and integrating: algorithms and complexity. Springer Science & Business Media, 2003.
- [31] Subhash Khot, Dor Minzer, and Muli Safra. On monotonicity testing and boolean isoperimetric-type theorems. SIAM Journal on Computing, 47(6):2238–2276, 2018. doi:10.1137/16M1065872.
- [32] Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, and Akshay Ramachandran. The paulsen problem, continuous operator scaling, and smoothed analysis. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 182–189, 2018. doi:10.1145/3188745.3188794.
- [33] Lap Chi Lau, Kam Chuen Tung, and Robert Wang. Cheeger inequalities for directed graphs and hypergraphs using reweighted eigenvalues. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 1834–1847, 2023. doi:10.1145/3564246.3585139.
- [34] Lap Chi Lau, Kam Chuen Tung, and Robert Wang. Fast algorithms for directed graph partitioning using flows and reweighted eigenvalues. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 591–624, 2024. doi:10.1137/1.9781611977912.22.
- [35] David A Levin and Yuval Peres. Markov chains and mixing times, volume 107. American Mathematical Soc., 2017.
- [36] Pierre Mathieu. Log-sobolev and spectral gap inequalities for the knapsack markov chain. Markov Process. Relat. Fields, 8:595–610, 2002.
- [37] Ravi Montenegro and Prasad Tetali. Mathematical aspects of mixing times in markov chains. Foundations and Trends® in Theoretical Computer Science, 1(3):237–354, 2006. doi:10.1561/0400000003.
- [38] Ben Morris and Alistair Sinclair. Random walks on truncated cubes and sampling 0-1 knapsack solutions. SIAM journal on computing, 34(1):195–226, 2004. doi:10.1137/S0097539702411915.
- [39] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014.
- [40] Ramesh Krishnan S Pallavoor, Sofya Raskhodnikova, and Erik Waingarten. Approximating the distance to monotonicity of boolean functions. Random Structures & Algorithms, 60(2):233–260, 2022. doi:10.1002/RSA.21029.
- [41] Romeo Rizzi and Alexandru I Tomescu. Faster fptases for counting and random generation of knapsack solutions. Information and Computation, 267:135–144, 2019. doi:10.1016/J.IC.2019.04.001.
- [42] Daniel Štefankovič, Santosh Vempala, and Eric Vigoda. A deterministic polynomial-time approximation scheme for counting knapsack solutions. SIAM Journal on Computing, 41(2):356–366, 2012. doi:10.1137/11083976X.
- [43] Michel Talagrand. Isoperimetry, logarithmic sobolev inequalities on the discrete cube, and margulis’ graph connectivity theorem. Geometric & Functional Analysis, 3(3):295–314, 1993.
- [44] Gerald Teschl. Ordinary differential equations and dynamical systems, volume 140. American Mathematical Soc., 2012.
- [45] Kam Chuen Tung. Reweighted Eigenvalues: A New Approach to Spectral Theory beyond Undirected Graphs. PhD thesis, University of Waterloo, 2025.
- [46] Ramon van Handel. Probability in high dimension (lecture notes), 2014.
- [47] Yuichi Yoshida. Nonlinear laplacian for digraphs and its applications to network analysis. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pages 483–492, 2016. doi:10.1145/2835776.2835785.
- [48] Yuichi Yoshida. Cheeger inequalities for submodular transformations. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2582–2601, 2019. doi:10.1137/1.9781611975482.160.
