Markov Chain Robustness
Abstract
When a Markov chain models nature or social interactions, it is likely not followed exactly, but only approximately. We therefore introduce several notions of robustness for a Markov chain . Our standard adversary can dynamically change transition probabilities of by , and our strong adversary can completely control each transition independently with probability , as in a model by Azar, Broder, Karlin, Linial, and Philips [4]. These adversaries are equivalent up to constant factors if the degrees are constant. Our adversarial chains need not converge.
We define and prove various robustness properties of a reversible chain , i.e., a random walk on a connected undirected graph . Let be the maximum degree, the diameter, the stationary distribution, and the mixing time.
-
1.
We define a natural analogue that upper bounds limiting frequencies in a set in the adversarial chain. We show that if , where is a variant of the mixing time, then for any .
-
2.
We define the mixing time robustness as the largest such that the approximate mixing time increases by only a constant factor, and prove that it is .
-
3.
We define the hitting time robustness as the largest such that the maximum hitting time increases by only a constant factor, and show that it is . For trees, we show it is .
-
4.
We define the cover time robustness as the largest such that the cover time increases by only a constant factor. We show that in most graphs it’s at least the hitting time robustness.
-
5.
We characterize the mixing, hitting, and cover time robustnesses for constant-degree regular expander graphs up to constant factors. They are , , and , respectively.
Keywords and phrases:
Markov chain, random walk, mixing time, hitting time, cover time, robustness, expander graph2012 ACM Subject Classification:
Theory of computation Random walks and Markov chainsAcknowledgements:
We thank Kuikui Liu and Salil Vadhan for helpful discussions, the anonymous referees for very useful suggestions, and Zhiyang Xun and Dean Doron for helpful comments.Funding:
Supported in part by NSF Grant CCF-2312573 and a Simons Investigator Award (#409864).Editor:
Shubhangi SarafSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Markov chains are used to model many scenarios, and there is a huge theory about them. In many situations, such as a Markov chain modeling nature or social behavior, it is likely that the Markov chain is not followed exactly, but only approximately. We therefore study how behavior in Markov chains is affected by small perturbations.
For example, Cannon, Daymude, Randall, and Richa [6] showed that a collection of computational elements can compress – gather tightly as in a sphere – by asynchronously each executing a simple program. (See also [17].) They analyzed this by modeling it as a Markov chain. Such compression and other self-organizing behavior also occurs in nature. In nature, one may expect that the algorithm is not followed exactly but only approximately. Would the same compressive behavior arise in this case? For a Markov chain modeling social behavior, it’s even more compelling that the chain wouldn’t be followed exactly but could have adversarial influences.
Another situation where errors arise is in simulating huge physical systems. Such a system may exactly follow a Markov chain; however, researchers simulate such a system by coarsening the exact chain into a discrete approximation. This coarsening necessarily leads to error. How badly does this error affect the simulation? In this case, Kania, Aristoff, and Zuckerman [15] recently developed a randomized algorithm to remove the error, but analyzing the error remains interesting.
There has been some work exploring the power of an adversary on certain Markov chains. Chin, Moitra, Mossel, and Sandon [7] studied the power of an adversary in Glauber dynamics. Glauber dynamics is a type of random walk on , where is a set of nodes in an underlying graph and is a finite alphabet. They proposed an adversary that controls any changes to a subset with . They studied how such an adversary can affect global features of the system, such as the approximate mixing time.
Doron, Moshkovitz, Oh, and Zuckerman [9, 10] studied the power of an extremely strong adversary for random walks on lossless expanders. They showed that even in the presence of such a strong adversary, a random walk will approximately mix in some sense. However, their methods don’t appear to extend beyond lossless expanders.
We study a model that can be applied to any random walk. Related questions have been studied before, and we adopt some of those frameworks, but we study different questions.
To study errors in Markov chain behavior, Hartfiel [11, 12] proposed a natural model of Markov set-chains defined as follows. Assume the states are , and let denote the set of stochastic matrices. The Markov set-chain is a stochastic process defined by a set .111Hartfiel’s definition has being compact, but let’s ignore that for now. At each time step , a transition matrix is chosen arbitrarily. Then determines the transitions from to : . Thus a Markov chain is a special case of a Markov set-chain where the set consists of one matrix, the transition matrix of the Markov chain. In general, the transition matrices in may have different stationary distributions.
Azar, Broder, Karlin, Linial, and Phillips [4] studied a model similar in spirit. More recently, researchers have studied a model very similar to Markov set-chains under the name of dynamic graphs or evolving graphs, e.g., [3, 22].
1.1 The adversary
There appear to be three reasonable choices for the power of an adversary choosing .222Note that we give the adversary more power by allowing to depend on ; otherwise we could get three more choices for the adversary’s power.
-
1.
Oblivious: is chosen independently of ;
-
2.
Markovian: may depend on but not for ;
-
3.
Non-Markovian: may depend on for all .
Note that an oblivious adversary corresponds to a non-homogeneous (or time-dependent) Markov chain. Oblivious and Markovian adversaries are equivalent if is closed under “row replacement”: replacing a row in any by the corresponding row in another . The equivalence follows because the adversary can choose the th row assuming the Markov chain is in state . The Markovian and non-Markovian adversaries are equivalent if an adversary is trying to affect the hitting time, but not equivalent for cover time. Hartfiel, [4], and recent work on dynamic graphs implicitly use an oblivious adversary (and don’t discuss the others).
We focus on the most powerful, non-Markovian adversary, although we make some observations about the other adversaries.
Can we understand a Markov set chains (MSC) in terms of a related Markov chain? Although reversibility is not required for all of our theorems, we usually assume the related chain is reversible, i.e., a random walk on an undirected graph. We further assume the graph is connected; otherwise we could restrict attention to the appropriate connected component. The most important probability distribution related to a Markov chain is the stationary distribution . The most important quantities of interest in Markov chains are the mixing time , the time it takes to approach ; the max hitting time , the maximum expected time to go from one vertex to another; and the cover time , the expected time to visit all nodes from a worst-case starting node.
First, we discuss the stationary distribution. Suppose all chains in an MSC have the same stationary distribution . Will this MSC converge to ? It depends on the adversary. For an oblivious adversary, the MSC will converge to – the usual proof works. For a Markovian adversary, the MSC need not converge, even if all underlying chains are reversible; see Section 3.1.
There are two natural special cases of MSCs that we consider. First, an interval chain is an interval of matrices , where denotes the set of matrices such that for all states , we have . For interval chains, oblivious and Markovian adversaries are equivalent, since is closed under row replacement.
For our standard adversary, we consider interval chains of the following form. For a Markov chain333We identify a Markov chain with its transition matrix. and , we consider the interval chain . When convenient, we consider the essentially equivalent interval chain .
Second, [4] introduced a model of biased random walks where a controller biases the random walk. That is, given a Markov chain and a parameter , a controller chooses an arbitrary stochastic matrix with and the biased random walk follows . That is, with probability the controller biases the walk, but has to respect the graph structure. Haslegrave, Sauerwald, and Sylvester [13] studied this in the time dependent setting. The focus of these papers is on a friendly controller who tries to improve the situation from the simple random walk. Our focus is on an adversarial controller who seeks to make matters worse. This will be our strong adversary.
In other words, in both cases, we have that is left alone. A strong adversary may distribute the remaining probability arbitrarily on its support. A standard adversary is limited to change each probability by a bounded relative amount. However, a strong adversary with parameter is at most as powerful as a -standard adversary, where denotes the maximum degree. Since we are thinking of as small or constant, we don’t dwell on the difference between the adversaries.
One might suggest an even stronger adversary that can take an arbitrary probability and distribute it arbitrarily. In other words, it can choose such that every row of and differs by at most in total variation distance. This adversary generalizes the adversary of [7]. However, if for some node , then such an adversary can force the MSC to avoid hitting . This renders some of our questions impossible, so we don’t consider this adversary.
Main question.
For how large an does the adversarial chain have similar behavior as the original chain?
In order to make sense of this question, we need to define suitable notions for MSCs. We study several notions: the robustness of the stationary distribution, and the robustness of three times: the mixing time, the maximum hitting time, and the cover time. We will use strong adversaries to define stationary and mixing time robustness, and standard adversaries for hitting time and cover time robustness. This is because our proofs work for strong adversaries for the first two robustnesses, but strong adversaries are too powerful for the last two robustnesses, as explained below.
1.2 Stationary robustness
We begin with the robustness of the stationary distribution . Although MSCs may not converge to a distribution (see Section 3.1), we can still study the maximum and minimum limiting probabilities of sets. We can define this as follows. For , let denote the number of visits to up to time .
Definition 1.1.
For , the upper stationary probability
where the outer is over the choice of a strong -adversary.
We could define a lower limit analogously, but we focus on .
For our first theorem, we use a variant of the mixing time that we call (see Section 2.3). Like mixing time, it satisfies , where is the spectral gap. We prove the following.
Theorem 1.2.
Let be a graph with all degrees at most . Let be a graph with all degrees at most . For any -strong adversary, any starting distribution, and any with , we have
In particular, for , we have .
Here the term goes to 0 as , so in particular we have for any .
The dependence of on is essentially best possible, as the path graph shows.
To interpret this theorem, consider the case when the state space is exponentially large. Then any event with negligible stationary probability in the original chain has negligible stationary probability in this MSC with a nontrivial adversary. For example, [6] show that compression in their model happens with negligible probability; we deduce that this conclusion holds even if their chain is not followed exactly.
We could define stationary robustness as the largest such that the stationary probability analogues increase from to at most for some . The choice of or quantification over could a priori change the definition. However, Theorem 1.2 implies a robust stationarity of , and at least for the path the choice of doesn’t matter.
We compare Theorem 1.2 to the work of [4]. They only consider a time-independent adversary, although it wouldn’t matter for this theorem. The main difference between their work and ours is that they show a lower bound on the adversarial power, whereas we show an upper bound. Specifically, they showed that for any bounded-degree graph and any node , some strong -adversary can achieve , and that this is tight for an expander. Haslegrave, Sauerwald, and Sylvester [13] generalized this to more settings. Our Theorem 1.2 upper bounds for every graph and every strong -adversary.
1.3 Mixing time robustness
We now turn to the robustness of the fundamental times associated with a random walk: mixing time, max hitting time, and cover time. We begin with mixing time, which requires some set up. Let denote variation distance. We will generalize the following for a standard random walk. For a node , let denote the probability for a random walk starting at . Let
Definition 1.3.
Fix a small constant , say . The mixing time .
To generalize this to MSCs, we define
Here is the stationary distribution of the original Markov chain, and the sup is over strong -adversaries. Thus . We can now define the robust mixing time, which will be a workable definition despite the MSC possibly not converging.
Definition 1.4.
Fix a small constant , say . Define the robust mixing time as
where is the adversary parameter.
Note that , and that may be infinite. This leads to the following definition.
Definition 1.5.
The mixing time robustness is
When is omitted, we allow to be an arbitrarily large constant.444This only makes sense when we have a family of graphs, but this applies to big Oh notation generally.
It is not hard to show that the mixing time robustness is , where is the maximum degree; see Proposition 2.12. We achieve the square root of this quantity for constant-degree graphs.
Theorem 1.6.
In a graph with all degrees at most , the mixing time robustness .
This is tight for the path and the cycle. However, for expander graphs, this gives . We prove a constant lower bound for constant-degree expanders.
Theorem 1.7.
A -regular -spectral expander has mixing time robustness .
1.4 Hitting time robustness
We now turn to some expected stopping times. Let denote the expectation for a Markov (set) chain starting at . Let denote the hitting time of : the time to first visit after time 0. The standard hitting times in an unbiased random walk are , and the max hitting time is .
Before defining the robust version, we briefly discuss the adversary. Note that if we have a strong -adversary, the robustness for hitting and covering is less than , where denotes the minimum degree. That’s because a strong -adversary can always avoid a particular node. Due to this and our proofs working better with standard adversaries, we define the robust versions for standard adversaries.
Definition 1.8.
Fix an . The robust hitting time , where the sup is over strategies for standard -adversaries. The robust max hitting time is .
Generalizing an observation by [4], the quantities are the same for oblivious, Markovian, and non-Markovian adversaries. This is because an adversary’s best strategy at a given node does not depend on the previous history. This basically follows from Markov decision theory.
We focus on so we have one quantity rather than quantities. We can now define:
Definition 1.9.
The hitting time robustness is
When is omitted, we allow to be an arbitrarily large constant.
It is not hard to show that the hitting time robustness is ; see Proposition 2.12. We improve this bound dramatically.
Theorem 1.10.
The hitting time robustness .
We can improve this bound for trees. Let denote the diameter.
Theorem 1.11.
The hitting time robustness for any tree is .
Note that , and is often significantly less. For example, for the line and cycle on nodes, but . Even more dramatically, for the balanced binary tree but .
For constant degree expanders, Theorem 1.10 gives a bound of . We show that this is tight for expanders by giving a matching upper bound, which is a negative result.
Theorem 1.12.
The hitting time robustness of any constant degree expander is .
1.5 Cover time robustness
We now turn to the cover time. Let the random variable denote the first time the random walk has visited all nodes. We define the cover time .
Definition 1.13.
Fix an . The robust cover time , where the sup is over strategies for standard -adversaries.
Note that Markovian and non-Markovian adversaries do not appear to be equivalent, since the prior history is very relevant for the cover time. We now define:
Definition 1.14.
The cover time robustness is
When is omitted, we allow to be an arbitrarily large constant.
It is not hard to show that the cover time robustness is ; see Proposition 2.12. We improve this in many cases. We always have ; we can improve the cover time robustness whenever this is tight up to a constant factor.
Theorem 1.15.
In any graph with , we have .
1.6 Examples
It’s instructive to see how our bounds apply to natural examples.
1.6.1 Path/cycle
The path is the graph on with edges ; the cycle contains the additional edge . In the non-adversarial setting, the main quantities are similar for the two graphs. The mixing time, max hitting time, and cover time are all .
To understand the adversarial setting, we focus on the path. The worst a strong adversary can do is set all and all . In this case, the hitting times will increase by a factor of about . Therefore, the hitting time robustness is . All other robustnesses have the same order of magnitude. Thus, Theorem 1.6 is tight for mixing time robustness, but Theorem 1.10 is not tight for hitting time robustness. See Section 4.4 for details.
1.6.2 Expanders
Here we consider -regular expander graphs, where we think of . In the non-adversarial setting, the mixing time , the max hitting time [21], and the cover time [5, 21]. We actually prove these last two theorems in what we believe is a simpler way.
Our theorems imply tight bounds on the robustness notions for constant-degree regular expanders. Specifically, Theorem 1.7 show that the mixing time robustness is a constant, while Theorem 1.12 and Theorem 1.15 show that the hitting and cover time robustnesses are .
We remark that previous work shows that lossless expanders have robustness properties beyond what we consider here. A -regular graph is a lossless expander if sets that are not too large expand by a factor of where . Doron, Moshkovitz, Oh, and Zuckerman [9, 10] studied random walks where a non-Markovian adversary chooses steps according to a Chor-Goldreich (CG) source, and even generalizations of this. In a CG-source, in a -regular graph, an adversary can choose any distribution for the next step such that no edge is chosen with probability more than for some parameter . They show that even this biased walk mixes in the sense that the final node has entropy . However, their methods do not seem to extend to standard expander graphs.
1.7 Weak randomness
As another perspective, there has been a lot of research investigating the use of weak randomness in computing. In those situations, it is natural to allow computation on the weak randomness, for example to purify it into high-quality randomness. This work disallows such preprocessing, which is particularly natural when the random process models nature or social interactions.
1.8 Techniques
We prove Theorem 1.2 and Theorem 1.6 about stationarity and mixing time robustness by generalizing a potential function argument by Haslegrave, Sauerwald, and Sylvester [13]. Their potential function uses squares to prove a lower bound. We use a potential function based on smaller powers to prove an upper bound. In particular, we need an inequality going in the opposite direction. Specifically, for define the -power mean , , and the -max-avg operator . We use Holder’s inequality and other ideas to bound in terms of for some just slightly larger than 1. We also simplify the [13] method somewhat, avoiding the use of trajectory trees.
We prove Theorem 1.7 about mixing time robustness of expanders by adapting the proof that the second largest eigenvalue determines convergence.
We prove Theorem 1.10 about hitting time robustness by showing that in a standard random walk, there’s the “right” probability of hitting a node within twice the separation time . Specifically, for , we show that for all , we have . We then show that this probability doesn’t decline much even in the robust setting. Dividing the random walk into epochs of length , we conclude that the hitting time doesn’t increase much. We further use this technique to give a simpler proof of the max hitting time for expanders, which will be in the final version of this paper. Given its utility for both of these theorems, we believe that this technique could be useful elsewhere.
We prove Theorem 1.11 by first observing that we can take the adversary to be oblivious, so the corresponding adversarial walk has a stationary distribution. We then compare the stationary distributions, and apply the “essential edge lemma” from Aldous-Fill about hitting times across cut edges.
We prove Theorem 1.12, the negative result giving a tight bound for expanders, by giving a target distribution and modifying the Metropolis algorithm. Much of this argument is similar to a lower bound on by [4].
Theorem 1.15 about cover time robustness follows from Matthews’ technique for bounding the cover time in terms of the hitting time.
1.9 Related work
We have not found any previous work giving good bounds on our notions of robustness, except for lossless expanders. Doron, Moshkovitz, Oh, and Zuckerman [9, 10] analyzed extremely strong non-Markovian adversaries for lossless expanders, but their methods don’t apply more generally. They also didn’t analyze hitting or cover times.
In the model introduced by [4] and in follow-up works, the focus is on a controller seeking to minimize hitting and cover times, and they show lower bounds on increasing the probability of being in sets. For example, Haslegrave, Sauerwald, and Sylvester [13] improved a result of [4] and proved that an -controller can increase the probability of being in a set from to approximately . They used this to minimize hitting and cover times, for example showing how to reduce the cover time of an expander to . This is different than our work where we view the controller as an adversary, so hitting times and cover times increase rather than decrease. Moreover, the quantification is different: they show that a suitable adversary exists, whereas we show that all adversaries are not too bad.
Chin, Moitra, Mossel, and Sandon [7] studied the power of an adversary in Glauber dynamics. Glauber dynamics is a type of random walk on , where is a set of nodes in an underlying graph and is a finite alphabet. They proposed an adversary that controls any changes to a subset with . They studied how such an adversary can affect global features of the system, such as the approximate mixing time. Their methods seem tailored to Glauber dynamics and not generalizable.
There is also research analyzing how the stationary distribution of a Markov chain is affected by perturbations. For example, Liu ([18], Theorem 2.2) showed that if is perturbed to , which has stationary distribution , then
where for a matrix is the maximum -norm of any row. Thus, the best bound on robustness that this could give is , which is much weaker than what we obtain. Moreover, our robustness is for Markov set chains, which are much more general than Markov chains.
Hartfiel [12] studies when Markov set-chains converge, and movement between transient states and absorbing states and the like. He doesn’t address the types of questions we ask.
Avin, Koucky, and Lotker [3] study random walks on evolving graphs. Here an adversary is allowed to completely change the graph, i.e., transition matrix, at every time step. However, the adversary doesn’t know the location of the random walker; in other words, the adversary decides the graphs in advance. They show that the cover time and mixing time could be exponentially larger than the cover times and mixing times of the individual graphs, but are only polynomially larger if the random walk is made lazy. This model is quite different from ours.
Hunter [14] studies a related model, what in our terminology is a strong oblivious adversary. He defines to be the “random target time” – the time for a random walk to hit a random node chosen according to – but confusingly calls this the mixing time. Really, is more closely related to ; see [16] for more on the random target time. He then shows that the adversarial walk has a stationary distribution at most -far from in variation distance. This is much weaker than our results.
There are other adversarial variants of Markov chains that don’t appear relevant for us. For example, the PageRank algorithm [20] and variants (e.g., [23]) allow perturbations where the adversary need not respect the graph structure, which leads to very different results. For another example, [8] define a model that they call “adversarial Markov chains” on an infinite metric space. In their model, transitions are followed exactly except when the chain is in some prespecified bounded subset, in which case an adversary can make arbitrary bounded jumps. They study whether such an adversarial chain remains bounded in probability.
1.10 Conjectures and Open Problems
We conjecture the following.
Conjecture 1.16.
The hitting time robustness is , where denotes the diameter.
Since , this would improve Theorem 1.10. Theorem 1.11 establishes this conjecture for trees, which is tight for the path and cycle.
Conjecture 1.17.
For all graphs, the cover time robustness is the hitting time robustness, up to a constant factor.
Conjecture 1.18.
We can replace in Theorem 1.2 with .
What do we need to know about a graph to determine its robustnesses? Are the different robustnesses efficiently computable, perhaps up to constant factors?
Another class of interesting open problems is to determine the various robustnesses for natural graphs, such as different dimensional grids, hypercubes, and balanced trees.
2 Preliminaries
2.1 Random walk basics
We will assume throughout that any undirected graph is connected. We denote , , the set of neighbors of node , and the degree of . A random walk on is a sequence of random variables where each and is uniform on . We use to denote the transition matrix of the random walk. Letting denote the probability distribution of , viewed as a row vector, so we have . If is not bipartite, then converges to the stationary distribution given by . Even if is bipartite, converges to .
Let and denote the expectation and probability, respectively, for a random walk starting at . Let denote the hitting time of : the time to first visit after time 0. The standard hitting times in an unbiased random walk are , and the max hitting time is .
The following lemma is well known.
Lemma 2.1.
The expected return time .
2.2 Expander graphs
Expanders are graphs where every subset of nodes has many neighbors. We will use the standard spectral characterization, which is equivalent for constant expansion. For simplicity, assume that is regular. Denote the eigenvalues of by . The last equality holds only when is bipartite. Let us assume that is not bipartite, e.g., by adding self loops. Let
denote the second largest eigenvalue in absolute value. Let denote the spectral gap. We call a -spectral expander.
2.3 Mixing time variants
Recall the definition of mixing time, Definition 1.3. The choice of only affects the mixing time up to a constant factor. Specifically:
Lemma 2.2.
In any graph, .
For this and much more on the mixing time see, for example, the book by Levin, Peres, and Wilmer [16].
Up to a logarithmic factor, the reciprocal of the spectral gap characterizes the mixing time. Specifically, it is known (e.g., [16] Theorems 12.3 and 12.4) that
We will use a variant of the mixing time called the separation time .
Definition 2.3.
The separation distance is:
In other words, it’s the minimum such that all nodes have .
We can now define separation time in the natural way.
Definition 2.4.
Fix a small constant , say . Define the separation time
Lemma 2.5.
In any graph, .
For the proof, see Aldous-Fill [1]. Briefly, it’s immediate that . Aldous-Fill (Lemma 4.7 in [1]) show that .
We further define a variant of the separation distance briefly discussed in Aldous-Fill in Equation 4.14. They left it unnamed, so we name it.
Definition 2.6.
The upper separation distance is:
In other words, it’s the minimum such that all nodes have .
We can now define upper separation time analogously.
Definition 2.7.
Fix a small constant , say . Define the upper separation time
While we have , we can’t bound in terms of ; see Aldous-Fill’s Example 4.9. Nevertheless, the upper bounds of in terms of the spectral gap also hold for and , just by examining the proof.
Lemma 2.8.
We have .
Remark 2.9.
Since standard random walks on bipartite graphs don’t converge, the above definitions aren’t interesting for bipartite graphs. The usual way of circumventing this is to study lazy random walks on bipartite graphs. Alternatively, we can get meaningful statements for bipartite graphs by replacing with in the above definitions, understanding that even though the random walks don’t converge, the average of pairs of steps converge.
2.4 Easy bounds on robustness
We will use the following notation.
Definition 2.10.
Let denote the probability that an unbiased walk, starting at node , is in the set at time . For a given adversary , let denote the probability that an -biased walk, starting at node , is in the set at time . When the start node is understood, we often omit it and write and .
We have the following simple lemma.
Lemma 2.11.
For any standard -adversary, any time , and any set , we have .
Note that this does not hold for a strong -adversary, since such an adversary could move probability to a set with tiny .
It’s not hard to show that each robustness is at least the reciprocal of the corresponding quantity.
Proposition 2.12.
For any graph , the following bounds hold:
-
1.
The mixing time robustness, if defined by a standard adversary, is . Since we defined mixing time robustness using a strong adversary, this implies for graphs with maximum degree .
-
2.
The hitting time robustness .
-
3.
The cover time robustness .
Proof.
First we show the mixing time robustness. Using Lemma 2.2, choose such that . Set . Then a standard -adversary can’t make the biased walk probabilities much smaller than the unbiased probabilities . Specifically, for any set , we have
Therefore, , and the triangle inequality gives .
We prove the hitting time and cover time robustness results together. Let be a stopping time such as a hitting time or cover time. Let . Then by Markov’s inequality, for any node , we have , so . Set . Fix a strong -adversary, and let denote with respect to this adversary. Then for any node , we have
But now dividing time into epochs of length , the chance of in each epoch is at least 1/2, so by Wald’s identity we have .
3 Stationary and Mixing Time Robustness
In this section, we first show that for a Markovian adversary, a Markov set-chain will not necessarily converge to , even if all underlying chains are reversible and have as their stationary distribution. We only consider a Markovian adversary in Section 3.1; in the rest of this section we consider standard and strong adversaries. Next we show how to save a square root from the easy mixing time robustness of , proving Theorem 1.6. We add one more ingredient to prove Theorem 1.2. Finally, we establish that constant-degree expanders have constant mixing time robustness, namely .
3.1 MSCs non-convergence
First we show that for a Markovian adversary, the MSC will not necessarily converge to , even if all underlying chains are reversible. To see this, we define two weighted undirected graphs based on the path graph. Denote , and . Graph assigns edges in Odd weight 2, and edges in Even weight 1, and adds weighted self loops at the two endpoints to make all nodes have weighted degree 3. Graph is the same but swaps even and odd: it assigns edges in Even weight 2, and edges in Odd weight 1. Let denote a random walk on , and similarly for . Since and are both regular, both have uniform stationary distributions.
However, consider a Markovian MSC that chooses if is odd and otherwise. If is not an endpoint, then . We thus have a biased random walk which converges to a very biased distribution very far from uniform.
3.2 Bounding the increase in probability
We prove that if a set has probability , then a strong -adversary cannot increase the probability to more than as long as . This quadratic dependence on improves on the easy linear dependence given by Lemma 2.11. We use a potential function argument that generalizes one by Haslegrave, Sauerwald, and Sylvester [13]. They used a potential function based on squares to prove a lower bound. We use a potential function based on smaller powers to prove an upper bound. In particular, we need an inequality going in the opposite direction as [13].
For define the -power mean , , and the -max-avg operator . We now develop the inequality we need.
Lemma 3.1.
For and , we have .
Proof.
Lemma 3.2.
For and satisfying and , we have .
Proof.
Let , and assume without loss of generality that . Using Holder’s inequality and Lemma 3.1 with ,
Recall Definition 2.10 for the definitions of and .
Theorem 3.3.
Let be a graph with all degrees at most , and fix a starting node. Let satisfy and . For any -strong adversary, any start node, and any , we have .
Proof.
We follow the structure of [13] but both simplify it somewhat, avoiding the use of the trajectory tree, and generalize it. Fix the transition matrix , a start node , target set , and stopping time . Define the vectors
When intermixed with matrices, we will view as a row vector and as a column vector, for reasons that will become apparent. Note that . Define the potential function
Since if and 0 otherwise, we have that
Since if and 0 otherwise, we have that
If the random walk is at node with steps remaining, the adversary’s strategy to maximize is to use the probability it controls to move to a neighbor maximizing . Letting denote the degree of , this strategy gives
using Lemma 3.2. Raising both sides to the th power gives
Using the fact that all degrees are at most , we can deduce an inequality with vectors and matrices:
Now we can bound the potential function by
We conclude that , and
Corollary 3.4.
The conclusion of Theorem 3.3 holds for any starting distribution.
Proof.
This follows from the concavity of the function .
Corollary 3.5.
For any -strong adversary, any starting distribution, and any , we have
where . In particular, for , we have .
Proof.
We choose to optimize the upper bound for . Let , , and . Since , the upper bound is
By the arithmetic-geometric mean inequality, this is minimized when and achieves the bound .
We now restate and prove Theorem 1.6, which is tight for the path and the cycle. See 1.6
Proof.
While we can deduce this from Corollary 3.5, it’s more straightforward to apply Theorem 3.3 directly. We will choose suitable constants and that satisfy the assumptions of Theorem 3.3 and use the conclusion. Letting , we can bound the variation distance
The derivative of is . There are two cases. If , then the derivative is nonnegative on , so the sup is achieved at and the sup is . If , then the sup is achieved when . This value of evaluates to:
using that for .
We conclude that . By choosing and , and using that for , we get that this maximum is at most . By Lemma 2.2, after steps we have . Therefore, by the triangle inequality, . Thus the robustness is bounded below by
We now use Corollary 3.5 to upper bound and prove Theorem 1.2.
Theorem 1.2. [Restated, see original statement.]
Let be a graph with all degrees at most . Let be a graph with all degrees at most . For any -strong adversary, any starting distribution, and any with , we have
In particular, for , we have .
Proof.
As usual, let and denote unbiased and adversarially-biased probabilities, respectively. Let be arbitrary, and fix a starting node for the random walk. First assume that is bipartite. Note that to show , it suffices to show that for all large enough we have . Let and . To bound for , set , and condition on . Recall that by the definition of , we have . Using for and , and applying Corollary 3.5, we have
as required for the nonbipartite case.
The bipartite case follows similarly by comparing with .
3.3 Mixing time robustness for expanders
We now establish robustness in terms of spectral gap. See Section 2.2 for the definition of a spectral expander. We first restate our theorem
Theorem 1.7. [Restated, see original statement.]
A -regular -spectral expander has mixing time robustness .
Proof.
Our strong adversary chooses a stochastic matrix with , and the transition matrix . Let denote the vector of probabilities at time . We can write , where can be viewed as an error term. Therefore,
Subtracting gives
Using that for stochastic with row sums at most , we have , , and for yields:
| (1) |
We will write (1) as to simplify our calculation. Our calculation will show that we can choose small enough to make converge quickly. For to be chosen later, set
Then .
Let . Note that for with , and for , we have
Plugging into (1), we deduce that for , we have
Consequently, after steps, we have
Here we used that for , we have .
Thus, by choosing a small enough constant, we can make the variation distance to , which is , an arbitrarily small constant.
4 Hitting Time Robustness
4.1 Inverse of mixing time
We now show that the hitting time robustness is . First recall the separation time and that (Lemma 2.5). We first prove that short unbiased walks hit nodes with at least the “right” probability. Let .
Lemma 4.1.
For all , we have .
Proof.
Let
Since for any starting , after steps every node has probability at least , we have
We now show that , which will prove the lemma. Suppose that . We will show that , which contradicts Markov’s inequality, proving the lemma.
To show this, let . We prove by induction that for all , we have
This is true for . Suppose it’s true for a given . Then
To analyze the conditional probability, we bound the conditional distribution of given that , where . Specifically, for any node ,
Thus, conditioning on can only increase probabilities by at most . Therefore,
and we are done.
We are now ready to prove our theorem about hitting time robustness.
Theorem 4.2.
For all and all nodes , we have .
Proof.
We prove the theorem with in place of ; this suffices since . By Lemma 4.1, even with an -adversary, we have
By Wald’s identity, we have .
Theorem 1.10, stating that the hitting time robustness , follows.
4.2 Hitting time robustness for trees
We now show that the hitting time robustness is for trees, confirming ˜1.16 in this case. We first restate the theorem.
Theorem 1.11. [Restated, see original statement.]
The hitting time robustness for any tree is .
Fix a tree with at least two edges (otherwise it’s trivial), so . Let , and fix a standard -adversary. For the hitting time, we can assume without loss of generality that the adversary is oblivious, and hence the adversarial random walk has a stationary distribution . We first show that is not too different than the stationary distribution of the uniform random walk .
Lemma 4.3.
For any node , we have .
Proof.
First consider two neighbors and . Let denote the stationary probability of the directed edge . Since is a cut edge, we must have . Therefore,
For that may not be neighbors, an inductive argument gives a bound in terms of the distance :
because for . The lemma will follow because is exactly proportional to the degree. To elaborate, define the ratio . Then
Since some and some , we have for all , as claimed.
Proof of Theorem 1.11.
As in the above lemmas, we set and fix a standard -adversary. We now show that , which will prove the theorem. We consider the oblivious -adversary that maximizes the hitting time from to . It suffices to show for neighbors and , since for arbitrary and in a tree the hitting time is the sum of the hitting times between neighbors.
We now use the idea behind the “essential edge lemma” from Aldous-Fill (Lemma 5.1 in [1]. For neighbors and , consider the graph obtained from by removing all nodes and edges that are strictly closer to than to . Thus, in the node has degree 1, and and are the same in as in . Let and denote the corresponding stationary distributions in , and let and denote hitting times in . Then since a random walk in starting at is forced to go to , we have the following:
| (since ) | ||||
| (using Lemma 4.3) |
The theorem follows.
4.3 Negative result
Definition 4.4.
Fix a simple undirected graph . Let denote the distance between and in . Define the average distance .
Theorem 4.5.
For every regular graph and , there is a strong oblivious -adversary with .
Proof.
Much of this proof is similar to a lower bound on by [4].
The idea is to construct a “lazy” adversary – allowing self loops – where some node has stationary probability at most , and then apply Lemma 2.1. The lazy adversary won’t be a valid adversary because the original graph doesn’t have self loops, so we then convert it into a valid adversary.
Let be -regular. Fix a node such that . We first define a lazy adversary using a Metropolis algorithm. The target stationary distribution will be proportional to the weight defined as . That is, , where . Since for an edge we have , the lazy adversary corresponds to the Metropolis Markov chain where
Since corresponds to a Metropolis algorithm for , the stationary distribution of is . Observe that
Let denote the hitting times for . Since some neighbor of has , by Lemma 2.1 we have
However, is not a valid adversary, since a valid adversary must only place weight on edges of , but has no self loops. To circumvent this issue, distribute all the self loop probabilities proportional to the other edges. That is, let , and define
In other words is a sped-up version of . A random trajectory of can be converted into a random trajectory of by simply deleting consecutive repetitions. Thus, expected hitting times of are at least times those of . Since is a strong oblivious -adversary, we have
Corollary 4.6.
For every regular graph and , there is a strong oblivious -adversary with .
Proof.
Every -regular graph has .
This enables us to conclude:
Theorem 1.12. [Restated, see original statement.]
The hitting time robustness of any constant degree expander is .
Proof.
The unbiased max hit time for any expander is [21]. In order for to be , we need . Also, suffices because of Theorem 1.10 and the fact that constant degree expanders have .
4.4 Tight analysis of path/cycle
Notice that Theorem 1.11 is mainly useful when the unbiased . We now give a tight analysis of the robustnesses of the path and cycle, which have . We focus on the hitting time robustness, as the others will be the same. We now show:
Theorem 4.7.
The hitting time robustness of the path and cycle are .
Proof.
From Theorem 1.11, we have . It therefore suffices to show the upper bound and negative result that .
The idea is to use the essential edge lemma and its weighted version. It says that if is a cut edge, then , where is the connected component of if is removed. The weighted version simply counts the weights of the edges, and normalizes the edge to have weight 1 (or divide by .
For simplicity we analyze the path on nodes labeled to and edges . Consider the hitting time . Let’s see how often edge appears in , where the +1 gets charged to . The contribution of edge to is .
Now compare it when we bias the random walk by . We can do this by assigning . The contribution of edge to the biased hitting time is
Thus the ratio of this to the contribution in the unbiased random walk is at least . Now, over half the contribution comes from , and for such we can lower bound this ratio by . Therefore, if , then the ratio , as we wanted.
5 Cover Time Robustness
In the non-adversarial setting, Matthews [19] upper bounded the cover time in terms of the max hitting time:
Theorem 5.1 ([19]).
.
There are many examples where this is tight up to a constant factor: expanders (and hence most graphs) [5], the complete graph (coupon collecting), two and higher dimensional grids and tori [2, 25], and the balanced binary tree [24].
We observe that in any graph with , we have . This is because Matthews’ proof works the same in the adversarial setting.
Theorem 5.2.
For any adversary, .
Proof.
Matthews [19] proof works in the adversarial setting, so
Corollary 5.3.
In any graph with , we have .
We conjecture that more generally, in any graph, .
References
- [1] D. Aldous and J. A. Fill. Reversible Markov chains and random walks on graphs, 2002. Unfinished monograph, recompiled 2014, available at http://www.stat.berkeley.edu/˜aldous/RWG/book.html.
- [2] D.J. Aldous. On the time taken by random walks on finite groups to visit every state. Z. Wahrscheinlichkeitstheorie verw Gebiete, 62:361–374, 1983.
- [3] C. Avin, M. Koucky, and Z. Lotker. Cover time and mixing time of random walks on dynamic graphs. Random Structures and Algorithms, 52:576–596, 2018. doi:10.1002/RSA.20752.
- [4] Y. Azar, A. Z. Broder, A. R. Karlin, N. Linial, and S. J. Phillips. Biased random walks. Combinatorica, 16:1–18, 1996. doi:10.1007/BF01300124.
- [5] A.Z. Broder and A.R. Karlin. Bounds on the cover time. Journal of Theoretical Probability, 2:101–120, 1989.
- [6] S. Cannon, J. J. Daymude, D. Randall, and A.W. Richa. A Markov chain algorithm for compression in self-organizing particle systems. In Proceedings of the 2016 Symposium on Principles of Distributed Computing, pages 279–288, 2016.
- [7] B. Chin, A. Moitra, E. Mossel, and C. Sandon. The power of an adversary in Glauber dynamics. In The Thirty Seventh Annual Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1102–1124, 2024. URL: https://proceedings.mlr.press/v247/chin24a.html.
- [8] R.V. Craiu, L. Gray, K. Latuszynski, N. Madras, G.O. Roberts, and J.S. Rosenthal. Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms. Annals of Applied Probability, 25:3592–3623, 2015.
- [9] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Almost Chor-Goldreich sources and adversarial random walks. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1–9, 2023.
- [10] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Online condensing of unpredictable sources via random walks. In Proceedings of the 40th Computational Complexity Conference, volume 339 of LIPIcs, pages 30:1–30:17, 2025. doi:10.4230/LIPIcs.CCC.2025.30.
- [11] D.J. Hartfiel. On the limiting set of stochastic products . Proceedings of the American Mathematical Society, 81:201–206, 1981.
- [12] D.J. Hartfiel. Markov Set-Chains, volume 1695 of Lecture Notes in Mathematics. Springer-Verlag, 1998.
- [13] J. Haslegrave, T. Sauerwald, and J. Sylvester. Time dependent biased random walks. ACM Transactions on Algorithms, 18, 2022.
- [14] J.J. Hunter. Mixing times with applications to perturbed markov chains. Linear Algebra and Its Applications, 417:108–123, 2006.
- [15] S. Kania, D. Aristoff, and D.M. Zuckerman. Riteweight: Randomized iterative trajectory reweighting for steady-state distributions without discretization error. Technical report, Arxiv, 2024. arxiv:2401.05597.
- [16] D.A. Levin and Y. Peres. Markov Chains and Mixing Times. American Mathematical Society, 2nd edition, 2017.
- [17] S. Li, B. Dutta, S. Cannon, J. J. Daymude, R. Avinery, E. Aydin, A. W. Richa, D. I. Goldman, and D. Randall. Programming active cohesive granular matter with mechanically induced phase changes. Science Advances, 7:eabe8494, 2021.
- [18] Y. Liu. Perturbation bounds for the stationary distributions of Markov chains. SIAM Journal on Matrix Analysis and Applications, 33:1057–1074, 2012. doi:10.1137/110838753.
- [19] P. Matthews. Covering problems for Markov chains. Annals of Probability, 16:1215–1228, 1988.
- [20] L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report Technical Report 1999-66, Stanford InfoLab, 1999.
- [21] R. Rubinfeld. The cover time of a regular expander is O(n log n). Information Processing Letters, 35:49–51, 1990. doi:10.1016/0020-0190(90)90173-U.
- [22] T. Sauerwald and L. Zanetti. Random walks on dynamic graphs: Mixing times, hitting times, and return probabilities. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132 of LIPIcs, pages 93:1–93:15, 2019. doi:10.4230/LIPIcs.ICALP.2019.93.
- [23] D. Vial and V.G. Subramanian. Restart perturbations for reversible markov chains: Trichotomy and pre-cutoff equivalence. Random Structures and Algorithms, 66, 2025.
- [24] D. Zuckerman. Covering times of random walks on bounded degree trees and other graphs. Journal of Theoretical Probability, 2:147–157, 1989.
- [25] D. Zuckerman. A technique for lower bounding the cover time. SIAM Journal on Discrete Mathematics, 5:254–259, 1992.
