Markov Chain Robustness

Zuckerman, David

doi:10.4230/LIPIcs.ITCS.2026.118

Markov Chain Robustness

David Zuckerman

University of Texas at Austin, TX, USA

Abstract

When a Markov chain models nature or social interactions, it is likely not followed exactly, but only approximately. We therefore introduce several notions of robustness for a Markov chain $P$ . Our standard adversary can dynamically change transition probabilities of $P$ by $1\pm\varepsilon$ , and our strong adversary can completely control each transition independently with probability $\varepsilon$ , as in a model by Azar, Broder, Karlin, Linial, and Philips [4]. These adversaries are equivalent up to constant factors if the degrees are constant. Our adversarial chains need not converge.

We define and prove various robustness properties of a reversible chain $P$ , i.e., a random walk on a connected undirected graph $G$ . Let $d$ be the maximum degree, $\Delta$ the diameter, $\pi$ the stationary distribution, and $t_{\mathrm{mix}}$ the mixing time.

1.

We define a natural analogue $\pi^{+}(S)$ that upper bounds limiting frequencies in a set $S$ in the adversarial chain. We show that if $\varepsilon=O(1/\sqrt{dt_{\mathrm{up}}})$ , where $t_{\mathrm{up}}$ is a variant of the mixing time, then $\pi^{+}(S)=O(\pi(S)^{1-\alpha})$ for any $\alpha>0$ .
2.

We define the mixing time robustness as the largest $\varepsilon$ such that the approximate mixing time increases by only a constant factor, and prove that it is $\Omega(1/\sqrt{dt_{\mathrm{mix}}})$ .
3.

We define the hitting time robustness as the largest $\varepsilon$ such that the maximum hitting time increases by only a constant factor, and show that it is $\Omega(1/t_{\mathrm{mix}})$ . For trees, we show it is $\Omega(1/\Delta)$ .
4.

We define the cover time robustness as the largest $\varepsilon$ such that the cover time increases by only a constant factor. We show that in most graphs it’s at least the hitting time robustness.
5.

We characterize the mixing, hitting, and cover time robustnesses for constant-degree regular expander graphs up to constant factors. They are $\Theta(1)$ , $\Theta(1/\log n)$ , and $\Theta(1/\log n)$ , respectively.

Keywords and phrases:

Markov chain, random walk, mixing time, hitting time, cover time, robustness, expander graph

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Random walks and Markov chains

Acknowledgements:

We thank Kuikui Liu and Salil Vadhan for helpful discussions, the anonymous referees for very useful suggestions, and Zhiyang Xun and Dean Doron for helpful comments.

Funding:

Supported in part by NSF Grant CCF-2312573 and a Simons Investigator Award (#409864).

DOI:

10.4230/LIPIcs.ITCS.2026.118

Event:

17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Editor:

Shubhangi Saraf

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Markov chains are used to model many scenarios, and there is a huge theory about them. In many situations, such as a Markov chain modeling nature or social behavior, it is likely that the Markov chain is not followed exactly, but only approximately. We therefore study how behavior in Markov chains is affected by small perturbations.

For example, Cannon, Daymude, Randall, and Richa [6] showed that a collection of computational elements can compress – gather tightly as in a sphere – by asynchronously each executing a simple program. (See also [17].) They analyzed this by modeling it as a Markov chain. Such compression and other self-organizing behavior also occurs in nature. In nature, one may expect that the algorithm is not followed exactly but only approximately. Would the same compressive behavior arise in this case? For a Markov chain modeling social behavior, it’s even more compelling that the chain wouldn’t be followed exactly but could have adversarial influences.

Another situation where errors arise is in simulating huge physical systems. Such a system may exactly follow a Markov chain; however, researchers simulate such a system by coarsening the exact chain into a discrete approximation. This coarsening necessarily leads to error. How badly does this error affect the simulation? In this case, Kania, Aristoff, and Zuckerman [15] recently developed a randomized algorithm to remove the error, but analyzing the error remains interesting.

There has been some work exploring the power of an adversary on certain Markov chains. Chin, Moitra, Mossel, and Sandon [7] studied the power of an adversary in Glauber dynamics. Glauber dynamics is a type of random walk on $\Sigma^{V}$ , where $V$ is a set of nodes in an underlying graph and $\Sigma$ is a finite alphabet. They proposed an adversary that controls any changes to a subset $A\subseteq V$ with $|A|\leq\varepsilon|V|$ . They studied how such an adversary can affect global features of the system, such as the approximate mixing time.

Doron, Moshkovitz, Oh, and Zuckerman [9, 10] studied the power of an extremely strong adversary for random walks on lossless expanders. They showed that even in the presence of such a strong adversary, a random walk will approximately mix in some sense. However, their methods don’t appear to extend beyond lossless expanders.

We study a model that can be applied to any random walk. Related questions have been studied before, and we adopt some of those frameworks, but we study different questions.

To study errors in Markov chain behavior, Hartfiel [11, 12] proposed a natural model of Markov set-chains defined as follows. Assume the states are $[n]=\{1,2,\dots,n\}$ , and let $\mathcal{M}_{n}$ denote the set of $n\times n$ stochastic matrices. The Markov set-chain is a stochastic process $\{X_{t}\}$ defined by a set $\mathcal{P}\subseteq\mathcal{M}_{n}$ .¹¹1Hartfiel’s definition has $\mathcal{P}$ being compact, but let’s ignore that for now. At each time step $t$ , a transition matrix $P_{t}\in\mathcal{P}$ is chosen arbitrarily. Then $P_{t}$ determines the transitions from $X_{t}$ to $X_{t+1}$ : $\Pr[X_{t+1}=j\mid X_{t}=i]=P_{t}(i,j)$ . Thus a Markov chain is a special case of a Markov set-chain where the set $\mathcal{P}$ consists of one matrix, the transition matrix of the Markov chain. In general, the transition matrices in $\mathcal{P}$ may have different stationary distributions.

Azar, Broder, Karlin, Linial, and Phillips [4] studied a model similar in spirit. More recently, researchers have studied a model very similar to Markov set-chains under the name of dynamic graphs or evolving graphs, e.g., [3, 22].

1.1 The adversary

There appear to be three reasonable choices for the power of an adversary choosing $P_{t}$ .²²2Note that we give the adversary more power by allowing $P_{t}$ to depend on $t$ ; otherwise we could get three more choices for the adversary’s power.

1.

Oblivious: $P_{t}$ is chosen independently of $\{X_{s}\}$ ;
2.

Markovian: $P_{t}$ may depend on $X_{t}$ but not $X_{s}$ for $s<t$ ;
3.

Non-Markovian: $P_{t}$ may depend on $X_{s}$ for all $s\leq t$ .

Note that an oblivious adversary corresponds to a non-homogeneous (or time-dependent) Markov chain. Oblivious and Markovian adversaries are equivalent if $\mathcal{P}$ is closed under “row replacement”: replacing a row in any $P\in\mathcal{P}$ by the corresponding row in another $P^{\prime}\in\mathcal{P}$ . The equivalence follows because the adversary can choose the $i$ th row assuming the Markov chain is in state $i$ . The Markovian and non-Markovian adversaries are equivalent if an adversary is trying to affect the hitting time, but not equivalent for cover time. Hartfiel, [4], and recent work on dynamic graphs implicitly use an oblivious adversary (and don’t discuss the others).

We focus on the most powerful, non-Markovian adversary, although we make some observations about the other adversaries.

Can we understand a Markov set chains (MSC) in terms of a related Markov chain? Although reversibility is not required for all of our theorems, we usually assume the related chain is reversible, i.e., a random walk on an undirected graph. We further assume the graph is connected; otherwise we could restrict attention to the appropriate connected component. The most important probability distribution related to a Markov chain is the stationary distribution $\pi$ . The most important quantities of interest in Markov chains are the mixing time $t_{\mathrm{mix}}$ , the time it takes to approach $\pi$ ; the max hitting time $t_{\mathrm{hit}}$ , the maximum expected time to go from one vertex to another; and the cover time $t_{\mathrm{cov}}$ , the expected time to visit all nodes from a worst-case starting node.

First, we discuss the stationary distribution. Suppose all chains in an MSC have the same stationary distribution $\pi$ . Will this MSC converge to $\pi$ ? It depends on the adversary. For an oblivious adversary, the MSC will converge to $\pi$ – the usual proof works. For a Markovian adversary, the MSC need not converge, even if all underlying chains are reversible; see Section 3.1.

There are two natural special cases of MSCs that we consider. First, an interval chain is an interval of matrices $\mathcal{P}=[P^{-},P^{+}]\cap\mathcal{M}_{n}$ , where $[P^{-},P^{+}]$ denotes the set of matrices $P$ such that for all states $i, j$ , we have $P^{-}(i,j)\leq P(i,j)\leq P^{+}(i,j)$ . For interval chains, oblivious and Markovian adversaries are equivalent, since $\mathcal{P}$ is closed under row replacement.

For our standard adversary, we consider interval chains of the following form. For $P$ a Markov chain³³3We identify a Markov chain with its transition matrix. and $\varepsilon>0$ , we consider the interval chain $[P^{-}=(1-\varepsilon)P,P^{+}=(1+\varepsilon)P]$ . When convenient, we consider the essentially equivalent interval chain $[P^{-}=e^{-\varepsilon}P,P^{+}=e^{\varepsilon}P]$ .

Second, [4] introduced a model of biased random walks where a controller biases the random walk. That is, given a Markov chain $P$ and a parameter $\varepsilon>0$ , a controller chooses an arbitrary stochastic matrix $B$ with $\operatorname{supp}(B)\subseteq\operatorname{supp}(P)$ and the biased random walk follows $(1-\varepsilon)P+\varepsilon B$ . That is, with probability $\varepsilon$ the controller biases the walk, but has to respect the graph structure. Haslegrave, Sauerwald, and Sylvester [13] studied this in the time dependent setting. The focus of these papers is on a friendly controller who tries to improve the situation from the simple random walk. Our focus is on an adversarial controller who seeks to make matters worse. This will be our strong adversary.

In other words, in both cases, we have that $(1-\varepsilon)P$ is left alone. A strong adversary may distribute the remaining $\varepsilon$ probability arbitrarily on its support. A standard adversary is limited to change each probability by a bounded relative amount. However, a strong adversary with parameter $\varepsilon$ is at most as powerful as a $(d_{\mathrm{max}}\varepsilon)$ -standard adversary, where $d_{\mathrm{max}}$ denotes the maximum degree. Since we are thinking of $d_{\mathrm{max}}$ as small or constant, we don’t dwell on the difference between the adversaries.

One might suggest an even stronger adversary that can take an arbitrary $\varepsilon$ probability and distribute it arbitrarily. In other words, it can choose $P_{t}$ such that every row of $P$ and $P_{t}$ differs by at most $\varepsilon$ in total variation distance. This adversary generalizes the adversary of [7]. However, if $\varepsilon\geq 1/\deg(v)$ for some node $v$ , then such an adversary can force the MSC to avoid hitting $v$ . This renders some of our questions impossible, so we don’t consider this adversary.

Main question.

For how large an $\varepsilon$ does the adversarial chain have similar behavior as the original chain?

In order to make sense of this question, we need to define suitable notions for MSCs. We study several notions: the robustness of the stationary distribution, and the robustness of three times: the mixing time, the maximum hitting time, and the cover time. We will use strong adversaries to define stationary and mixing time robustness, and standard adversaries for hitting time and cover time robustness. This is because our proofs work for strong adversaries for the first two robustnesses, but strong adversaries are too powerful for the last two robustnesses, as explained below.

1.2 Stationary robustness

We begin with the robustness of the stationary distribution $\pi$ . Although MSCs may not converge to a distribution (see Section 3.1), we can still study the maximum and minimum limiting probabilities of sets. We can define this as follows. For $S\subseteq\Omega$ , let $N_{t}(S)$ denote the number of visits to $S$ up to time $t$ .

Definition 1.1.

For $\varepsilon>0$ , the upper stationary probability

\pi^{+}(S)=\pi_{\varepsilon}^{+}(S)=\sup(\limsup_{t\to\infty}N_{t}(S)/t),

where the outer $\sup$ is over the choice of a strong $\varepsilon$ -adversary.

We could define a lower limit $\pi^{-}$ analogously, but we focus on $\pi^{+}$ .

For our first theorem, we use a variant of the mixing time that we call $t_{\mathrm{up}}$ (see Section 2.3). Like mixing time, it satisfies $t_{\mathrm{up}}=O(\frac{\log(1/\pi_{\mathrm{min}})}{\gamma})$ , where $\gamma$ is the spectral gap. We prove the following.

Theorem 1.2.

Let $G=(V,E)$ be a graph with all degrees at most $d$ . Let $G=(V,E)$ be a graph with all degrees at most $d$ . For any $\varepsilon$ -strong adversary, any starting distribution, and any $S\subseteq V$ with $\pi(S)\leq 1-\alpha$ , we have

\pi^{+}(S)<(1+\alpha)\exp(2\varepsilon\sqrt{2dt_{\mathrm{up}}\ln(1/\pi(S))})% \pi(S).

In particular, for $\varepsilon=O(1/\sqrt{dt_{\mathrm{up}}})$ , we have $\pi^{+}(S)=\pi(S)\cdot\exp(O(\sqrt{\log(1/\pi(S)}))=O(\pi(S)^{1-o(1)})$ .

Here the $o(1)$ term goes to 0 as $\pi(S)\to 0$ , so in particular we have $\pi^{+}(S)=O(\pi(S)^{1-\alpha})$ for any $\alpha>0$ .

The dependence of $\varepsilon$ on $t_{\mathrm{up}}$ is essentially best possible, as the path graph shows.

To interpret this theorem, consider the case when the state space $V$ is exponentially large. Then any event with negligible stationary probability in the original chain has negligible stationary probability in this MSC with a nontrivial adversary. For example, [6] show that compression in their model happens with negligible probability; we deduce that this conclusion holds even if their chain is not followed exactly.

We could define stationary robustness as the largest $\varepsilon$ such that the stationary probability analogues increase from $p$ to at most $O(p^{\beta})$ for some $\beta<1$ . The choice of $\beta$ or quantification over $\beta$ could a priori change the definition. However, Theorem 1.2 implies a robust stationarity of $\Omega(\sqrt{dt_{\mathrm{up}}})$ , and at least for the path the choice of $\beta$ doesn’t matter.

We compare Theorem 1.2 to the work of [4]. They only consider a time-independent adversary, although it wouldn’t matter for this theorem. The main difference between their work and ours is that they show a lower bound on the adversarial power, whereas we show an upper bound. Specifically, they showed that for any bounded-degree graph and any node $v$ , some strong $\varepsilon$ -adversary can achieve $\pi^{+}(v)\geq\pi(v)^{1-\Omega(\varepsilon)}$ , and that this is tight for an expander. Haslegrave, Sauerwald, and Sylvester [13] generalized this to more settings. Our Theorem 1.2 upper bounds $\pi^{+}(S)$ for every graph and every strong $\varepsilon$ -adversary.

1.3 Mixing time robustness

We now turn to the robustness of the fundamental times associated with a random walk: mixing time, max hitting time, and cover time. We begin with mixing time, which requires some set up. Let $|\cdot|$ denote variation distance. We will generalize the following for a standard random walk. For a node $i$ , let $\Pr_{i}$ denote the probability for a random walk starting at $i$ . Let

\delta(t)=\max_{i}\left|\Pr_{i}[X_{t}\in\cdot]-\pi(\cdot)\right|\\

Definition 1.3.

Fix a small constant $\alpha$ , say $\alpha=1/4$ . The mixing time $t_{\mathrm{mix}}=t_{\mathrm{mix}}(\alpha)=\min\{t:\delta(t)\leq\alpha\}$ .

To generalize this to MSCs, we define

\delta^{+}(t)=\sup\left(\max_{i}\left|\Pr_{i}[X_{t}\in\cdot]-\pi(\cdot)\right|% \right).

Here $\pi$ is the stationary distribution of the original Markov chain, and the sup is over strong $\varepsilon$ -adversaries. Thus $\delta(t)\leq\delta^{+}(t)$ . We can now define the robust mixing time, which will be a workable definition despite the MSC possibly not converging.

Definition 1.4.

Fix a small constant $\alpha$ , say $\alpha=1/4$ . Define the robust mixing time as

t_{\mathrm{mix}}^{+}(\varepsilon)=\min\{t:\delta^{+}(t)\leq\alpha\},

where $\varepsilon$ is the adversary parameter.

Note that $t_{\mathrm{mix}}\leq t_{\mathrm{mix}}^{+}(\varepsilon)$ , and that $t_{\mathrm{mix}}^{+}(\varepsilon)$ may be infinite. This leads to the following definition.

Definition 1.5.

The mixing time robustness is

\rho_{\mathrm{mix}}(C)=\sup\{\varepsilon:t_{\mathrm{mix}}^{+}(\varepsilon)\leq Ct% _{\mathrm{mix}}\}.

When $C$ is omitted, we allow $C$ to be an arbitrarily large constant.⁴⁴4This only makes sense when we have a family of graphs, but this applies to big Oh notation generally.

It is not hard to show that the mixing time robustness is $\Omega(1/(t_{\mathrm{mix}}d))$ , where $d$ is the maximum degree; see Proposition 2.12. We achieve the square root of this quantity for constant-degree graphs.

Theorem 1.6.

In a graph with all degrees at most $d$ , the mixing time robustness $\rho_{\mathrm{mix}}=\Omega(1/\sqrt{t_{\mathrm{mix}}d})$ .

This is tight for the path and the cycle. However, for expander graphs, this gives $\Omega(1/\sqrt{\log n})$ . We prove a constant lower bound for constant-degree expanders.

Theorem 1.7.

A $d$ -regular $\gamma$ -spectral expander has mixing time robustness $\Omega(\gamma/\sqrt{d})$ .

1.4 Hitting time robustness

We now turn to some expected stopping times. Let $\operatorname{\mathbb{E}}_{u}$ denote the expectation for a Markov (set) chain starting at $u\in V$ . Let $H_{v}$ denote the hitting time of $v$ : the time to first visit $v$ after time 0. The standard hitting times in an unbiased random walk are $h(u,v)=\operatorname{\mathbb{E}}_{u}H_{v}$ , and the max hitting time is $t_{\mathrm{hit}}=\max_{u\neq v}h(u,v)$ .

Before defining the robust version, we briefly discuss the adversary. Note that if we have a strong $\varepsilon$ -adversary, the robustness for hitting and covering is less than $1/d_{\mathrm{min}}$ , where $d_{\mathrm{min}}$ denotes the minimum degree. That’s because a strong $1/d_{\mathrm{min}}$ -adversary can always avoid a particular node. Due to this and our proofs working better with standard adversaries, we define the robust versions for standard adversaries.

Definition 1.8.

Fix an $\varepsilon>0$ . The robust hitting time $h^{+}(u,v)=\sup(\operatorname{\mathbb{E}}_{u}H_{v})$ , where the sup is over strategies for standard $\varepsilon$ -adversaries. The robust max hitting time is $t_{\mathrm{hit}}^{+}=\max_{u,v}h^{+}(u,v)$ .

Generalizing an observation by [4], the quantities $h^{+}$ are the same for oblivious, Markovian, and non-Markovian adversaries. This is because an adversary’s best strategy at a given node does not depend on the previous history. This basically follows from Markov decision theory.

We focus on $t_{\mathrm{hit}}^{+}$ so we have one quantity rather than $n^{2}$ quantities. We can now define:

Definition 1.9.

The hitting time robustness is

\rho_{\mathrm{hit}}(C)=\sup\{\varepsilon:t_{\mathrm{hit}}^{+}(\varepsilon)\leq Ct% _{\mathrm{hit}}\}.

When $C$ is omitted, we allow $C$ to be an arbitrarily large constant.

It is not hard to show that the hitting time robustness is $\Omega(1/t_{\mathrm{hit}})$ ; see Proposition 2.12. We improve this bound dramatically.

Theorem 1.10.

The hitting time robustness $\rho_{\mathrm{hit}}=\Omega(1/t_{\mathrm{mix}})$ .

We can improve this bound for trees. Let $\Delta$ denote the diameter.

Theorem 1.11.

The hitting time robustness for any tree is $\Omega(1/\Delta)$ .

Note that $\Delta=O(t_{\mathrm{mix}})$ , and is often significantly less. For example, for the line and cycle on $n$ nodes, $\Delta=\Theta(n)$ but $t_{\mathrm{mix}}=\Theta(n^{2})$ . Even more dramatically, for the balanced binary tree $t_{\mathrm{mix}}=\Theta(n\log n)$ but $\Delta=\Theta(\log n)$ .

For constant degree expanders, Theorem 1.10 gives a bound of $\Omega(1/\log n)$ . We show that this is tight for expanders by giving a matching upper bound, which is a negative result.

Theorem 1.12.

The hitting time robustness of any constant degree expander is $\rho_{\mathrm{hit}}=\Theta(1/\log n)$ .

1.5 Cover time robustness

We now turn to the cover time. Let the random variable $T$ denote the first time the random walk has visited all nodes. We define the cover time $t_{\mathrm{cov}}=\max_{u}\operatorname{\mathbb{E}}_{u}T$ .

Definition 1.13.

Fix an $\varepsilon>0$ . The robust cover time $t_{\mathrm{cov}}^{+}=\sup(\operatorname{\mathbb{E}}_{u}T)$ , where the sup is over strategies for standard $\varepsilon$ -adversaries.

Note that Markovian and non-Markovian adversaries do not appear to be equivalent, since the prior history is very relevant for the cover time. We now define:

Definition 1.14.

The cover time robustness is

\rho_{\mathrm{cov}}(C)=\sup\{\varepsilon:t_{\mathrm{cov}}^{+}(\varepsilon)\leq Ct% _{\mathrm{cov}}\}.

When $C$ is omitted, we allow $C$ to be an arbitrarily large constant.

It is not hard to show that the cover time robustness is $\Omega(1/t_{\mathrm{cov}})$ ; see Proposition 2.12. We improve this in many cases. We always have $t_{\mathrm{cov}}=O(t_{\mathrm{hit}}\log n)$ ; we can improve the cover time robustness whenever this is tight up to a constant factor.

Theorem 1.15.

In any graph with $t_{\mathrm{cov}}=\Theta(t_{\mathrm{hit}}\log n)$ , we have $\rho_{\mathrm{cov}}=\Omega(\rho_{\mathrm{hit}})$ .

There are many graphs where $t_{\mathrm{cov}}=\Theta(t_{\mathrm{hit}}\log n)$ , including expanders (and hence most graphs) [5], the complete graph (coupon collecting), two and higher dimensional grids and tori [2, 25], and the balanced binary tree [24].

1.6 Examples

It’s instructive to see how our bounds apply to natural examples.

1.6.1 Path/cycle

The path is the graph on $[n]$ with $n-1$ edges $\{i,i+1\}$ ; the cycle contains the additional edge $\{1,n\}$ . In the non-adversarial setting, the main quantities are similar for the two graphs. The mixing time, max hitting time, and cover time are all $\Theta(n^{2})$ .

To understand the adversarial setting, we focus on the path. The worst a strong adversary can do is set all $P_{t}(i,i+1)=(1+\varepsilon)/2$ and all $P_{t}(i,i-1)=(1-\varepsilon)/2$ . In this case, the hitting times will increase by a factor of about $((1+\varepsilon)/(1-\varepsilon))^{n}$ . Therefore, the hitting time robustness is $\Theta(1/n)=\Theta(1/\sqrt{t_{\mathrm{mix}}})$ . All other robustnesses have the same order of magnitude. Thus, Theorem 1.6 is tight for mixing time robustness, but Theorem 1.10 is not tight for hitting time robustness. See Section 4.4 for details.

1.6.2 Expanders

Here we consider $d$ -regular expander graphs, where we think of $d=O(1)$ . In the non-adversarial setting, the mixing time $t_{\mathrm{mix}}=\Theta(\log n)$ , the max hitting time $t_{\mathrm{hit}}=\Theta(n)$ [21], and the cover time $t_{\mathrm{cov}}=\Theta(n\log n)$ [5, 21]. We actually prove these last two theorems in what we believe is a simpler way.

Our theorems imply tight bounds on the robustness notions for constant-degree regular expanders. Specifically, Theorem 1.7 show that the mixing time robustness is a constant, while Theorem 1.12 and Theorem 1.15 show that the hitting and cover time robustnesses are $\Theta(1/\log n)$ .

We remark that previous work shows that lossless expanders have robustness properties beyond what we consider here. A $d$ -regular graph is a lossless expander if sets that are not too large expand by a factor of $(1-\eta)d$ where $\eta<1/2$ . Doron, Moshkovitz, Oh, and Zuckerman [9, 10] studied random walks where a non-Markovian adversary chooses steps according to a Chor-Goldreich (CG) source, and even generalizations of this. In a CG-source, in a $d$ -regular graph, an adversary can choose any distribution for the next step such that no edge is chosen with probability more than $d^{-\delta}$ for some parameter $\delta>0$ . They show that even this biased walk mixes in the sense that the final node has entropy $\log n-O(1)$ . However, their methods do not seem to extend to standard expander graphs.

1.7 Weak randomness

As another perspective, there has been a lot of research investigating the use of weak randomness in computing. In those situations, it is natural to allow computation on the weak randomness, for example to purify it into high-quality randomness. This work disallows such preprocessing, which is particularly natural when the random process models nature or social interactions.

1.8 Techniques

We prove Theorem 1.2 and Theorem 1.6 about stationarity and mixing time robustness by generalizing a potential function argument by Haslegrave, Sauerwald, and Sylvester [13]. Their potential function uses squares to prove a lower bound. We use a potential function based on smaller powers to prove an upper bound. In particular, we need an inequality going in the opposite direction. Specifically, for $x=(x_{1},\dots,x_{d})\in[0,\infty)^{d}$ define the $\ell$ -power mean $M_{\ell}(x)=((\sum_{i}x_{i}^{\ell})/d)^{1/\ell}$ , $M_{\infty}=\max_{i}x_{i}$ , and the $\varepsilon$ -max-avg operator $\operatorname{MA}_{\varepsilon}=\varepsilon M_{\infty}+(1-\varepsilon)M_{1}$ . We use Holder’s inequality and other ideas to bound $\operatorname{MA}_{\varepsilon}(x)$ in terms of $M_{\ell}(x)$ for some $\ell$ just slightly larger than 1. We also simplify the [13] method somewhat, avoiding the use of trajectory trees.

We prove Theorem 1.7 about mixing time robustness of expanders by adapting the proof that the second largest eigenvalue determines convergence.

We prove Theorem 1.10 about hitting time robustness by showing that in a standard random walk, there’s the “right” probability of hitting a node within twice the separation time $t_{\mathrm{sep}}=O(t_{\mathrm{mix}})$ . Specifically, for $h_{v}=\operatorname{\mathbb{E}}_{\pi}[H_{v}]$ , we show that for all $u,v\in V$ , we have $\Pr_{u}[H_{v}\leq 2t_{\mathrm{sep}}]=\Omega(t_{\mathrm{sep}}/h_{v})$ . We then show that this probability doesn’t decline much even in the robust setting. Dividing the random walk into epochs of length $2t_{\mathrm{sep}}$ , we conclude that the hitting time doesn’t increase much. We further use this technique to give a simpler proof of the max hitting time for expanders, which will be in the final version of this paper. Given its utility for both of these theorems, we believe that this technique could be useful elsewhere.

We prove Theorem 1.11 by first observing that we can take the adversary to be oblivious, so the corresponding adversarial walk has a stationary distribution. We then compare the stationary distributions, and apply the “essential edge lemma” from Aldous-Fill about hitting times across cut edges.

We prove Theorem 1.12, the negative result giving a tight bound for expanders, by giving a target distribution and modifying the Metropolis algorithm. Much of this argument is similar to a lower bound on $\pi^{+}(S)$ by [4].

Theorem 1.15 about cover time robustness follows from Matthews’ technique for bounding the cover time in terms of the hitting time.

1.9 Related work

We have not found any previous work giving good bounds on our notions of robustness, except for lossless expanders. Doron, Moshkovitz, Oh, and Zuckerman [9, 10] analyzed extremely strong non-Markovian adversaries for lossless expanders, but their methods don’t apply more generally. They also didn’t analyze hitting or cover times.

In the model introduced by [4] and in follow-up works, the focus is on a controller seeking to minimize hitting and cover times, and they show lower bounds on increasing the probability of being in sets. For example, Haslegrave, Sauerwald, and Sylvester [13] improved a result of [4] and proved that an $\varepsilon$ -controller can increase the probability of being in a set from $p$ to approximately $p^{1-\varepsilon}$ . They used this to minimize hitting and cover times, for example showing how to reduce the cover time of an expander to $O(n\log\log n)$ . This is different than our work where we view the controller as an adversary, so hitting times and cover times increase rather than decrease. Moreover, the quantification is different: they show that a suitable adversary exists, whereas we show that all adversaries are not too bad.

Chin, Moitra, Mossel, and Sandon [7] studied the power of an adversary in Glauber dynamics. Glauber dynamics is a type of random walk on $\Sigma^{V}$ , where $V$ is a set of nodes in an underlying graph and $\Sigma$ is a finite alphabet. They proposed an adversary that controls any changes to a subset $A\subseteq V$ with $|A|\leq\varepsilon|V|$ . They studied how such an adversary can affect global features of the system, such as the approximate mixing time. Their methods seem tailored to Glauber dynamics and not generalizable.

There is also research analyzing how the stationary distribution of a Markov chain is affected by perturbations. For example, Liu ([18], Theorem 2.2) showed that if $P$ is perturbed to $\tilde{P}$ , which has stationary distribution $\tilde{\pi}$ , then

|\pi-\tilde{\pi}|\leq t_{\mathrm{hit}}^{2}|P-\tilde{P}|,

where $|M|$ for a matrix $M$ is the maximum $\ell_{1}$ -norm of any row. Thus, the best bound on robustness that this could give is $\Omega(1/t_{\mathrm{hit}}^{2})$ , which is much weaker than what we obtain. Moreover, our robustness is for Markov set chains, which are much more general than Markov chains.

Hartfiel [12] studies when Markov set-chains converge, and movement between transient states and absorbing states and the like. He doesn’t address the types of questions we ask.

Avin, Koucky, and Lotker [3] study random walks on evolving graphs. Here an adversary is allowed to completely change the graph, i.e., transition matrix, at every time step. However, the adversary doesn’t know the location of the random walker; in other words, the adversary decides the graphs in advance. They show that the cover time and mixing time could be exponentially larger than the cover times and mixing times of the individual graphs, but are only polynomially larger if the random walk is made lazy. This model is quite different from ours.

Hunter [14] studies a related model, what in our terminology is a strong oblivious adversary. He defines $\eta$ to be the “random target time” – the time for a random walk to hit a random node chosen according to $\pi$ – but confusingly calls this the mixing time. Really, $\eta$ is more closely related to $t_{\mathrm{hit}}$ ; see [16] for more on the random target time. He then shows that the adversarial walk has a stationary distribution at most $\eta\varepsilon$ -far from $\pi$ in variation distance. This is much weaker than our results.

There are other adversarial variants of Markov chains that don’t appear relevant for us. For example, the PageRank algorithm [20] and variants (e.g., [23]) allow perturbations where the adversary need not respect the graph structure, which leads to very different results. For another example, [8] define a model that they call “adversarial Markov chains” on an infinite metric space. In their model, transitions are followed exactly except when the chain is in some prespecified bounded subset, in which case an adversary can make arbitrary bounded jumps. They study whether such an adversarial chain remains bounded in probability.

1.10 Conjectures and Open Problems

We conjecture the following.

Conjecture 1.16.

The hitting time robustness is $\Omega(1/\Delta)$ , where $\Delta$ denotes the diameter.

Since $\Delta=O(t_{\mathrm{mix}})$ , this would improve Theorem 1.10. Theorem 1.11 establishes this conjecture for trees, which is tight for the path and cycle.

Conjecture 1.17.

For all graphs, the cover time robustness is the hitting time robustness, up to a constant factor.

Conjecture 1.18.

We can replace $t_{\mathrm{up}}$ in Theorem 1.2 with $t_{\mathrm{mix}}$ .

What do we need to know about a graph to determine its robustnesses? Are the different robustnesses efficiently computable, perhaps up to constant factors?

Another class of interesting open problems is to determine the various robustnesses for natural graphs, such as different dimensional grids, hypercubes, and balanced trees.

2 Preliminaries

2.1 Random walk basics

We will assume throughout that any undirected graph $G=(V,E)$ is connected. We denote $n=|V|$ , $m=|E|$ , $\Gamma(v)$ the set of neighbors of node $v\in V$ , and $d_{v}=|\Gamma(v)|$ the degree of $v$ . A random walk on $G$ is a sequence of random variables $(X_{0},X_{1},\dots)$ where each $X_{t}\in V$ and $X_{t+1}$ is uniform on $\Gamma(X_{t})$ . We use $P$ to denote the transition matrix of the random walk. Letting $p_{t}$ denote the probability distribution of $X_{t}$ , viewed as a row vector, so we have $p_{t+1}=p_{t}P$ . If $G$ is not bipartite, then $p_{t}$ converges to the stationary distribution $\pi$ given by $\pi(v)=d_{v}/(2m)$ . Even if $G$ is bipartite, $(p_{t}+p_{t+1})/2$ converges to $\pi$ .

Let $\operatorname{\mathbb{E}}_{u}$ and $\Pr_{u}$ denote the expectation and probability, respectively, for a random walk starting at $u\in V$ . Let $H_{v}$ denote the hitting time of $v$ : the time to first visit $v$ after time 0. The standard hitting times in an unbiased random walk are $h(u,v)=\operatorname{\mathbb{E}}_{u}H_{v}$ , and the max hitting time is $t_{\mathrm{hit}}=\max_{u\neq v}h(u,v)$ .

The following lemma is well known.

Lemma 2.1.

The expected return time $h(v,v)=1/\pi(v)$ .

2.2 Expander graphs

Expanders are graphs where every subset of nodes has many neighbors. We will use the standard spectral characterization, which is equivalent for constant expansion. For simplicity, assume that $G$ is regular. Denote the eigenvalues of $P$ by $1={\lambda}_{1}>{\lambda}_{2}\geq\ldots\geq{\lambda}_{n}\geq-1$ . The last equality holds only when $G$ is bipartite. Let us assume that $G$ is not bipartite, e.g., by adding self loops. Let

\lambda=\max({\lambda}_{2},-{\lambda}_{n})=\max_{i\neq 1}|{\lambda}_{i}|

denote the second largest eigenvalue in absolute value. Let $\gamma\stackrel{{\scriptstyle\rm def}}{{=}}1-\lambda$ denote the spectral gap. We call $G$ a $\gamma$ -spectral expander.

2.3 Mixing time variants

Recall the definition of mixing time, Definition 1.3. The choice of $\alpha$ only affects the mixing time up to a constant factor. Specifically:

Lemma 2.2.

In any graph, $t_{\mathrm{mix}}(\alpha)\leq\lceil\log_{2}1/\alpha\rceil t_{\mathrm{mix}}(1/4)$ .

For this and much more on the mixing time see, for example, the book by Levin, Peres, and Wilmer [16].

Up to a logarithmic factor, the reciprocal of the spectral gap $\gamma$ characterizes the mixing time. Specifically, it is known (e.g., [16] Theorems 12.3 and 12.4) that

	$\displaystyle t_{\mathrm{mix}}(\alpha)$	$\displaystyle\leq$	$\displaystyle(1/\gamma)\log(1/(\alpha\pi_{\mathrm{min}}))$
	$\displaystyle t_{\mathrm{mix}}(\alpha)$	$\displaystyle\geq$	$\displaystyle(1/\gamma-1)\log(1/(2\alpha))$

We will use a variant of the mixing time called the separation time $t_{\mathrm{sep}}$ .

Definition 2.3.

The separation distance is:

s(t)=\max_{v\in V}(1-p_{t}(v)/\pi(v)).

In other words, it’s the minimum $\beta$ such that all nodes $v$ have $p_{t}(v)\geq(1-\beta)\pi(v)$ .

We can now define separation time in the natural way.

Definition 2.4.

Fix a small constant $\alpha$ , say $\alpha=1/4$ . Define the separation time

t_{\mathrm{sep}}=t_{\mathrm{sep}}(\alpha)=\min\{t:s(t)\leq\alpha\}.

Lemma 2.5.

In any graph, $t_{\mathrm{sep}}=\Theta(t_{\mathrm{mix}})$ .

For the proof, see Aldous-Fill [1]. Briefly, it’s immediate that $\delta(t)\leq s(t)$ . Aldous-Fill (Lemma 4.7 in [1]) show that $s(2t)\leq 1-(1-\delta(t))^{2}$ .

We further define a variant of the separation distance briefly discussed in Aldous-Fill in Equation 4.14. They left it unnamed, so we name it.

Definition 2.6.

The upper separation distance is:

s_{\mathrm{up}}(t)=\max_{v\in V}(p_{t}(v)/\pi(v)-1).

In other words, it’s the minimum $\beta$ such that all nodes $v$ have $p_{t}(v)\leq(1+\beta)\pi(v)$ .

We can now define upper separation time analogously.

Definition 2.7.

Fix a small constant $\alpha$ , say $\alpha=1/4$ . Define the upper separation time

t_{\mathrm{up}}=t_{\mathrm{up}}(\alpha)=\min\{t:s_{\mathrm{up}}(t)\leq\alpha\}.

While we have $\delta(t)\leq s_{\mathrm{up}}(t)$ , we can’t bound $s_{\mathrm{up}}(2t)$ in terms of $\delta(t)$ ; see Aldous-Fill’s Example 4.9. Nevertheless, the upper bounds of $t_{\mathrm{mix}}$ in terms of the spectral gap $\gamma$ also hold for $t_{\mathrm{sep}}$ and $t_{\mathrm{up}}$ , just by examining the proof.

Lemma 2.8.

We have $t_{\mathrm{sep}}(\alpha),t_{\mathrm{up}}(\alpha)\leq\frac{-\log(\alpha\pi_{% \mathrm{min}})}{\gamma}$ .

$\blacktriangleright$ Remark 2.9.

Since standard random walks on bipartite graphs don’t converge, the above definitions aren’t interesting for bipartite graphs. The usual way of circumventing this is to study lazy random walks on bipartite graphs. Alternatively, we can get meaningful statements for bipartite graphs by replacing $p_{t}$ with $(p_{t}+p_{t+1})/2$ in the above definitions, understanding that even though the random walks don’t converge, the average of pairs of steps converge.

2.4 Easy bounds on robustness

We will use the following notation.

Definition 2.10.

Let $p_{t}(v,S)$ denote the probability that an unbiased walk, starting at node $v$ , is in the set $S$ at time $t$ . For a given adversary $A$ , let $q_{t}(v,S)$ denote the probability that an $A$ -biased walk, starting at node $v$ , is in the set $S$ at time $t$ . When the start node $v$ is understood, we often omit it and write $p_{t}(S)$ and $q_{t}(S)$ .

We have the following simple lemma.

Lemma 2.11.

For any standard $\varepsilon$ -adversary, any time $t$ , and any set $S$ , we have $q_{t}(S)\leq e^{t\varepsilon}p_{t}(S)$ .

Note that this does not hold for a strong $\varepsilon$ -adversary, since such an adversary could move $\varepsilon$ probability to a set $S$ with tiny $p_{t}(S)$ .

It’s not hard to show that each robustness is at least the reciprocal of the corresponding quantity.

Proposition 2.12.

For any graph $G$ , the following bounds hold:

1.

The mixing time robustness, if defined by a standard adversary, is $\Omega(1/t_{\mathrm{mix}})$ . Since we defined mixing time robustness $\rho_{\mathrm{mix}}$ using a strong adversary, this implies $\rho_{\mathrm{mix}}=\Omega(1/(dt_{\mathrm{mix}}))$ for graphs with maximum degree $d$ .
2.

The hitting time robustness $\rho_{\mathrm{hit}}=\Omega(1/t_{\mathrm{hit}})$ .
3.

The cover time robustness $\rho_{\mathrm{cov}}=\Omega(1/t_{\mathrm{cov}})$ .

Proof.

First we show the mixing time robustness. Using Lemma 2.2, choose $t=O(t_{\mathrm{mix}})$ such that $|p_{t}-\pi|\leq\alpha/2$ . Set $\varepsilon=\alpha/(2t)=\Omega(1/t_{\mathrm{mix}})$ . Then a standard $\varepsilon$ -adversary can’t make the biased walk probabilities $q_{t}$ much smaller than the unbiased probabilities $p_{t}$ . Specifically, for any set $S$ , we have

q_{t}(S)\geq(1-\varepsilon)^{t}p_{t}(S)\geq(1-t\varepsilon)p_{t}(S)=(1-\alpha/% 2)p_{t}(S).

Therefore, $|q_{t}-p_{t}|\leq\alpha/2$ , and the triangle inequality gives $|q_{t}-\pi|\leq\alpha$ .

We prove the hitting time and cover time robustness results together. Let $R$ be a stopping time such as a hitting time or cover time. Let $t=4\max_{u}\operatorname{\mathbb{E}}_{u}[R]$ . Then by Markov’s inequality, for any node $v$ , we have $\Pr_{v}[R\geq t]\leq 1/4$ , so $\Pr_{v}[R\leq t]\geq 3/4$ . Set $\varepsilon=1/(4t)$ . Fix a strong $\varepsilon$ -adversary, and let $R^{+}$ denote $R$ with respect to this adversary. Then for any node $v$ , we have

\Pr_{v}[R^{+}\leq t]\geq(1-t\varepsilon)3/4\geq 1/2.

But now dividing time into epochs of length $t$ , the chance of $R^{+}$ in each epoch is at least 1/2, so by Wald’s identity we have $\operatorname{\mathbb{E}}_{u}[R^{+}]\leq 2t\leq 8\max_{u}\operatorname{\mathbb% {E}}_{u}[R]$ . $\hfill\blacktriangleleft$

3 Stationary and Mixing Time Robustness

In this section, we first show that for a Markovian adversary, a Markov set-chain will not necessarily converge to $\pi$ , even if all underlying chains are reversible and have $\pi$ as their stationary distribution. We only consider a Markovian adversary in Section 3.1; in the rest of this section we consider standard and strong adversaries. Next we show how to save a square root from the easy mixing time robustness of $\Omega(1/t_{\mathrm{mix}})$ , proving Theorem 1.6. We add one more ingredient to prove Theorem 1.2. Finally, we establish that constant-degree expanders have constant mixing time robustness, namely $\Omega(1/\gamma)$ .

3.1 MSCs non-convergence

First we show that for a Markovian adversary, the MSC will not necessarily converge to $\pi$ , even if all underlying chains are reversible. To see this, we define two weighted undirected graphs based on the path graph. Denote $\text{Odd}=\{\{i,i+1\}\mid\text{$i$ is odd}\}$ , and $\text{Even}=\{\{i,i+1\}\mid\text{$i$ is even}\}$ . Graph $G_{odd}$ assigns edges in Odd weight 2, and edges in Even weight 1, and adds weighted self loops at the two endpoints to make all nodes have weighted degree 3. Graph $G_{even}$ is the same but swaps even and odd: it assigns edges in Even weight 2, and edges in Odd weight 1. Let $P_{odd}$ denote a random walk on $G_{odd}$ , and similarly for $P_{even}$ . Since $G_{odd}$ and $G_{even}$ are both regular, both have uniform stationary distributions.

However, consider a Markovian MSC that chooses $P_{odd}$ if $X_{t}$ is odd and $P_{even}$ otherwise. If $X_{t}$ is not an endpoint, then $\Pr[X_{t+1}=X_{t}+1]=2/3$ . We thus have a biased random walk which converges to a very biased distribution very far from uniform.

3.2 Bounding the increase in probability

We prove that if a set $S$ has probability $p$ , then a strong $\varepsilon$ -adversary cannot increase the probability to more than $O(p^{1-o(1)})$ as long as $td\varepsilon^{2}=O(1)$ . This quadratic dependence on $\varepsilon$ improves on the easy linear dependence given by Lemma 2.11. We use a potential function argument that generalizes one by Haslegrave, Sauerwald, and Sylvester [13]. They used a potential function based on squares to prove a lower bound. We use a potential function based on smaller powers to prove an upper bound. In particular, we need an inequality going in the opposite direction as [13].

For $x=(x_{1},\dots,x_{d})\in[0,\infty)^{d}$ define the $\ell$ -power mean $M_{\ell}(x)=((\sum_{i}x_{i}^{\ell})/d)^{1/\ell}$ , $M_{\infty}=\max_{i}x_{i}$ , and the $\varepsilon$ -max-avg operator $\operatorname{MA}_{\varepsilon}=\varepsilon M_{\infty}+(1-\varepsilon)M_{1}$ . We now develop the inequality we need.

Lemma 3.1.

For $k,r\geq 1$ and $kr\varepsilon\leq 1$ , we have $(1+r\varepsilon)^{k}+r(1-\varepsilon)^{k}\leq r+1+2k^{2}r^{2}\varepsilon^{2}$ .

Proof.

	$\displaystyle(1+r\varepsilon)^{k}+r(1-\varepsilon)^{k}$	$\displaystyle=1+kr\varepsilon+\binom{k}{2}(r\varepsilon)^{2}+\binom{k}{3}(r% \varepsilon)^{3}+\dots$
		$\displaystyle+r-kr\varepsilon+\binom{k}{2}r\varepsilon^{2}-\binom{k}{3}r% \varepsilon^{3}+\dots$
		$\displaystyle\leq r+1+2\left(\binom{k}{2}(r\varepsilon)^{2}+\binom{k}{3}(r% \varepsilon)^{3}+\dots\right)$
		$\displaystyle\leq r+1+2\binom{k}{2}(r\varepsilon)^{2}\left(1+(kr\varepsilon/2)% +(kr\varepsilon/2)^{2}+\dots\right)$
		$\displaystyle\leq r+1+4\binom{k}{2}(r\varepsilon)^{2}\leq r+1+2k^{2}r^{2}% \varepsilon^{2}.\$

$\hfill\blacktriangleleft$

Lemma 3.2.

For $x\in[0,\infty)^{d}$ and $k,\ell>1$ satisfying $1/k+1/\ell=1$ and $k(d-1)\varepsilon\leq 1$ , we have $\operatorname{MA}_{\varepsilon}(x)\leq e^{2k(d-1)\varepsilon^{2}}M_{\ell}(x)$ .

Proof.

Let $x=(x_{1},\dots,x_{d})$ , and assume without loss of generality that $x_{1}\geq x_{2}\geq\dots\geq x_{d}\geq 0$ . Using Holder’s inequality and Lemma 3.1 with $r=d-1$ ,

	$\displaystyle\operatorname{MA}_{\varepsilon}(x)$	$\displaystyle=\left(\varepsilon+\frac{1-\varepsilon}{d}\right)x_{1}+\frac{1-% \varepsilon}{d}x_{2}+\dots+\frac{1-\varepsilon}{d}x_{d}$
		$\displaystyle\leq\frac{1}{d}\left((1+(d-1)\varepsilon)^{k}+(d-1)(1-\varepsilon% )^{k}\right)^{1/k}\left(x_{1}^{\ell}+\dots+x_{d}^{\ell}\right)^{1/\ell}$
		$\displaystyle\leq\frac{1}{d}\left(d+2k^{2}(d-1)^{2}\varepsilon^{2}\right)^{1/k% }\left(x_{1}^{\ell}+\dots+x_{d}^{\ell}\right)^{1/\ell}$
		$\displaystyle\leq\left(1+2k^{2}(d-1)\varepsilon^{2}\right)^{1/k}M_{\ell}(x)$
		$\displaystyle\leq e^{2k(d-1)\varepsilon^{2}}M_{\ell}(x).\$

$\hfill\blacktriangleleft$

Recall Definition 2.10 for the definitions of $p_{t}$ and $q_{t}$ .

Theorem 3.3.

Let $G=(V,E)$ be a graph with all degrees at most $d$ , and fix a starting node. Let $k,\ell>1$ satisfy $1/k+1/\ell=1$ and $kd\varepsilon\leq 1$ . For any $\varepsilon$ -strong adversary, any start node, and any $S\subseteq V$ , we have $q_{t}(S)\leq e^{2tkd\varepsilon^{2}}p_{t}(S)^{1/\ell}$ .

Proof.

We follow the structure of [13] but both simplify it somewhat, avoiding the use of the trajectory tree, and generalize it. Fix the transition matrix $P$ , a start node $u$ , target set $S$ , and stopping time $t$ . Define the vectors

	$\displaystyle p_{t}$	$\displaystyle=$	$\displaystyle(p_{t}(v))_{v\in V}$
	$\displaystyle s_{t}$	$\displaystyle=$	$\displaystyle(q_{t}(v,S)^{\ell})_{v\in V}.$

When intermixed with matrices, we will view $p_{t}$ as a row vector and $s_{t}$ as a column vector, for reasons that will become apparent. Note that $p_{t+1}=p_{t}P$ . Define the potential function

\Phi_{i}=\langle p_{t-i},s_{i}\rangle=\sum_{v\in V}p_{t-i}(v)q_{i}(v,S)^{\ell}.

Since $q_{0}(v,S)=1$ if $v\in S$ and 0 otherwise, we have that

\Phi_{0}=\sum_{v\in V}p_{t}(v)q_{0}(v,S)^{\ell}=p_{t}(S).

Since $p_{0}(v)=1$ if $v=u$ and 0 otherwise, we have that

\Phi_{t}=\sum_{v\in V}p_{0}(v)q_{t}(v,S)^{\ell}=q_{t}(u,S)^{\ell}=q_{t}(S)^{% \ell}.

If the random walk is at node $v$ with $i+1$ steps remaining, the adversary’s strategy to maximize $q_{i+1}(v,S)$ is to use the $\varepsilon$ probability it controls to move to a neighbor $w$ maximizing $q_{i}(w,S)$ . Letting $d_{v}$ denote the degree of $v$ , this strategy gives

q_{i+1}(v,S)=\operatorname{MA}_{\varepsilon}((q_{i}(w,S))_{w\in\Gamma(v)})\leq e% ^{2kd_{v}\varepsilon^{2}}M_{\ell}((q_{i}(w,S))_{w\in\Gamma(v)}),

using Lemma 3.2. Raising both sides to the $\ell$ th power gives

q_{i+1}(v,S)^{\ell}\leq e^{2k\ell d_{v}\varepsilon^{2}}\frac{1}{d_{v}}\sum_{w% \in\Gamma(v)}q_{i}(w,S)^{\ell}.

Using the fact that all degrees are at most $d$ , we can deduce an inequality with vectors and matrices:

s_{i+1}\leq e^{2k\ell d\varepsilon^{2}}Ps_{i}.

Now we can bound the potential function by

\Phi_{i+1}=\langle p_{t-i-1},s_{i+1}\rangle\leq e^{2k\ell d\varepsilon^{2}}p_{% t-i-1}Ps_{i}=e^{2k\ell d\varepsilon^{2}}\langle p_{t-i},s_{i}\rangle=e^{2k\ell d% \varepsilon^{2}}\Phi_{i}.

We conclude that $\Phi_{t}\leq\exp(2tk\ell d\varepsilon^{2})\Phi_{0}$ , and

q_{t}(S)=\Phi_{t}^{1/\ell}\leq e^{2tkd\varepsilon^{2}}\Phi_{0}^{1/\ell}=e^{2% tkd\varepsilon^{2}}p_{t}(S)^{1/\ell}.\

$\hfill\blacktriangleleft$

Corollary 3.4.

The conclusion of Theorem 3.3 holds for any starting distribution.

Proof.

This follows from the concavity of the function $x^{\ell}$ . $\hfill\blacktriangleleft$

Corollary 3.5.

For any $\varepsilon$ -strong adversary, any starting distribution, and any $S\subseteq V$ , we have

q_{t}(S)\leq\exp(2\varepsilon\sqrt{2td\ln(1/p)})\cdot p,

where $p=p_{t}(S)$ . In particular, for $\varepsilon=O(1/\sqrt{td})$ , we have $q_{t}(S)=O(p^{1-o(1)})$ .

Proof.

We choose $k$ to optimize the upper bound for $q_{t}(S)$ . Let $c=2td\varepsilon^{2}$ , $p=p_{t}(S)$ , and $r=\ln(1/p)$ . Since $1/\ell=1-1/k$ , the upper bound is

e^{ck}p^{1-1/k}=e^{ck+r/k}p.

By the arithmetic-geometric mean inequality, this is minimized when $ck=r/k$ and achieves the bound $e^{2\sqrt{cr}}p=\exp(2\varepsilon\sqrt{2tdr})\cdot p$ . $\hfill\blacktriangleleft$

We now restate and prove Theorem 1.6, which is tight for the path and the cycle. See 1.6

Proof.

While we can deduce this from Corollary 3.5, it’s more straightforward to apply Theorem 3.3 directly. We will choose suitable constants $k$ and $\ell$ that satisfy the assumptions of Theorem 3.3 and use the conclusion. Letting $c\stackrel{{\scriptstyle\rm def}}{{=}}\exp(2tkd\varepsilon^{2})$ , we can bound the variation distance

|q_{t}-p_{t}|\leq\sup_{p\in[0,1]}(cp^{1/\ell}-p).

The derivative of $cp^{1/\ell}-p$ is $(c/\ell)p^{1/\ell-1}-1=(c/\ell)p^{-1/k}-1$ . There are two cases. If $c\geq\ell$ , then the derivative is nonnegative on $[0,1]$ , so the sup is achieved at $p=1$ and the sup is $c-1$ . If $c<\ell$ , then the sup is achieved when $cp^{-1/k}=\ell$ . This value of $p$ evaluates to:

cp^{1/\ell}-p=p(cp^{-1/k}-1)=p(\ell-1)\leq 2p/k\leq 2/k,

using that $\ell=1/(1-1/k)\leq 1+2/k$ for $k\geq 2$ .

We conclude that $|q_{t}-p_{t}|\leq\max(c-1,2/k)$ . By choosing $k=4/\alpha$ and $\varepsilon=\sqrt{\alpha/(8tkd)}$ , and using that $e^{x}\leq 1+2x$ for $x\leq 1/2$ , we get that this maximum is at most $\alpha/2$ . By Lemma 2.2, after $t=O(t_{\mathrm{mix}})$ steps we have $|p_{t}-\pi|\leq\alpha/2$ . Therefore, by the triangle inequality, $|q_{t}-\pi|\leq\alpha$ . Thus the robustness is bounded below by

\varepsilon=\sqrt{\alpha/(8tkd)}=\Omega(1/\sqrt{t_{\mathrm{mix}}d}).\

$\hfill\blacktriangleleft$

We now use Corollary 3.5 to upper bound $\pi^{+}(S)$ and prove Theorem 1.2.

Theorem 1.2. [Restated, see original statement.]

Let $G=(V,E)$ be a graph with all degrees at most $d$ . Let $G=(V,E)$ be a graph with all degrees at most $d$ . For any $\varepsilon$ -strong adversary, any starting distribution, and any $S\subseteq V$ with $\pi(S)\leq 1-\alpha$ , we have

\pi^{+}(S)<(1+\alpha)\exp(2\varepsilon\sqrt{2dt_{\mathrm{up}}\ln(1/\pi(S))})% \pi(S).

In particular, for $\varepsilon=O(1/\sqrt{dt_{\mathrm{up}}})$ , we have $\pi^{+}(S)=\pi(S)\cdot\exp(O(\sqrt{\log(1/\pi(S)}))=O(\pi(S)^{1-o(1)})$ .

Proof.

As usual, let $p_{t}(\cdot)$ and $q_{t}(\cdot)$ denote unbiased and adversarially-biased probabilities, respectively. Let $S\subseteq V$ be arbitrary, and fix a starting node $v$ for the random walk. First assume that $G$ is bipartite. Note that to show $\pi^{+}(S)\leq B$ , it suffices to show that for all large enough $t$ we have $q_{t}(S)\leq B$ . Let $c=2\varepsilon\sqrt{2dt_{\mathrm{up}}}$ and $f(x)=xe^{c\sqrt{\ln(1/x)}}$ . To bound $q_{t}(S)$ for $t\geq t_{\mathrm{up}}$ , set $t^{\prime}=t-t_{\mathrm{up}}$ , and condition on $X_{t^{\prime}}$ . Recall that by the definition of $t_{\mathrm{up}}$ , we have $p_{t_{\mathrm{up}}}(S)\leq(1+\alpha)\pi(S)$ . Using $f(rx)\leq rf(x)$ for $rx<1$ and $r>1$ , and applying Corollary 3.5, we have

q_{t}(v,S)=\operatorname{\mathbb{E}}_{X_{t^{\prime}}}[q_{t_{\mathrm{up}}}(X_{t% ^{\prime}},S)]\leq\operatorname{\mathbb{E}}_{X_{t^{\prime}}}\left[f(p_{t_{% \mathrm{up}}}(X_{t^{\prime}},S))\right]\leq(1+\alpha)f(\pi(S)),

as required for the nonbipartite case.

The bipartite case follows similarly by comparing $(q_{t}+q_{t+1})/2$ with $(p_{t}+p_{t+1})/2$ . $\hfill\blacktriangleleft$

3.3 Mixing time robustness for expanders

We now establish robustness in terms of spectral gap. See Section 2.2 for the definition of a spectral expander. We first restate our theorem

Theorem 1.7. [Restated, see original statement.]

A $d$ -regular $\gamma$ -spectral expander has mixing time robustness $\Omega(\gamma/\sqrt{d})$ .

Proof.

Our strong adversary chooses a stochastic matrix $B_{t}$ with $\operatorname{supp}(B_{t})\subseteq\operatorname{supp}(P)$ , and the transition matrix $P_{t}=(1-\varepsilon)P+\varepsilon B_{t}$ . Let $q_{t}$ denote the vector of probabilities at time $t$ . We can write $q_{t}=\pi+r_{t}$ , where $r_{t}\perp\pi$ can be viewed as an error term. Therefore,

	$\displaystyle q_{t+1}$	$\displaystyle=$	$\displaystyle(1-\varepsilon)Pq_{t}+\varepsilon B_{t}q_{t}$
		$\displaystyle=$	$\displaystyle(1-\varepsilon)\pi+(1-\varepsilon)Pr_{t}+\varepsilon B_{t}\pi+% \varepsilon B_{t}r_{t}$

Subtracting $\pi$ gives

r_{t+1}=(1-\varepsilon)Pr_{t}+\varepsilon B_{t}r_{t}+\varepsilon(B_{t}\pi-\pi)

Using that for stochastic $B$ with row sums at most $d$ , we have $\|Bv\|_{2}\leq\sqrt{d}\|v\|_{2}$ , $\|(B-I)v\|_{2}\leq\sqrt{d}\|v\|_{2}$ , and $\|Pw\|_{2}\leq\lambda\|w\|_{2}$ for $w\perp\pi$ yields:

\|r_{t+1}\|_{2}\leq((1-\varepsilon)\lambda+\varepsilon\sqrt{d})\|r_{t}\|_{2}+% \varepsilon\sqrt{d/n}

(1)

We will write (1) as $\|r_{t+1}\|_{2}\leq b\|r_{t}\|_{2}+\delta$ to simplify our calculation. Our calculation will show that we can choose $\varepsilon$ small enough to make $r_{t}$ converge quickly. For $\varepsilon_{0}\leq 1$ to be chosen later, set

\varepsilon=\frac{(1-\lambda)\varepsilon_{0}}{2(\sqrt{d}-\lambda)}.

Then $b\stackrel{{\scriptstyle\rm def}}{{=}}(1-\varepsilon)\lambda+\varepsilon\sqrt{% d}=\lambda+\varepsilon(\sqrt{d}-\lambda)\leq\lambda+(1-\lambda)/2=(1+\lambda)/% 2<1$ .

Let $\delta\stackrel{{\scriptstyle\rm def}}{{=}}\varepsilon\sqrt{d/n}$ . Note that for $x^{\prime}\leq bx+\delta$ with $b<1$ , and for $x\geq 2\delta/(1-b)$ , we have

x^{\prime}\leq bx+(1-b)x/2=((1+b)/2)x.

Plugging into (1), we deduce that for $\|r_{t}\|_{2}\geq 4\varepsilon\sqrt{d/n}/(1-\lambda)$ , we have

\|r_{t+1}\|_{2}\leq\frac{1+\lambda}{2}\|r_{t}\|_{2}.

Consequently, after $t=O(\log_{1/b}n)$ steps, we have

\|r_{t}\|_{1}\leq\sqrt{n}\|r_{t}\|_{2}\leq\frac{4\varepsilon\sqrt{d}}{1-% \lambda}=\frac{2\varepsilon_{0}\sqrt{d}}{\sqrt{d}-\lambda}\leq\frac{2% \varepsilon_{0}\sqrt{d}}{\sqrt{d}/4}=8\varepsilon_{0}.

Here we used that for $d\geq 2$ , we have $\sqrt{d}-1\geq\sqrt{d}/4$ .

Thus, by choosing $\varepsilon_{0}$ a small enough constant, we can make the variation distance to $\pi$ , which is $\|r_{t}\|_{1}/2$ , an arbitrarily small constant. $\hfill\blacktriangleleft$

4 Hitting Time Robustness

4.1 Inverse of mixing time

We now show that the hitting time robustness is $\Omega(1/t_{\mathrm{mix}})$ . First recall the separation time $t_{\mathrm{sep}}$ and that $t_{\mathrm{sep}}=\Theta(t_{\mathrm{mix}})$ (Lemma 2.5). We first prove that short unbiased walks hit nodes with at least the “right” probability. Let $h_{v}=\operatorname{\mathbb{E}}_{\pi}[H_{v}]$ .

Lemma 4.1.

For all $u,v\in V$ , we have $\Pr_{u}[H_{v}\leq 2t_{\mathrm{sep}}]\geq t_{\mathrm{sep}}/(16h_{v})$ .

Proof.

Let

p\stackrel{{\scriptstyle\rm def}}{{=}}\Pr_{\pi}[H_{v}\leq t_{\mathrm{sep}}]=% \sum_{w\in V}\pi(w)\Pr_{w}[H_{v}\leq t_{\mathrm{sep}}].

Since for any starting $u$ , after $t_{\mathrm{sep}}$ steps every node $w$ has probability at least $\pi(w)/2$ , we have

\Pr_{u}[H_{v}\leq 2t_{\mathrm{sep}}]\geq\sum_{w\in V}\frac{\pi(w)}{2}\Pr_{w}[H% _{v}\leq t_{\mathrm{sep}}]=\frac{p}{2}.

We now show that $p\geq t_{\mathrm{sep}}/(8h_{v})$ , which will prove the lemma. Suppose that $p<t_{\mathrm{sep}}/(8h_{v})$ . We will show that $\Pr_{\pi}[H_{v}\leq 2h_{v}]<1/2$ , which contradicts Markov’s inequality, proving the lemma.

To show this, let $k=2h_{v}/t_{\mathrm{sep}}$ . We prove by induction that for all $i\leq k$ , we have

\Pr_{\pi}[H_{v}\leq it_{\mathrm{sep}}]<2ip\leq 1/2.

This is true for $i=1$ . Suppose it’s true for a given $i$ . Then

\Pr_{\pi}[H_{v}\leq(i+1)t_{\mathrm{sep}}]=\Pr_{\pi}[H_{v}\leq it_{\mathrm{sep}% }]+\Pr_{\pi}[H_{v}\leq(i+1)t_{\mathrm{sep}}\mid H_{v}>it_{\mathrm{sep}}].

To analyze the conditional probability, we bound the conditional distribution of $X_{t}$ given that $H_{v}>t$ , where $t=it_{\mathrm{sep}}$ . Specifically, for any node $w$ ,

\Pr_{\pi}[X_{t}=w\mid H_{v}>t]=\frac{\Pr_{\pi}[X_{t}=w\wedge H_{v}>t]}{\Pr_{% \pi}[H_{v}>t]}\leq\frac{\Pr_{\pi}[X_{t}=w]}{\Pr_{\pi}[H_{v}>t]}=\frac{\pi(w)}{% \Pr_{\pi}[H_{v}>t]}

Thus, conditioning on $H_{v}>t$ can only increase probabilities by at most $1/\Pr_{\pi}[H_{v}>t]\leq 2$ . Therefore,

\Pr_{\pi}[H_{v}\leq(i+1)t_{\mathrm{sep}}\mid H_{v}>it_{\mathrm{sep}}]\leq 2p,

and we are done. $\hfill\blacktriangleleft$

We are now ready to prove our theorem about hitting time robustness.

Theorem 4.2.

For all $\varepsilon$ and all nodes $u, v$ , we have $h_{u,v}^{+}=O(h_{v}\exp(O(\varepsilon t_{\mathrm{mix}})))$ .

Proof.

We prove the theorem with $t_{\mathrm{sep}}$ in place of $t_{\mathrm{mix}}$ ; this suffices since $t_{\mathrm{sep}}=O(t_{\mathrm{mix}})$ . By Lemma 4.1, even with an $\varepsilon$ -adversary, we have

q\stackrel{{\scriptstyle\rm def}}{{=}}\Pr_{u}[H_{v}^{+}\leq 2t_{\mathrm{sep}}]% \geq\exp(-2t_{\mathrm{sep}}\varepsilon)t_{\mathrm{sep}}/(16h_{v}).

By Wald’s identity, we have $h_{u,v}^{+}\leq 2t_{\mathrm{sep}}/q=O(h_{v}\exp(O(\varepsilon t_{\mathrm{mix}}% )))$ . $\hfill\blacktriangleleft$

Theorem 1.10, stating that the hitting time robustness $\rho_{\mathrm{hit}}=\Omega(1/t_{\mathrm{mix}})$ , follows.

4.2 Hitting time robustness for trees

We now show that the hitting time robustness is $\Omega(1/\Delta)$ for trees, confirming ˜1.16 in this case. We first restate the theorem.

Theorem 1.11. [Restated, see original statement.]

The hitting time robustness for any tree is $\Omega(1/\Delta)$ .

Fix a tree $G$ with at least two edges (otherwise it’s trivial), so $\Delta\geq 2$ . Let $\varepsilon=1/(2\Delta)$ , and fix a standard $\varepsilon$ -adversary. For the hitting time, we can assume without loss of generality that the adversary is oblivious, and hence the adversarial random walk $P^{\prime}$ has a stationary distribution $\pi^{\prime}$ . We first show that $\pi^{\prime}$ is not too different than the stationary distribution $\pi$ of the uniform random walk $P$ .

Lemma 4.3.

For any node $v$ , we have $\pi(v)/4\leq\pi^{\prime}(v)\leq 4\pi(v)$ .

Proof.

First consider two neighbors $v$ and $w$ . Let $\pi^{\prime}(v,w)=\pi^{\prime}(v)P^{\prime}(v,w)$ denote the stationary probability of the directed edge $(v,w)$ . Since $\{v,w\}$ is a cut edge, we must have $\pi^{\prime}(v,w)=\pi^{\prime}(w,v)$ . Therefore,

\frac{\pi^{\prime}(v)}{\pi^{\prime}(w)}=\frac{P^{\prime}(w,v)}{P^{\prime}(v,w)% }\geq\frac{(1-\varepsilon)d_{v}}{(1+\varepsilon)d_{w}}\geq(1-2\varepsilon)% \frac{d_{v}}{d_{w}}.

For $v, w$ that may not be neighbors, an inductive argument gives a bound in terms of the distance $\Delta(v,w)$ :

\frac{\pi^{\prime}(v)}{\pi^{\prime}(w)}\geq(1-2\varepsilon)^{\Delta(v,w)}\frac% {d_{v}}{d_{w}}\geq\frac{d_{v}}{4d_{w}},

because $(1-1/x)^{x}\geq 1/4$ for $x\geq 2$ . The lemma will follow because $\pi(v)=d_{v}/(2m)$ is exactly proportional to the degree. To elaborate, define the ratio $r(v)=\pi^{\prime}(v)/\pi(v)$ . Then

\frac{r(v)}{r(w)}=\frac{\pi^{\prime}(v)}{\pi^{\prime}(w)}\cdot\frac{\pi(w)}{% \pi(v)}=\frac{\pi^{\prime}(v)}{\pi^{\prime}(w)}\frac{d_{w}}{d_{v}}\geq\frac{1}% {4}.

Since some $r(u)\geq 1$ and some $r(u^{\prime})\leq 1$ , we have $1/4\leq r(v)\leq 4$ for all $v$ , as claimed. $\hfill\blacktriangleleft$

Proof of Theorem 1.11.

As in the above lemmas, we set $\varepsilon=1/(2\Delta)$ and fix a standard $\varepsilon$ -adversary. We now show that $h^{+}(v,w)<8h(v,w)$ , which will prove the theorem. We consider the oblivious $\varepsilon$ -adversary that maximizes the hitting time from $v$ to $w$ . It suffices to show $h^{+}(v,w)\leq 8h(v,w)$ for neighbors $v$ and $w$ , since for arbitrary $v$ and $w$ in a tree the hitting time is the sum of the hitting times between neighbors.

We now use the idea behind the “essential edge lemma” from Aldous-Fill (Lemma 5.1 in [1]. For neighbors $v$ and $w$ , consider the graph $H$ obtained from $G$ by removing all nodes and edges that are strictly closer to $w$ than to $v$ . Thus, in $H$ the node $w$ has degree 1, and $h(v,w)$ and $h^{+}(v,w)$ are the same in $H$ as in $G$ . Let $\pi$ and $\pi^{\prime}$ denote the corresponding stationary distributions in $H$ , and let $h(\cdot,\cdot)$ and $h^{+}(\cdot,\cdot)$ denote hitting times in $H$ . Then since a random walk in $H$ starting at $w$ is forced to go to $v$ , we have the following:

	$\displaystyle h(v,w)$	$\displaystyle=h(w,w)-1=1/\pi(w)-1\geq 1/(2\pi(w))$	(since $\pi(w)\leq 1/2$ )
	$\displaystyle h^{+}(v,w)$	$\displaystyle=h^{+}(w,w)-1=1/\pi^{\prime}(w)-1<4/\pi(w)\leq 8h(v,w)$	(using Lemma 4.3)

The theorem follows. $\hfill\blacktriangleleft$

4.3 Negative result

Definition 4.4.

Fix a simple undirected graph $G=(V,E)$ . Let $\Delta(v,w)$ denote the distance between $v$ and $w$ in $G$ . Define the average distance $\bar{\Delta}=\operatorname{\mathbb{E}}_{v,w}[\Delta(v,w)]$ .

Theorem 4.5.

For every regular graph $G$ and $\varepsilon\in[0,1]$ , there is a strong oblivious $\varepsilon$ -adversary with $t_{\mathrm{hit}}^{+}=\Omega(ne^{\varepsilon\bar{\Delta}})$ .

Proof.

Much of this proof is similar to a lower bound on $\pi^{+}(S)$ by [4].

The idea is to construct a “lazy” adversary – allowing self loops – where some node has stationary probability at most $e^{-\varepsilon\bar{\Delta}}/n$ , and then apply Lemma 2.1. The lazy adversary won’t be a valid adversary because the original graph doesn’t have self loops, so we then convert it into a valid adversary.

Let $G$ be $d$ -regular. Fix a node $u$ such that $\operatorname{\mathbb{E}}_{w}[\Delta(u,w)]\geq\bar{\Delta}$ . We first define a lazy adversary using a Metropolis algorithm. The target stationary distribution $\rho$ will be proportional to the weight defined as $\operatorname{wt}(v)=(1-\varepsilon)^{-\Delta(u,v)}$ . That is, $\rho_{v}=\operatorname{wt}(v)/z$ , where $z=\sum_{v}\operatorname{wt}(v)$ . Since for an edge $\{v,w\}$ we have $|\Delta(u,v)-\Delta(u,w)|\leq 1$ , the lazy adversary corresponds to the Metropolis Markov chain $Q$ where

Q_{v,w}=\left\{\begin{array}[]{ll}1/d&\text{if $\{v,w\}\in E$ and $\Delta(u,w)% \geq\Delta(v,w)$}\\ (1-\varepsilon)/d&\text{if $\{v,w\}\in E$ and $\Delta(u,w)=\Delta(v,w)-1$}\\ 1-\sum_{x\in\Gamma(v)}Q_{v,x}&\text{if $w=v$}\end{array}\right.

Since $Q$ corresponds to a Metropolis algorithm for $\rho$ , the stationary distribution of $Q$ is $\rho$ . Observe that

z=\sum_{v}(1-\varepsilon)^{-\Delta(u,v)}=n\operatorname{\mathbb{E}}_{v}[(1-% \varepsilon)^{-\Delta(u,v)}\geq n(1-\varepsilon)^{-\operatorname{\mathbb{E}}_{% v}[\Delta(u,v)]}\geq n(1-\varepsilon)^{-\bar{\Delta}}.

Let $h^{*}(\cdot)$ denote the hitting times for $Q$ . Since some neighbor $v$ of $u$ has $h^{*}(v,u)\geq h^{*}(u,u)-1$ , by Lemma 2.1 we have

t_{\mathrm{hit}}^{*}\geq 1/\rho_{u}-1=z-1\geq n(1-\varepsilon)^{-\bar{\Delta}}% -1.

However, $Q$ is not a valid adversary, since a valid adversary must only place weight on edges of $G$ , but $G$ has no self loops. To circumvent this issue, distribute all the self loop probabilities proportional to the other edges. That is, let $Q_{v}=\sum_{x\in\Gamma(v)}Q_{v,x}$ , and define

R_{v,w}=\left\{\begin{array}[]{ll}(1+Q_{v,v})/(dQ_{v})&\text{if $\{v,w\}\in E$% and $\Delta(u,w)\geq\Delta(v,w)$}\\ (1-\varepsilon)/(dQ_{v})&\text{if $\{v,w\}\in E$ and $\Delta(u,w)=\Delta(v,w)-% 1$}\\ \end{array}\right.

In other words $R$ is a sped-up version of $Q$ . A random trajectory of $Q$ can be converted into a random trajectory of $R$ by simply deleting consecutive repetitions. Thus, expected hitting times of $R$ are at least $\min_{v}Q_{v}\geq 1-\varepsilon$ times those of $Q$ . Since $R$ is a strong oblivious $\varepsilon$ -adversary, we have

t_{\mathrm{hit}}^{+}\geq(1-\varepsilon)t_{\mathrm{hit}}^{*}\geq n(1-% \varepsilon)^{1-\bar{\Delta}}-1=\Omega(ne^{\varepsilon\bar{\Delta}}).\

$\hfill\blacktriangleleft$

Corollary 4.6.

For every regular graph $G$ and $\varepsilon\in[0,1]$ , there is a strong oblivious $\varepsilon$ -adversary with $t_{\mathrm{hit}}^{+}=\Omega(ne^{(\varepsilon\log_{d}n)/2})$ .

Proof.

Every $d$ -regular graph has $\bar{\Delta}\geq(\log_{d}n)/2$ . $\hfill\blacktriangleleft$

This enables us to conclude:

Theorem 1.12. [Restated, see original statement.]

The hitting time robustness of any constant degree expander is $\rho_{\mathrm{hit}}=\Theta(1/\log n)$ .

Proof.

The unbiased max hit time for any expander is $O(n)$ [21]. In order for $t_{\mathrm{hit}}^{+}$ to be $O(n)$ , we need $\varepsilon=\Omega(1/\log n)$ . Also, $\varepsilon=\Theta(\log n)$ suffices because of Theorem 1.10 and the fact that constant degree expanders have $t_{\mathrm{mix}}=\Theta(\log n)$ . $\hfill\blacktriangleleft$

4.4 Tight analysis of path/cycle

Notice that Theorem 1.11 is mainly useful when the unbiased $t_{\mathrm{hit}}=\Theta(n)$ . We now give a tight analysis of the robustnesses of the path and cycle, which have $t_{\mathrm{hit}}=\Theta(n^{2})$ . We focus on the hitting time robustness, as the others will be the same. We now show:

Theorem 4.7.

The hitting time robustness of the path and cycle are $\Theta(1/n)$ .

Proof.

From Theorem 1.11, we have $\rho_{\mathrm{hit}}=\Omega(1/n)$ . It therefore suffices to show the upper bound and negative result that $\rho_{\mathrm{hit}}=O(1/n)$ .

The idea is to use the essential edge lemma and its weighted version. It says that if $\{v,w\}$ is a cut edge, then $h(v,w)=2e(G_{v})+1$ , where $G_{v}$ is the connected component of $v$ if $\{v,w\}$ is removed. The weighted version simply counts the weights of the edges, and normalizes the edge $\{v,w\}$ to have weight 1 (or divide by $\operatorname{wt}(v,w))$ .

For simplicity we analyze the path on $n+1$ nodes labeled $0$ to $n$ and edges $e_{i}=\{i,i+1\}$ . Consider the hitting time $h(n,0)=h(n,n-1)+h(n-1,n-2)+...+h(1,0)$ . Let’s see how often edge $e_{i}$ appears in $2e(G_{v})+1$ , where the +1 gets charged to $\{v,w\}$ . The contribution of edge $e_{i}$ to $h(n,0)$ is $1+2i$ .

Now compare it when we bias the random walk by $\varepsilon$ . We can do this by assigning $\operatorname{wt}(e_{i})=(1+\varepsilon)^{i}$ . The contribution of edge $e_{i}$ to the biased hitting time $h^{+}(n,0)$ is

1+2((1+\varepsilon)+(1+\varepsilon)^{2}+\dots+(1+\varepsilon)^{i-1})\approx 2(% 1+\varepsilon)^{i}/\varepsilon>2\binom{i}{2}\varepsilon.

Thus the ratio of this to the contribution in the unbiased random walk is at least $\Omega(i\varepsilon)$ . Now, over half the contribution comes from $i\geq n/2$ , and for such $i$ we can lower bound this ratio by $\Omega(n\varepsilon)$ . Therefore, if $\varepsilon=\omega(1/n)$ , then the ratio $h^{+}(n,0/h(n,0)=\omega(1)$ , as we wanted. $\hfill\blacktriangleleft$

5 Cover Time Robustness

In the non-adversarial setting, Matthews [19] upper bounded the cover time in terms of the max hitting time:

Theorem 5.1 ([19]).

$t_{\mathrm{cov}}\leq t_{\mathrm{hit}}(1+1/2+\dots+1/n)\leq t_{\mathrm{hit}}(1+% \ln n)$ .

There are many examples where this is tight up to a constant factor: expanders (and hence most graphs) [5], the complete graph (coupon collecting), two and higher dimensional grids and tori [2, 25], and the balanced binary tree [24].

We observe that in any graph with $t_{\mathrm{cov}}=\Theta(t_{\mathrm{hit}}\log n)$ , we have $\rho_{\mathrm{cov}}=\Theta(\rho_{\mathrm{hit}})$ . This is because Matthews’ proof works the same in the adversarial setting.

Theorem 5.2.

For any adversary, $t_{\mathrm{cov}}^{+}\leq t_{\mathrm{hit}}^{+}(1+\ln n)$ .

Proof.

Matthews [19] proof works in the adversarial setting, so

t_{\mathrm{cov}}^{+}\leq t_{\mathrm{hit}}^{+}(1+1/2+\dots+1/n)\leq t_{\mathrm{% hit}}^{+}(1+\ln n).\

$\hfill\blacktriangleleft$

Corollary 5.3.

In any graph with $t_{\mathrm{cov}}=\Theta(t_{\mathrm{hit}}\log n)$ , we have $\rho_{\mathrm{cov}}=\Omega(\rho_{\mathrm{hit}})$ .

We conjecture that more generally, in any graph, $\rho_{\mathrm{cov}}=\Theta(\rho_{\mathrm{hit}})$ .

References

[1] D. Aldous and J. A. Fill. Reversible Markov chains and random walks on graphs, 2002. Unfinished monograph, recompiled 2014, available at http://www.stat.berkeley.edu/˜aldous/RWG/book.html.
[2] D.J. Aldous. On the time taken by random walks on finite groups to visit every state. Z. Wahrscheinlichkeitstheorie verw Gebiete, 62:361–374, 1983.
[3] C. Avin, M. Koucky, and Z. Lotker. Cover time and mixing time of random walks on dynamic graphs. Random Structures and Algorithms, 52:576–596, 2018. doi:10.1002/RSA.20752.
[4] Y. Azar, A. Z. Broder, A. R. Karlin, N. Linial, and S. J. Phillips. Biased random walks. Combinatorica, 16:1–18, 1996. doi:10.1007/BF01300124.
[5] A.Z. Broder and A.R. Karlin. Bounds on the cover time. Journal of Theoretical Probability, 2:101–120, 1989.
[6] S. Cannon, J. J. Daymude, D. Randall, and A.W. Richa. A Markov chain algorithm for compression in self-organizing particle systems. In Proceedings of the 2016 Symposium on Principles of Distributed Computing, pages 279–288, 2016.
[7] B. Chin, A. Moitra, E. Mossel, and C. Sandon. The power of an adversary in Glauber dynamics. In The Thirty Seventh Annual Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1102–1124, 2024. URL: https://proceedings.mlr.press/v247/chin24a.html.
[8] R.V. Craiu, L. Gray, K. Latuszynski, N. Madras, G.O. Roberts, and J.S. Rosenthal. Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms. Annals of Applied Probability, 25:3592–3623, 2015.
[9] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Almost Chor-Goldreich sources and adversarial random walks. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1–9, 2023.
[10] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Online condensing of unpredictable sources via random walks. In Proceedings of the 40th Computational Complexity Conference, volume 339 of LIPIcs, pages 30:1–30:17, 2025. doi:10.4230/LIPIcs.CCC.2025.30.
[11] D.J. Hartfiel. On the limiting set of stochastic products $xA_{1}\ldots A_{k}$ . Proceedings of the American Mathematical Society, 81:201–206, 1981.
[12] D.J. Hartfiel. Markov Set-Chains, volume 1695 of Lecture Notes in Mathematics. Springer-Verlag, 1998.
[13] J. Haslegrave, T. Sauerwald, and J. Sylvester. Time dependent biased random walks. ACM Transactions on Algorithms, 18, 2022.
[14] J.J. Hunter. Mixing times with applications to perturbed markov chains. Linear Algebra and Its Applications, 417:108–123, 2006.
[15] S. Kania, D. Aristoff, and D.M. Zuckerman. Riteweight: Randomized iterative trajectory reweighting for steady-state distributions without discretization error. Technical report, Arxiv, 2024. arxiv:2401.05597.
[16] D.A. Levin and Y. Peres. Markov Chains and Mixing Times. American Mathematical Society, 2nd edition, 2017.
[17] S. Li, B. Dutta, S. Cannon, J. J. Daymude, R. Avinery, E. Aydin, A. W. Richa, D. I. Goldman, and D. Randall. Programming active cohesive granular matter with mechanically induced phase changes. Science Advances, 7:eabe8494, 2021.
[18] Y. Liu. Perturbation bounds for the stationary distributions of Markov chains. SIAM Journal on Matrix Analysis and Applications, 33:1057–1074, 2012. doi:10.1137/110838753.
[19] P. Matthews. Covering problems for Markov chains. Annals of Probability, 16:1215–1228, 1988.
[20] L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report Technical Report 1999-66, Stanford InfoLab, 1999.
[21] R. Rubinfeld. The cover time of a regular expander is O(n log n). Information Processing Letters, 35:49–51, 1990. doi:10.1016/0020-0190(90)90173-U.
[22] T. Sauerwald and L. Zanetti. Random walks on dynamic graphs: Mixing times, hitting times, and return probabilities. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132 of LIPIcs, pages 93:1–93:15, 2019. doi:10.4230/LIPIcs.ICALP.2019.93.
[23] D. Vial and V.G. Subramanian. Restart perturbations for reversible markov chains: Trichotomy and pre-cutoff equivalence. Random Structures and Algorithms, 66, 2025.
[24] D. Zuckerman. Covering times of random walks on bounded degree trees and other graphs. Journal of Theoretical Probability, 2:147–157, 1989.
[25] D. Zuckerman. A technique for lower bounding the cover time. SIAM Journal on Discrete Mathematics, 5:254–259, 1992.

[bib.bib1] [1] D. Aldous and J. A. Fill. Reversible Markov chains and random walks on graphs, 2002. Unfinished monograph, recompiled 2014, available at http://www.stat.berkeley.edu/˜aldous/RWG/book.html.

[bib.bib2] [2] D.J. Aldous. On the time taken by random walks on finite groups to visit every state. Z. Wahrscheinlichkeitstheorie verw Gebiete, 62:361–374, 1983.

[bib.bib3] [3] C. Avin, M. Koucky, and Z. Lotker. Cover time and mixing time of random walks on dynamic graphs. Random Structures and Algorithms, 52:576–596, 2018. doi:10.1002/RSA.20752.

[bib.bib4] [4] Y. Azar, A. Z. Broder, A. R. Karlin, N. Linial, and S. J. Phillips. Biased random walks. Combinatorica, 16:1–18, 1996. doi:10.1007/BF01300124.

[bib.bib5] [5] A.Z. Broder and A.R. Karlin. Bounds on the cover time. Journal of Theoretical Probability, 2:101–120, 1989.

[bib.bib6] [6] S. Cannon, J. J. Daymude, D. Randall, and A.W. Richa. A Markov chain algorithm for compression in self-organizing particle systems. In Proceedings of the 2016 Symposium on Principles of Distributed Computing, pages 279–288, 2016.

[bib.bib7] [7] B. Chin, A. Moitra, E. Mossel, and C. Sandon. The power of an adversary in Glauber dynamics. In The Thirty Seventh Annual Conference on Learning Theory, volume 247 of Proceedings of Machine Learning Research, pages 1102–1124, 2024. URL: https://proceedings.mlr.press/v247/chin24a.html.

[bib.bib8] [8] R.V. Craiu, L. Gray, K. Latuszynski, N. Madras, G.O. Roberts, and J.S. Rosenthal. Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms. Annals of Applied Probability, 25:3592–3623, 2015.

[bib.bib9] [9] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Almost Chor-Goldreich sources and adversarial random walks. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1–9, 2023.

[bib.bib10] [10] D. Doron, D. Moshkovitz, J. Oh, and D. Zuckerman. Online condensing of unpredictable sources via random walks. In Proceedings of the 40th Computational Complexity Conference, volume 339 of LIPIcs, pages 30:1–30:17, 2025. doi:10.4230/LIPIcs.CCC.2025.30.

[bib.bib11] [11] D.J. Hartfiel. On the limiting set of stochastic products $xA_{1}\ldots A_{k}$ . Proceedings of the American Mathematical Society, 81:201–206, 1981.

[bib.bib12] [12] D.J. Hartfiel. Markov Set-Chains, volume 1695 of Lecture Notes in Mathematics. Springer-Verlag, 1998.

[bib.bib13] [13] J. Haslegrave, T. Sauerwald, and J. Sylvester. Time dependent biased random walks. ACM Transactions on Algorithms, 18, 2022.

[bib.bib14] [14] J.J. Hunter. Mixing times with applications to perturbed markov chains. Linear Algebra and Its Applications, 417:108–123, 2006.

[bib.bib15] [15] S. Kania, D. Aristoff, and D.M. Zuckerman. Riteweight: Randomized iterative trajectory reweighting for steady-state distributions without discretization error. Technical report, Arxiv, 2024. arxiv:2401.05597.

[bib.bib16] [16] D.A. Levin and Y. Peres. Markov Chains and Mixing Times. American Mathematical Society, 2nd edition, 2017.

[bib.bib17] [17] S. Li, B. Dutta, S. Cannon, J. J. Daymude, R. Avinery, E. Aydin, A. W. Richa, D. I. Goldman, and D. Randall. Programming active cohesive granular matter with mechanically induced phase changes. Science Advances, 7:eabe8494, 2021.

[bib.bib18] [18] Y. Liu. Perturbation bounds for the stationary distributions of Markov chains. SIAM Journal on Matrix Analysis and Applications, 33:1057–1074, 2012. doi:10.1137/110838753.

[bib.bib19] [19] P. Matthews. Covering problems for Markov chains. Annals of Probability, 16:1215–1228, 1988.

[bib.bib20] [20] L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report Technical Report 1999-66, Stanford InfoLab, 1999.

[bib.bib21] [21] R. Rubinfeld. The cover time of a regular expander is O(n log n). Information Processing Letters, 35:49–51, 1990. doi:10.1016/0020-0190(90)90173-U.

[bib.bib22] [22] T. Sauerwald and L. Zanetti. Random walks on dynamic graphs: Mixing times, hitting times, and return probabilities. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132 of LIPIcs, pages 93:1–93:15, 2019. doi:10.4230/LIPIcs.ICALP.2019.93.

[bib.bib23] [23] D. Vial and V.G. Subramanian. Restart perturbations for reversible markov chains: Trichotomy and pre-cutoff equivalence. Random Structures and Algorithms, 66, 2025.

[bib.bib24] [24] D. Zuckerman. Covering times of random walks on bounded degree trees and other graphs. Journal of Theoretical Probability, 2:147–157, 1989.

[bib.bib25] [25] D. Zuckerman. A technique for lower bounding the cover time. SIAM Journal on Discrete Mathematics, 5:254–259, 1992.

Markov Chain Robustness

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Acknowledgements:

Funding:

DOI:

Event:

Editor:

Series and Publisher:

1 Introduction

1.1 The adversary

Main question.

1.2 Stationary robustness

Definition 1.1.

Theorem 1.2.

1.3 Mixing time robustness

Definition 1.3.

Definition 1.4.

Definition 1.5.

Theorem 1.6.

Theorem 1.7.

1.4 Hitting time robustness

Definition 1.8.

Definition 1.9.

Theorem 1.10.

Theorem 1.11.

Theorem 1.12.

1.5 Cover time robustness

Definition 1.13.

Definition 1.14.

Theorem 1.15.

1.6 Examples

1.6.1 Path/cycle

1.6.2 Expanders

1.7 Weak randomness

1.8 Techniques

1.9 Related work

1.10 Conjectures and Open Problems

Conjecture 1.16.

Conjecture 1.17.

Conjecture 1.18.

2 Preliminaries

2.1 Random walk basics

Lemma 2.1.

2.2 Expander graphs

2.3 Mixing time variants

Lemma 2.2.

Definition 2.3.

Definition 2.4.

Lemma 2.5.

Definition 2.6.

Definition 2.7.

Lemma 2.8.

▶ Remark 2.9.

2.4 Easy bounds on robustness

Definition 2.10.

Lemma 2.11.

Proposition 2.12.

Proof.

3 Stationary and Mixing Time Robustness

3.1 MSCs non-convergence

3.2 Bounding the increase in probability

Lemma 3.1.

Proof.

Lemma 3.2.

Proof.

Theorem 3.3.

Proof.

Corollary 3.4.

Proof.

Corollary 3.5.

Proof.

Proof.

Theorem 1.2. [Restated, see original statement.]

Proof.

3.3 Mixing time robustness for expanders

Theorem 1.7. [Restated, see original statement.]

Proof.

$\blacktriangleright$ Remark 2.9.