Phase Transitions of Best-of-Two and Best-of-Three on Stochastic Block Models

This paper is concerned with voting processes on graphs where each vertex holds one of two different opinions. In particular, we study the \emph{Best-of-two} and the \emph{Best-of-three}. Here at each synchronous and discrete time step, each vertex updates its opinion to match the majority among the opinions of two random neighbors and itself (the Best-of-two) or the opinions of three random neighbors (the Best-of-three). Previous studies have explored these processes on complete graphs and expander graphs, but we understand significantly less about their properties on graphs with more complicated structures. In this paper, we study the Best-of-two and the Best-of-three on the stochastic block model $G(2n,p,q)$, which is a random graph consisting of two distinct Erd\H{o}s-R\'enyi graphs $G(n,p)$ joined by random edges with density $q\leq p$. We obtain two main results. First, if $p=\omega(\log n/n)$ and $r=q/p$ is a constant, we show that there is a phase transition in $r$ with threshold $r^*$ (specifically, $r^*=\sqrt{5}-2$ for the Best-of-two, and $r^*=1/7$ for the Best-of-three). If $r>r^*$, the process reaches consensus within $O(\log \log n+\log n/\log (np))$ steps for any initial opinion configuration with a bias of $\Omega(n)$. By contrast, if $r<r^*$, then there exists an initial opinion configuration with a bias of $\Omega(n)$ from which the process requires at least $2^{\Omega(n)}$ steps to reach consensus. Second, if $p$ is a constant and $r>r^*$, we show that, for any initial opinion configuration, the process reaches consensus within $O(\log n)$ steps. To the best of our knowledge, this is the first result concerning multiple-choice voting for arbitrary initial opinion configurations on non-complete graphs.


Introduction
This paper is concerned with voting processes on distributed networks. Consider an undirected connected graph G = (V, E) where each vertex v ∈ V initially holds an opinion from a finite set. A voting process is defined by a local updating rule: Each vertex updates its opinion according to the rule. Voting processes appear as simple mathematical models in a wide range of fields, e.g. social behavior, physical phenomena and biological systems [37,35,4]. In distributed computing, voting processes are known as a simple approach for consensus problems [24,27].

Our results
This paper considers the stochastic block model, a well-known random graph model that forms multiple communities. This model has been well-explored in a wide range of fields, including biology [11,36], network analysis [5,28] and machine learning [2,1], where it serves as a benchmark for community detection algorithms. The study of the voting processes on the stochastic block model has a potential application in distributed community detection algorithms [6,9,19]. In this paper, we focus on the following model which admits two communities of equal size. Definition 1.1 (Stochastic block model). For n ∈ N and p, q ∈ [0, 1] with q ≤ p, the stochastic block model G(2n, p, q) is a graph on a vertex set V = V 1 ∪V 2 , where |V 1 | = |V 2 | = n and V 1 ∩V 2 = ∅. In addition, each pair {u, v} of distinct vertices u ∈ V i and v ∈ V j forms an edge with probability θ, independent of any other edges, where Note that G(2n, p, q) is not connected w.h.p. if p = o(log n/n) [25]. Throughout this paper, we assume p = ω(log n/n), in which regime each community is connected w.h.p.
In this paper, we first generate a random graph G(2n, p, q), and then set an initial opinion configuration from {1, 2}. Let A (0) , A (1) , . . . be a sequence of random vertex subsets where A (t) is the set of vertices of opinion 1 at step t. For any A ⊆ V , the consensus time T cons (A) is defined as We obtain two main results, described below.
Result I: phase transition. Observe that, if p = q = 1, then G(2n, 1, 1) is a complete graph and the consensus time of the Best-of-two is O(log n), from the results of [22]. On the other hand, the graph G(2n, 1, 0) consists of two disjoint complete graphs, each of size n, meaning that, depending on the initial state, it may not reach consensus. This naturally raises the following question: Where is the boundary between these two phenomena? This motivated us to study the consensus times of the Best-of-two and the Best-of-three on G(2n, p, q) for a wide range of r := q/p, and led us to propose the following answers. Theorem 1.2 (Phase transition of the Best-of-three on G(2n, p, q)). Consider the Best-of-three on G(2n, p, q) such that r := q p is a constant.
(ii) If r < 1 7 , then G(2n, p, q) w.h.p. satisfies the following property: There exist a set A ⊆ V with |A| − |V \ A| = Ω(n) and two positive constants C, C > 0 such that Theorem 1.3 (Phase transition of the Best-of-two on G(2n, p, q)). Consider the Best-of-two on G(2n, p, q) such that r := q p is a constant.
(i) If r > √ 5 − 2, then G(2n, p, q) w.h.p. satisfies the following property: There exist two positive constants C, C > 0 such that ∀A ⊆ V of |A| − |V \ A| = Ω(n) : Pr T cons (A) ≤ C log log n + log n log(np) (ii) If r < √ 5 − 2, then G(2n, p, q) w.h.p. satisfies the following property: There exist a set A ⊆ V with |A| − |V \ A| = Ω(n) and two positive constants C, C > 0 such that Note that the upper bound T cons (A) = O(log log n + log n/ log(np)) is tight up to a constant factor if log n/ log(np) ≥ log log n. To see this, observe that there exists an A ⊆ V such that T cons (A) is at least half of the diameter. In addition, it is easy to see that the diameter of G(2n, p, q) is Θ(log n/ log(np)) w.h.p. [25].
We also note that the consensus time of the pull voting is O(poly(n)) for any non-bipartite graph [29]. To the best of our knowledge, Theorem 1.2 and Theorem 1.3 provide the first nontrivial graphs where the consensus time of a multiple-choice voting process is exponentially slower than that of the pull voting.
Result II: worst-case analysis. The most central topic in voter processes is the symmetry breaking, i.e. the number of iterations required to cause a small bias starting from the half-andhalf state. Here, we are interested in the worst-case consensus time with respect to initial opinion configurations. To the best of our knowledge, all current results on worst-case consensus time of multiple-choice voting processes deal with complete graphs [22,7,10,26]. All previous work on non-complete graphs has involved some special bias setting (e.g. an initial bias [14,15,16], or a random initial opinion configuration [3,19,32]). In this paper, we present the following first worst-case analysis of non-complete graphs. Theorem 1.4 (Worst-case analysis of the Best-of-three on G(2n, p, q)). Consider the Best-of-three on G(2n, p, q) such that p and q are positive constants. If q p > 1 7 , then G(2n, p, q) w.h.p. satisfies the following property: There exist two positive constants C, C > 0 such that Theorem 1.5 (Worst-case analysis of the Best-of-two on G(2n, p, q)). Consider the Best-of-two on G(2n, p, q) such that p and q are positive constants. If q p > √ 5 − 2, then G(2n, p, q) w.h.p. satisfies the following property: There exist two positive constants C, C > 0 such that Based on these theorems, an immediate but important corollary follows. Corollary 1.6. For any constant p > 0, the Best-of-two and the Best-of-three on the Erdős-Rényi graph G(n, p) reach consensus within O(log n) steps w.h.p. for all initial opinion configurations.
Recall that the Best-of-two and the Best-of-three on G(n, p) has been extensively studied in previous works but these works put aforementioned assumptions on initial bias.  [22] exploited this idea for the Best-of-two and obtained the worst-case analysis for the consensus time on complete graphs. Somewhat interestingly, we also have E[|A | | A] = f (|A|) in the Best-of-three.
Cooper et al. [14] extended this approach to the Best-of-two on regular expander graphs. Specifically, they proved that E[|A | | A] = f (|A|) ± O( ) for all A ⊆ V , where = (n, λ 2 ) = o(n) is some function using the expander mixing lemma. This argument assumes an initial bias of size Ω( ). In another paper, Cooper et al. [15] improved this technique and proved more sophisticated results that hold for general (i.e. not necessarily regular) expander graphs.
In this paper, we consider G(2n, p, q) on the vertex set (2) for details. We show the same result for the Best-of-two (51). Here, our key tool is the concentration method, specifically the Janson inequality (Lemma A.15) and the Kim-Vu concentration (Lemma A.16).
High-level proof sketch. Consider the Best-of-three on G(2n, p, q), and let A (0) , A (1) , . . . be a sequence of random vertex subsets determined by A (t+1) := (A (t) ) for each t ≥ 0. Consider a stochastic process Our technical result in the previous paragraph approximates the stochastic process α (t) by the deterministic process a (t) defined as a (t+1) = H(a (t) ) and α (0) = a (0) for some function H : [0, 1] 2 → [0, 1] 2 (See (4) and Figure 2). The function H induces a two-dimensional dynamical system, which we call the induced dynamical system. Using this, we obtain two results concerning α (t) .
First, we show that, for any initial configuration, the process reaches one of the zero areas (a neighbor of a fixed point of H) within a constant number of steps. To show this, in addition to the approximation result, we used the theory of competitive dynamical systems [30].
Second, we characterize the behavior of α (t) in zero areas. The zero areas depend only on r = q/p, and are classified into four types using the Jacobian matrix: consensus, sink, saddle and source areas (see Figure 1 for a description). In consensus areas, we show that the process reaches consensus within O(log log n+log n/ log(np)) steps. In sink areas, we show that the process remains there for at least 2 Ω(n) steps, and also that sink areas only appear if r < 1/7. In saddle and source areas, we show that the process escapes from there within O(log n) steps if p is a constant by using techniques of [22]. Intuitively speaking, in these two kinds of areas, there are drifts towards outside. To apply the techniques of [22], we show that Var[|A i |] = Ω(n) in the area if p is constant, which leads to our worst-case analysis result. Indeed, any previous works working on expander graphs did not investigate the worst-case due to the lack of variance estimation.
These arguments also enable us to study the Best-of-two process, which implies Theorem 1.3.

Related work
The consensus time of the pull voting process is investigated via its dual process, known as coalescing random walks [29,13,17]. Recently coalescing random walks have been extensively studied, including the relationship with properties of random walks such as the hitting time and the mixing time [31,39].
Other studies have focused on voting processes with more general updating rules. Cooper and Rivera [17] studied the linear voting model, whose updating rule is characterized by a set of n × n binary matrices. This model covers the synchronous pull and the asynchronous push/pull voting processes. However, it does not cover the Best-of-two and the Best-of-three. Schoenebeck and Yu [40] studied asynchronous voting processes whose updating functions are majority-like (including the asynchronous Best-of-(2k + 1) voting processes). They gave upper bounds on the consensus times of such models on dense Erdős-Rényi random graphs using a potential technique.
Organization. First we set notation and precise definition of the Best-of-three in Section 2. After explaining key properties of the stochastic block model in Section 3, we show some auxiliary results of the induced dynamical system in Section 4. Then we derive Theorems 1.2 and 1.4 in Section 5. Our general framework of voting processes and results of the general induced dynamical systems are given in Sections 6 and 7, respectively. Then we give proofs for key properties of the stochastic block model in Section 8 and results of the general induced dynamical system in Section 9. In Section 10, we show Theorems 1.3 and 1.5, and we conclude this paper in Section 11.
2 Best-of-three voting process Here, we study the Best-of-three with two possible opinions from {1, 2}.
For the set A of vertices holding opinion 1, let A denote the set of vertices that hold opinion 1 after an update. In the Best-of-three, A = {v ∈ V : X v = 1} where (X v ) v∈V are independent binary random variables satisfying For a given vertex subset A (0) ⊆ V , we are interested in the behavior of the Markov chain (A (t) ) ∞ t=0 , i.e. the sequence of random vertex subsets determined by A (t+1) := (A (t) ) for each t ≥ 0.
the Hoeffding bound (Lemma A.9) implies that the following holds w.h.p for i = 1, 2: 3 Concentration result for the stochastic block model In this paper, we consider the Best-of-three on the stochastic block model G(2n, p, q) (Defini- (1) is a random variable since G(2n, p, q) is a random graph. Here, our key ingredient is the following general concentration result for G(2n, p, q).
satisfies the following properties.
(P1) It is connected and non-bipartite.
Note that the proof of (P1) is not difficult since p = ω(log n/n) and q ≥ log n/n 2 [25]. Proving (P2) and (P3), however, is more challenging: we show these in Section 8.
From Theorem 3.2, G(2n, p, q) is f Bo3 -good w.h.p. Hence, we consider the Best-of-three on an f Bo3 -good G(2n, p, q). From (P2) and (P3), we have for all A ⊆ V and i = 1, 2. Here, we remark that (P3) is stronger than (P2) if |A| is sufficiently small. This property will play a key role in the proof of Proposition 4.5.
Idea of the proof of Theorem 3.2. We consider the property (P2). Note that we may assume f (x) = x k for some constant k w.l.o.g. since it suffices to obtain the concentration result for each term of f . For simplicity, let us exemplify our idea on the special case of k = 3. It is known that deg(v) = n(p + q) ± O( √ np log n) holds for all v ∈ V w.h.p. (see, e.g. [25]). This implies that The core of the proof is the concentration of

Induced dynamical system
Let α i := |A i | n , α i := |A i | n and r := q p . Suppose that r is a constant. Then, for an f Bo3 -good G(2n, p, q), it holds w.h.p. that for all A ⊆ V and i = 1, 2 since (1) and (2) hold. Throughout this paper, we use α = (α 1 , α 2 ) and α = (α 1 , α 2 ) as vector-valued random variables. Equation (3) leads us to the dynamical system H, where we define H : and H i (a 1 , a 2 ) := f Bo3 a i +ra 3−i  (b) r = 1/9 Figure 2: The induced dynamical system H of (4). The points d * i are the fixed points given in (8). Here, the horizontal and vertical axes correspond to α 1 and α 2 , respectively. We can observe two sink points in (b), but none in (a).
Let H be the mapping (4) and define (a (t) ) ∞ t=0 as Then there exists a positive constant C > 0 such that Broadly speaking, Theorem 4.1 approximates the behavior of α (t) by the orbit a (t) of the corresponding dynamical system H. We call the mapping H the induced dynamical system. Indeed, the same results as (2) hold for the Best-of-two voting. Therefore, analogous results of Theorem 4.1 hold, which enable us to analyze the Best-of-two on G(2n, p, q) via its induced dynamical system. The dynamical system H of (4) is illustrated in Figure 2.
To make the calculations more convenient, we change the coordinate of H by Note that δ 1 and δ 2 axes are corresponding to the dotted lines of Figure 1. Let u := 1−r 1+r . Then we have where This suggests another dynamical system T (d) = (T 1 (d), T 2 (d)). Here, we use d = (d 1 , d 2 ) as a specific point and δ = (δ 1 , δ 2 ) as a vector-valued random variable. Consider δ (t) = (δ ) for each t ≥ 0. From Theorem 4.1, it holds w.h.p. that for sufficiently large constant C > 0, any 0 ≤ t ≤ n o(1) and any initial configuration A (0) ⊆ V . For notational convenience, we use δ := δ (t+1) for δ = δ (t) . Similarly, we refer d to T (d).
Note that δ satisfies |δ 1 |+|δ 2 | ≤ 1. In addition, the dynamical system T is symmetric: Precisely, T 1 (±d 1 , ∓d 2 ) = ±T 1 (d 1 , d 2 ) and T 2 (±d 1 , ∓d 2 ) = ∓T 2 (d 1 , d 2 ) hold. In Lemma 4.2, we assert that the sequence (d (t) ) ∞ t=0 is closed in From now on, we focus on S and consider the behavior of δ around fixed points. A straightforward calculation shows that Here, we provide auxiliary results needed for the proofs of Theorems 1.
Proposition 4.4 (Dynamics around d * 2 ). Consider the Best-of-three on an f Bo3 -good G(2n, p, q) such that r = q/p < 1/7 is a constant. Then there exists a positive constant = (r) satisfying In particular, T cons (A) = exp(Ω(n)) w.h.p. for any A satisfying δ + ∈ B(d * 2 , ).  Intuitive explanations for Propositions 4.4 to 4.6. In Propositions 4.4 to 4.6, we consider the behavior of α (t) around the fixed points (8). Let H be the induced dynamical system and let J be the Jacobian matrix of H at a fixed point a * with two eigenvalues λ 1 , λ 2 . If the eigenvectors are linearly independent, we can rewrite J as J = U −1 ΛU , where Λ := diag(λ 1 , λ 2 ) and U is some nonsingular matrix. Let β := U (α − a * ). Roughly speaking, if α is closed to a * , the Taylor expansion at a * (i.e. H(α) ≈ a * + J(α − a * )) yields In other words, β i ≈ λ i β i . If max{|λ 1 |, |λ 2 |} < 1 − c for some constant c > 0, we might expect that β = Θ( α − a * ) is likely to keep being small. Here, we do not restrict this argument on the Best-of-three. We will prove Proposition 7.2, which is a generalized version of Proposition 4.4. If max{|λ 1 |, |λ 2 |} > 1 + c for some constant c > 0, the norm β seems to become large in a small number of steps. We will exploit this insight and prove Proposition 7.8, which immediately implies Proposition 4.6. Indeed, for consensus areas (i.e. a * ∈ {(0, 0), (1, 1)}), the induced dynamical systems of the Best-of-three and the Best-of-two satisfy λ 1 = λ 2 = 0. Then, the Taylor expansion yields α − a * ≈ O( α − a * 2 ). This observation and the property (P3) lead to the proof of Proposition 7.3 as well as Proposition 4.5.

Derive Theorems 1.2 and 1.4
Here, we prove Theorems 1.2 and 1.4 using Propositions 4.3 to 4.6.

Polynomial voting processes
Using Theorem 3.2, we can prove the same results as Theorem 4.1 for various models including the Best-of-two. Hence, in this paper, we do not restrict our interest to the Best-of-three: Instead, we prove general results that hold for polynomial voting process on G(2n, p, q).
For the set A of vertices with opinion 1, let A denote the set of vertices with opinion 1 after an update. In the (f 1 , f 2 )-polynomial voting process, where (X v ) v∈V are independent binary random variables satisfying .
In other words, for i = 1, 2. Polynomial voting process includes several known voting models including the Bestof-two, the Best-of-three, and so on. For example, We can define induced dynamical system for any polynomial voting process on G(2n, p, q) via the following result: Theorem 6.2 (Theorem 4.1 for polynomial voting processes). Let f 1 and f 2 be polynomials with constant degree. Consider an (f 1 , f 2 )-polynomial voting process, on an f 1 -good and f 2 -good G(2n, p, q) starting with vertex set Define (a (t) ) ∞ t=0 as a (0) = α (0) and a (t+1) = H(a (t) ) for each t ≥ 0. Then, there exists a constant C > 0 such that Remark that the mapping H of Theorem 6.2 is the induced dynamical system.
Proof. For any polynomial voting process, the cardinality |A i | can be written as the sum of independent random variables: Thus, if we fix A ⊆ V , the Hoeffding bound (Lemma A.9) implies that (1) holds w.h.p. Since the property (P2) and (1) lead to Note that the function H satisfies the Lipschitz condition. Hence, a positive constant C 2 exists such that 2 ) be the vector-valued stochastic process and a (t) = (a 2 ) be the vector sequence given in (5). Then, we have where C is sufficiently large constant.
7 Results of general induced dynamical systems with applications to the Best-of-three Now let us focus on the orbit (α (t) ) ∞ t=1 such that H(α (0) ) = α (0) holds, where H is the induced dynamical system. In this case, Theorem 6.2 does not provide enough information about the dynamics. In dynamical system theory, a natural approach for the local behavior around fixed points is to consider the Jacobian matrix. Recall that, the Jacobian matrix J of a function H : i,j∈ [2] .
In the following subsections, we will investigate the local dynamics from the viewpoint of the maximum singular value and eigenvalue of the Jacobian matrix. In contrast to the local dynamics, it is quite difficult to predicate the orbit of general dynamical systems since some of them exhibits so-called chaos phenomenon. Therefore, the proof of the orbit convergence (e.g. Proposition 4.3) is not trivial. Fortunately, the induced dynamical system of the Best-of-three on G(2n, p, q) is competitive, a well-known nice property for predicting the future orbit [30] (see Appendix A.4 for definition). In Section 7.4, we use known results of competitive dynamical systems to show Proposition 4.3. It should be noted that the same argument leads to the orbit convergence for the Best-of-two as we shall discuss in Section 10.

Sink point
We begin with defining the notion of sink points. Recall that the singular value of a matrix A is the positive square root of the eigenvalue of A A (see Appendix A.1 for formal definition and basic properties).
Consider an (f 1 , f 2 )-polynomial voting process on an f 1 -good and f 2 -good G(2n, p, q) such that r = q p is a constant. Let H be the induced dynamical system. Then, for any sink point a * and any sufficiently small = ω( 1/np), holds.

Fast consensus
Suppose that the Jacobian matrix at the consensus point (i.e. α ∈ {(0, 0), (1, 1)} is the all-zero matrix. Then, we claim that the polynomial voting process reaches consensus within a small number of iterations if the initial set A (0) has small size.
Consider an (f 1 , f 2 )-polynomial voting process on an f 1 -good and f 2 -good G(2n, p, q) such that p q is a constant. Suppose that the Jacobian matrix at the point α = (0, 0) is the all-zero matrix.
Then, there exists a constant C 1 , C 2 , δ > 0 such that To show Proposition 7.3, we prove the following result which might be an independent interest: Consider a polynomial voting process on a graph G of n vertices. Suppose that there exist absolute constants C, δ > 0 and a function = (n) = o(1) such that Then, there exist positive constants δ , C , C such that It should be noted that in Proposition 7.4, we do not restrict the underlying graph G to be random graphs.

Escape from a fixed point
Consider an (f 1 , f 2 )-polynomial voting process on an f 1 -good and f 2 -good G(2n, p, q) such that p and q are constants. Let a * ∈ R 2 be a fixed point of the induced dynamical system H. Let J be the Jacobian matrix of H at a * and λ 1 , λ 2 be its eigenvalues. Let u i be the eigenvector of J corresponding to λ i . Suppose that u 1 , u 2 are linearly independent. Then, we can rewrite J as Roughly speaking, from the Taylor expansion of H at a * , we have , one may expect that α (τ ) ∈ B(a * , 0 ) holds for any A (0) ⊆ V and for some constant 0 > 0. We aim to prove this under some assumptions.
Assumption 7.5 (Basic assumptions). We consider an (f 1 , f 2 )-polynomial voting process on an f 1 -good and f 2 -good G(2n, p, q) for constants p ≥ q ≥ 0. Let a * be a fixed point and J be the corresponding Jacobian matrix satisfying (A1) The eigenvectors u 1 and u 2 are linearly independent.
Under Assumption 7.5, we can define the random variable β of (9). Further, we put the following.
Assumption 7.6. In addition to Assumption 7.5, we assume that there exists a positive constant * satisfying the followings: (A4) There exist two positive constants 1 , C such that Sometimes, it might be not easy to check the conditions of Assumption 7.6. In this paper, we provide the following alternative condition which is easy to check: Assumption 7.7. In addition to Assumption 7.5, we assume the following: Based on the assumptions, we prove the following result: Proposition 7.8 (Escape from source and sink areas). Let a * be a fixed point satisfying either Assumption 7.6 or 7.7. Then, there exist τ = O(log n) and a constant > 0 such that the followings hold w.h.p.:

Application to the Best-of-three
In this section, we will prove the results of Section 4. Consider the Best-of-three. The Jacobian matrix of the dynamical system of (6) is Let J i be the Jacobian matrix at d * i , where d * i is the fixed points (8). A straightforward calculation yields Depending on the eigenvalues λ 1 ≥ λ 2 of J i , the property of d * i changes as shown in Table 1. Table 1: Each cell (c 1 , c 2 ) represents the property of the eigenvalues λ 1 ≥ λ 2 of the corresponding Jacobian matrix. Precisely, the sign c i represents whether λ i is larger than 1 or not. For example, (+, 1) indicates that λ 1 > λ 2 = 1. If (+, −) or (+, +), we may apply Proposition 7.8. Indeed, cells with (−, −) correspond sink points in this model.
Then, (d 1 , d 2 ) ∈ S implies 1 2 ≤ x ≤ 1 and 0 ≤ y ≤ 1. In addition, a simple calculation yields Proof of Proposition 4.4. It is straightforward to check that the points d * 2 and −d * 2 are sink. Therefore, Proposition 4.4 immediately follows from Proposition 7.2.
Proof of Proposition 4.5. Note that J 4 is the all-zero matrix and the same holds at −d * 4 . Let > 0 be sufficiently small constant. If A satisfies |A| ≤ n, apply Proposition 7.3. If A satisfies |A| ≥ (2 − )n, apply Proposition 7.3 for V \ A.
Proof of Proposition 4.6. Suppose that u < 3 4 (or equivalently, r > 1 7 ) and that p ≥ q ≥ 0 are constants. We prove that for any constant 2 > 0. We check the condition (A2) of Assumption 7.5. To this end, we show the following result: Theorem 7.9 (Concentration of the variance for the Best-of-three). Consider the Best-of-three on f Bo3 -good G(2n, p, q). Then, two constants C 1 , C 2 > 0 exist such that Proof. That variance Var[|A i | | A] can be represented as Therefore, Theorem 7.9 immediately follows from property (P2).
Let 2 > 0 be sufficiently small constant mentioned in (12). The Jacobian matrix J 1 = J 2 has eigenvalues 1 and 3 2 . Suppose that δ (0) ∞ ≤ 2 for sufficiently small constant 2 > 0. Then, we have This verifies the assumption (A4). On the other hand, for any δ of |δ 1 | ≤ 2 , we have By applying the Hoeffding bound (Lemma A.9) to the random variables δ 1 = α 1 − α 2 and δ 2 = α 1 + α 2 − 1 2 , we obtain In particular, letting t = log n n , we have holds w.h.p. We claim that holds, which is equivalent to (A5). From (13) and (14), if |δ 1 | ≤ 2 , we have Suppose that 0 ≤ u < 1. We use basic results of competitive dynamics (see Appendix A.4). We first claim that the map T : S → S is competitive and it satisfies the conditions (C1) to (C4) described in Appendix A.4. Then, we apply Theorem A.17 and complete the proof of the first statement. To this end, we consider the Jacobian matrix J given in (10).
The condition (C1) follows from Lemma A. 19: it is straightforward to check that the Jacobian matrix (10) satisfies the condition of Lemma A.19. To verify the condition (C2), we use the Inverse Function Theorem (Theorem A.3). We claim that det J > 0 for any d ∈ S \ {(0, 1)}. Indeed, in look at (10), we have  (1)). This follows from a simple calculation 8 Proof of the f -goodness of the stochastic block model (Theorem 3.

2)
In this section we show Theorem 3.2. In Section 8.1, we show that the property (P2) is obtained from Lemma 8.1.
Then two positive constants C 1 , C 2 exist such that the following holds with probability 1 − N −C 1 :

Reduction to W
Proof of (P2) of Theorem 3.2 via Lemma 8.
. Then from the triangle inequality, it holds that For the second term of the right hand of (16), two positive constant C 1 , C 2 exist such that The second inequality follows from the Lipschitz condition of f (c.f. Appendix A.2). The third inequality holds since E[deg(v)] = (n − 1)p + nq and ( For the first term of the right hand of (16), since for any j and v ∈ V , we have A), we obtain the claim from Lemma 8.1. Note that, for any S ⊆ V , a ∈ R V and x ∈ [0, 1] V , | s∈S a s x s | ≤ max U ⊆V | u∈U a u | since s∈S:as≤0 a s ≤ s∈S a s x s ≤ s∈S:as≥0 a s . Now we introduce the following lemma, which we will use in Sections 8.2 and 8.3.

Discrepancy between the expected value and the ideal value
The first inequality follows directly from the FKG inequality (Lemma A.14) since deg S i (s) is a monotone increase function on (I e ) e∈( V 2 ) for every i. Now we show the second inequality. We write each element s ∈ S as s = (s 0 , s 1 , . . . , s ).
For the first term, since s i = s j for any i, j ∈ [ ] (i = j) if |F (s)| = , we obtain s∈S: Note that G(s) = (U (s), F (s)) is a connected graph for any s ∈ S.

Proof of key lemma (Lemma 8.2)
To complete the proof of Lemma 8.1, we show Lemma 8.2 in this section.
To estimate above, we introduce the following notations. For any (l + 1)-dimensional vector s = (s 0 , s 1 , . . . , s l ) ∈ S 0 × S 1 × · · · × S l , let  Note that |R l | = B l+1 . Then we have From the definition of R(s), for any r ∈ R(s), s i = s j for any i, j ∈ r. Thus For an index i ∈ {0} ∪ [l], let r i be the element of R such that r i i. Now let us consider the set L described in the statement (of Lemma 8.2). First we assume that there exist i, j ∈ L with i = j such that both i and j in the same r * = r i = r j ∈ R. In this case, since S i ∩ S j = ∅ from the definition of L, we have r∈R i∈r Now we assume that r i = r j for any i, j ∈ L. Then since |{r i : i ∈ L}| = |L| and R = {r i : i ∈ L} ∪ R \ {r i : i ∈ L}, we have r∈R i∈r Finally, by combining (29) to (33), we obtain Note that the third inequality follows since N p ≥ 1 from the assumption.
Proof of Lemma 8.1. Combining Lemmas 8.3 and 8.4, we obtain the proof.

Proof of (P3) of Theorem 3.2
The property (P3) is obtained by the following two lemmas.
Lemma 8.5. Suppose that 0 ≤ q ≤ p = ω(log n/n). Then two positive constants C 1 , C 2 exist such that G(2n, p, q) satisfies the following with probability 1 − O(n −C 1 ): Proof. Applying the Chernoff bound (Lemma A.8), Thus we obtain the claim letting t = C 2 √ np log n since t = C 2 √ np log n ≥ C log n for some constant C. Proof.
Upper bound. Now we show the following claim: Two positive constants C 7 , C 8 exist such that the following holds with probability 1 − n −C 7 : From the same discussion of (24), Thus we consider an upper bound on W (S 0 ; S 1 , . . . , S −1 , A).
Proof of (P3) of Theorem 3.2. Let d min := min v∈V deg(v). Then for any j ∈ [ ], From Lemma 8.5, it holds with high probability that The second equality holds since (log n)/(np) = o(1) and j ∈ [ ] is a constant. Thus from Lemma 8.6, we have Thus we obtain the claim.

Proofs of general results of dynamical systems
In this section, we consider a polynomial voter process according to f 1 and f 2 on G(2n, p, q) that is both f 1 -good and f 2 -good. Moreover, we assume that q/p is a constant. Let H = (H 1 , H 2 ) : [0, 1] 2 → [0, 1] 2 be the induced dynamical system. Throughout this section, probability and expectation are taken over the voter process unless otherwise noted.

Proof of stationary dynamics around a sink point (Proposition 7.2)
We begin with establishing two auxiliary results. Proof. From (2) and = ω( 1/np), we have for sufficiently large n. Hence, from the Hoeffding inequality (Lemma A.9), we have Proof. Let x * be a sink point and J be the Jacobian matrix of H at x * . From the Taylor expansion (see, e.g. Theorem 12.15 of [34]), we have By the property of singular value (Proposition A.2), there exist constants , K > 0 such that, for any x ∈ B(x * , ), it holds that Consequently, for any y ∈ B(x * , ), we have Equivalently, we obtain H(x) − y ∞ > K .
Proof of Proposition 7.2. Let a * be a sink point of H. From Lemma 9.2, we can take a constant holds if α ≤ δ for sufficiently small constant δ.
Proof of Proposition 7. 4 We prove Proposition 7.4. Let n = |V (G)|. Suppose that there exist absolute constants C, δ > 0 and a function = (n) such that (n) = o(1) and E[|A |] ≤ C|A| 2 n + |A| holds for all A ⊆ V of |A| ≤ δn. Note that we may assume (n) = Ω( log n/n): If = o( log n/n), we have log n/ log −1 = O(1) and we will obtain the claim by applying Proposition 7.4 with letting = log n/n. Take a positive constant δ such that Then, for |A| ≤ δ n, it holds with probability 1 − O(n −3 ) that Here, we used (40) Thus let us consider the following two cases.
To show the claim, we exploit the property that E[|A | | A] ≤ 2 |A| if |A| ≤ C n. Before using this, we show that |A (t) | ≤ C n holds for all t = 1, . . . , n o(1) . Conditioned on |A| ≤ C n, we have E[|A |] ≤ 2 |A| ≤ O( 2 n) and thus the Chernoff bound ((iii) of Lemma A.7) yields Therefore, |A (t) | ≤ C n holds for all t = 0, . . . , n o(1) . Let C (t) be the event that |A (i) | ≤ C n holds for all i = 0, . . . , t. Then, from the tower property of the conditional expectation, we have for some τ 2 = O(log n/ log −1 ). This shows the aforementioned claim as well as completes the proof of Proposition 7.4.

Proof of the escape result (Proposition 7.8)
In this section, we prove Proposition 7.8. Let a * be a fixed point satisfying Assumption 7.5. The proof of Proposition 7.8 consists of two parts: We derive Proposition 7.8 from Assumption 7.6 and 7.7.
Recall the random variable β defined in (9). From the definition (9), each element β i of β can be rewritten as where we let U = (u ij ). Each element u ij of the matrix U does not depend on n. Hence, the Hoeffding bound (Lemma A.9) implies From Theorem 4.1 and the Taylor expansion (39), we have Hence, the i-th element β i of β = (β 1 β 2 ) satisfies It is convenient to consider the behavior of β instead of α. Note that α → a * implies β → 0 and vice versa since the matrix U is nonsingular. By substituting t = Θ log n n to (45), for sufficiently large constant C > 0, it holds w.h.p. that Derive Proposition 7.8 from Assumption 7.6 Suppose that the fixed point a * satisfies Assumption 7.6. Let I >1 := {i ∈ [2] : |λ i | > 1} and I ≤1 := [2] \ I >1 . Fix a sufficiently large constant K > 0 and let * be the constant mentioned in Assumption 7.6. Define We claim that, for each i = 1, 2 and any A ∈ A i , there exists τ = O(log n) satisfying (1) . This completes the proof of Proposition 7.8 under Assumption 7.6. . Here, note that, for every i ∈ [2], there exists j ∈ [2] such that u ij = 0, since otherwise, it contradicts to the fact that the matrix U is nonsingular. Thus, from Corollary A.11, it holds that, for any constant h > 0, there exists a positive constant This verifies the condition (1 ) of Corollary A.13. Now we check the condition (2 ). Let z ∈ [2] be the least index satisfying |β z | = max{|β i | : i ∈ (recall that the constant 1 is mentioned in (A4)). Then, from (A4), we have Thus, from the Hoeffding inequality (Lemma A.9), we obtain This verifies the condition (2 ). Finally, we check the condition (3 ) of Corollary A. 13. From (A5), it holds that Therefore, from Corollary A.13, we have f (A (τ ) ) ≥ m = K √ n log n (i.e. A (τ ) ∈ A 2 ) holds w.h.p. for some τ = O(log n).
Case II: A (0) ∈ B 2 . Suppose A (0) ∈ B 2 . Our strategy is to apply Corollary A.13. We will prove the following result in the last part of this subsection. Lemma 9.3. Conditioned on A ∈ B 2 , the followings hold w.h.p.: (i) For every i ∈ I <1 , it holds that |β i | ≤ K log n n .
(ii) there exists a constant h > 0 such that, for every i ∈ I >1 , We check the condition (2 ) of Corollary A.13. For every i ∈ I >1 , Lemma 9.3 yields In look at (44), from the Hoeffding inequality (Lemma A.9), it holds for any set A (t) ∈ B 2 , any index i ∈ I >1 and any constant > 0 that From (49) and (50), by letting = 2(1+ ) , we obtain In other words, for any A ∈ B 2 satisfying f (A) ≥ h √ n for some constant h > 0, we have Finally, we check the condition (3 ) of Corollary A.13. From Lemma 9.3, we have Now we are ready to apply Corollary A.13. Thus, there exists τ = O(log n) such that f (A (τ ) ) ≥ K log n n and |β (τ ) j | ≤ K log n n hold w.h.p. for every j ∈ I <1 . Consequently, A (τ ) ∈ B 3 ∪ B 4 holds w.h.p.
Case III: A (0) ∈ B 3 . Suppose that A (0) ∈ B 3 . From (47), it holds w.h.p. that Moreover, for any j ∈ I <1 , it holds w.h.p. that These imply that A (t+1) ∈ B 1 ∪ B 2 holds w.h.p. whenever A (t) ∈ B 3 . Let τ be the stopping time given by τ := min{t : Proof of Lemma 9.3. Suppose A ∈ B 2 and recall the definition K = C . For any i ∈ I <1 , the bound (47) yields that log n n − 2C log n n + K 2 log n n ≤ K log n n holds w.h.p. This completes the proof of the statement (i). Now we consider the statement (ii). Suppose that A ∈ B 2 and |β i | ≥ C · 1 √ n (we expect h = C ).
To obtain the second claim, we show that d 2 > 0 whenever (d 1 , d 2 ) ∈ S satisfies d 2 > 0. This follows from a simple calculation In particular, T cons (A) = exp(Ω(n)) holds w.h.p. for any A ⊆ V satisfying δ + ∈ B(d * 2 , ). Proof. From (55), it is easy to check that both d * 2 and −d * 2 are sink. Then, Proposition 10.3 follows from Proposition 7.2.
Proposition 10.4 (Towards consensus). Consider the Best-of-two on G(2n, p, q) that is both f Bo2 1good and f Bo2 2 -good. Suppose that r = q/p is a constant. Then, there exists a universal constant = (r) > 0 satisfying the following: T cons (A) ≤ O(log log n + log n/ log(np)) holds w.h.p. for all A ⊆ V with min{|A|, 2n − |A|} ≤ n.
Proof. Since J 4 is the all-zero matrix, Proposition 10.4 immediately follows from Proposition 7.3. Proof sketch. The proof is the same as that of Proposition 4.6 and we just present the sketch. We first show the following claim: Claim 10.6 (Concentration of the variance for the Best-of-two). Consider the Best-of-two on G(2n, p, q) that is both f Bo2 1 -good and f Bo2 2 -good. Then, two constants C 1 , C 2 > 0 exist such that ∀A ⊆ V, ∀i ∈ {1, 2} : Proof. That variance Var[|A i | | A] can be represented as Therefore, Claim 10.6 immediately follows from the property (P2) and Theorem 3.2.

Concluding remark
In this paper we studied the Best-of-two and the Best-of-three voting processes on the stochastic block model G(2n, p, q). Here, we first generate G(2n, p, q), then set an initial opinion configuration and observe the voting process. We presented phase transition results on r = q/p for both processes. In addition, if p ≥ q > 0 are constants, we proved that the consensus time is O(log n) for arbitrary initial opinion configurations. In the proof, we combined the theory of dynamical systems and our technical result Theorem 6.2 which approximates the stochastic processes by the corresponding appropriate deterministic processes. To estimate the probability which the process reaches sink areas from the source area is future work to consider an application of these processes to distributed community detection algorithms. For an application to distributed community detection algorithms, it is significant to estimate the probability that the voting process reaches the sink areas (in particular, starting from the source area). This is a possible future direction of this paper.
Note that Theorem 6.2 is allowable for any polynomial function with constant degree. For example, consider the Best-of-(2k + 1) voting process for a positive constant k. This process is defined by ∞ ≤ C t ( 1/np + log n/n) for this process. Moreover, using Lemma 8.1, it is not difficult to extend Theorem 6.2 to voting processes on general stochastic block models that has c 1 communities each of size Ω(n) and initially involving c 2 opinions, where c 1 , c 2 denote arbitrary positive constants. This setting yields induced dynamical systems of dimension more than two. The Jacobian matrix would be helpful to investigate several properties including the exponential time lower bound (Proposition 7.2), the fast consensus (Proposition 7.3) and escape result (Proposition 7.8) since the proofs of Sections 9.1 to 9.3 work for induced dynamical systems with higher dimension. Unfortunately, it may not be easy to specify other properties (e.g. zero areas, convergence properties, . . . ) of (a (t) ) t∈N corresponding to such processes. This problem is left for future work. Also, the worst-case analysis of the consensus time for sparse random graphs remains open in this paper.

A.2 Real analysis tools
The Inverse Function Theorem. The Inverse Function Theorem is a fundamental result in real analysis and can be seen in many textbooks [23,34].
Theorem A.3 (The Inverse Function Theorem, (See, e.g. Theorem 12.17 of [34] and Theorem 1A.1 of [23])). Let f be a continuously differentiable function from an open set U ⊆ R k into R k . Suppose that the Jacobian matrix J at p ∈ U is invertible. Then there is a neighborhood V of p such that the restriction of f to V is invertible. Moreover, the Jacobian matrix of f −1 at p is given by J −1 .
It should be noted that the definition of the Lipschitz condition does not depends on the norm · on R n . The following is a well-known result in real analysis. (ii) for any δ ∈ [0, 1], (iii) for any k ≥ 2e E[X], Here, e denotes the Napier constant.
Lemma A.9 (Hoeffding bound (See, e.g. Theorem 10.9 of [21])). Let X 1 , X 2 , . . . , X n be independent random variables. Assume that each X i takes values in a real interval [a i , b i ] of length c i := b i − a i . Let X = n i=1 X i . Then for any δ > 0, Lemma A.10 (Berry-Esseen (See, e.g. [41])). Let X 1 , X 2 , . . . , X n be independent random variables such that E[ x −∞ e −y 2 /2 dy (the cumulative distribution function of the standard normal distribution). Then Corollary A.11. Let X 1 , X 2 , . . . , X n be n independent random variables such that Var[X] = 0 and |X i − E[X i ]| ≤ C < ∞ for all i ∈ [n] where X = n i=1 X i . Then for any x ∈ R, Proof. For all i ∈ [n], let Lemma A.12 (Lemma 4.5 of [12]). Consider a Markov chain (X t ) ∞ t=1 with finite state space Ω and a function f : Ω → {0, . . . , n}. Let C 3 be arbitrary constant and m = C 3 √ n log n. Suppose that Ω, f and m satisfies the following conditions: (1) For any positive constant h, there exists a positive constant C 1 < 1 such that (2) Three positive constants , C 2 and h exist such that, for any x ∈ Ω satisfying h √ n ≤ f (x) < m, Then f (X τ ) ≥ m holds for some τ = O(log n).
Suppose that Ω, f, m and B satisfy the following conditions: (1 ) For any positive constant h, there exists a positive constant C 1 < 1 such that (2 ) Three positive constants , C 2 , h exist such that, for any x ∈ B satisfying h √ n ≤ f (x) < m, (3 ) For some constant C 4 > 0, Pr [X t+1 ∈ B and f (X t+1 ) < m | X t ∈ B] ≤ O(n −C 2 ).
Proof. Let Ω = B ∪ {a, b} be the state space with two special states a and b. We consider a Markov chain (X t ) ∞ t+1 on Ω by In other words, the special state a corresponds to the event "f (x) < m and x ∈ B ", and b does "f (x) ≥ m".
Suppose that X 0 ∈ B and let τ = min{t : X t ∈ B} > 0 be the stopping time. Then, the above definition of X t naturally yields a coupling (X t , X t ) t<τ satisfying X t = X t for t < τ .
Let f : Ω → {0, . . . , n} be a function given by Then, the Markov chain (X t ) on Ω and the function f satisfies the conditions (1) and (2) of Lemma A.12. Hence, for some τ = O(log n), it holds that X τ ∈ {a, b}. We insist that X τ = b, that is, f (X τ ) ≥ m. Indeed, from the condition (3 ) If the polynomial Y has degree at most k (i.e. max F ∈E |F | ≤ k), then for any positive λ > 1, it holds that

A.4 Competitive dynamical systems on R 2
In this paper, we consider discrete-time dynamic systems given by a map: For a map T : S → S and x ∈ S, we discuss whether the orbit {T n (x)} n≥0 converges to a fixed point or not. For general dynamical systems, it is typically difficult to predicate the asymptotic behavior of an orbit since it sometimes exhibits chaos. This section is devoted to introduce planar competitive dynamical systems, which includes the induced dynamical systems explored in this paper. Our dynamical systems are defined on R 2 . The definitions and results in this section follow from [30]. For two points x = (x 1 , x 2 ) and y = (y 1 , y 2 ), we write x ≤ K y if both x 1 ≤ y 1 and y 2 ≤ x 2 hold. For S ⊆ R 2 , a map T : S → S is competitive if T is monotone with respect to the relation ≤ K (i.e. T (x) ≤ K T (y) whenever x ≤ K y). We use x ≤ y for the usual component-wise comparison (i.e. x ≤ y if x i ≤ y i for i = 1, 2). Throughout this section, we let S = {(x, y) ∈ R 2 : x ≥ 0, y ≥ 0, x + y ≤ 1} and our map T : S → S is supposed to satisfy the following conditions: (C1) T is competitive.