The Condensation Phase Transition in Random Graph Coloring

Based on a non-rigorous formalism called the “cavity method”, physicists have put forward intriguing predictions on phase transitions in diluted mean-field models, in which the geometry of interactions is induced by a sparse random graph or hypergraph. One example of such a model is the graph coloring problem on the Erdős–Renyi random graph G(n, d/n), which can be viewed as the zero temperature case of the Potts antiferromagnet. The cavity method predicts that in addition to the k-colorability phase transition studied intensively in combinatorics, there exists a second phase transition called the condensation phase transition (Krzakala et al. in Proc Natl Acad Sci 104:10318–10323, 2007). In fact, there is a conjecture as to the precise location of this phase transition in terms of a certain distributional fixed point problem. In this paper we prove this conjecture for k exceeding a certain constant k0.


Background and motivation.
Since the early 2000s physicists have developed a systematic but non-rigorous formalism called the cavity method for the study of diluted mean-field models [23]. These are models of disordered systems, such as glasses or spin glasses, where the geometry of interactions is given by a sparse random graph or hypergraph. Apart from cases of immediate physical interest, such as the diluted Potts or Ising model, the cavity method has been applied to long-standing problems in combinatorics and information theory, as well as, more recently, to problems in computer science and compressive sensing [19]. The predictions obtained in this way have a very significant potential impact on all of these areas. Hence the importance of providing a rigorous mathematical foundation for the cavity method, an effort that the present work contributes to.
The specific model that we deal with is the Potts antiferromagnet on the Erdős-Rényi random graph at zero temperature, also known as the random graph coloring problem. This problem has played a central role in combinatorics since the seminal 1960 paper of Erdős and Rényi that started the theory of random graphs [14]. In this model, the geometry is defined by the random graph G(n, d/n) on n vertices V = {1, . . . , n}, any two of which are connected by an edge with probability d/n independently. For an integer k ≥ 3 we call a map σ : V → {1, . . . , k} a k-coloring of G(n, d/n) if σ (v) = σ (w) for any two vertices v, w that are connected by an edge. Let Z k be the number of kcolorings of G(n, d/n). How does the (appropriately scaled) partition function Z k vary as a function of the average degree d of the random graph in the thermodynamic limit n → ∞?
The cavity method predicts that for any fixed k ≥ 3 there occur two phase transitions as d increases [20,21,33]. The first of these is the condensation phase transition. This phase transition is ubiquitous in physics, and is believed to hold the key to a variety of problems. For instance, the role of condensation in the context of structural glasses is a major open problem, going back to the work of Kauzmann in the 1940s [17]. In the context of diluted mean-field models, the existence of the condensation phase transition has been proved in the hypergraph 2-coloring problem [9] and the Potts antiferromagnet (at positive temperature) [10]. But thus far its location has not been determined exactly in a rigorous way. The contribution of the present paper is to establish that in the random graph coloring problem, condensation occurs at the exact point predicted by the cavity method. This is the first rigorous result of this kind in a diluted mean-field model.
As most predictions based on the cavity method, the one on condensation comes in the form of a distributional fixed point problem. Apart from studying this fixed point analytically, the key contribution of the present work is to establish an explicit link between the fixed point problem and the combinatorics of the random graph coloring problem. We expect that the technique that we develop for this purpose generalises to a variety of other models. Immediate examples that spring to mind include the random hypergraph 2-coloring problem and the k-NAESAT problem, which is of interest in computer science.
The second conjectured phase transition is the k-colorability threshold. This is the point where the random graph G(n, d/n) ceases to possess a k-coloring. Establishing the existence and location of the k-colorability threshold is a major open problem in combinatorics [1], and our main result implies a slightly improved lower bound on this conjectured threshold. However, the k-colorability phase transition is an artefact of the zero-temperature case: it is not expected to persist in the Potts antiferromagnet at positive temperature, in contrast to the condensation phase transition.

Pinning down the condensation phase transition.
Letting Z k (G) be the number of k-colorings of a graph G, we consider In the case of the graph coloring problem, Φ k (d) is the natural scaling of the partition function Z k (G(n, d/n)). 1 According to physics conventions, a "phase transition" would be a point d 0 where the function d → Φ k (d) is non-analytic. However, the limit Φ k (d) is not currently known to exist for all d, k. 2 Hence, we need to tread carefully: for a fixed k ≥ 3 we call d 0 ∈ (0, ∞) smooth if there exists ε > 0 such that -for any d ∈ (d 0 − ε, d 0 + ε) the limit Φ k (d) exists, and -the map d ∈ (d 0 − ε, d 0 + ε) → Φ k (d) has an expansion as an absolutely convergent power series around d 0 .
If d 0 fails to be smooth, we say that a phase transition occurs at d 0 .
For a smooth d 0 the sequence of random variables (Z k (G(n, d 0 /n)) 1/n ) n converges to Φ k (d 0 ) in probability. This follows from a concentration result for the number of k-colorings from [2].
As a next step, we state (an equivalent but slightly streamlined version of) the physics prediction from [33] as to the location of the condensation phase transition. As most predictions based on the "cavity method", this one comes in terms of a distributional fixed point problem. To be specific, let Ω be the set of probability measures on the set [k] = {1, . . . , k}. We identify Ω with the k-simplex, i.e., the set of maps μ : [k] → [0, 1] such that k h=1 μ(h) = 1, equipped with the topology and Borel algebra induced by R k . Moreover, we define a map B : ∞ γ =1 Ω γ → Ω, (μ 1 , . . . , μ γ ) → B[μ 1 , . . . , μ γ ] by letting for any i ∈ [k] . (1.1) Further, let P be the set of all probability measures on Ω. For each μ ∈ Ω let δ μ ∈ P denote the Dirac measure that puts mass one on the single point μ. In particular, δ k −1 1 ∈ P signifies the measure that puts mass one on the uniform distribution k −1 1 = (1/k, . . . , 1/k). For π ∈ P and γ ≥ 0 let (1.2) Further, define a map F d,k : P → P, π → F d,k [π ] by letting dπ(μ j ).
(1.3) 2 It seems natural to conjecture that the limit Φ k (d) exists for all d, k, but proving this might be difficult. In fact, the existence of the limit for all d, k would imply that d k-col (n) converges, which is a major open problem in the theory of random graphs [1]. Thus, in (1.3) we integrate a function with values in P, viewed as a subset of the Banach space 3 of signed measures on Ω. The normalising term Z γ (π ) ensures that F d,k [π ] really is a probability measure on Ω.
The main theorem is in terms of a fixed point of the map F d,k , i.e., a point π * ∈ P such that F d,k [π * ] = π * . In general, the map F d,k has several fixed points. Hence, we need to single out the correct one. For h ∈ [k] let δ h ∈ Ω denote the vector whose hth coordinate is one and whose other coordinates are 0 (i.e., the Dirac measure on h). We call a measure π ∈ P frozen if π({δ 1 , . . . , δ k }) ≥ 2/3; in words, the total probability mass concentrated on the k vertices of the simplex Ω is at least 2/3.
As a final ingredient, we need a function φ d,k : P → R. To streamline the notation, for π ∈ P and h ∈ [k] we write π h for the measure dπ h (μ) = kμ(h)dπ(μ). With this notation, φ d,k is defined in Fig. 1. The integrals in (1.4) and (1.5) are well-defined because the set where the argument of the logarithm vanishes has measure zero. Theorem 1.1. There exists a constant k 0 ≥ 3 such that for any k ≥ k 0 the following holds. If d ≥ (2k − 1) ln k − 2, then F d,k has precisely one frozen fixed point π * d,k . Further, the function has a unique zero d k,cond in the interval [(2k − 1) ln k − 2, (2k − 1) ln k − 1]. For this number d k,cond the following three statements are true.
(i) Any 0 < d < d k,cond is smooth and Φ k (d) = k(1 − 1/k) d/2 . (ii) There occurs a phase transition at d k,cond . 3 To be completely explicit, the probability mass that a measurable set A ⊂ Ω carries under F d,k [π ] is We emphasise that our conditioning on Z k (G(n, d/n)) > 0 is necessary to speak of a random k-coloring τ but otherwise harmless as Theorem 1.1 implies that G(n, d/n) is k-colorable with probability tending to one as n → ∞ for any d < d k,cond . In other words, Corollary 1.1 shows that there is a certain function Σ k > 0 such that the total number of k-colorings exceeds the number of k-colorings in the cluster of a randomly chosen k-coloring by at least a factor of exp[n(Σ k (d) + o(1))] with probability tending to one. However, as d approaches d k,cond , Σ k (d) tends to 0, and with a nonvanishing probability the gap between the total number of k-colorings and the size of a single cluster is upper-bounded by exp[n(Σ k (d) + o(1))]. 1.4. Discussion and related work. In this section we discuss some relevant related work and also explain the impact of Theorem 1.1 on various questions that have come up in the literature.

The physics perspective.
The original physics motivation for the study of systems in which the interactions are induced by a probability distribution was the study of materials with peculiar magnetic properties, the so-called spin glasses. A wide variety of models have been put forward, ranging from models on lattices to the well-known Sherrington-Kirkpatrick model [28], whose free energy is captured by the "Parisi formula" [26,29]. The Sherrington-Kirkpatrick model is a fully-connected mean-field model, i.e., each variable interacts with any other (via randomly chosen couplings). By comparison, in diluted mean-field models the interactions are determined by the edges of a sparse random graph or hypergraph, rather than by a complete graph. These models thus possess a non-trivial geometry (as opposed to fully-connected models where every pair of vertices interacts in the same way), while having only a bounded number of short cycles a.a.s. (as opposed to a finite-dimensional lattice). This makes diluted mean-field models amenable to analytic albeit non-rigorous study via the "cavity method" [23].
The condensation phase transition (which is sometimes also referred to as "static one-step replica symmetry breaking transition" or "Kauzmann transition") has been established in a variety of models, ranging from the random energy model [12], the fully-connected p spin-glass [18,30] as well as disordered polymers on trees [13].
With respect to diluted mean-field models, Coja-Oghlan and Zdeborová [9] showed that a condensation phase transition occurs in random r -uniform hypergraph 2-coloring. Furthermore, [9] determines the location of the condensation phase transition up to an error ε r that tends to zero as the uniformity r of the hypergraph becomes large. Moreover, Contucci, Dommers, Giardina, Starr proved that a condensation phase transition occurs in the diluted mean-field k-spin Potts antiferromagnet at positive temperature [10], and determined the value of d where it starts to occur up to an additive error of about ln k.
Yet the present work is the first to fully verify the prediction of the cavity method on condensation in a diluted mean-field model; the physics prediction was derived in [20,33]. The core of the proof of Theorem 1.1 is to establish an explicit link between the combinatorics of the graph coloring problem and the cavity formalism. In effect, in our analysis of the distributional fixed point problem we can directly incorporate some of the physics calculations from [33,Appendix C].
Finally, the problem of coloring random graphs algorithmically has received quite a bit of interest in computer science (e.g., [15]). The cavity method has inspired new "message passing" algorithms for this problem by the name of Belief/Survey Propagation Guided Decimation [6,24]. Experiments on random graph k-coloring instances for small values of k indicate an excellent performance of these algorithms [6,32,33]. While a rigorous analysis remains elusive, the physics prediction is that the performance of Belief Propagation guided decimation hinges on the location of the "condensation line" in a two-dimensional phase diagram parametrised by d and a value t that measures the progress of the algorithm [27]. In this notation, Theorem 1.1 identifies the location of the condensation point in the case t = 0. Thus, it would be interesting to extend the present techniques to t ∈ (0, 1), and to turn this into a rigorous analysis of the algorithm.

The combinatorics perspective.
Graph coloring is one of the most fundamental problems in combinatorics, as witnessed by the famous "four color problem". Thus, it is unsurprising that the problem of coloring random graphs has attracted a great deal of attention since it was first posed by Erdős and Rényi [14]; see [16] for a comprehensive overview. In the case that p = d/n for a fixed real d > 0, it is known that there exists a sharp threshold sequence d k-col (n) such that for any fixed ε > 0, the random graph G(n, d(n)/n) has a k-coloring a.a.s. if d(n) < (1 − ε)d k-col (n), and fails to have a k-coloring a.a.s. if d(n) > (1 + ε)d k-col (n) [1]. It is widely conjectured but as of yet unproven that the sequence d k-col (n) converges to a limit d k-col as n → ∞. If so, then d k-col would mark a second phase transition in the random graph coloring problem (as The best current bounds on the threshold sequence d k-col (n) are where ε k , δ k → 0 as k → ∞. The upper bound is by the first moment method [7]. The lower bound rests on a second moment argument [8], which improves a landmark result of Achlioptas and Naor [4]. While Theorem 1.1 allows for the possibility that d k,cond is equal to the k-colorability threshold d k-col (if it exists), the physics prediction is that these two are different. More specifically, the cavity method yields a prediction as to the precise value of d k-col in terms of another distributional fixed point problem. An asymptotic expansion in terms of k leads to the conjecture d k-col = (2k − 1) ln k − 1 + η k with η k → 0 as k → ∞. Thus, the upper bound in (1.7) is conjectured to be asymptotically tight in the limit k → ∞.
The present work builds upon the second moment argument from [8]. Conversely, Theorem 1.1 yields a small improvement over the lower bound in (1.7). Indeed, as we saw above Theorem 1.1 implies that lim inf n→∞ d k-col (n) ≥ d k,cond , thereby determining the precise "error term" ε k in the lower bound in (1.7). In fact, d k,cond is the best-possible lower bound that can be obtained via the kind of second moment argument developed in [4,8]. This is because a necessary condition for the success of the second moment argument is that The proofs in this paper build upon some of the techniques that have been developed to study the "geometry" of the set of k-colorings of the random graph, and add to this machinery. Among the techniques that we harness is the "planting trick" from [2] (which, in a sense, we are going to "put into reverse"), the notion of a core [2,8,25], techniques for proving the existence of "frozen variables" (or "hard fields" in physics jargon) [25], and a concentration argument from [9]. That said, the cornerstone of the present work is a novel argument that allows us to connect the distributional fixed point problem from [33] rigorously with the geometry of the set of k-colorings.
1.5. Preliminaries and notation. Throughout the paper we tacitly assume that k ≥ k 0 for some large enough constant k 0 that is large enough for the various estimates to hold. We also implicitly assume that n is sufficiently large. We use the standard O-notation when referring to the limit n → ∞. Thus, f (n) = O(g(n)) means that there exist C > 0, n 0 > 0 such that for all n > n 0 we have | f (n)| ≤ C · |g(n)|. In addition, we use the standard symbols o(·), Ω(·), Θ(·). In particular, o(1) stands for a term that tends to 0 as n → ∞.
Additionally, we use asymptotic notation with respect to the limit of large k. To make this explicit, we insert k as an index. Thus, f (k) = O k (g(k)) means that there exist If L is an integer, then we let [L] = {1, . . . , L}. Finally, we always set m = dn/2 and we let G(n, m) denote a random graph with vertex set V = [n] = {1, . . . , n} and with precisely m edges chosen uniformly at random.

Outline
The proof of Theorem 1.1 is composed of two parallel threads. The first thread is to identify an "obvious" point where a phase transition occurs or, more specifically, a critical degree d k,crit where statements (i)-(iii) of the theorem are met. The second thread is to identify the frozen fixed point π * d,k of F d,k and to interpret it combinatorially. Finally, the two threads intertwine to show that d k,crit = d k,cond , i.e. that the "obvious" phase transition d k,crit is indeed the unique zero of Eq. (1.6). The first thread is an extension of ideas developed in [9] for random hypergraph 2-coloring to the (technically more involved) random graph coloring problem. The second thread and the intertwining of the two require novel arguments.

The first thread.
With the nth root sitting inside the expectation, the quantity is difficult to calculate for general values of d. However for d ∈ [0, 1), Φ k (d) is easily understood. In fact, for d ∈ [0, 1) the random graph G(n, d/n) decomposes into tree components and a bounded number of connected components with precisely one cycle a.a.s. [14]. Moreover, the number of k-colorings of a tree with ν vertices and μ edges is well-known to be k ν (1 − 1/k) μ . Since G(n, d/n) has m ∼ dn/2 edges a.a.s., we obtain As Z k (G) 1/n ≤ k for any graph on n vertices, (2.1) implies that is analytic, the least d > 0 where the limit Φ k (d) either fails to exist or strays away from k(1 − 1/k) d/2 is going to be a phase transition. Hence, we let Proof. The upper bound (1.7) on the k-colorability threshold implies that Z k (G(n, d/n)) = 0 a.a.s. for d > (2k − 1) ln k. By contrast, k(1 − 1/k) d/2 > 0 for any d > 0.
Thus, d k,crit is a well-defined finite number, and there occurs a phase transition at d k,crit . Moreover, the following proposition, which we prove in Sect. 3, yields a lower bound on d k,crit and implies that d k,crit satisfies the first condition in Theorem 1.1.
Thus, we know that there exists a number d k,crit that satisfies conditions (i)-(ii) in Theorem 1.1. Of course, to actually calculate this number we need to unearth its combinatorial "meaning". As we saw in Sect. 1.3, if d k,crit really is the condensation phase transition, then the combinatorial interpretation should be as follows. For d < d k,crit , the size of the cluster that a randomly chosen k-coloring τ belongs to is smaller than Z k (G(n, d/n)) by an exponential factor exp(Ω(n)) a.a.s. But as d approaches d k,crit , the gap between the cluster size and Z k (G(n, d/n)) diminishes. Hence, d k,crit should mark the point where the cluster size has the same order of magnitude as Z k (G(n, d/n)).
But how can we possibly get a handle on the size of the cluster that a randomly chosen k-coloring τ of G(n, d/n) belongs to? No "constructive" method is known for obtaining a single k-coloring of G(n, d/n) for d anywhere close to d k-col , let alone for sampling one uniformly at random. Nevertheless, as observed in [2], in the case that Φ k (d) = k(1 − 1/k) d/2 , i.e., for d < d k,crit , it is possible to capture the experiment of first choosing the random graph G(n, d/n) and then sampling a k-coloring τ uniformly at random by means of a different, much more innocent experiment.
In this latter experiment, we first choose a map σ : [n] → [k] uniformly at random. Then, we generate a graph G(n, p , σ ) on [n] by connecting any two vertices v, w ∈ [n] such that σ (v) = σ (w) with probability p independently. If p = dk/(n(k − 1)) is chosen so that the expected number of edges is the same as in G(n, d/n) and if Φ k (d) = k(1 − 1/k) d/2 , then this so-called planted model should be a good approximation to the "difficult" experiment of first choosing G(n, d/n) and then picking a random k-coloring τ . In particular, with respect to the cluster size we expect that i.e., that the suitably scaled cluster size in the planted model is about the same as the cluster size in G(n, d/n). Hence, d k,crit should mark the point where E[|C(G(n, p , σ ), σ )| 1/n ] equals k(1 − 1/k) d/2 . The following proposition verifies that this is indeed so. Let us write G = G(n, p , σ ) for the sake of brevity.

2.2.
The second thread. Our next aim is to "solve" the fixed point problem for F d,k to an extent that gives the fixed point an explicit combinatorial interpretation. This combinatorial interpretation is in terms of a certain random tree process, associated with a concept of "legal colorings". Specifically, we consider a multi-type Galton-Watson branching process. Its set of types is The intuition is that i is a "distinguished color" and that is a set of "available colors". The branching process is further parameterized by a vector q = (q 1 , . . . , q k ) ∈ [0, 1] k such that q 1 + · · · + q k ≤ 1. Let d = dk/(k − 1) and Further, for each (i, ) ∈ T such that | | > 1 we define T i, as the set of all (i , ) ∈ T such that ∩ = ∅ and | | > 1. In addition, for (i, ) ∈ T such that | | = 1 we set The branching process GW(d, k, q) starts with a single individual, whose type (i, ) ∈ T is chosen from the probability distribution (q i, ) (i, )∈T . In the course of the process, each individual of type (i, ) ∈ T spawns a Poisson number Po(d q i , ) of offspring of type (i , ) for each (i , ) ∈ T i, . In particular, only the initial individual may have a type (i, ) with | | = 1, in which case it does not have any offspring. Let 1 ≤ N ≤ ∞ be the progeny of the process (i.e., the total number of individuals created).
We are going to view GW(d, k, q) as a distribution over trees endowed with some extra information. Let us define a decorated graph as a graph T = (V, E) together with a map ϑ : V → T such that for each edge e = {v, w} ∈ E we have ϑ(w) ∈ T ϑ(v) . Moreover, a rooted decorated graph is a decorated graph (T, ϑ) together with a distinguished vertex v 0 , the root. Further, an isomorphism between two rooted decorated graphs T and T is an isomorphism of the underlying graphs that preserves the root and the types of the vertices.
Given that N < ∞, the branching process GW(d, k, q) canonically induces a probability distribution over isomorphism classes of rooted decorated trees. Indeed, we obtain a tree whose vertices are all the individuals created in the course of the branching process and where there is an edge between each individual and its offspring. The individual from which the process starts is the root. Moreover, by construction each individual v comes with a type ϑ(v). We denote the (random) isomorphism class of this tree by T d,k,q . (It is most natural to view the branching process as a probability distribution over isomorphism classes as the process does not specify the order in which offspring is created.) To proceed, we define a legal coloring of a decorated graph (G, ϑ) as a map τ : such that τ is a k-coloring of G and such that for any type (i, ) ∈ T and for any vertex v with ϑ(v) = (i, ) we have τ (v) ∈ . Let Z(G, ϑ) denote the number of legal colorings.
Since Z(G, ϑ) is isomorphism-invariant, we obtain the integer-valued random variable Z(T d,k,q ). We have Z(T d,k,q ) ≥ 1 with certainty because a legal coloring τ can be constructed by coloring each vertex with its distinguished color (i.e., setting τ (v) = i if v has type (i, )). Hence, ln Z(T d,k,q ) is a well-defined non-negative random variable. Additionally, we write |T d,k,q | for the number of vertices in T d,k,q .
Finally, consider a rooted, decorated tree (T, ϑ, v 0 ) and let τ be a legal coloring of (T, ϑ, v 0 ) chosen uniformly at random. Then the color τ (v 0 ) of the root is a random variable with values in [k]. Let μ T,ϑ,v 0 ∈ Ω denote its distribution. Clearly, μ T,ϑ,v 0 is invariant under isomorphisms. Consequently, the distribution μ T d,k,q of the color of the root of a tree in the random isomorphism class T d,k,q is a well-defined Ω-valued random variable. Let π d,k,q ∈ P denote its distribution. Then we can characterise the frozen fixed point of F d,k as follows.
has a unique fixed point q * in the interval [2/3, 1]. Moreover, with the branching process GW(d, k, q * ) is sub-critical. Thus, P[N < ∞] = 1. 2. The map F d,k has precisely one frozen fixed point, namely π d,k,q * . The function (2.6) and its fixed point explicitly occur in the physics work [33]. The proof of Proposition 2.3 can be found in Sect. 5.

Tying up the threads.
To prove that d k,cond = d k,crit , we establish a connection between the random tree T d,k,q * and the random graph G with planted coloring σ . We start by giving a recipe for computing the cluster size |C(G, σ )|, and then show that the random tree process "cooks" it.
Computing the cluster size hinges on a close understanding of its combinatorial structure. As hypothesised in physics work [23] and established rigorously in [2,7,25], typically many vertices v are "frozen" in C(G, σ ), i.e., τ (v) = τ (v) for any two colorings τ, τ ∈ C(G, σ ). More generally, we consider for each vertex v the set of colors that v may take in colorings τ that belong to the cluster. Together with the "planted" color σ (v), we can thus assign each vertex v a type ϑ(v) = (σ (v), (v)). This turns G into a decorated graph (G, ϑ).
By construction, each coloring τ ∈ C(G, σ ) is a legal coloring of the decorated graph G. Conversely, we will see that a.a.s. any legal coloring of (G, ϑ) belongs to the cluster C(G, σ ). Hence, computing the cluster size |C(G, σ )| amounts to calculating the number Z(G, ϑ) of legal colorings of (G, ϑ). This calculation is facilitated by the following observation. Let G be the graph obtained from G by deleting all edges e = {v, w} that join two vertices such that (v) ∩ (w) = ∅. Then any legal coloring τ of G is a legal coloring of G, because τ (v) ∈ (v) for any vertex v. Hence, Z(G, ϑ) = Z( G, ϑ).
Thus, we just need to compute Z( G, ϑ). This task is much easier than computing Z(G, ϑ) directly because G turns out to have significantly fewer edges than G a.a.s. More precisely, a.a.s. G (mostly) consists of connected components that are trees of bounded size. In fact, we shall see that in an appropriate sense the distribution of the tree components converges to that of the decorated random tree T d,k,q * . In effect, we obtain Proposition 2.4. Suppose that d ≥ (2k − 1) ln k − 2 and let p be as in (2.3). Let q * be as in (2.7). Then the sequence { 1 n ln |C(G, σ )|} n converges to E ln Z(T d,k,q * ) The proof of Proposition 2.4, which can be found in Sect. 6, is based on the precise analysis of a further, combinatorial fixed point problem called Warning Propagation.
Proof of Theorem 1.1. Combining Propositions 2.2 and 2.4, we see that d k,crit is equal to d k,cond , which is well-defined by Proposition 2.3. Further, (2.2) implies that d k,crit > 0. Assume for contradiction that d k,crit is smooth. Then there is ε > 0 such that the limit Φ k (d) exists for all d ∈ (d k,crit − ε, d k,crit + ε) and such that the function d → Φ k (d) is given by an absolutely convergent power series on this interval. Moreover, Consequently, the uniqueness of analytic continuations implies that , in contradiction to the definition of d k,crit . Thus, d k,crit is a phase transition.

2.4.
Proof of Corollary 1.1. Corollary 1.1 follows rather easily from the above and the following Lemma that establishes a connection between the planted model and the Boltzmann distribution on G(n, d/n). As in Corollary 1.1, we let τ denote a random k-coloring of G(n, d/n). Lemma 2.1 [3]. Assume that d < d k,cond . Let E be a set of pairs (G, σ ), where G is a graph and σ is a k-coloring of G. Further, given that Z k (G(n, d/n)) > 0, let τ be a uniformly random k-coloring of G(n, d/n).

Groundwork: The First and the Second Moment Method
In this section we prove Proposition 2.1 and also lay the foundations for the proof of Proposition 2.2.
3.1. The first moment upper bound. We start by deriving an upper bound on Φ k (d) by computing the expected number of k-colorings. To avoid fluctuations of the total number of edges, we work with the G(n, m) model.
Lemma 3.1 is folklore. We carry the proof out regardless to make a few observations that will be important later. For a map σ : be the number of "forbidden pairs" of vertices that are colored the same under σ . By convexity, Hence, using Stirling's formula, we find As |Bal| = Ω(k n ) by Stirling, the linearity of expectation and (3.
Letting Z k,bal denote the number of balanced k-colorings, we obtain from the above argument As a further consequence of Lemma 3.1, we obtain

Proof. Lemma 3.1 and Jensen's inequality yield
Now, let c > 0 and set d = c − ε for some ε > 0. The number of edges in G(n, c/n) is binomially distributed with mean (1 + o(1))cn/2 = m + Ω(n). Hence, by the Chernoff bound the probability of the event A that G(n, c/n) has at least m edges tends to 1 as n → ∞. Because adding further edges can only decrease the number of k-colorings and since the number of k-colorings is trivially bounded by k n , we obtain from (3.4) that

The second moment lower bound.
The main technical step in the article [8] that yields the lower bound (1.7) on d k-col is a second moment argument for a random variable Z k,tame related to the number of k-colorings. We are going to employ this second moment estimate to bound Z k (G(n, d/n)) from below. The random variable Z k,tame counts k-colorings with some additional properties. Suppose that σ is a balanced k-coloring of a graph G on V = [n]. We call σ separable if for any balanced τ ∈ C(G, σ ) and any i ∈ [k] we have Thus, if σ is a balanced, separable k-coloring, then for any color i and for any other balanced k-coloring τ in the cluster of σ , a 1 − κ + o(1)-fraction of the vertices colored i under σ are colored i under τ as well. In particular, the clusters of any two such colorings are either disjoint or identical. Definition 3.1. Let G be a graph with n vertices and m edges.
As fleshed out in [8], together with the sharp threshold result from [1], Here we are going to combine Lemma 3.2 with the following variant of that sharp threshold result to obtain a lower bound on the number of k-colorings. Lemma 3.3 [2]. For any k ≥ 3 and for any real ξ > 0 there is a sequence d k,ξ (n) such that for any ε > 0 the following holds.
Further, pick and fix d * <d < d * such that We are going to use Lemmas 3.2 and 3.3 to establish a lower bound on Z k (G(n, d * /n)) that contradicts (3.6). By the Paley-Zygmund inequality and because (3.5) holds for any Further, because (3.5) is true for any d * − ε < d < d * and ξ < k(1 − 1/k) d/2 for any d <d < d * , we see that Hence, (3.9) implies lim inf n→∞ P Z k,tame (G(n, m)) ≥ ξ n > 0 for any d <d. (3.10) Since the number of edges in G(n, d/n) has a binomial distribution with mean m, with probability at least 1/3 the number of edges in G(n, d/n) does not exceed m. Therefore, (3.10) implies that Combining (3.6), (3.7) and (3.13) yields a contradiction, which refutes our assumption that d k,crit < d * .

Proof of Proposition 2.1.
We start with the following observation.

Lemma 3.5. Let
Let us denote the random graph G(n, d 1 /n) by G 1 . Furthermore, let G 2 be a random graph obtained from G 1 by joining any two vertices that are not already adjacent in G 1 with probability q independently. Then G 2 is identical to G(n, d 2 /n), because in G 2 any two vertices are adjacent with probability (3.14) (3.14) implies that Suppose that we condition on e(G 1 ), e(G 2 ) and |e( What is the probability that σ remains a k-coloring of G 2 ? For this to happen, none of the e(G 2 ) − e(G 1 ) additional edges must be among the Forb(σ ) pairs of vertices with the same color under σ . Using Stirling's formula, we see that the probability of σ remaining a k-coloring in G 2 is bounded by (1))n/2 .
Proof of Proposition 2.1. The first assertion follows from Corollary 3.2 which additionally implies that Hence, the second assertion is immediate from Lemma 3.5. The third assertion follows from Lemmas 3.2 and 3.4.

4.1.
Overview. The aim in this section is to prove Proposition 2.2. The proof of the first part is fairly straightforward. More precisely, in Sect. 4.2 we are going to establish The more challenging claim is that d ≥ d k,crit if typically the cluster in the planted model is "too big". To prove this, we consider a variant of the planted model in which the number of edges is fixed. More precisely, for a map σ : [n] → [k] we let G(n, m, σ ) denote a graph on the vertex set V = [n] with precisely m edges that do not join vertices v, w with σ (v) = σ (w) chosen uniformly at random. In other words, G(n, m, σ ) is just the random graph G(n, m) conditioned on the event that σ is a k-coloring. The following lemma, which is a variant of the "planting trick" from [2], establishes a general relationship between G(n, m) and G(n, m, σ ).
We prove Lemma 4.2 in Sect. 4.6. Hence, assuming that the typical cluster size in the planted model is "too big" a.a.s., we need to exhibit events E n such that (4.1) holds. An obvious choice seems to be But (4.1) requires that the probability that E n occurs in G(n, m, σ ) is exponentially small, and neither the cluster size nor Z k are known to be sufficiently concentrated to obtain such an exponentially small probability. Therefore, we define the events E n by means of another random variable. For a graph G = (V, E) and a map σ : the partition function of the k-spin Potts antiferromagnet on G at inverse temperature β.
For large β there is a stiff "penalty factor" of exp(−β) for any monochromatic edge. Thus, we expect that Z β,k becomes a good proxy for Z k as β → ∞. At the same time, ln Z β,k enjoys a Lipschitz property. Namely, suppose that we obtain a graph G from G by either adding or removing a single edge. Then Due to this Lipschitz property, one can easily show that ln Z β,k is tightly concentrated. More precisely, we have . Then for all large enough n, Proof. This is immediate from the Lipschitz property (4.

Proof of Lemma 4.1.
We use the following observation from [8].
Pick a number d * > d such that with m * = d * n/2 we have We claim that if we choose σ : [n] → [k] uniformly at random and independently a random graph G(n, m * ), then To see this, let E be the event that the random graph G(n, p , σ ) has no more than m * edges. Because the number of edges in Further, set d = kd * /(k−1) and let p = d /n > p . Then we can think of G(n, p , σ ) as being obtained from G(n, p , σ ) by adding further random edges. More precisely, let A be the event that G(n, p , σ ) contains precisely m * edges and set Since adding edges can only decrease the cluster size, (4.5) entails Similarly, let p n = P σ is separable in G(n, p , σ ) | A . Then Lemma 4.5 implies Further, consider p n = P [σ is balanced]. Then by Stirling's formula, Finally, let p n = P σ is a tame k-coloring of G(n, p , σ )|A . Given the event A, G(n, p , σ ) is just a uniformly random graph with m * edges in which σ is a k-coloring. Hence, As (4.6)-(4.7) yield lim inf n→∞ p n > 0, we obtain (4.4).
The estimate (4.4) enables us to bound E[Z k,tame (G(n, m * ))] from below. Indeed, by the linearity of expectation Thus, Lemma 3.1 and (4.4) yield As this holds for all d * in an interval for any d * > d. (4.8) Indeed, the number e(G(n, d * /n)) of edges of G(n, d * /n) is binomially distributed with mean (1 + o(1))d * n/2. Since d, d * are independent of n and d * > d, the Chernoff bound implies that Further, if we condition on the event that m * = e(G(n, d * /n)) > m, then we can think of G(n, d * /n) as follows: first, create a random graph G(n, m); then, add another m * −m random edges. Since the addition of further random edges cannot increase the number of k-colorings, by (4.9) we find that Taking n → ∞, and assuming that d * > d is sufficiently close to d, we conclude that Then the assertion follows from Lemma 4.6. Since Furthermore, by the linearity of expectation, To estimate the last factor, we use (3.1) and Stirling's formula, which yield Plugging this estimate into (4.11) and recalling that σ is a random map [n] → [k], we obtain Finally, using our assumption that lim sup P [G(n, m, σ ) ∈ E n ] 1/n < 1 and combining (4.11) and (4.12), we see that thereby completing the proof of (4.10).
Proof. Let A be the event that the number of edges in the random graph G(n, p , σ ) differs from dn/2 by at most √ n. Let N = n 2 . For any balanced σ : Since the number of edges in G(n, p , σ ) is a binomial random variable, (4.15) shows together with the central limit theorem that there exists a fixed γ > 0 such that for sufficiently large n Furthermore, by Stirling's formula there is an n-independent number δ > 0 such that for sufficiently large n we have Combining (4.16) and (4.17), we see that Thus, pick σ n ∈ Bal and μ n ∈ [dn/2 − √ n, dn/2 + √ n] that maximize Then (2.5) and (4.18) imply that lim n→∞ p(σ n , μ n ) = 1.
Thus, the assertion is immediate from the Chernoff bound.
Let Vol G (S) be the sum of the degrees of the vertices in S in the graph G. Proof. Let (X v ) v∈[n] be a family of independent random variables with distribution Bin(n, p ). Then for any set S the volume Vol(S) in G(n, p , σ ) is stochastically dominated by X S = 2 v∈S X v . Indeed, for each vertex v ∈ S the degree is a binomial random variable with mean at most np , and the only correlation amongst the degrees of the vertices in S is that each edge joining two vertices in S contributes two to Vol(S). Furthermore, E[X S ] ≤ 2d |S|. Thus, for any γ > 0 we can choose an n-independent α > 0 such that for any S ⊂ [n] of size |S| ≤ αn we have E[X S ] ≤ γ n/2. In fact, the Chernoff bound shows that by picking α > 0 sufficiently small, we can ensure that as desired.
Let γ = ε/(4β) > 0. By Lemma 4.10 there exists α > 0 such that for large enough n for any set S ⊂ V of size |S| ≤ αn and any σ : (4.20) Pick and fix a small 0 < η < α/3 and let A be the event that k i=1 ||σ −1 (i)|−n/k| ≤ ηn. Then by Lemma 4.9 there exists an (n-independent) number δ = δ(β, ε, η) > 0 such that for n large enough . Therefore, if A occurs, then it is possible to obtain from σ a map τ σ ∈ T by changing the colors of at most 2ηn vertices. If A occurs, we let G 1 = G(n, p , τ σ ). Further, let G 2 be the random graph obtained by removing from G 1 all edges that are monochromatic under σ . Finally, let G 3 be the random graph obtained from G 2 by inserting an edge between any two vertices v, w with σ (v) = σ (w) but τ σ (v) = τ σ (w) with probability p independently. Thus, the bottom line is that in G 3 , we connect any two vertices that are colored differently under σ with probability p independently. That is, G 3 = G(n, p , σ ).
Let S σ be the set of vertices v with σ (v) = τ σ (v) and let Δ be the number of edges we removed to obtain G 2 from G 1 . Then Δ is bounded by the volume of S σ in G 1 = G(n, p , τ σ ). Hence, (4.20) implies that Since removing a single edge can reduce Y by at most β/n, we obtain Finally, the assertion follows from Lemma 4.3.
1. The function has a unique fixed point q * = (q * 1 , . . . , q * k ) such that j∈[k] q * j ≥ 2/3. This fixed point has the property that q * 1 = · · · = q * k . Moreover, q * = kq * 1 is the unique fixed point of the function (2.6) in the interval [2/3, 1], and q The proof of Lemma 5.1 requires several steps. We begin by studying the fixed points of F d,k .
Lemma 5.2. The function F d,k maps the compact set [ 2 3k , 1 k ] k into itself and has a unique fixed point q * in this set. Moreover, the function from (2.6) has a unique fixed point q * in the set [2/3, 1] and q * = (q * /k, . . . , q * /k). Furthermore, In addition, if q ∈ [0, 1] k is a fixed point of F d,k , then As a first step, we show that F d,k (I ) ⊂ I . Indeed, let q ∈ I . Then for any i ∈ [k] On the other hand, as d ≥ (2k − 1) ln k − 2 we see that d ≥ 1.99k ln k. Hence, Thus, F d,k (I ) ⊂ I . In addition, we claim that F d,k is contracting on I . In fact, for any i, j ∈ [k] [as d ≥ 1.99k ln k and q l ≥ 2/3 for all l] ≤ k −1. 3 [for the same reason].
Therefore, for q ∈ I the Jacobi matrix D F d,k (q) satisfies Thus, F d,k is a contraction on the compact set I . Consequently, Banach's fixed point theorem implies that there is a unique fixed point q * ∈ I . To establish (5.3), assume without loss that q = (q 1 , . . . , q k ) ∈ [0, 1] k is a fixed point such that q 1 ≤ · · · ≤ q k . For the trivial fixed point q 1 = · · · = q k = 0, the Eq. (5.3) obviously holds. So we assume q 1 > 0. Because q is a fixed point, we find Comparing the expressions f d,k (q) and F d,k (q), we see that (q * /k, . . . , q * /k) is a fixed point of F d,k . Consequently, q * = (q * /k, . . . , q * /k).
Finally, since f d,k (q) > 0 for all q, the function f d,k is strictly increasing. Therefore, as d = (2 − o k (1))k ln k,  [33] that predicted the existence and location of d k,cond . We redo these calculations here in detail to be self-contained and because not all steps are carried out in full detail in [33].
From here on out, we let q * denote the fixed point of F d,k in [2/(3k), 1/k] k and we denote the fixed point of the function (2.6) in the interval [2/3, 1] by q * . Hence, q * = (q * /k, . . . , q * /k). If we keep k fixed, how does q * vary with d?
Proof. The map d → q * is differentiable by the implicit function theorem. Moreover, differentiating (2.6) while keeping in mind that q * = q * (d) is a fixed point, we find Rearranging the above using d = 2k ln k + O k (ln k) and (5.2) yields the assertion.
Proof. Lemma 5.2 shows that q * j = q * /k for all j ∈ [k]. Hence, due to (5.2) and because Furthermore, applying Corollary 5.1, we get Due to the symmetry of the fixed point q * (i.e., q * = (q * /k, . . . , q * /k)), M 22 is precisely the expected number of offspring of type (i, ) with | | = 2 that an individual of type To show that this is the case, we need to estimate the entries M i j . Estimating the q * i, via Corollary 5.2, we obtain The branching process GW Moreover, let us introduce the shorthands T = T d,k,q * andT = T d,k,q * . We aim to bound To this end, we couple T andT as follows.
• Initially, there is one individual. Its type is (i 0 , 0 ).
• Each individual of type (i, ) spawns a Po(Λ i , ) number of offspring of each type • Given that the total progeny is finite, we obtain T by linking each individual to its offspring. -For each type (i, ) let For every vertex v of T let s v be a random variable with distribution Be( Further, since |T | −1 ln Z(T ), |T | −1 ln Z(T ) ≤ ln k with certainty, we obtain Because (i 0 , 0 ) and (î 0 ,ˆ 0 ) are coupled optimally and P[A] = kq * 1 , P[Â] = kq * 1 , Now, let E be the event that 0 = {i 0 },ˆ 0 = {î 0 } and (i 0 , 0 ) = (î 0 ,ˆ 0 ). Due to Corollary 5.2 and because (i 0 , 0 ), (î 0 ,ˆ 0 ) are coupled optimally, we see that Combining (5.6) and (5.7), we conclude that Thus, we are left to estimate the probability that T =T , given that both trees have a root of the same type (i 0 , 0 ) with | 0 | > 1. Our coupling ensures that this event occurs iff s v = 1 for some vertex v of T . To estimate the probability of this event, we observe that by Corollary 5.2 Now, let N 1 be the number of vertices v = v 0 of T such that | v | = 2, and let N 2 be the number of v = v 0 such that | v | > 2. Then (5.9), (5.10) and the construction of the coupling yield To complete the proof, we claim that Then Corollary 5.2 entails that Then Corollary 5.2 shows that ξ 2 =Õ k (k −2 ). Furthermore, by the construction of the branching process and (5.2) which implies (5.12). Finally, (5.11) and (5.12) imply that Δ ≤ εÕ k (k −2 ). Taking ε → 0 completes the proof.
Proof of Lemma 5.1. The first assertion is immediate from Lemma 5.2. The second claim follows from Lemma 5.3, and the third one from Lemma 5.4.

The "hard fields".
In this section we make the first step towards proving that π d,k,q * is the unique frozen fixed point of F d,k . More specifically, identifying the set Ω with the k-simplex, we show that every face of Ω carries the same probability mass under any frozen fixed point of F d,k as under the measure π d,k,q * . Formally, let us denote the extremal points of Ω by δ h = (1 i=h ) i∈ [k] , i.e., δ h is the probability measure on [k] that puts mass 1 on the single point h ∈ [k]. In addition, let Ω be the set of all μ ∈ Ω with support (i.e., μ(i) > 0 for all i ∈ and μ(i) = 0 for all i ∈ [k] \ ). Further, for a probability measure π ∈ P we let ρ h (π ) = π({δ h }) denote the probability mass of δ h under π . In physics jargon, the numbers ρ h (π ) are called the "hard fields" of π . In addition, recalling that dπ i (μ) = kμ(i)dπ(μ), we set ρ i, (π ) = π i (Ω ) for any (i, ) ∈ T . The main result of this section is Lemma 5.5. Suppose that d ≥ (2k − 1) ln k − 2. Let q * ∈ [2/3, 1] be the fixed point of (2.6). If π ∈ P is a frozen fixed point of F d,k , then ρ i (π ) = q * /k and ρ i, To avoid many case distinctions, we introduce the following convention when working with product measures. Let us agree that Ω 0 = {∅}. Hence, if B : Ω 0 → Ω is a map, then B(∅) ∈ Ω. Furthermore, there is precisely one probability measure π 0 on Ω 0 , namely the measure that puts mass one on the point ∅ ∈ Ω 0 . Thus, the integral Ω ∅ B(μ)dπ 0 (μ) is simply equal to B(∅). If π 1 , π 2 , . . . are probability measures on Ω, what we mean by the empty product measure 0 γ =1 π γ is just the measure π 0 on Ω 0 . Further, for a real λ ≥ 0 and an integer y ≥ 1 we let p λ (y) = λ y exp(−λ)/y!.
Moreover, for i ∈ [k] we let Γ i be the set of all non-negative integer vectors γ = (γ j ) j∈[k]\{i} and for γ ∈ Γ i we set Thus, with the convention from the previous paragraph, in the case γ = 0 the set Ω γ = {∅} contains only one element, namely μ 0 = ∅. Moreover, π i,γ is the probability measure on Ω 0 that gives mass one to the point ∅. We recall the map B : γ ≥1 Ω γ → Ω from (1.1) and extend this map to Ω 0 by letting B(∅) = 1 k 1 be the uniform distribution on Ω. We start the proof of Lemma 5.5 by establishing the following identity.

Lemma 5.6. If π is fixed point of F d,k , then for any i ∈ [k] we have
To establish Lemma 5.6 we need to calculate the normalising quantities Z γ (π ). Lemma 5.7. If π is fixed point of F d,k , then Z γ (π ) = (k − 1) γ /k γ −1 .
To prove the second assertion, let (i, ) ∈ T . Then Lemma 5.6 yields Given γ , the distributions μ h, j are chosen independently from π h for all h = i, j ∈ γ h . Hence, for a given γ the probability that (1) and (2) occur is precisely Thus, combining (5.17) and (5.18), we see that Finally, as we already know from the first paragraph that ρ h (π ) = q * /k, (5.19) implies that ρ i, (π ) = kq * i, .

The fixed point. The objective in this section is to establish
Lemma 5.8. Suppose that d ≥ (2k − 1) ln k − 2. Then π d,k,q * is the unique frozen fixed point of F d,k .
To prove Lemma 5.8, let P be the set of all probability measures π ∈ P whose support is contained in Ω (i.e., π(Ω ) = 1). For each π ∈ P and any (i, ) ∈ T we define a measure π i, by letting In addition, let P = (i, )∈T P be the set of all families (π i, ) i, ∈T such that π i, ∈ P for all (i, ).
Proof. Let (i, ) ∈ T . By construction, the support of π i, is contained in Ω . Furthermore, Lemma 5.5 implies that Thus, π i, is a probability measure.
Let Γ i, be the set of all non-negative integer vectors γ = ( γ i , ) (i , )∈T i, . For γ ∈ Γ i, , we let Ω and denote its points by μ γ =  (μ i , , j ) In addition, if π is a probability measure on Ω and γ ∈ Γ i, , we set Further, we define for any non-empty set ⊂ [k] a map Then the map π ∈ X → π = (π i, ) (i, )∈T induces a bijection between X and X .
Proof. Suppose that π ∈ X . Let (i, ) ∈ T . Then Lemma 5.6 yields Now let us fix a pair (i, ) ∈ T and (γ , μ γ ). We denote, for h = i, by γ h = γ h (μ γ ) the number of occurrences of δ h in the tuple μ γ . The event B[μ γ ] ∈ Ω occurs iff 1. for each h ∈ [k] \ there is j ∈ γ h such that μ h, j = δ h , i.e. γ h > 0, 2. for each h ∈ \ {i} and all j ∈ γ h we have μ h, j = δ h , i.e. γ h = 0, Thus, Lemma 5.5 implies that we obtain from (5.21) and (5.22) Thus, if π is a frozen fixed point of F d,k , then π is a fixed point of F d,k . Conversely, if π = (π i, ) is a fixed point of F d,k , then the measure π defined by is easily verified to be a fixed point of F d,k . Moreover, for i ∈ [k], ρ i (π ) = q * i,{i} = q * /k ≥ 2/(3k) and π is thus a frozen fixed point of F d,k .
Proof. To unclutter the notation we write π = π d,k,q * . Moreover, we let T = T d,k,q * ; by Lemma 5.1 we may always assume that T is a finite tree. Recall that π is the distribution of μ T , which is the distribution of the color of the root under a random legal coloring of T . In light of Lemma 5.10 it suffices to show that π = (π i, ) is a fixed point of F d,k . Thus, we need to show that for all (i, ) ∈ T , Let us denote by T i, the random tree T given that the root has type (i, ). We claim that π i, is the distribution of μ T i, . Indeed, let ⊂ [k]. If the root v 0 of T has type (i, ) for some i ∈ , then the support of the measure μ T is contained in (because under any legal coloring, v 0 receives a color from ). Moreover, all children of v 0 have types in T i, , and if (i , ) ∈ T i, , then | | ≥ 2. Hence, inductively we see that if v 0 has type (i, ), then for any color h ∈ there is a legal coloring under which v 0 receives color h. Consequently, the support of μ T is precisely . Furthermore, the distribution μ T is invariant under the following operation: obtain a random tree T by choosing a legal color τ of T randomly and then changing the types ϑ(v) = (i v , v ) of the vertices to ϑ (v) = (τ (i v ), v ); this is because the trees T and T have the same set of legal colorings. These observations imply that for any measurable set A we have To prove that π is a fixed point of F d,k , we observe that the random tree T i, can be described by the following recurrence. There is a root v 0 of type (i, ). For each (i , ), v 0 has a random number of  type (i , ). Moreover, each v i , , j is the root of a random tree T i , , j . Of course, the random variables (γ i , ) (i , )∈T i, and the random trees T i , , j are chosen independently.
This recursive description of the random tree T i, leads to a recurrence for the distribution π i, . Indeed, given the numbers (γ i , ) i , , the distribution μ T i , , j of the color of the root of the random tree T i , , j is an Ω -valued random variable with distribution π i , for each j = 1, . . . , γ i , . Moreover, the random variables (μ T i , , j ) i , , j are mutually independent. In addition, we claim that given the distributions (μ T i , , j ) i , , j , the color of the root v 0 of the entire tree T i, has distribution Indeed, given that v 0 has type (i, ), v 0 receives a color from under any legal coloring. Further, for any h ∈ the probability that v 0 takes color h under a random coloring of T i, is proportional to the probability that none of its children v i , , j takes color h in a random coloring of the tree T i , , j whose root v i , , j is. Finally, we recall that π i, is the distribution of μ T i, . Hence, (5.24) implies together with the fact that the γ i , , j are independent Poisson variables that π i, satisfies (5.23).
Lemma 5.11. The map F d,k has at most one fixed point.
Proof. As before, we let T denote the random tree T d,k,q * . Moreover, T i, is the random tree T given that the root has type (i, ).
Let t ≥ 0 be an integer and let π = (π i, ) ∈ P. We define a distribution π t = (π i, ,t ) ∈ P by means of the following experiment. Let (i, ) ∈ T . Let v 0 denote the root of T i, and let ϑ(v) signify the type of each vertex v. TR1 Let T i, ,t be the tree obtained from T i, by deleting all vertices at distance greater than t from v 0 . TR2 Let V t be the set of all vertices at distance exactly t from v 0 . For each v ∈ V t independently, choose μ v ∈ Ω from the distribution π ϑ(v) . TR3 Let μ i, ,t be the distribution of the color of v 0 under a random coloring τ chosen as follows.
-Independently for each vertex v ∈ V t choose a color τ t (v) from the distribution μ v . -Let τ be a uniformly random legal coloring of T i, ,t such that all v ∈ V t ; if there is no such coloring, discard the experiment.
Step TR3 of the above experiment yields a distribution μ i, ,t ∈ Ω. Clearly μ i, ,t is determined by the random choices in steps TR1-TR2. Thus, let we let π i, ,t be the distribution of μ i, ,t with respect to TR1-TR2. We now claim that for any integer t ≥ 0 the following is true.
If π is a fixed point of F d,k , then π = π t . (5.25) The proof of (5.25) is by induction on t. It is immediate from the construction that π i, ,0 = π i, for all (i, ) ∈ T . Thus, assume that t ≥ 1. By induction, it suffices to show that π t = π t−1 . To this end, let us condition on the random tree . We obtain the random tree T i, ,t from T i, ,t−1 by attaching to each such v ∈ V t−1 a random number γ i , ,v = Po(d q * i , ) of children of each type (i , ) ∈ T i v , v where, of course, the random variables γ i , ,v are mutually independent. Further, in step TR2 of the above experiment we choose μ i , ,v, j ∈ Ω independently from π i , for each v ∈ V t −1 , (i , ) ∈ T i, and j = 1, . . . , γ i , ,v . Given the distributions μ i , ,v, j , suppose that we choose a legal coloring τ v of the sub-tree consisting of v ∈ V t−1 and its children only from the following distribution.
-Independently choose the colors for v uniformly from the set of all colors h ∈ that are not already assigned to a child of v if possible.
Let μ v denote the distribution of the color τ v (v). Then by construction, Hence, the distribution of μ v with respect to the choice of the numbers γ i , ,v and the distributions μ i , ,v, j is given by because π is a fixed point of F d,k . Therefore, the experiment of first choosing T i, ,t , then choosing distributions μ u independently from π ϑ(u) for the vertices at distance t, and then choosing a random legal coloring τ as in TR3 is equivalent to performing the same experiment with t − 1 instead. Hence, π t = π t−1 .
To complete the proof, assume that π, π are fixed points of F d,k . Then for any integer t ≥ 0 we have π = π t , π = π t . Furthermore, as π t , π t result from the experiment TR1-TR3, whose first step TR1 can be coupled, we see that for any (i, ) ∈ T , the condition (5.27) ensures that these quantities are well-defined (i.e., the argument of the logarithm is positive in both instances). Additionally, to cover the case γ = 0 we set φ (∅) = ln | |. Further, suppose that T, ϑ, v is a rooted decorated tree that has at least one legal coloring σ . Let v 1 , . . . , v γ be the neighbors of the root vertex v and suppose that ϑ(v) = (i, ) and ϑ(v j ) = (i j , j ) for j = 1, . . . , γ . If we remove the root v from T , then each of the vertices v 1 , . . . , v γ lies in a connected component T i of the resulting forest. By considering the restrictions ϑ i of ϑ to the vertex set of T i , we obtain decorated trees T i , ϑ i . Recall that μ T j ,ϑ j ,v j denotes the distribution of the color of the root in a random legal coloring of T j , ϑ j , v j . Since σ is a legal coloring, Proof. This follows from [11,Proposition 3.7]. More specifically, let (i v , v ) = ϑ(v) be the type of vertex v. In the terminology of [11] (and of the physicists "cavity method"), φ(T, ϑ, v) is the Bethe free entropy of the Boltzmann distribution Thus, ν is simply the uniform distribution over legal k-colorings of T, ϑ, and Z(T, ϑ) is its partition function.
Let T denote the random rooted decorated tree T d,k,q * . Moreover, for (i, ) ∈ T we let T i, denote the random tree T given that the root has type (i, ). The starting point of the proof is the following key observation. Furthermore, if (T, ϑ, v) is a rooted decorated tree, then we let (T, ϑ, v) signify the isomorphism class of the random rooted decorated tree (T, ϑ, u) obtained from (T, ϑ, v) by choosing a vertex u of T uniformly at random and rooting the tree at u. In other words, (T, ϑ, v) is obtained by re-rooting (T, ϑ, v) at a random vertex. Proof. This follows from the general fact that Galton-Watson trees are unimodular in the sense of [5].
Proof. Letting (T, ϑ, v) range over rooted decorated trees, we find as claimed.

Lemma 5.13.
We have Proof. Writing π = π d,k,q * for the distribution of μ T , we know from Corollary 5.4 that π i, is the distribution of μ T i, for any type (i, ). Furthermore, the distribution of T i, can be described by the following recurrence: there is a root v 0 of type (i, ), to which we attach for each (i , ) ∈ T i, independently a number γ i , = Po(d q * i , ) of trees (T i , , j ) j=1,...,γ i , that are chosen independently from the distribution T i , . By independence, the distribution of the color of the root of each T i , , j is just an independent sample from the distribution π i , . Therefore, we obtain the expansion Substituting in the definition of φ , we obtain Further, by the definition of φ e we have To simplify this, we use the following elementary relation: if X : Z → R ≥0 is a function and g is a Poisson random variable, then Applying this observation to we obtain Now, since π is a fixed point of F d,k , the distribution of the measure B[μ γ ] is just π i, . Hence, Thus, we obtain the assertion.
Proof. Summing over all (i, ) ∈ T , we obtain from Lemma 5.13 that Recalling that dπ i, (μ) = It finally remains to simplify the expression for I . To do it, we introduce . We note that if γ ∈ Γ i, and γ ∈ Γ i are such that: and that μ γ , μ γ satisfy Moreover, choosing the γ i , from a Poisson distribution of parameter q * i , d , the event "(a) and (b)" happens with probability exactly kq * i, . This allows us to write: We used (5.28) to go from the first to the second line, and summed over i to go from the second to the third. Re-indexing the vector μ γ in a vector μ γ , γ ∈ Γ i (with γ i = :(i , )∈T γ i , ), we obtain with Lemma 5.5: Proof of Proposition 2.3. The first assertion is immediate from Lemma 5.1, while the second assertion follows from Lemma 5.8. The third claim follows by combining Corollary 5.5 with Lemma 5.14. With respect to the last assertion, we observe that for d = (2k − 1) ln k − 2 ln 2 + o k (1) we have Moreover, as q * = 1 − 1/k + o k (1/k) by Lemma 5.1, one checks easily that Further, by Lemma 5.1 Combining (5.29) and (5.30) and using the third part of Proposition 2.3, we conclude that Σ k has a unique zero d k,cond , as claimed.

The Cluster Size
The objective in this section is to prove Proposition 2.4. For technical reasons, we consider a variant of the "planted model " G(n, p , σ ) in which the number of vertices is not exactly n but n − o(n). This is necessary because we are going to perform inductive arguments in which small parts of the random graph get removed. Thus, let η = η(n) = o(n) be a non-negative integer sequence. Throughout the section, we write n = n−η(n).
By a slight abuse of notation we do not distinguish between σ and its restriction to the vertices in [n ]. Unless specified otherwise, all statements in this section are understood to hold for any sequence η = o(n).

Preliminaries.
Assume that G = (V, E), let σ be a k-coloring of G, let v ∈ V and let ω ≥ 1 be an integer. We write ∂ ω G (v) for the subgraph of G consisting of all vertices at distance at most ω from v. Moreover, |∂ ω G (v)| signifies the number of vertices of ∂ ω G (v). Where the reference to G is clear from the context, we omit it. We begin with the following standard fact about the random graph G. Lemma 6.1. Let ω = 10 ln ln ln n .

A.a.s. all but o(n) vertices
In addition, we need to know that the "local structure" of the random graph G endowed with the coloring σ enjoys the following concentration property. S be a set of triples (G 0 , σ 0 , v 0 ) such that G 0 is a graph, σ 0 is a kcoloring of G 0 , and v 0 is a vertex of G 0 . Let ω = 10 ln ln ln n and define a random variable S v = S v (G, σ ) by letting

Lemma 6.2. Let
The proof of Lemma 6.2 is based on standard arguments. The full details can be found in Sect. 6.5.

Warning propagation.
The goal in this section is to prove Proposition 2.4, i.e., to determine the cluster size |C(G, σ )|. A key step in this endeavor will be to determine the sets may take under a k-coloring in C(G, σ ). In particular, we called a vertex frozen in C (G, σ ) To establish Proposition 2.4, we will first show that the sets (v) can be determined by means of a process called Warning Propagation, which hails from the physics literature (see [23] and the references therein). More precisely, we will see that Warning Propagation yields color sets L(v) such that L(v) = (v) for all but o(n) vertices a.a.s. Crucially, by tracing Warning Propagation we will be able to determine for any given type (i, ) how many vertices of that type there are. Moreover, we will show that the cluster C(σ ) essentially consists of all k-colorings τ of G such that τ (v) ∈ L(v) for all v. In addition, the number of such colorings τ can be calculated by considering a certain reduced graph G WP (σ ). This graphs turns out to be a forest (possibly after the removal of o(n) vertices), and the final step of the proof consists in arguing that, informally speaking, a.a.s. the statistics of the trees in this forest are given by the distribution of the multi-type branching process from Sect. 2.
Let us begin by describing Warning Propagation on a general graph G endowed with a k-coloring σ . For each edge e = {v, w} of G and any color i we define a sequence (μ v→w (i, t|G, σ )) t≥1 such that μ v→w (i, t|G, σ ) ∈ {0, 1} for all i, v, w. The idea is that μ v→w (i, t|G, σ ) = 1 indicates that in the tth step of the process vertex v "warns" vertex w that the other neighbors u = w of v force v to take color i. We initialize this process by having each vertex v emit a warning about its original σ (v) at t = 0, i.e., Thus, L(v, t|G, σ ) is the set of colors that vertex v receives no warnings about at step t. To unclutter the notation, we omit the reference to G, σ where it is apparent from the context. To understand the semantics of this process, observe that by construction the list L(v, t|G, σ ) only depends on the vertices at distance at most t + 1 from v. Further, if we assume that the tth neighborhood ∂ t v in G is a tree, then L(v, t|G, σ ) is precisely the set of colors that v may take in k-colorings τ of G such that τ (w) = σ (w) for all vertices w at distance greater than t from v, as can be verified by a straightforward induction on t. As we will see, this observation together with the fact that the random graph G contains only few short cycles (cf. Lemma 6.1) allows us to show that for most vertices v we have (v) = L(v|G, σ ) a.a.s. In effect, the number of k-colorings τ of G with τ (v) ∈ L(v|G, σ ) for all v will emerge to be a very good approximation to the cluster size C (G, σ ).
Counting these k-colorings τ is greatly facilitated by the following observation. For a graph G together with a k-coloring σ , let us denote by G WP (t|σ ) the graph obtained from G by removing all edges {v, w} such that either |L(v, t)| < 2, |L(w, t)| < 2 or L(v, t) ∩ L(w, t) = ∅. Furthermore, obtain G WP (σ ) from G by removing all edges {v, w} such that L(v) ∩ L(w) = ∅. We view G WP (t|σ ) and G WP (σ ) as decorated graphs in which each vertex v is endowed with the color list L(v, t) and L(v) respectively. As before, we let Z denote the number of legal colorings of a decorated graph. Thus, The key statement in this section is We begin by proving that Z(G WP (σ )) is a lower bound on the cluster size a.a.s. To this end, let us highlight a few elementary facts.
Fact 6.1. The following statements hold for any G, σ .

For all v, w, i and all t
Proof. We prove (1) and (2) by induction on t. In the case t = 0 both statements are immediate from (6.1). Now, assume that t ≥ 1 and μ v→w (i, t) = 0. Then there is a color j = i and a neighbor u = w of v such that μ u→v ( j, t − 1) = 0. By induction, we have μ u→v ( j, t) = 0. Hence, (6.2) implies that μ v→w (i, t + 1) = 0. Furthermore, if μ v→w (i, t + 1) = 1 for some i = σ (v), then v has a neighbor u = w such that μ u→v (σ (v), t) = 1. But since σ (u) = σ (v) because σ is a k-coloring, this contradicts the induction hypothesis. Thus, we have established (1) and (2). Finally, (3) is immediate from (1). This implies the first assertion. The second assertion follows from the first assertion and Fact 6.1, which shows that there is a finite t such that L(v, t) = L(v) for all v.
To turn Fact 6.2 into a lower bound on the cluster size, we are going to argue that a.a.s. in G there are a lot of frozen vertices a.a.s. In fact, a.a.s. the number of such frozen vertices will turn out to be so large that all colorings τ as in Fact 6.2 belong to the cluster C(G, σ ) a.a.s.
To exhibit frozen vertices, we consider an appropriate notion of a "core". More precisely, assume that σ is a k-coloring of a graph G. We denote by core(G, σ ) the largest set V of vertices with the following property.
In words, any vertex in the core has at least 100 neighbors of any color j = σ (v) that also belong to the core. The core is well-defined; for if V , V are two sets with this property, then so is V ∪ V . The following is immediate from the definition of the core.
The core has become a standard tool in the theory of random structures in general and in random graph coloring in particular. Indeed, standard arguments show that G has a very large core a.a.s. More precisely, we have As before, we drop G, σ from the notation where possible.
Similarly as before, we can use the lists L (v, t) to construct a decorated reduced graph. Indeed, let G WP (t|σ ) be the graph obtained from G by removing all edges {v, w} such that |L (v, t)| < 2 or |L (w, t)| < 2 or L (v, t) ∩ L (w, t) = ∅. We decorate each vertex in this graph with the list L (v, t). In addition, let G WP (σ ) be the graph obtain from G by removing all edges {v, w} such that L (v) ∩ L (w) = ∅ endowed with the lists L (v).

For all
Proof. This follows by induction on t (cf. the proof of Fact 6.1).
Assuming (6.4), we are going to prove by induction on t that By construction, for any vertex v and any color j we have j ∈ L (v, 0), unless v has a neighbor w ∈ core(G, σ ) such that σ (w) = j. Moreover, if such a neighbor w exists, (6.4) implies that a.a.s. τ (w) = j and thus τ (v) = j for all τ ∈ C(σ ). Hence, (6.5) is true for t = 0. Now, assume that (6.5) holds for t. Suppose that j ∈ L (v, t + 1). Then v has a neighbor u such that μ u→v ( j, t + 1) = 1. Therefore, for each l = j there is w l = v such that μ w l →u (l, t) = 1. Consequently, L (u, t) = { j}. Hence, by induction we have τ (u) = j and thus τ (v) = j for all τ ∈ C(G, σ ).
To this end, we need one more general construction. Let G be a graph and let σ be a k-coloring of G. Let t ≥ 0 be an integer. For each vertex v of G we define a rooted, decorated graph T (v, t|G, σ ) as follows. L(w, t|G, σ )).
Of course, the total number Z(G WP (σ )) of legal colorings of G WP (σ ) is just the product of the number of legal colorings of all the connected components of G WP (σ ). The following lemma shows that a.a.s. for all but o(n) vertices the components in G WP (σ ) and G WP (σ ) coincide.
The main technical step towards the proof of Lemma 6.4 is to show that a.a.s. most of the components T (v|G, σ ) are "small" by comparison to n. Technically, it is easier to establish this statement for T (v, 0|G, σ ), which contains T (v|G, σ ) as a subgraph due to the monotonicity property Fact 6.4(3). The proof of Lemma 6.5, which we defer to Sect. 6.4, is a bit technical but based on known arguments. Lemma 6.1 shows that a.a.s. for most vertices v such that T (v, 0|G, σ ) contains at most, say, ω = ln ln ln n vertices, T (v, 0|G, σ ) is a tree. In this case, the following observation applies.
As a first step we are going to show that for each edge {x, y} in T (v, 0|G, σ ) we have μ x→y (i, t) = μ x→y (i, t) = 0 for all t > ω and all i ∈ [k]. (6.6) To do so, pick and fix an arbitrary vertex y in T (v, 0). We define the y-height h y (x) of a vertex x = y in T (v, 0) as follows. Since T (v, 0) is a tree, there is a unique path from x to y in T (v, 0). Let P y (x) be the neighbor of x on this path. Then h y (x) is the maximum distance from x to a leaf of T (v, 0) that belongs to the component of x in the subgraph of T (v, 0) obtained by removing the edge x, P y (x) . Let U be the set of all neighbors u of x that do not belong to T (v, 0), and let U be the set of all neighbors u = P y (x) of x in T (v, 0). We compute where we omitted the vertices in U since by construction of core(G, σ ) we conclude that for all u ∈ U we get μ u →x (i, 0) = 0 for all i ∈ [k]. For each j ∈ [k]\L (x, 0) there exists a neighbor u ∈ U such that σ (u) = j and μ u→x (i, 0) = 1 i= j and let U C be the set of all such neighbors. By Fact 6.4 and P3 for all u ∈ U C we find By construction of T (v, 0) for all u ∈ U the lists L (x, 0) and L (u, 0) are disjoint and by P1, P2 and (6.7) we obtain We conclude by (6.8) that To prove (6.6) we show by induction on h y (x) that for all i ∈ [k] μ x→P y (x) (i, t) = μ x→P y (x) (i, t) = 0 for all t ≥ h y (x) + 1. (6.10) To get started, suppose that h y (x) = 0. Then x is a leaf of T (v, 0). We compute for all i ∈ [k]. By Fact 6.1 and P3 we conclude that μ x→P y (x) (i, t) = μ x→P y (x) (i, t) = 0 for all t ≥ 1. Now, assume that h y (x) > 0. Then all u ∈ U satisfy h y (u ) < h y (x). Moreover, P y (u ) = x. Therefore, by induction (6.11) We compute Hence, applying (6.10) to the neighbors x of y in T (v, 0), we obtain μ x→y ( j, t) = μ x→y (i, ω+1) = μ x→y (i, ω+ 1) = 0 = μ x→y (i, t) for all i ∈ [k] and all t > ω. Together with (6.7) which states that for any x ∈ T (v, 0) and for any j ∈ [k]\L (x, 0) there exists a vertex u / ∈ T (v, 0) that is adjacent to x in G such that μ u→x ( j, t) = μ u→x ( j, t) = 1 for all t ≥ 0 and with (6.8) which states that for any j ∈ L (x, 0) there exists no vertex u / ∈ T (v, 0) that is adjacent to x in G such that μ u→x ( j, t) = μ u→x ( j, t) = 1 for any t ≥ 0 we conclude that L(x) = L(x, ω + 1) = L (x, ω + 1) = L (x) as desired.
Proof of Lemma 6.4. Lemma 6.5 implies that for all but o(n) vertices v we have |T (v, 0)| ≤ ln ln ln n a.a.s. Together with Lemma 6.1, this implies that a.a.s. T (v, 0) is a tree for all but o(n) vertices v. Thus, assume in the following that v is such that T (v, 0) is a tree.
6.3. Counting legal colorings. Proposition 6.1 reduces the proof of Proposition 2.4 to the problem of counting the legal colorings of the reduced graph G WP (σ ). Lemma 6.5 implies that a.a.s. G WP (σ ) is a forest consisting mostly of trees of size, say at most ln ln ln n. In this section we are going to show that a.a.s. the "statistics" of these trees follows the distribution of the random tree generated by the branching process from Sect. 2. To formalise this, let T = T d,k,q * with q * from (2.7) denote the random isomorphism class of rooted, decorated trees produced by the process GW(d, k, q * ). Moreover, for a rooted, decorated tree T let H T be the number of vertices v in G WP (σ ) such that T (v|G, σ ) ∼ = T . In this section we prove We begin by showing that the fixed point problem q * = F(q * ) with F from (5.1) provides a good approximation to the number of vertices v such that L(v|G, σ ) = {i} for any i. To this end, we let q 0 = (1/k, . . . , 1/k) and q t = F(q t−1 ) for t ≥ 1.
In addition, let Q i (t|G, σ ) be the set of vertices v of G such that L(v, t|G, σ ) = {i}. Proof. We proceed by induction on t. To get started, we set Q i (−1|G, σ ) = σ −1 (i) and q −1 i = 1/k. Then a.a.s. 1 n |Q i (−1|G, σ )| = q −1 i + o (1). Now, assuming that t ≥ 0 and that the assertion holds for t − 1, we are going to argue that Indeed, let v = n be the last vertex of the random graph, and let us condition on the event that σ (v) = i. By symmetry and the linearity of expectation, it suffices to show that To show (6.16), let G signify the subgraph obtained from G by removing v. Moreover, let Q t−1 (ε) be the event that Since G is nothing but a random graph G(n − 1, p , σ ) with one less vertex and as n − 1 = n − o(n), by induction we have for any ε > 0. (6.18) Let A(i) be the event that for each j ∈ [k]\{i} there is w ∈ ∂ G v such that L(w, t − 1| G, σ ) = { j}. Given σ (v) = i, we can obtain G from G by connecting v with each vertex w ∈ [n − 1] such that σ (w) = i with probability p independently. Therefore, Furthermore, for any fixed δ > 0 there is an (n-independent) ε > 0 such that given that Q t−1 (ε) occurs, we have Combining (6.18) and (6.19), we see that for any fixed δ > 0 we have If v is acyclic, σ (v) = i and A(i) occurs, then L(v, t|G, σ ) = {i}. Therefore, (6.16) follows from (6.20) and Lemma 6.1. Finally, the random variable |Q t i (G, σ )| satisfies the assumptions of Lemma 6.2. Indeed, the event v ∈ Q i (t|G, σ ) is determined solely by the sub-graph of G encompassing those vertices at distance at most t from v. Thus, (6.15) and Lemma 6.2 imply that 1 n |Q i (t|G, σ )| = q t i + o(1) a.a.s., as desired.
As a next step, we consider the statistics of the trees T (v, ω|G, σ ) with ω ≥ 0 large but fixed as n → ∞. Thus, for an isomorphism class T of rooted, decorated graphs we let H T,ω be the number of vertices v in G WP (ω|σ ) such that T (v, ω|G, σ ) ∈ T .

Lemma 6.8. Assume that T is an isomorphism class of rooted decorated trees such that
Proof. We observe that P [T = T ] is a number that depends on T but not on n. Furthermore, if T * is the isomorphism class of a rooted sub-tree of T , then The proof is by induction on the height of the trees in T . In the case that T consists of a single vertex v of type (i, {i}) for some i ∈ [k], the assertion readily follows from Lemma 6.7.
Let (i 0 , 0 ) be the type of the root and v = n . To this end, consider the graph G obtained by removing v. By Lemma 6.7 the number of vertices w of G with L(w, ω| G, σ ) = { j} is n(q j + o ω (1)) a.a.s. for all j, where o ω (1) signifies a term that tends to 0 in the limit of large ω. Let A be the event that this is indeed the case. Moreover, let B be the following event.
-for each color j ∈ 0 , vertex v has a neighbor w in G such that L(w, ω| G, σ ) = { j}.
v does not have a neighbor w with L(w, ω| G, σ ) = {h} for any h ∈ 0 . Then Let T v 0 be the unique tree of the isomorphism class of rooted decorated trees consisting only of the root v 0 . Let Y v be the event that v has no neighbor of any type (i , ) ∈ T i 0 , 0 . Therefore let q 0 ∅ = (i , )∈T i, q i, . We find Combining (6.21) and (6.22), we find that As for the inductive step, pick and fix one representative T 0 ∈ T . If we remove the root v 0 from T 0 , then we obtain a decorated forest T 0 − v 0 . Each tree T in this forest contains precisely one neighbor of the root of T 0 , which we designate as the root of T . Let V(T ) be the set of all isomorphism classes of rooted decorated trees T obtained in this way. Furthermore, for eachT ∈ V(T ) let y(T ) be the number of components of the forest T 0 − v 0 that belong to the isomorphism classT .
We are going to show that for v = n and for ω = ω(T, ε) sufficiently large we have Furthermore, for each tree T ∈ V(T ) let Q(T ) be the set of all vertices w of G such that T (w, ω| G, σ ) ∼ = T . In addition, let Q ∅ be the set of all vertices w of G that satisfy none of the following conditions: Further, let q(T ) = P T = T and let Let Q be the event that | Q(T )|/n = q(T ) + o ω (1) for all T ∈ V(T ) and that | Q ∅ |/n = q ∅ (T ) + o ω (1). Then Finally, because the event T (v, ω|G, σ ) ∈ T is governed by the vertices at distance at most |T | + ω from v, Lemma 6.2 implies together with (6.26) that for any ε > 0 there is ω such that This completes the induction. Proof. Lemma 6.6 implies that T (v|G, σ ) = T (v, ω + 2|G, σ ), unless T (v, 0|G, σ ) contains at least ω vertices. Furthermore, Lemma 6.5 implies that for any fixed ε > 0 there is ω = ω(ε) such that this holds for no more than εn vertices a.a.s.
Finally, Proposition 6.3 is immediate from Lemmas 6.8 and 6.9 and Proposition 2.4 follows from Propositions 6.1 and 6.3.
6.4. Proof of Lemma 6.5. Set θ = ln ln n . Moreover, for a set S ⊂ V let C S denote the σ -core of the subgraph of G obtained by removing the vertices in S. Further, for any vertex w ∈ S let Λ(w, S) be the set of colors j ∈ [k] such that in G vertex w does not have a neighbor in σ −1 ( j) ∩ C S . In addition, let us call S wobbly in G if the following conditions are satisfied.
W1 |S| = θ . W2 We have |Λ(w, S)| ≥ 2 for all w ∈ S. W3 The subgraph of G induced on S has a spanning tree T such that Λ(u, S) ∩ Λ(w, S) = ∅ for each edge {u, w} of T.
Assume that T (v, 0|G, σ ) contains at least θ vertices. If T = (S, E T ) is a sub-tree on θ vertices contained in T (v, 0|G, σ ), then S is wobbly. Therefore, it suffices to prove that the total number W of vertices that are contained in a wobbly set S satisfies E[W ] ≤ S⊂V :|S|=θ θ · P S is wobbly = o(n). (6.27) To prove (6.27), we need a bit of notation. For a set S let E S be the event that Then Proposition 6.2 implies that for any set S of size θ we have P [E S ] ≥ 1 − exp(−Ω(n)). (6.28) Further, for a vertex w ∈ S and a set J w ⊂ [k] \ {σ (w)} let L(w, J w ) be the event that Λ(w, S) ⊃ J w . Crucially, the core C S of the subgraph of G obtained by removing S is independent of the edges between S and C S . Therefore, w is adjacent to a vertex x in C S with σ (x) = σ (w) with probability p , independently for all such vertices x. Consequently, (6.29) Moreover, due to the independence of the edges in G, the events L(w, J w ) are independent for all w ∈ S. Let S ⊂ V be a set of size θ . Let us call a vertex w ∈ S rich if |Λ(w, S)| ≥ √ k. Further, let R S be the set of rich vertices in S. To estimate the probability that S is wobbly, we consider the following events.
-Let A S be the event that |R S | ≥ k −1/3 θ and that G contains a tree T with vertex set S.
-Let A S be the event that and that G contains a tree T with vertex set S such that w∈R S |∂ 1 T (w)| ≥ θ/2.
(In words, the sum of the degrees of the rich vertices in T is at least θ/2.) -Let A S be the event that G contains a tree T with vertex set S such that w∈R S |∂ 1 T (w)| < θ/2.
-Let W S be the event that condition W2 is satisfied.
-For a given tree T with vertex set S let W S,T be the event that condition W3 is satisfied.
If S is wobbly, then the event A S ∪ (W S ∩ A S ) ∪ (W S ∩ W S,T ∩ A S ) for a tree T occurs. Therefore, In the following, we are going to estimate the three probabilities on the r.h.s. separately.
With respect to the probability of A S , (6.28) and (6.29) yield +P ∃R ⊂ S, |R| = k −1/3 θ : ∀w ∈ R : |Λ(w, S)| ≥ √ k|E S ≤ exp(−Ω(n)) + θ Furthermore, by Cayley's formula there are θ θ−2 possible trees with vertex set S. Since any two vertices in S are connected in G with probability at most p , and because edges occur independently, we obtain P [A S ] ≤ θ θ−2 p θ−1 · P |R S | ≥ k −1/3 θ ≤ θ θ−2 p θ−1 exp(− √ kθ). (6.31) To bound the probability of W S ∩ A S \A S , let R ⊂ S. Moreover, let e(S) denote the total number of edges spanned by S in G, and let e(R, S) denote the number of edges that join a vertex in R with another vertex in S. Let A S (R, t) be the event e(S) ≥ θ − 1 and e(R, S) = t. If A S \A S occurs, then there exist R ⊂ S, |R| ≤ r = k −1/3 θ , and t ≥ θ/4 such that A S (R, t) occurs. Therefore, by the union bound, Because any two vertices in S are connected with probability at most p independently, the random variable e(R, S) is stochastically dominated by a binomial distribution Bin(r θ, p ). Therefore, P [e(R, S) = t] ≤ P Bin(r θ, p ) = t ≤ r θ t p t . (6.34) Similarly, we find To bound the probability of A S , suppose that T is a tree with vertex set S, let U ⊂ S and denote by A S (T, U ) the event that the following statements are true.
(i) T is contained as a subgraph in G.
(ii) Let s 0 = min S and consider s 0 the root of T . Then for each u ∈ U the parent P(u) satisfies P(u) ∈ R S .
If the event A S \(A S ∪A S ) occurs, then there exist a tree T and a set U of size |U | ≥ θ/3 such that A S (T, U ) occurs. Therefore, Fix a tree T on S and a set U ⊂ S, |U | ≥ θ/3. Since any two vertices are connected in G with probability at most p independently, the probability that (i) occurs is bounded by p θ−1 . Furthermore, if (ii) occurs and u ∈ U , then |Λ(P(u), S)| ≤ √ k because P(u) is not rich. In addition, W3 requires that Λ(P ( Alternatively, it could be that σ (u) ∈ Λ(P(u), S). Given that Λ(P(u), S) has size at most √ k, the probability of this event is bounded by k −1/2 because σ (u) is random. Additionally, by W2 there is another color j ∈ Λ(u), j = σ (u). Hence, the event L(u, { j}) occurs and (6.29) yields   Proof of Lemma 6.2. The proof is based on Lemma 6.10. Of course, we can view (G, σ ) as chosen from a product space X 2 , . . . , X N with N = 2n where X i is a 0/1 vector of length i − 1 whose components are independent Be( p ) variables for 2 ≤ i ≤ n and where X i ∈ [k] is uniformly distributed for i > n 2 ("vertex exposure"). Let Γ be the event that |N ω (v)| ≤ λ = n 0.01 for all vertices v. Then by Lemma 6.1 we have P [Γ ] ≥ 1 − exp(−Ω(ln 2 n)).
(6.49) Furthermore, let G be the graph obtained from G by removing all edges e that are incident with a vertex v such that |∂ ω G (v)| > λ and let If Γ occurs, then S = S . Hence, (6.49) implies that Moreover, the random variable S = f (X 2 , . . . , X N ) satisfies (6.48) with c = λ and c = n . Indeed, altering either the color of one vertex u or its set of neighbors can only affect those vertices v that are at distance at most ω from u, and in G there are no more than λ such vertices. Thus, Lemma 6.10 applied with, say, t = n 2/3 and γ = 1/n and (6.49) yields P |S − E[S ]| > t ≤ exp(−Ω(ln 2 n)) = o(1). (6.51) Finally, the assertion follows from (6.50) and (6.51).