Polynomial mixing of the edge-flip Markov chain for unbiased dyadic tilings

We give the first polynomial upper bound on the mixing time of the edge-flip Markov chain for unbiased dyadic tilings, resolving an open problem originally posed by Janson, Randall, and Spencer in 2002. A dyadic tiling of size n is a tiling of the unit square by n non-overlapping dyadic rectangles, each of area 1/n, where a dyadic rectangle is any rectangle that can be written in the form [a2^{-s}, (a+1)2^{-s}] \times [b2^{-t}, (b+1)2^{-t}] for non-negative integers a,b,s,t. The edge-flip Markov chain selects a random edge of the tiling and replaces it with its perpendicular bisector if doing so yields a valid dyadic tiling. Specifically, we show that the relaxation time of the edge-flip Markov chain for dyadic tilings is at most O(n^{4.09}), which implies that the mixing time is at most O(n^{5.09}). We complement this by showing that the relaxation time is at least \Omega(n^{1.38}), improving upon the previously best lower bound of \Omega(n\log n) coming from the diameter of the chain.


Introduction
We study the edge-flip Markov chain for dyadic tilings. An interval is dyadic if it can be written in the form [a2 −s , (a + 1)2 −s ] for non-negative integers a and s with 0 ≤ a < 2 s . A rectangle is dyadic if it is the Cartesian product of two dyadic intervals. A dyadic tiling of size n is a tiling of the unit square by n non-overlapping dyadic rectangles with the same area 1/n; see Figure 1. Less formally, work of Lagarias, Spencer, and Vinson [11] showed that in two dimensions, dyadic tilings are precisely those tilings that can be constructed by bisecting the unit square, either horizontally or vertically; bisecting each half again, either horizontally or vertically; and repeatedly bisecting all remaining rectangular regions until there are n total dyadic rectangles, each of equal area. We necessarily assume n is a power of 2. There is a natural Markov chain which connects the state space of all dyadic tilings of size n by moves we refer to as edge-flips.
We analyze this edge-flip Markov chain over the set of dyadic tilings of size n. Given any dyadic tiling, this chain evolves by selecting an edge of the tiling uniformly at random and replacing it by its perpendicular bisector, if doing so yields a valid dyadic tiling of size n; an illustration is given in Figure 2  In terms of lower bounds, the best previously known lower bound for the mixing time is Ω(n log n), which is a simple consequence of the fact that the diameter of the Markov chain is of order n log n [10]. In the theorem below we improve upon this bound, showing that even the relaxation time is much larger than n log n. is the golden ratio.
Related work. The edge-flip Markov chain for dyadic tilings was first considered by Janson, Randall, and Spencer in 2002 [10]. They showed that this Markov chain is irreducible, but left as an open problem to derive that the mixing time is polynomial in n. Instead, they presented another Markov chain, which has additional global moves consisting of rotations at all scales, and showed that this chain mixes in polynomial time. However, applications of the comparison technique of Diaconis and Saloff-Coste [8] have failed to extend this polynomial mixing bound to the more natural edge-flip Markov chain (which, in fact, corresponds to only performing rotations at the smallest scale). Cannon, Miracle, and Randall considered the mixing time of the edge-flip Markov chain for a weighted version of dyadic tilings [2]. In this version, given a parameter λ > 0, the stationary probability of a dyadic tiling x is proportional to λ |x| , where |x| is the sum of the length of the edges of x. The Metropolis rule [16] is incorporated into the edge-flip Markov chain so that the chain has the desired stationary distribution. They showed the mixing time of this chain is at least exponential in n 2 for any λ > 1, and at most O(n 2 log n) for any λ < 1. This establishes a phase transition at critical point λ = 1, which corresponds to the unweighted case considered here. However, their techniques did not extend to the critical point, and they left as an open problem bounding the mixing time when λ = 1. Our main result, Theorem 1.1, uses a different, non-local approach to finally answer the question of [10] and [2] by showing the mixing time of the edge-flip Markov chain at critical point λ = 1 is at most polynomial in n, substantially less than the mixing time when λ > 1. Furthermore, our Theorem 1.2 combined with the result for the weighted case in [2] shows that the behavior at the (unweighted) critical point λ = 1 is also substantially different than when λ < 1. While it follows from the path coupling analysis in [2] that the relaxation time is O(n) for all fixed λ < 1, Theorem 1.2 establishes a super-linear lower bound on the relaxation time when λ = 1.
Variants of the edge-flip Markov chain offer a natural way to sample from many structures, but establishing rigorous polynomial upper bounds on the mixing time has often proven difficult, even in simple cases. Perhaps the most studied case is that of triangulations of a given point set, as efficiently generating uniformly random triangulations of general planar point sets has been a problem of great interest in computer graphics and computational geometry. However, the mixing time of the edge-flip Markov chain for triangulations remains open in the general case, and no polynomial upper bound is known. The only known exception is for n points in convex position, which corresponds to triangulations of a convex polygon. In this case, the edge-flip Markov chain is known to mix in at most O(n 5 ) steps [15], but the correct order of the mixing time is still unknown. For the case of lattice triangulations, which are triangulations of an m × n grid of points, no polynomial upper bound on the mixing time is known even when m ≥ 2 is kept fixed as n → ∞. The only known results in this case are limited to the weighted case [3,4,19]. Another example of a related Markov chain that uses natural edge-flip type moves is the switch Markov chain for sampling from graphs with a given degree sequence. In this chain, at each iteration two random non-adjacent edges are removed and their four endpoints are randomly rematched; the move is rejected if it results in a multiple edge. Again, in the general case the mixing time of this Markov chain is unknown, though polynomial upper bounds exist when certain restrictions are placed on the degree sequence [7,9].
For the case of rectangular tilings, results for the mixing time of the edge-flip Markov chain have been quite rare. One important result was obtained for domino tilings, which are tilings of an n × n square by rectangles of dimensions 1 × 2 or 2 × 1. In this case, the edge-flip Markov chain is known to mix in time polynomial in the number of dominoes, a result that heavily relies on the connection between domino tilings and random lattice paths [13,17].
The case of dyadic tilings exhibits interesting asymptotic properties that have been studied by combinatorialists [11,10]. Tilings in which all rectangles are dyadic, but may have different areas, have been used as a basis for subdivision algorithms to solve problems such as approximating singular algebraic curves [1] and classifying data using decision trees [18]. In both of these examples, the unit square is repeatedly subdivided into smaller and smaller dyadic rectangles until the desired approximation or classification is achieved, with more subdivisions in the areas of the most interest (e.g., near the algebraic curve or where data classified differently is close together).
Proof ideas. We identify a certain block structure on dyadic tilings that allows us to relate the spectral gap of the edge-flip Markov chain to that of another, simpler Markov chain. In the simpler Markov chain, which we refer to as the block dynamics, for each transition a large region of the tiling is selected and retiled uniformly at random, if possible. At the smallest scale, n = 4, these correspond to exactly the moves of the (lazy) edge-flip Markov chain. The structure of these block moves allows us to set up a recursion that relates the spectral gap of the edge-flip Markov chain for tilings of size n with that of sizes smaller than n and that of the block dynamics. This produces an inverse polynomial lower bound on the spectral gap of the edge-flip Markov chain.
Specifically, we adapt a bisection approach inspired by spin system analysis [14,5]. We bound the spectral gap γ k of the Markov chain M k for dyadic tilings of size n = 2 k by the product of the spectral gap γ block of the block dynamics Markov chain and the spectral gap γ k−1 of M k−1 , and then use recursion to obtain γ k ≥ (γ block ) k = (γ block ) log n . As γ block is constant, this implies a polynomial relaxation time and thus a polynomial mixing time.
To establish the explicit upper bound in Theorem 1.1, we use a coupling argument to bound γ block ; see, e.g., Chapter 13 of [12]. The distance metric we use is a carefully weighted average of two different notions of distance between tilings. We do a case analysis and show this distance metric always contracts by a factor of at least 1 − 1/17 in each step, implying the spectral gap γ block is at least 1/17.
We use a distinguishing statistic to show the mixing time and relaxation time of the edge-flip Markov chain for dyadic tilings are at least Ω(n 1.38 ); again, see Chapter 13 of [12]. That is, we define a specific function f on the state space of all dyadic tilings of size n = 2 k . By considering the variance and Dirichlet form of f , and using combinatorial properties of dyadic tilings, we can give an upper bound on the spectral gap and thus a lower bound on the relaxation and mixing times.

Background
Here we present some necessary information on dyadic tilings, including their asymptotic behavior, and on Markov chains, including mixing time and local variance.

Dyadic Tilings
A dyadic interval is an interval that can be written in the form [a2 −s , (a + 1)2 −s ] for non-negative integers a and s with 0 ≤ a < 2 s . A dyadic rectangle is the product of two dyadic intervals. A dyadic tiling of size n = 2 k is a tiling of the unit square by n dyadic rectangles of equal area 1/n = 2 −k that do not overlap except on their boundaries; see Figure 1. Let Ω k be the set of all dyadic tilings of size n = 2 k .
We say a dyadic tiling has a vertical bisector if the line x = 1/2 does not intersect the interior of any dyadic rectangle in the tiling. We say it has a horizontal bisector if the same is true of the line y = 1/2. It is easy to prove that every dyadic tiling of size n > 1 has a horizontal bisector or a vertical bisector.
The asymptotics of dyadic tilings were first explored by Lagarias, Spencer, and Vinson [11], and we present a summary of their results. Let A k = |Ω k | denote the number of dyadic tilings of size n = 2 k . The unit square is the unique dyadic tiling consisting of one dyadic rectangle, so A 0 = 1. There are two dyadic tilings of size 2, since the unit square may be divided by either a horizontal or vertical bisector, so A 1 = 2. One can also observe that A 2 = 7, A 3 = 82, A 4 = 11047, ... . In fact, the values A k can be shown to satisfy the recurrence A k = 2A 2 k−1 − A 4 k−2 ; we include a proof of this fact as presented in [10], because we will use these ideas later.
Proof. A dyadic tiling of size 2 k has a horizontal bisector, a vertical bisector, or both. If it has a vertical bisector, the number of ways to tile the left half of the unit square is A k−1 ; by mapping x → 2x, we can see that the left half of a dyadic tiling of size 2 k is equivalent to a dyadic tiling of the unit square of size 2 k−1 because dyadic rectangles scaled by factors of two remain dyadic. Similarly, mapping x → 2x − 1, the right half of a dyadic tiling of size 2 k is equivalent to a dyadic tiling of size 2 k−1 . We conclude the number of dyadic tilings of size 2 k with a vertical bisector is A 2 k−1 . Similarly, by appealing to the maps y → 2y and y → 2y − 1, we conclude the number of dyadic tilings of size 2 k with a vertical bisector is A 2 k−1 . The number of dyadic tilings of size 2 k with both a horizontal and a vertical bisector is A 4 k−2 , as each quadrant of any such tiling is equivalent to a dyadic tiling of the unit square of size 2 k−2 . This follows from appealing to the map (x, y) → (2x, 2y) for the lower left quadrant, and appropriate translations of this for the other three quadrants. Altogether, we see It is believed this recurrence does not have a closed form solution. We note that, as proved in [11], A k ∼ φ −1 ω 2 k = φ −1 ω n , where φ = (1 + √ 5)/2 is the golden ratio and ω = 1.84454757...; an exact value for ω is not known.
We now define a recurrence for another useful statistic. We say that a dyadic tiling has a left half-bisector if the straight line segment from (0, 1/2) to (1/2, 1/2) doesn't intersect the interior of any dyadic rectangles. Figure 1(a) does not have a left half-bisector, while Figure 1 We are interested in the number of ways to tile the left half of a vertically-bisected dyadic tiling of size 2 k such that it has a left half-bisector. Appealing to the dilation maps defined in the proof of Proposition 2.1, this number is A 2 k−2 . Among all possible ways to tile the left half of a vertically-bisected tiling σ ∈ Ω k , we define f k to be the fraction with a left half-bisector. We see We can similarly define right half-bisectors, top half-bisectors, and bottom half-bisectors by considering the straight line segments between (1/2, 1/2) and, respectively, (1, 1/2), (1/2, 1), and (1/2, 0). Then f k is also the fraction of tilings of the right half of vertically-bisected tiling σ with a right halfbisector, or the fraction of tilings of the top or bottom halves of a horizontally-bisected tiling σ with a top or bottom half-bisector, respectively. Note f 2 = 0.5, f 3 = 4/7 ∼ 0.571, and f 4 = 49/82 ∼ 0.598. We now examine the asymptotic behavior of f k .
Proof. This follows from the recurrence for A k given in Proposition 2.1: We can use this recurrence to study the asymptotic behavior of the sequence {f k } ∞ k=2 .
Lemma 2.3. The sequence {f k } ∞ k=2 is strictly increasing and bounded above by ( Proof. Note f 2 = 0.5 < ( To show f k < f k+1 for all k ≥ 3, it suffices to show x < 1/(2 − x 2 ) for all x ∈ 0.5, ( √ 5 − 1)/2 . This is equivalent to showing the polynomial x 3 − 2x + 1 is positive in that range. Factoring shows this polynomial has roots at 1, ( √ 5 − 1)/2, and −( √ 5 + 1)/2, and is positive in the range −( k=2 is bounded and monotone, so it must converge to some limit β. To find β, we consider the function g(x) = 1/(2 − x 2 ), which is the recurrence for the f k . This function is continuous away from √ 2 and − √ 2, and thus certainly is continuous on 0.5, ( √ 5 − 1)/2 , the range of possible values for the f k and their limit β. This continuity implies Thus the limit β is necessarily a fixed point of g(x). The fixed points of g(x) are exactly the three roots of x 3 − 2x + 1 found above, and the only one in 0.5, ( We conclude lim k→∞ f k = ( √ 5 − 1)/2, as desired.

Markov Chains
We will consider only discrete time Markov chains in this paper, though identical results also hold for the analogous continuous time Markov chains. Any finite ergodic Markov chain is known to converge to a unique stationary distribution π. The time a Markov chain with transition matrix P takes to converge to its stationary distribution is measured by the total variation distance, which captures how far the distribution after t steps is from the stationary distribution given a worst case starting configuration: The mixing time of a Markov chain M is defined to be For convenience, as is standard we define t mix = t mix (1/4). We will bound the mixing time of the edge-flip Markov chain for dyadic tilings by studying its relaxation time and spectral gap. The spectral gap γ of a Markov chain M with transition matrix P is 1 − λ 2 , where λ 2 is the second largest eigenvalue of P . For a lazy Markov chain M, the relaxation time, denoted by t rel , is then the inverse of this spectral gap; we will see in the next section that the edge-flip Markov chain for dyadic tilings, as we've defined it, is lazy. The following well-known proposition relates the relaxation time and mixing time for Markov chains; for a proof, see, e.g., [12,Theorem 12.3

and Theorem 12.4].
Proposition 2.4. Let M be an ergodic Markov chain on state space Ω with reversible transition matrix P and stationary distribution π. Let π min = min x∈Ω π(x). Then: We will bound the spectral gap, and thus the relaxation and mixing times, of the edge-flip Markov chain for dyadic tilings by considering functions on the chain's state space. For f : Ω → R, the variance of f with respect to a distribution π on Ω can be expressed as: We will only be considering the variance with respect to the uniform distribution on Ω, so the subscript π will be omitted. For a given reversible transition matrix P on state space Ω with stationary distribution π, the Dirichlet form, also know as the local variance, associated to the pair (P, π) is, for any function f : Ω → R, As we see in the following well-known proposition, the Dirichlet form and variance of a function f can be used to bound the spectral gap of a transition matrix, and therefore the relaxation time and mixing time of a Markov chain; see, e.g., [12,Lemma 13.12]. Proposition 2.5. Given a Markov chain with reversible transition matrix P and stationary distribution π, the spectral gap γ = 1 − λ 2 of P satisfies We will use this proposition in both our upper bound and lower bound proofs.

The Edge-Flip Markov Chain M k
Let n = 2 k . For k ≥ 1, the edge-flip Markov chain M k on the state space Ω k of all dyadic tilings of size 2 k is given by the following rules.
Beginning at any σ 0 ∈ Ω k , repeat: • Choose a rectangle R of σ i uniformly at random.
• Choose lef t, right, top, or bottom uniformly at random; let e be the corresponding side of R.
• If e bisects a rectangle of area 2 −k+1 , remove e and replace it with its perpendicular bisector to obtain σ i+1 if the result is a valid dyadic tiling; else, set σ i+1 = σ i .
An example of an edge-flip move of M k is shown in Figure 2(a); two selections of R and e that do not yield valid moves are shown in (b) and (c). Let P k,edge denote the transition matrix of this edge-flip Markov chain and γ k its spectral gap. For every valid edge flip, there are two choices of (R, e) that result in that move. This implies every move between two tilings differing by an edge flip occurs with probability 1/(2n) = 2 −k−1 , so all off-diagonal entries of P k,edge are either 2 −k−1 or 0. The Markov chain M k , in a slightly different form, was introduced by Janson, Randall and Spencer [10]. Note that M k is lazy, as for any rectangle R of a dyadic tiling at most one of its left and right edges can be flipped to produce another valid dyadic tiling. This is because if R's projection onto the x-axis is dyadic interval [a2 −s , (a + 1)2 −s ] for a, s ∈ Z ≥0 , then flipping its left edge yields a rectangle with x-projection [(a − 1)2 −s , (a + 1)2 −s ] and flipping its right edge yields a rectangle with x-projection [a2 −s , (a + 2)2 −s ]. If a is even, the first of these intervals is not dyadic, while if a is odd, the second is not, so at most one of these edge flips produces a valid dyadic tiling. Similarly, at most one of R's top and bottom edges yields a valid edge flip. This implies in each iteration with probability at least 1/2 a pair (R, e) is selected that does not yield a valid edge flip move.
It was previously shown that this Markov chain is irreducible in [10], so M k is ergodic and thus has a unique stationary distribution. The uniform distribution satisfies the detailed balance equation, implying both that M k is reversible and that its stationary distribution is uniform on Ω k .
While we index this edge-flip Markov chain for dyadic tilings of size n = 2 k by k instead of by n, it is important to keep in mind that we wish to show the mixing time of M k is polynomial in n, not polynomial in k.

The Block Dynamics Markov Chain M block k
To analyze the mixing time of Markov chain M k , we will appeal to a similar Markov chain that uses larger block moves instead of single edge flips. We use in a crucial way the bijection between tilings in Ω k−1 and the left or right (resp. top or bottom) half of a tiling in Ω k that has a vertical (resp. horizontal) bisector, as discussed in the proof of Proposition 2.1.
For k ≥ 2, the block dynamics Markov chain M block k on the state space Ω k of all dyadic tilings of size 2 k is given by the following rules.
• Uniformly at random choose Lef t, Right, T op, or Bottom.
• To obtain σ i+1 : -If Lef t was chosen and σ has a vertical bisector, retile σ's left half with ρ, under the mapping x → x/2.
-If Bottom was chosen and σ has a horizontal bisector, retile σ's bottom half with ρ, under the mapping y → y/2.
Let P k,block be the transition matrix of this Markov chain and let γ k,block be its spectral gap. Any valid nonstationary transition of M block k occurs with probability 1/(4|Ω k−1 |). This Markov chain is not lazy, but it is aperiodic, irreducible, and reversible. This implies it is ergodic and thus has a unique stationary distribution, which by detailed balance is uniform on Ω k .

A Polynomial upper bound on the mixing time of M k
Recall we wish to show the mixing time of M k is polynomial in n = 2 k , not polynomial in k. We show the spectral gap γ k of M k and the spectral gap γ k−1 of M k−1 differ by a multiplicative constant (specifically, 1/17) by appealing to the Dirichlet forms of both of these Markov chains as well as the block dynamics Markov chain M block k . We can then use recursion to show γ k is bounded below by (1/17) k , which, because k = log n, gives a polynomial upper bound on the relaxation time and thus on the mixing time of M k .
For any function f : Ω k → R, we will denote the Dirichlet form of f with respect to transition matrix P k,edge and the uniform stationary distribution as E k,edge (f ). The Dirichlet form of f with respect to transition matrix P k,block and the uniform stationary distribution will be E k,block (f ). We will let the variance of function f on Ω k with respect to the uniform stationary distribution be var k (f ). Here the k indicates which state space Ω k we are considering, rather than which distribution on Ω k the variance is taken with respect to; all variances we consider will be with respect to the uniform distribution.
Because we consider two different Markov chains on the same state space Ω k , there are two different notions of adjacencies on this state space, each corresponding to the moves of one of these Markov chains. For x, y in Ω k , we say x ∼ e y if x and y differ by a single edge flip move of M k and x ∼ b y if x and y differ by a single move of the block dynamics chain M block k . More specifically, if x and y differ by a retiling of their left half (implying x and y both have a vertical bisector and are the same on their right half), we say x ∼ L y; then x ∼ R y, x ∼ T y, and x ∼ B y are defined similarly for the right, top, and bottom halves.
Proof. We begin by computing the Dirichlet forms for block dynamics and then for the edgeflip dynamics, which will allow comparison of their spectral gaps. Recall that for any function f : Ω k → R, This sum can be split up into four terms, depending on whether x and y differ by a retiling of their left, right, top, or bottom halves. We now analyze the first of these terms, containing all pairs x, y differing by a retiling of their left halves. For x L , x R ∈ Ω k−1 , by x L x R below we mean the tiling in Ω k with a vertical bisector whose left half is x L under the map x → x/2 and whose right half is x R under the map x → (x + 1)/2.
For each x R ∈ Ω k−1 , the function f | x R : Ω k−1 → R given by f | x R (z) = f (zx R ) has variance var k−1 (f | x R ) (with respect to the uniform distribution) that is exactly equal to the term in parentheses above. By appealing to Proposition 2.5, we can bound this variance using both the Dirichlet form of function f | x R associated to transition matrix P k−1,edge and the spectral gap γ k−1 of this Markov chain M k−1 . Thus, We now see that the Dirichlet form for the edge-flip Markov chain on Ω k−1 is Using this expression, we see that We now compare this to the Dirichlet form for the edge flip Markov chain on Ω k , which we recall is We note for every x, y ∈ Ω k such that x ∼ e y, at least one of and at most two of x ∼ L y, x ∼ R y, x ∼ T y, and x ∼ B y hold. Thus each summand of E k (f ) appears at most twice as a summand of It follows that Note this implies that for any f , Let f be chosen to be the function achieving equality in We conclude We will prove in Section 6 that the spectral gap of the block dynamics Markov chain is at least 1/17 for sufficiently large k. This can be used to bound the spectral gap, the relaxation time, and finally the mixing time of M k . Theorem 4.2. There exists a positive integer k 0 such that for all k ≥ k 0 , the spectral gap γ k,block is at least 1/17.
Proof. See Section 6. We introduce a distance metric on dyadic tilings, and then give a coupling where the distance between two tilings decreases in expectation after one iteration by a multiplicative factor of 1 − 1 17 for all k sufficiently large. By a result of Chen [6], this implies the theorem. We are now ready to prove our first main theorem, Theorem 1.1, which states that the relaxation time of M k for n = 2 k is O(n log 17 ) and its mixing time is O(n 1+log 17 ) Proof of Theorem 1.1. By Theorems 4.1 and 4.2, the spectral gap of M k satisfies where k 0 is the value from Theorem 4.2. Since γ k 0 is a constant that does not depend on n, we obtain γ k = Ω 17 −k = Ω n − log 17 = Ω n −4.09 .
Because M k is a lazy Markov chain, its relaxation time satisfies To use this to bound the mixing time of M k , we appeal to Proposition 2.4, though we first must calculate π min . For π the uniform distribution, min x∈Ω k π(x) = 1/|Ω k |. By the asymptotics of dyadic tilings, a loose bound is 1/π min = |Ω k | < 2 n . This implies t mix = O n 1+log 17 .

Lower bound on the mixing time of M n
In this section we give the proof of Theorem 1.2. For this, we define the following subsets of Ω k : x has both a horizontal and a vertical bisector} , Ω | k = {x ∈ Ω k : x has a vertical bisector} , and Ω − k = {x ∈ Ω k : x has a horizontal bisector} .

By definition, we have Ω
We start with the following simple lemma.
We will also require the following technical estimate.
Proof. We will show how to estimate k−2 i=0 |Ω i | 2 via the construction of a tiling in Ω k . We start with a tiling with both a horizontal and a vertical bisector, as in Figure 3(a). Then we inductively do the following. Both quadrants of the left half are tiled independently with a uniformly random tiling from Ω k−2 . In the top-right quadrant, we add a vertical bisector and complete the two halves of this quadrant with independent, uniformly random tilings from Ω k−3 . Finally, in the bottomright quadrant, we create a horizontal and a vertical bisector, reaching the tiling in Figure 3(b). Then we take this bottom-right quadrant, and iterate the procedure above; see Figure 3(c,d) for the configurations after one and two more iterations. This iteration continues until creating a bisector will result in rectangles of area less than 2 −k . In the case where an attempt is made to divide a rectangle of area 2 −k+1 into four rectangles of equal area by adding both a horizontal and vertical bisector, we instead add just a horizontal bisector, resulting in two rectangles each of area 2 −k .
Let Υ k ⊂ Ω k be the set of tilings obtained in this way. Note that the number of tilings in Υ k is where the first expression is exactly the value we wish to bound. Using the construction above until Figure 3(b), we obtain that where the second factor stands for the fact that the top-right quadrant must contain a vertical bisector. Iterating this in the bottom-right quadrant, we obtain Proposition 2.1 gives that where the inequality follows from Lemma 5.1. For even k, because |Ω | 0 | = 0 the last term we can obtain in (1) is where the last expressions come from, respectively, identities for φ and the easily-checked inequality 2φ + 1 > φ 2 . When k is odd, the last term in (1) is construction: among the two halves of the top half, one must contain the pivotal edge, say the bottom one, while the other contains a vertical bisector, each side of which being completed with a tiling from Ω k−3 , which gives the configuration in Figure 5(c). Continuing this for k − 2 steps concludes the construction.
To estimate the cardinality of ∂Ω | k , note that in each step of the construction we have two choices for where the pivotal edge is: either in the top half or the bottom half of the corresponding region. Therefore, the number of tilings in ∂Ω | k is Hence, where the last step follows from Lemma 5.2. Therefore, there exists a constant c > 0 such that This implies that the relaxation time and mixing time satisfy This complete the proof of the theorem.

The spectral gap of the block dynamics
We now present the proof of Theorem 4.2, which states that there exists a positive integer k 0 such that for all k ≥ k 0 , the spectral gap γ k,block is at least 1/17.
Proof of Theorem 4.2. We start defining the distance between two dyadic tilings x, y ∈ Ω k . In order to do this, we recall the notion of half-bisectors. We say that a tiling x has a left half-bisector if the line segment from (0, 1/2) to (1/2, 1/2) does not intersect the interior of any dyadic rectangle. In an analogous way we can define a right half-bisector using the line segment from (1/2, 1/2) to (1, 1/2), a top half-bisector using the line segment from (1/2, 1) to (1/2, 1/2), and a bottom half-bisector using the line segment from (1/2, 1/2) to (1/2, 0). Note that if x has a horizontal bisector, then it has both a left half-bisector and a right half-bisector. However, x may have a left half-bisector but no horizontal bisector. For example, the dyadic tiling in Figure 1(a) has top, right and bottom half-bisectors, but no left half-bisector. Now we define the distance between x and y as follows. For each of the four possible halfbisectors, let 1 be the number of such half-bisectors that are present in either x or y, but not in both of them. Also, for each of the four possible quadrants (top-left, top-right, bottom-left and bottom-right) of x and y, let 2 denote the number of such quadrants for which the rectangles in x intersecting that quadrant are not the same as the rectangles in y intersecting that quadrant. Then, introducing a parameter b > 0 that we will take to be sufficiently large later, we define the distance between x and y as d(x, y) = b 1 + 2 .
For instance, consider the two dyadic tilings in Figure 1(a,b). In this case we have 1 = 1 due to the left half-bisector that is present in (b) but not in (a), and 2 = 3 for top-left, top-right and bottom-left quadrants. The distance between these two tilings is then b + 3.
Our goal is to couple two instances of the block dynamics M block k , one starting from a state x ∈ Ω k and the other from a state y ∈ Ω k , such that the distance between x and y contracts after one step of the chains. More precisely, letting E x,y denote the expectation with respected to the coupling, and if x and y are the dyadic tilings obtained after one step of each chain, respectively, we want to obtain a coupling and a value ∆ > 0 such that Once we have the above inequality, then a result of Chen [6] (see also [12,Theorem 13.1]), implies that γ k,block ≥ ∆. We will use the following simple coupling between x and y : • Uniformly at random choose a tiling ρ ∈ Ω k−1 .
• Uniformly at random choose Lef t, Right, T op or Bottom.
• Retile the choosen half (left, right, top or bottom) of x with ρ, if possible.
• Retile the choosen half (left, right, top or bottom) of y with ρ, if possible.
For a more detailed description of the retiling step, see the definition of the transition rule of M block k in Section 3.1. When we update the left (resp., right) half of x and ρ contains a horizontal bisector, note that x will contain a left (resp., right) half-bisector. Similarly, if we update the top (resp., bottom) half of x and ρ contains a vertical bisector, then x will contain a top (resp., bottom) half-bisector. In any of these cases, we say that the retiling yields a half-bisector of x. The remaining of the proof is devoted to showing that we can set b large enough so that (6) holds with ∆ = 1 17 . In order to see this, we will split into three cases, and show that (6) holds with ∆ = 1 17 for each case.
Case 1: x and y have no common bisector. The maximum number of common half-bisectors of x and y in this case is two. Figure 6 illustrates the three possible configurations for the number of common half-bisectors of x and y. Consider first that x and y have no common half-bisector, which is illustrated in Figure 6(a) and has d(x, y) = 4b + 4. Then, whichever half (left, right, top or bottom) is chosen to be retiled, note that either x or 4b + 4 3b y is actually retiled, but never both. With probability |Ω k−1 | = f k the retiling yields a half-bisector, which increases the number of common half-bisectors between x and y, and thus decreases their distance by b. Hence, using that f k ≥ 1/2, we have where the last step is true by setting b large enough (in this case, b ≥ 1 suffices). Now consider that x and y have one common half-bisector, and use Figure 6(b) as a reference, with x being the left tiling and y being the right tiling. We have d(x, y) = 3b + 4. If we retile the left or right halves, so only x gets retiled, and the retiling yields a half-bisector, then the number of common half-bisectors of x and y decreases by 1. A similar behavior happens if we retile the top half. However, if we retile the bottom half, and the retiling does not yield a half-bisector, then the number of common half-bisectors decreases by 1. Hence, using that f k ≥ 1/2, we obtain where the last step is true by setting b large enough (in this case, b ≥ 4 suffices). Finally, suppose x and y have two common half-bisectors, as illustrated in Figure 6(c), where they may or may not be tiled the same in the quadrant bounded by these common half-bisectors. In this case d(x, y) = 2b + 4 − i, where i = 1 if they agree on this quadrant and i = 0 otherwise. Retiling the left and top halves can yield a new common half-bisector, while retiling the right and bottom halves may remove a common half-bisector. Moreover, if i = 1 and we retile the right or bottom halves, the tilings of the bottom-right quadrant of x and of y may become different, increasing the distance between x and y by 1. Putting these together, we have as k → ∞, the right-hand side above goes to 6− In particular, for k ≥ 10, the coefficient of b above satisfies 5−2f k 2 < 2 1 − 1 17 , and so we can set b large enough so that E x,y [d(x , y )] ≤ 1 − 1 17 (2b + 4 − i). We note this is the tight case, as 6− √ 5 2 > 2 1 − 1 16 , so this particular coupling and distance metric cannot be used to show the spectral gap is at least 1/16. This concludes the first case.  horizontal bisector. Each of x and y has at least 2 and at most 3 half-bisectors. Figure 7 illustrates the four possible configurations for the number of half-bisectors of x and y; the shaded quadrants are those where x and y could have the same tiling. In all the situations of Figure 7, if we retile the left or right halves, then we match up the configuration of x and y in that half. In particular, if x and y don't agree on the presence of left half-bisector, then they also do not have the same tiling of the top left or bottom left quadrants, so the decrease in distance due to a retiling of the left half, a move that occurs with probability 1/4, is (b + 2). If x and y agree on the presence of a left half-bisector and have the same tiling on i ∈ {0, 1, 2} of the two left quadrants, then the decrease in distance due to a retiling of the left half is (2 − i ). The same holds for right half-bisectors and retilings of the right half. As there are no moves of the coupling that can increase the distance between x and y, it can be shown that in all of the cases shown in Figure 7 the distance decreases by 1/4 in expectation. Hence, which concludes the second case.
Case 3: y has both vertical and horizontal bisectors.
Here there are three situations, depending on whether x has two, three or four half-bisectors; see Figure 8. In the situation of Figure 8(a), if the left or right halves are retiled, then we match up x and y in that half, decreasing the distance by b + 2. But if we retile the top or bottom halves, then we may increase the distance by b if the retiling does not yield a half-bisector. Hence, Since 4−f k 2 → 9− √ 5 4 < 1 − 1 17 2, the right-hand side above is smaller than 1 − 1 17 (2b + 4) when k and b are large enough. A similar situation occurs in Figure 8(b), but the distance increases a bit more when the top or bottom half is retiled as quadrants that were equal in x and y may become different. In this case, we have  < 1 − 1 17 , the right-hand side above is smaller than 1 − 1 17 (b + 4 − i) when k and b are large enough; this is the second tight case, where we see contraction by a factor of 1 − 1 17 but not by 1 − 1 16 . Finally, for the situation in Figure 8(c), regardless of which half we choose to retile, the distance will not increase; if we choose a half containing a quadrant on which x and y differ, the distance will decrease. Each quadrant on which x and y differ is contained in two halves and thus is retiled so that x and y agree there with probability 1/2. That is, This concludes the third case. We have shown that for all possible tilings x and y, it holds that E x,y [d(x , y )] ≤ 1 − 1 17 d(x, y). This implies γ k,block ≥ 1 17 for all k sufficiently large, as desired.