On Long Words Avoiding Zimin Patterns

A pattern is encountered in a word if some infix of the word is the image of the pattern under some non-erasing morphism. A pattern p is unavoidable if, over every finite alphabet, every sufficiently long word encounters p. A theorem by Zimin and independently by Bean, Ehrenfeucht and McNulty states that a pattern over n distinct variables is unavoidable if, and only if, p itself is encountered in the n-th Zimin pattern. Given an alphabet size k, we study the minimal length f(n,k) such that every word of length f(n,k) encounters the n-th Zimin pattern. It is known that f is upper-bounded by a tower of exponentials. Our main result states that f(n,k) is lower-bounded by a tower of n − 3 exponentials, even for k = 2. To the best of our knowledge, this improves upon a previously best-known doubly-exponential lower bound. As a further result, we prove a doubly-exponential upper bound for encountering Zimin patterns in the abelian sense.


Introduction
A pattern is a finite word over some set of pattern variables. A pattern matches a word if the word can be obtained by substituting each variable appearing in the pattern by This  a non-empty word. The pattern xx matches the word nana when x is replaced by the word na. A word encounters a pattern if the pattern matches some infix of the word. For example, the word banana encounters the pattern xx (as the word nana is one of its infixes). The pattern xyx is encountered in precisely those words that contain two non-consecutive occurrences of the same letter, as e.g., the word abca.
A pattern is unavoidable if over every finite alphabet, every sufficiently long word encounters the pattern. Equivalently, by Kőnig's Lemma, a pattern is unavoidable if over every finite alphabet all infinite words encounter the pattern. If it is not the case, the pattern is said to be avoidable.
The pattern xyx is easily seen to be unavoidable since every sufficiently long word over a finite alphabet must contain two non-consecutive occurrences of the same letter. On the other hand, the pattern xx is avoidable as Thue [23] gave an infinite word over a ternary alphabet that does not encounter the pattern xx.
A precise characterization of unavoidable patterns was found by Zimin [24] and independently by Bean et al. [10], see also for a more recent proof [17]. This characterization is based on a family (Z n ) n≥0 of unavoidable patterns, called the Zimin patterns, where Z 1 = x 1 and Z n+1 = Z n x n+1 Z n for all n ≥ 1.
A pattern over n distinct pattern variables is unavoidable if, and only if, the pattern itself is encountered in the n-th Zimin pattern Z n . Zimin patterns can therefore be viewed as the canonical patterns for unavoidability.
Due to the canonical status of Zimin patterns it is natural to investigate "what is the smallest word length f (n, k) that guarantees that every word over a k-letter alphabet of this length encounters the n-th pattern Z n ?".
Computing the exact value of f (n, k) for n ≥ 1 and k ≥ 2, or at least giving upper and lower bounds on its value, has been the topic of several articles in recent years [6,7,16,22].
For small values of n and k, known results from [15,16] are summarized in the following table.  In general, Cooper and Rorabaugh [6,Theorem 1.1] showed that the value of f (n, k) is upper-bounded by a tower of exponentials of height n − 1. To make this more precise let us define the tower function Tower : N × N → N inductively as follows: Tower(0, k) = 1 and Tower(n + 1, k) = k Tower(n,k) for all n, k ∈ N.
In stark contrast with this upper bound, Cooper and Rorabaugh showed that f (n, k) is lower-bounded doubly-exponentially in n for every fixed k ≥ 2. To our knowledge, this is the best known lower bound for f . Theorem 2 (Cooper/Rorabaugh [6]) f (n, k) ≥ k 2 n−1 (1+o (1)) .
This lower bound is obtained by estimating the expected number of occurrences of Z n in long words over a k-letter alphabet using the first moment method.

Our Contributions
Our main contribution is to prove a lower bound for f (n, k) that is non-elementary in n even for k = 2. We use Stockmeyer's yardstick construction [21] to construct for each n ≥ 1, a family of words of length at least Tower(n − 1, 2) (that we call higher-order counters here). We then show that a counter of order n does not encounter Z n (for n ≥ 3). As these words are over an alphabet of size 2n − 1, this immediately establishes that f (n, 2n − 1)>Tower(n − 1, 2).
Stockmeyer's yardstick construction is a well-known technique to prove nonelementary lower bounds in computer science, for instance it is used to show that the first-order theory of binary words with order is non-elementary, see for instance [14] for a proof.
By using a carefully chosen encoding we are able to prove a lower bound for f over a binary alphabet. Namely for all n ≥ 4, it holds f (n, 2)>Tower(n − 3, 2).
As a spin-off result, we also consider the abelian setting. Matching a pattern in the abelian sense is a weaker condition, where one only requires that when an infix matches a pattern variable it must only have the same number of occurrences of each letter (instead of being the same words). This gives rise to the notion of avoidable in the abelian sense and unavoidable pattern in the abelian sense. We note that every pattern that is unavoidable is in particular unavoidable in the abelian sense. However, the converse does not hold in general as witnessed by the pattern xyzxyxuxyxzyx, as shown in [9]. Even though Zimin patterns lose their canonical status in the abelian setting, the function g(n, k), which is an abelian analog of the function f (n, k), has been studied [22]. For this function, Tao [22] establishes a lower bound that turns out to be doubly-exponential from the estimations in [13]. The upper bound is inherited from the non-abelian setting and is hence non-elementary. We improve this upper bound to doubly-exponential. We also provide a simple proof using the first moment method that g admits a doubly-exponential lower bound which does not require the elaborate estimations of [13].
Comparison with [5] While finalizing the present article, we became aware of the preprint [5] They determine the value of f (3, k) up to a multiplicative constant and show [5,Theorem 1.3] that: For the general case, they show [5, Theorem 1.1] for any fixed n ≥ 3, They provide two proofs of this theorem. The first proof is based on the probabilistic method and positively answer a question we ask in conclusion of our conference paper [3]. The second proof uses a counting argument. For the case of the binary of alphabet, they show [5,Theorem 1.2] that: In this last case, our bound is slightly better and has the extra advantage to provide a concrete family of words witnessing the bound.

Applications to the Equivalence Problem of Deterministic Pushdown Automata
The equivalence problem for deterministic pushdown automata (dpda) is a famous problem in theoretical computer science. Its decidability has been established by Sénizergues in 1997 and Stirling proved in 2001 the first complexity-theoretic upper bound, namely a tower of exponentials of elementary height [20] (in F 3 in terms of Schmitz' classification [18]), see also [12] for a more recent presentation.
In [19] Sénizergues generalizes Stirling's approach by a the so-called "subwords lemma" allowing him both to prove a coNP upper bound for the equivalence problem of finite-turn dpda and to explicitly link the complexity of dpda equivalence with the growth of the function f : he shows that in case f is elementary, then the complexity of dpda equivalence is elementary.
Inspired by this insight, a closer look reveals that the above-mentioned function f has the same importance in all complexity upper bound proofs [12,19,20] for dpda equivalence. However, due to Theorem 7 one cannot hope to improve the computational complexity of dpda equivalence by proving an elementary upper bound on f since f (n, k) is shown to grow non-elementarily even for k = 2 (Theorem 7).

Organization of the Paper
We introduce necessary notations in Section 2. We show that f (n, 2n − 1) ≥ Tower(2, n − 1) in Section 3. We lift this result to unavoidability over a binary alphabet in Section 4, where we show that f (n, 2) ≥ Tower(n − 3, 2) for all n ≥ 4. Our doubly-exponential bounds on abelian avoidability are presented in Section 5. We conclude in Section 6.
If A is a finite set of symbols, we denote by A * the set of all words over A and by A + the set of all non-empty words over A. We write ε for the empty word. For a word u ∈ A * , we denote by |u| its length. For two words u and v, we denote by u · v (or simply uv) their concatenation. A word v is a prefix of a word u, denoted by v u, if there exists a word z such that u = vz. If z is non-empty, we say that v is a strict prefix 1 of u. A word v is a suffix of a word u if there exist a word z such that u = zv. If z is non-empty, we say that v is a strict suffix of u.
A word v is an infix of a word u if there exists z 1 and z 2 such that u = z 1 vz 2 . If both z 1 and z 2 are non-empty, v is a strict infix 2 of u. If v is an infix u and u can be written as z 1 vz 2 , the integer |z 1 | is called an occurrence of v in u. For a ∈ A, we denote by |u| a the number of occurrences of the symbol a in u.
Given two non-empty sets A and B, a morphism is a function ψ : A * → B * that satisfies ψ(uv) = ψ(u)ψ(v) for all u, v ∈ A * . Thus, every morphism can simply be given by a function from A to B * . A morphism ψ is said to be non-erasing if ψ(a) = ε for all a ∈ A and ψ is alphabetic if ψ(a) ∈ B for all a ∈ A.
Let us fix a countable set X = {x 1 , x 2 , . . .} of pattern variables. A pattern is a finite word over X . Let ρ = ρ 1 · · · ρ n be a pattern of length n. A finite word w matches ρ if w = ψ(ρ) for some non-erasing morphism ψ. A finite or infinite word w encounters ρ if some infix of w matches ρ.
A pattern ρ is said to be unavoidable if for all k ≥ 1 all but finitely many finite words (equivalently every infinite word, by Kőnig's Lemma) over the alphabet [k] encounter ρ. Otherwise we say ρ is avoidable.
Unavoidable patterns are characterized by the so called Zimin patterns. For all n ≥ 1, the n-th Zimin pattern Z n is given by: For instance, we have The following statement gives a decidable characterization of unavoidable patterns. [10], Zimin [24]) A pattern ρ containing n different variables is unavoidable if, and only if, Z n encounters ρ.

Theorem 3 (Bean/Ehrenfeucht/McNulty
For instance, the pattern x 1 x 2 x 1 x 2 is avoidable because it is not encountered in Z 2 (not even in Z n for any n ∈ N).
Theorem 3 justifies the study of the following Ramsey-like function.
As we mainly work with Zimin patterns, we introduce the notions of Zimin type (i.e. the maximal Zimin pattern that matches a word) and Zimin index (i.e. the maximal Zimin pattern that a word encounters) and their basic properties.

Definition 2
The Zimin type ZType(w) of a word w is the largest n such that w = ϕ(Z n ) for some non-erasing morphism ϕ.
For instance, we have ZType(aaab) = 1, ZType(aba) = 2 and ZType(a 7 ba 7 ) = 4. Remark that the Zimin type of any non-empty word is greater or equal to 1 and the Zimin type of the empty word is 0.
Proof The first two points directly follow from the definition. For the last point, remark that for a word w to encounter the n-th Zimin pattern Z n , it must be of length a least |Z n |. As Z n has length 2 n − 1, we have 2 Zimin(w) − 1 ≤ |w|, which implies the announced bound.

The Zimin Index of Higher-Order Counters
In this section we show that there is a family of words, that we refer to as "higherorder counters", whose length is non-elementary in n and whose Zimin index is n−1, allowing us to show that f (2n−1, n)> Tower(n − 1, 2). In Section 3.1 we introduce higher-order counters and in Section 3.2 we show that their Zimin index is precisely n − 1 including the mentioned lower bound on f .

Higher-Order Countersà La Stockmeyer
In this section we introduce counters that encode values range from 0 to a tower of exponentials. To the best of our knowledge this construction was introduced by Stockmeyer to show non-elementary complexity lower bounds and is often referred to as the "yardstick construction" [21]. We refer to such counters as "higher-order counters" in the following.
We define the (unary) tower function τ : N → N as max 0 = 1 a n d max n + 1 = 2 max n for all n ≥ 0.
Equivalently, max n = Tower(n, 2) for all n ∈ N. For all n ≥ 1, we define an alphabet n by taking 1 = {0 1 , 1 1 } and for all n > 1, n = n−1 ∪ {0 n , 1 n }. We say the symbols 0 n and 1 n have order n. We define = ∪ n≥1 n to be the set of all these symbols. For n ≥ 1 and i ∈ [0, max n + 1 − 1] we define For instance, there are τ (2) = 4 counters of order 2, namely For [ [ 11 ]] 3 , we have 11 = 1 · 2 0 + 1 · 2 1 + 0 · 2 2 + 1 · 2 3 and hence The following lemma is easily be proven by induction on n. The following lemma expresses that the order of a symbol in a counter of order n only depends on the distance of this symbol to an order n symbol. It is proven by induction on n by making use of the previous lemma. The length of an order-n counter, denoted by L n is inductively defined as follows: Note that in particular for all n ≥ 1 we have L n ≥ max n − 1.

Higher-Order Counters have Small Zimin Index
The aim of this section is to give an upper bound on the Zimin index of counters of order n. A first simple remark is that the Zimin index of any counter of order n is upper-bounded by the index of

Lemma 4 For all n ≥ 1 and for all
Proof Let n ≥ 1 and let i ∈ [0, max n − 1]. By definition of higher-order counters, we have where ψ is the alphabetic morphism defined by ψ(0 n ) = ψ(1 n ) = 0 n and ψ(x) = x for all x ∈ n−1 . Assume that [[ i ]] n contains an infix of the form ϕ(Z ) for some non-erasing morphism ϕ and ≥ 0. By (1), This leads us to the main result of this section. All these words have Zimin type 1 which concludes the base case. Assume that the property holds for some n ≥ 3. By Lemma 4 and induction hypothesis, we have that for all i ∈ [0, max n − 1], Let us show that Zimin( for some non-empty words α and β. It is enough to show that ZType(α) ≤ n − 1. We distinguish the following cases depending on the number occurrences of 0 n+1 in α.
. By induction hypothesis (i.e. (2)) and Lemma 1, Consider the morphism ψ that erases all symbols in n−1 and replaces 0 n and 1 n by 0 and 1, respectively. Let us assume that Let us start by showing that By definition of counters, b max n−1−1 is the most significant bit of the binary presentation (of length max n − 1) of i and j and c 0 is the least significant bit of the binary presentation of both i + 1 and j + 1. More formally, there exist Assume towards a contradiction that 0 + 1 ≥ max n − 1. In particular, this implies 2 1 ≥ 2 max n−1− 0 . And hence, A similar reasoning shows that x j = C − 1 mod 2 max n−1− 0 . Hence x i = x j and hence i = j which brings the contradiction.
Having just shown 0 That is, the binary representation of i 0 of length max n − 1 has c 0 · · · c 1 −1 as 1 least significant bits and b max n−1− 0 · · · b max n−1−1 as 0 most significant bits. In particular, as 0 + 1 < max n − 1, we have that: We claim that ZType

by Lemma 4 and induction hypothesis.
Assume that α = γ δγ for non-empty γ and δ. Using Fact 1, it is enough to show that ZType(γ ) Recall that α = u0 n+1 v and α contains only one occurrence of 0 n+1 . It follows that γ must be a prefix of u and a suffix of v. In particular using (4), [[ i 0 ]] n contains γ rγ as an infix.
The upper bound on the Zimin index of higher-order counters established in the previous theorem is tight. We proceed by induction on n ≥ 3.

. Clearly ZType(α 3 ) = 2 and as α 3 is an infix of [[ 3 ]] 2 , it is also an infix of [[ 2 ]] 3 and [[ 3 ]] 3 (and in fact of all order 3 counters).
Assume that the property holds for some n ≥ 3 and let us show that it holds for n + 1. From the definitions, we have: Therefore there exist x 2 , y 2 , x 3 and y 3 such that: In particular, the common infix can be written as: We can take α n+1 = α n y 2 0 n+1 x 3 α n which is therefore an infix of both . The Zimin type of α n+1 = α n βα n with β = y 2 0 n+1 x 3 is at least 1 more than that of α n . By induction hypothesis, the Zimin type of α n+1 is therefore at least n which concludes the proof.

Reduction to the Binary Alphabet
In this section, we show how to encode a higher-order counter seen in Section 3 over the binary alphabet {0, 1} while still preserving a relatively low upper bound on the Zimin index. For this we apply to counters the morphism ψ, defined as follows Definition 4 For all n ≥ 1 and i ∈ [0, max n − 1], we define The set of images of the letters by this morphism forms what is known as an infix code, i.e. ψ(a) is not an infix of ψ(b) for any two letters a, b ∈ with a = b. In addition to being an infix code, the morphism was designed so that: -we are able to attribute a non-ambiguous partial decoding to most infixes of an encoded word (cf. Lemma 8), -the encoding of 0 n and 1 n differ on their first and last symbol inter alia, -the Zimin index of the encoding of an order n symbol is relatively low (we will show that it is at most log 2 (3 + 2n) ).
Applying a non-erasing morphism to a word can only increase its Zimin index. We will see in the remainder of this section that the Zimin index of higher-order counters is increased by at most 2 when the morphism ψ is applied. It is possible that another choice of morphism would bring a better upper bound. However, remark that the proof we present is tightly linked to the above-mentioned properties of ψ that are decisive for the construction to work.
This section is devoted to establishing the following theorem.
Theorem 6 For all n ≥ 2 and for all i ∈ [0, max n − 1], The proof, which is essentially an extension of the proof of Theorem 4, is given in Section 4.2. To perform this reduction, we first establish basic properties of the morphism ψ and its decoding in Section 4.1.
Recalling that an order n counter has length at least max n − 1, in particular so does its code, which is its image under the non-erasing morphism ψ.
Theorem 6 immediately implies the following non-elementary lower bound for f (n, 2) whenever n ≥ 4.

Parsing the Code ψ
A word w ∈ {0, 1} * is coded by ψ (or simply a coded word) if it is the image by ψ of some word v over * . As the image of ψ is an infix code, the word v ∈ * is unique. However in our proof, we need to take into consideration all infixes of a coded word. To be able to reuse the proof techniques of Theorem 4, it is necessary to associate to an infix of a coded word a partial decoding called a parse.
We collect in the following lemma some observations on these sets which will be used through out the proofs in this section.

Lemma 5
The sets C, R, L and F satisfy the following equations:

Lemma 6
For every n ≥ 1 and every infix of a word in C * ≤n belongs to F ≤n ∪ L ≤n C * ≤n R ≤n where F ≤n , C ≤n and R ≤n respectively denote the restrictions of F , C and R to symbols of order at most n.
Proof Let us first remark that ε ∈ L ≤n and ε ∈ R ≤n . We show, by induction on m, that every infix α of a word in C m ≤n belongs to F ≤n ∪ L ≤n C * ≤n R ≤n . For m = 0, the property is immediate as ε ∈ F ≤n (and to L ≤n C * ≤n R ≤n ). For m = 1, if α is an infix of c ∈ C ≤n , then c = xαy for some words x and y. If x = ε and y = ε then α belongs to F ≤n . If x = ε and y = ε then α belongs L ≤n . If x = ε and y = ε then α belongs to R ≤n . Finally, if x = y = ε, then α = c ∈ C ≤n ⊆ L ≤n C * ≤n R ≤n . For the induction step, let α be an infix of some c 1 · · · c m+1 ∈ C m+1 ≤n , where m ≥ 1. There are the following cases: either α is an infix of c 1 · · · c m and we can conclude using the induction hypothesis, or α is an infix of c m+1 and we can conclude using the case m = 1. Finally it remains the case where α = xy with x a suffix of c 1 · · · c m and y a prefix of c m+1 . Clearly x belongs to L ≤n C * ≤n and y belongs to R ≤n ∪ C ≤n . This implies that α belongs to L ≤n C * ≤n R ≤n .
This leads us to define a parse p as a triple ( , u, r) in L × * × R. The word u will be called the center of the parse p. The value of the parse ( , u, r) is the word ψ(u)r ∈ {0, 1} * . We say that α admits p if α is the value of p.
However, we will provide sufficient conditions on an infix to admit a unique parse.
On the one hand, the term "simple" is justified in the context of this proof as simple infixes of { { i } } n can rather easily shown to have Zimin index at most n − 1 for all n ≥ 4 and all i ∈ [0, max n − 1] (cf. Lemma 7). It follows that { { i } } n cannot contain more than 9 consecutive zeros or 9 consecutive ones. Hence α is simple either because it belongs to F or because |α| < 11.
In the second case α belongs to On the other hand, the term "simple" will be justified by the fact that for nonsimple infixes there is exactly one possible parse as shown in the following lemma.

Lemma 8 Any non-simple infix of a coded word admits a unique parse.
Proof The existence of the parse is immediate from Lemma 6. We will show that unicity of the parse can be be reduced to testing the functionality of a certain rational relation (i.e., a relation accepted by a word transducer), a property which can be decided in polynomial time [2].
The relation R parse containing all pairs (α, w p ) such that p is a parse of α is rational. Indeed R 1 is the restriction of the rational relation π −1 to regular image L (C ) * R . As rational relations are closed under restriction to a regular image (or domain), it follows that R parse is a rational relation.
To check the unicity of the parse, it is enough to show the R parse is functional when its domain is restricted to non-simple words. By Definition 5, the set of simple words is a regular set and hence so is the set of non-simple words. As rational relation are closed under restriction to a regular domain, we can construct a transducer for R parse restricted to non-simple words and check its functionality with an automata framework.
A direct but rather tedious proof of this statement can be found in [4].
Thus, we will refer to the unique parse of a non-simple infix α of a coded word as the parse of α.
The next lemma states that occurrences of codings of symbols of order strictly larger than one in a coded word can be related with occurrences of this symbol in the word that has been coded.

Lemma 9
Let α be a non-simple infix of some coded word and let ( , u, r) be its parse.
If α contains n > 1 occurrences of ψ(x) for some x ∈ \ 1 then u contains n occurrences of x.
Proof Let α a non-simple infix of some coded word and let p = ( , u, r) be its parse. Let x be a letter in \ 1 such that α contains n > 1 occurrences of ψ(x).
By definition of the parse p, we have α = ψ(u)r. If we write u = u 1 · · · u m with m ≥ 0 and u i ∈ for all i ∈ [1, m], we can write α as follows, where α 0 = , α i = ψ(u i ) for all i ∈ [m], -and α m+1 = r.
Let m 1 < · · · < m n be an enumeration of the n occurrences of ψ(x) in α. For all i ∈ [n], we denote by q i the maximal integer satisfying m i ≥ Claim For all i ∈ [n] we have 0 < q i ≤ m and m i =

Proof of the claim Let i ∈ [1, n].
Let us first show that q i = 0. Assume towards a contradiction that q i = 0. By maximality of q i , α 0 = cannot be empty. This implies that ψ(x) can be written as α 1 · · · α k r with a non-empty suffix of , k ≥ 0 and r a prefix of α k+1 . As C is an infix code, k is necessarily equal to 0. Hence ψ(x) = r . In particular ψ(x) ∈ LR ∩ C. In Lemma 5, we remarked that LR ∩ C = {0000, 1111}, which brings a contradiction with the fact that x ∈ \ 1 .
Let us next show q i = m + 1. Assume towards a contradiction that q i = m + 1. In this case, r ∈ R would contain ψ(x) as an infix which contradicts the fact that C is an infix code.
Let us finally show m i ≤ q i −1 j =0 |α j | (and thus m i = q i −1 j =0 |α j |). Assume towards a contradiction that m i > q i −1 j =0 |α j |. By definition of q i , this implies that ψ(x) can be written as α q i +1 · · · α q i +k r with a non-empty suffix of α q i , k ≥ 0 and r a prefix of α q i +k+1 . As C is an infix code, k is necessarily equal to 0. Hence ψ(x) = r . In particular ψ(x) ∈ LR ∩ C. As above, we have from Lemma 5, that LR ∩ C = {0000, 1111}, which brings a contradiction with the fact that x is not of order 1.

End of the proof of the claim
Using the claim, it follows that either ψ(x) is a prefix of ψ(u q i ) or conversely that ψ(u q i ) is a prefix of ψ(x). As C is an infix code, this is only possible if ψ(x) = ψ(u q i ) and hence u q i = x.
Again using the claim, we have that q 1 < · · · < q n (as m 1 < · · · < m n ). Hence we have shown that u contains at least n occurrences of x. Clearly, u cannot contain more than n occurrences of x as each occurrence of x in u induces an occurrence of ψ(x) in α.
Let w = w 0 · · · w |w|−1 ∈ * and p = ( , u = u 0 · · · u |u|−1 , r) be a parse, an occurence of p in w is an occurrence m of u in w such that whenever is non-empty we have m = 0 and is a suffix of ψ(w m−1 ) and similarly whenever r is non-empty we have m + |u| < |w| and r is a prefix of ψ(w m+|u| ).

Remark 1
In the previous lemma, the requirement that the order of the symbol is strictly greater than 1 is necessary. For instance consider the coded word w = ψ(0 2 0 3 ) = 00010000010100. If we take α to be w which is non-simple, α contains ψ(0 1 ) = 0000 as an infix but 0 1 does not occur in its unique parse of (ε, 0 2 0 3 , ε).
The next lemma shows that for a word w ∈ * and a non-simple infix α of ψ(w), there is a one-to-one correspondence between the occurrences of α in ψ(w) and the occurrences of its parse p α in ψ(w).

Lemma 10 For any word w ∈ * and any non-simple infix α of ψ(w), there is a unique order-preserving bijection between the occurrences of α in ψ(w) and the occurrences of its parse p in w.
Proof Let w = w 0 w 1 · · · w n−1 , n ≥ 1 a non-empty word over and let α be a non-simple infix of the word ψ(w). Consider the unique parse p = ( , u, r) of the infix α. To each occurrence m of the parse p in w, we associate the occurrence ρ(m) = ( m−1 i=0 |ψ(w i )|) − | | of the word α in ψ(w). The mapping ρ, from the set of occurrences of p in w to the set of occurrences of α in ψ(w), is order-preserving and injective. It remains to show that it is surjective.
Let h be an occurrence of α in ψ(w). By definition, there exist two words x, y ∈ {0, 1} * such that ψ(w) = xαy and |x| = h. Consider the greatest integer m 0 ∈ [0, n − 1] such that We will show that m 0 is an occurrence of the parse p in w and hence that τ (m 0 ) = h. Remarking that ψ(w) is equal to both xαy and ψ(w 0 ) · · · ψ(w n−1 ), there must exist k ≥ 0 such that α = ψ(w m 0 ) · · · ψ(w m 0 +k−1 )r , where It follows that ( , w m 0 w m 0 +1 · · · w m 0 +k−1 , r ) is a parse of α occurring at m 0 in w. The lemma now follows from the unicity of the parse.
By definition, the context c of some occurrence of a parse p = ( , u, r) in w is an infix of w, that itself contains u as an infix. Moreover, the value α of p is an infix of ψ(c).

Upper Bound on the Zimin Index
We are now ready to upper-bound the Zimin index of the code of higher-order counters. Due to the nature of our coding ψ we need to prove a slightly stronger inductive statement that takes into the account the code of a symbol of order n + 1 directly before or directly after the code of a counter of order n.

Theorem 8 For all n ≥ 2 and for all
Proof We proceed by induction on n. For the cases n = 2 and n = 3, the property is checked using a computer program.
Remark that the reason we start the induction at 3 is to be able to apply the upper bound from Lemma 7.
For the induction step assume that the property holds for some n ≥ 3 and let us show that it holds for n + 1. Let i ∈ [0, max n + 1 − 1], we have to show that We start by showing that Zimin for some non-empty α and β. It is enough to show that ZType(α) ≤ n + 1. By Lemma 7, we only need to consider the case when α is non-simple. Let p = ( , u, r) be the parse of α, whose uniqueness is guaranteed by Lemma 8. We distinguish cases depending on the number of occurrences of a symbol of order n + 1 in c 1 .

If c 1 does not contain any symbol of order n +1
where the last inequality follows from induction hypothesis. From now on, we assume that both x and y are non-empty. In particular, the center u of the parse p = ( , u, r) contains b and can therefore be uniquely written as u =xby. In summary, we have for some s and t such that: s = ε if = ε and otherwise s ∈ with is a suffix of ψ(s).
t = ε if r = ε and otherwise t ∈ with r is a prefix of ψ(t).

Claim 1
The context c 2 (of the second occurence of α) is equal to c 1 .
Proof of the Claim 1 Similarly as for c 1 , the context c 2 can be written as s xbyt for some s and t such that: s = ε if = ε and otherwise s ∈ with is a suffix of ψ(s ).
t = ε if r = ε and otherwise t ∈ with t is a prefix of ψ(t ).
Towards a contradiction, assume c 1 and c 2 are different. It is either the case that s = s or the t = t . As both cases can be shown analogously, we only consider the first one and assume that s = s . In particular, without loss of generality we may assume that is non-empty.
The symbols s and s occur in [[ i ]] n+1 at the same distance of an order n + 1 symbol and by Lemma 3 must have the same order. Furthermore the last symbol of their encoding by ψ is the same (it is the last symbol of ). By the definition of ψ, s and s are either both from {0 k | k ≥ 1} or both from {1 k | k ≥ 1}. This proves that s and s are equal which brings the contradiction.

End of the proof of Claim 1
As c 1 = c 2 = xby and as b belongs to the center u of the parse, the infix α can be written as wherex is a suffix of ψ(x) andỹ is a prefix of ψ(y).

Proof of Claim 2
We proceed along the same lines as in the proof of Theorem 4. Consider the morphism ϕ that erases all symbols in n−1 and replaces 0 n and 1 n by 0 and 1 respectively. That is, we can write ϕ(x) and ϕ(y) as follows, With the same proof as in Theorem 4, we show that By the same reasoning as in the proof of Theorem 4, there exists j 0 ∈ [0, max n − 1] and non-empty ξ such that [[ j 0 ]] n = yξ x. By applying ψ and recalling thatx is a suffix of ψ(x) andỹ is a prefix of ψ(y), we can conclude.

End of the proof of Claim 2
Let us now consider an arbitrary decomposition of α as δγ δ for non-empty δ and γ . Recall that it is enough to show that ZType(α) ≤ n + 1 or that ZType(δ) ≤ n.
There are several cases to consider depending on how the two decompositions xψ(b)ỹ and δγ δ overlap.

Case 1 |xψ(b)| ≤ |δ|.
This situation cannot occur under our hypothesis. Indeed, α would contain two occurences of ψ(b) which by Lemma 9 implies that the center of its parse contains two occurences of the order n + 1 symbol b. This brings a contradiction with the fact that the context c 1 contains exactly one symbol of order n + 1.
Asỹχxz 1 is equal to γ 2 δχδ we have that δχδ is an infix of a word (i.e. { { j 0 } } n z 1 ) of Zimin index at most n + 1. By Fact 1, this implies that ZType(δ) ≤ n which concludes the case.
We will now show that x = y is empty. Towards a contradiction, assume that x is not empty. We recall that c 1 = xby, where x ∈ * n is a suffix of [[ k 0 ]] n and y ∈ * n is a prefix of [[ k 0 + 1 ]] n . Since x = sx it follows that x is a suffix of [[ k 0 ]] n . By definition of [[ k 0 ]] n we have that x ends with an order n symbol. But x = y is a also prefix of [[ k 0 + 1 ]] n (which contains an order n symbol) and hence starts with which brings the contradiction.
By induction hypothesis, ψ(b){ { j 0 } } n has Zimin index at most n + 1. In particular, z 2ỹ χx, which is a suffix, also has Zimin index at most n + 1.
As z 2ỹ χx is equal to δχδγ 1 we have that δχδ is an infix of the word z 2ỹ χx =z 2 { { j 0 } } n of Zimin index at most n + 1. By Fact 1, this implies that ZType(δ) ≤ n which concludes the case.

Case 6 |x| > |δγ |
This situation cannot occur under our hypothesis. Indeed α would contain two occurrences of ψ(b) which by Lemma 9 implies that the center of its parse contains two occurrences of the order n + 1 symbol b. This brings a contradiction to the fact that the context c 1 contains exactly one symbol of order n + 1.
We have shown that for all i ∈ [0, τ (n + 1) − 1] we have Let us now show that for all b ∈ {0 n+2 , 1 n+2 }, we have We first consider the case of for some non-empty α and β. By Fact 1, it is enough to show that ZType(α) ≤ n + 1. By Lemma 7, it is enough to consider the case when α is non-simple and hence by Lemma 8, α admits a unique parse p = ( , u, r).
As αβα is an infix of ψ(b){ { i } } n+1 , there exists z 1 and z 2 such that We distinguish different possible lengths of z 1 .
would contain two occurrences of the order n + 2 symbol b, which brings the contradiction.
Note that this will be sufficient since then we can apply induction hypothesis to conclude that ZType(α) ≤ n + 1.
As 1 ≤ |z 1 | < |ψ(b)|, the parse p = ( , u, r) is such that is a non-empty suffix of ψ(b) and u is a prefix of It remains to show that is a suffix of ψ(b ) for some b ∈ {0 n+1 , 1 n+1 }. As there are two occurrences of the parse p in ψ(b){ { i } } n+1 , this implies that is the suffix of ψ(b) and some ψ(b ) for some symbol b of order k ≤ n + 1. From the definition of ψ, it follows that is a suffix of (01) k−1 00 or (01) k−1 11. Hence as announced, is a suffix of an order n + 1 symbol.
We have shown that Zimin( Remark that, as the definition of higher-order counters is not symmetrical with respect to left-right and right-left, this case is not identical to the previous one. Let αβα be an infix of { { i } } n+1 ψ(b) for some non-empty α and β. By Fact 1, it is enough to show that ZType(α) ≤ n+1. By Lemma 7, it is enough to consider the case when α is non-simple and hence by Lemma 8, α has a unique parse p = ( , u, r).
As αβα is an infix of { { i } } n+1 ψ(b), there exist z 1 and z 2 such that: We distinguish cases on the length of z 2 .
We now distinguish cases on the length of αz 2 .
In this case, the parse p = ( , u, r) of α is such that: r is a non-empty prefix of ψ(b), u ends with the order n + 1 symbol b . We have established that α is a suffix of { { max n − 1 } } n ψ(b )r with |r| < 4. It remains to prove that ZType(α) ≤ n + 1.
Consider a decomposition of α as δγ δ for some non-empty δ and γ . Assume towards a contradiction that |δ| ≥ |ψ(b )r|. In this case, ψ(b ) is an infix of δ and hence α would have two occurrences of ψ(b ). By Lemma 9, the center of α's parse would contain two order n + 1 symbols which contradicts the fact that α is a suffix of { { max n − 1 } } n ψ(b )r which has precisely one occurrence of the code of one order n + 1 symbol.
This case cannot occur under our assumptions. Indeed, this would imply that the center u of the parse p = ( , u, r) of α contains [[ max n − 1 ]] n . As the parse p has at least two occurrences in , which contradicts Lemma 2.

Avoiding Zimin Patterns in the Abelian Sense
Matching a pattern in the abelian sense is a weaker condition, where one only requires that all infixes that are matching a pattern variable must have the same number of occurrences of each letter (instead of being the same words). Hence, for two words x, y ∈ A * we write x ≡ y if |x| a = |y| a for all a ∈ A. Let ρ = ρ 1 · · · ρ n be a pattern, where ρ i ∈ X is a pattern variable for all i ∈ [k]. An abelian factorization of a word w ∈ A * for the pattern ρ is a factorization w = w 1 · · · w n such that w i = ε for all i ∈ [n] and ρ i = ρ j implies w i ≡ w j for all i, j ∈ [n]. A word w ∈ A * matches pattern ρ in the abelian sense if there is an abelian factorization of w for ρ. The definitions when a word encounters a pattern in the abelian sense and when a pattern is unvavoidable in the abelian sense are as expected.
We note that every pattern that is unavoidable is in particular unavoidable in the abelian sense. However, the converse does not hold in general as witnessed by the pattern xyzxyxuxyxzyx as shown in [9].
To the best of the authors' knowledge abelian unavoidability still lacks a characterization in the style of general unavoidability in terms of Zimin patterns; we refer to [8] for some open problems and conjectures. Although being possibly less meaningful as for general unavoidability, the analogous Ramsey-like function for abelian unavoidability has been studied.
Clearly, g(n, k) ≤ f (n, k) and to the best of the authors' knowledge no elementary upper bound has been shown for g so far. By applying a combination of the probabilistic method [1] and analytic combinatorics [11] Tao showed the following lower bound for g.
Theorem 9 (Tao [22], Corollary 3) Let k ≥ 4. Then Unfortunately, it was not clear to the authors what the asymptotic behavior of this lower bound is. However Jugé [13] provided us with an estimate of its asymptotic behavior.
Corollary 2 (Jugé [13]) Let k ≥ 4. The expression in Theorem 9, and hence g(n, k), is lower-bounded by In Section 5.1 we prove another doubly-exponential lower bound on g by applying the first moment method [1]. Our lower bound on g is not as good as the one obtained by combining Theorem 9 with Corollary 2 but its proof seems more direct (already more direct than the proof of Theorem 9 itself). The proof follows a similar strategy as the (slightly better) doubly-exponential lower bound for f from [6], but again, seems to be more direct.
Our novel contribution is to provide a doubly-exponential upper bound on g in Section 5.2. Note that Tao in [22] only provides a non-elementary upper bound for the non-abelian case.

A Simple Lower Bound via the First-Moment Method
For all n ≥ 1 let X n = {x 1 , . . . , x n } denote the set of the first n pattern variables. We note that the variable x i appears precisely 2 n−i times in Z n and its first occurrence is at position 2 i−1 for all i ∈ [1, n]. An abelian occurrence of Z n in a word w is a pair (j, λ) ∈ [0, |w| − 1] × N X n for which there is an factorization w = uvz with |u| = j and an abelian factorization v 1 · · · v 2 n −1 of v for Z n satisfying λ(x i ) = |v 2 i−1 |.
By applying the probabilistic method [1] we show a lower bound for g(n, k) that is doubly-exponential in n for every fixed k ≥ 2. The proof is similar the lower bound proof from [6]. Proof For n, ≥ 1 let n,k, denote the expected number of abelian occurrences of Z n in a random word in the set [k] . Remark that we always consider the uniform distribution over words. If n,k, < 1, then by the probabilistic method [1] there exists a word of length over the alphabet [k] that does not encounter Z n in the abelian sense; hence we can conclude g(n, k) > . Therefore we investigate those = (n, k) for which we can guarantee n,k, < 1. We need two intermediate claims.

Proof of Claim 1
We only show the claim only for m = 2, the case when m > 2 can be shown analogously. Let A k,h denote the event that two independent random words u and v in [k] h satisfy u ≡ v. Then Pr(A k,h ) ≤ 1/k for all h ≥ 1. For every word Let us fix any j ∈ [k]. Then we clearly have Pr [⊕w = j ] = 1/k for every random word w = w 1 · · · w h ∈ [k] h . Thus, End of the proof of Claim 1.
Recall that Z n = y 1 · · · y 2 n −1 , where y i ∈ {x 1 , . . . , x n } for all i ∈ [2 n − 1] and that the variable x i appears precisely 2 n−i times in Z n . We recall that we would like to bound the expected number of occurrences (in the abelian sense) of Z n in a random word of length over the alphabet [k]. To account for this, we define for each mapping λ : X n → N + its width as width(λ) = n i=1 2 n−i · λ(x i ). For every word v of length width(λ) its (unique) decomposition with respect to λ is the unique factorization v = v 1 · · · v 2 n −1 such that y j = x i implies |v j | = λ(x i ) for all j ∈ [2 n − 1] and all i ∈ [n].
Claim 2 Let λ : X n → N + and let B λ denote the event that in a random word from [k] d we have that (0, λ) is an occurrence of Z n in the abelian sense. Then Pr(B λ ) ≤ k n−2 n +1 .
Proof of Claim 2 Let λ : X n → N + with d = width(λ). For i ∈ [n], let j (i) 1 < · · · < j (i) 2 n−i be an enumeration of the 2 n−i indices corresponding to occurrences of x i in Z n . For all i ∈ [n] consider the event B (i) λ that a random word of [k] d has its decomposition with respect to λ of the form v 1 · · · v 2 n −1 such that the words v j (i) (1/k) 2 n−i −1 = k − n i=1 2 n−i +n = k n−2 n +1 . (6) End of the proof of Claim 2.
It is clear that for every (j, λ), where d = width(λ) and j + d ≤ , the probability that (j, λ) is an occurrence of a random word from [k] equals to probability that (0, λ) is such an occurrence and therefore equals Pr(B λ ). Thus, this probability does not depend on j .
We are ready to prove an an upper bound for n,k, , where we note that any occurrence (j, λ) of Z n in a random word of length must satisfy width(λ) ≥ 2 n − 1.
We finally determine the largest value of that still guarantees that n,k, < 1.

A Doubly-Exponential Upper Bound
Let us finally prove an upper bound for g(n, k) that is doubly-exponential in n.
For the induction step, let n ≥ 1 and let us assume induction hypothesis for g(n, k).
To determine an upper bound g(n + 1, k) we consider any sufficiently long word w ∈ [k] + that we can factorize as w = w 1 a 1 w 2 a 2 · · · w m a m z, where |w j | = g(n, k), a j ∈ [k] for all j ∈ [m] and z ∈ [k] * , where m is assumed sufficiently large for the following arguments to work. By induction hypothesis for all j ∈ [m], w j encounters Z n in the abelian sense, witnessed in some infix v j and some abelian factorization v j = v (1) j · · · v (2 n −1) j for Z n . To each such abelian factorization we can assign the Parikh image how the word v j matches each variable x i (with i ∈ [n]) that appears in Z n . Formally, each of the above abelian factorizations v j = v (1) j · · · v (2 n −1) j induces a mapping ψ j : X n → N [k] such that ψ j (x i )(t) = |v , all i ∈ [n] and all t ∈ [k]. As expected, we write ψ j ≡ ψ h if ψ j (x i ) = ψ j (x i ) for all i ∈ [n]. Note that if there are distinct j, h ∈ [1, m] with ψ j ≡ ψ h , then clearly w encounters Z n+1 = Z n x n+1 Z n in the abelian sense.
Let us therefore estimate a sufficiently large bound on m such that there are always two distinct indices i, j ∈ [1, m] that satisfy ψ i ≡ ψ j .
It is easy to see that there are at most g(n, k) kn different equivalence classes for the ψ j with respect to ≡. Therefore by setting m = g(n, k) kn + 1 we have shown g(n + 1, k) ≤ (g(n, k) + 1)(g(n, k) kn + 1) .

Conclusion
We have established a lower bound for f (n, k) that is already non-elementary when k = 2. A first element of an answer is that the first moment method used in [6] cannot be used to obtain a lower bound that is asymptotically above doubly-exponential. Indeed, as for a length ≥ k 2 n −n−1 + 2 n , the expected number n,k, of occurrences Z n in a random word in [k] is greater than 1.
To see this, recall that |Z n | = 2 n − 1 and hence there is at most one possible occurrence of Z n in any word of length 2 n − 1. Let A n denote the event that Z n is encountered in a random word in [k] 2 n −1 . We have (1/k) 2 n−i −1 = k −2 n +n+1 .
Assume that ≥ k 2 n −n−1 + 2 n . For each i ∈ [0, k 2 n −n−1 ], let X i be the indicator random variable marking that the infix, of a random word in [k] , occurring at i and of length 2 n − 1 matches Z n . By linearity of the expectation, it follows that n,k, ≥ k 2 n −n−1 i=0 E(X i ) ≥ (k 2 n −n−1 + 1) Pr(A n ) = 1 + 1 k 2 n −n−1 ≥ 1.
Thus, more advanced probabilistic method techniques are necessary. Indeed, very recently [5] Condon, Fox and Sudakov have applied the local lemma to obtain nonelementary lower bounds on f (n, k).
For the abelian case, an explicit family of words witnessing the doublyexponential lower bound seems worth investigating.