Word Equations in Nondeterministic Linear Space ∗

Satisﬁability of word equations is an important problem in the intersection of formal languages and algebra: Given two sequences consisting of letters and variables we are to decide whether there is a substitution for the variables that turns this equation into true equality of strings. The computational complexity of this problem remains unknown, with the best lower and upper bounds being, respectively, NP and PSPACE . Recently, the novel technique of recompression was applied to this problem, simplifying the known proofs and lowering the space complexity to (nondeterministic) O ( n log n ). In this paper we show that satisﬁability of word equations is in nondeterministic linear space, thus the language of satisﬁable word equations is context-sensitive. We use the known recompression-based algorithm and additionally employ Huﬀman coding for letters. The proof, however, uses analysis of how the fragments of the equation depend on each other as well as a new strategy for nondeterministic choices of the algorithm, which uses several new ideas to limit the space occupied by the letters.


Introduction
Solving word equations was an intriguing problem since the dawn of computer science, motivated first by its ties to Hilbert's 10th problem.Initially it was conjectured that this problem is undecidable, which was disproved in a seminal work of Makanin [10].At first little attention was given to computational complexity of Makanin's algorithm and the problem itself; these questions were reinvestigated in the '90 [6,18,9], culminating in the EXPSPACE implementation of Makanin's algorithm by Gutiérrez [5].
The connection to compression was first observed by Plandowski [16], who showed that a length-minimal solution of size N has a compressed representation of size poly(n, log N ).Plandowski further explored this approach [14] and proposed a PSPACE algorithm [13], which is the best bound up to date; a simpler PSPACE solution also based on compression was proposed by Jeż [8].On the other hand, this problem is only known to be NP-hard, and it is conjectured that it is in NP.

95:2 Word Equations in Nondeterministic Linear Space
While the computational complexity of word equations remains unknown, its exact space complexity is intriguing as well: Makanin's algorithm uses exponential space [5], Plandowski [13] gave no explicit bound on the space usage of his algorithm, a rough estimation is NSPACE(n 5 ), the recent solution of Jeż [8] yields NSPACE(n log n).Moreover, for O(1) variables a linear bound on space complexity was shown [8]; recall that languages recognisable in nondeterministic linear space are exactly the context-sensitive languages.
In this paper we show that satisfiability of word equations can be tested in nondeterministic linear space in terms of the number of bits of the input, thus showing that the language of satisfiable word equations is context-sensitive (and by the famous Immerman-Szelepcsényi theorem: the language of unsatisfiable word equations).The employed algorithm is a (variant of) algorithm of Jeż [8], which additionally uses Huffman coding for letters in the equation.On the other hand, the actual proof uses a different encoding of letters, which extends the ideas used in a (much simpler) proof in case of O(1) variables [8, Section 5]; the other new ingredient is a different strategy of compression: roughly speaking, previously a strategy that minimised the length of the equation was used.Here, a more refined strategy is used: it simultaneously minimises the size of a particular bit encoding, enforces that changes in the equation (during the algorithm) are local, and limits the amount of new letters that are introduced to the equation.
The bound holds when letters and variables in the input are encoded using an arbitrary encoding, in particular, the Huffman coding (so the most efficient one) is allowed.

The (known) algorithm
We first present a slight variation of the algorithm of Jeż [8] and the notions necessary to understand how it works.The proofs are omitted, yet they should be intuitively clear.

Notions.
The word equation is a pair (U, V ), written as U = V , where U, V ∈ (Γ ∪ X ) * and Γ and X are disjoint alphabets of letters and variables, both are collectively called symbols.By n X we denote the number of occurrences of X in the (current) equation; in the algorithm n X does not change till X is removed from the equation, in which case n X becomes 0. A substitution is a morphism S : X ∪ Γ → Γ * , where Γ ⊇ Γ and S(a) = a for every a ∈ Γ, a substitution naturally extends to (X ∪ Γ) * .A solution of an equation U = V is a substitution S such that S(U ) = S(V ); given a solution S of an equations U = V we call S(U ) the solution word.We allow the solution to use letters that are not present in the equation, this does not change the satisfiability: all such letters can be changed to a fixed letter from Γ, and the obtained substitution is still a solution.Yet, the proofs become easier, when we allow the usage of such letters.The alphabet Γ is usually given implicitly: as the set of letters used by the substitution.A block is a string a with ≥ 1 that cannot be extended to the left nor to the right with a.
As we deal with linear-space, the encoding used by the input equation matters.We assume only that the input is given by a fixed (uniquely decodable) coding: each symbol in the input is always given by the same bitstring and given the bitstrings representing the sides of the equation there is only one pair of strings (over Γ ∪ X ) that is encoded in this way.It is folklore that among such codes the Huffman code yields the smallest space consumption (counted in bits) and moreover the Huffman coding can be efficiently computed, also in linear space.As we focus on space counted in bits and use encodings, by ||α|| we denote the space consumption of the encoding of α, the encoding shall be always clear from the context.Furthermore, whenever we talk about space complexity, it is counted in bits.

95:3
Nondeterministic Linear Space.We recall some basic facts about the nondeterministic space-bounded computation.A nondeterministic procedure is sound, when given a unsatisfiable word equation U = V it cannot transform it to a satisfiable one, regardless of the nondeterministic choices; a procedure is complete, if given a satisfiable equation U = V for some nondeterministic choices it returns a satisfiable equation U = V .A composition of sound (complete) procedures is sound (complete, respectively).It is enough that we show linear-space bound for one particular computation: as the bound is known, we limit the space available to the algorithm and reject the computations exceeding it.
The algorithm.We use (a variant of) recompression algorithm [8], which conceptually applies the following two operations on S(U ) and S(V ): given a string w and alphabet Γ the Γ block compression of w is a string w obtained by replacing every block a , where a ∈ Γ and ≥ 2, with a fresh letter a ; the (Γ , Γ r ) pair compression of w, where Γ , Γ r is a partition of Γ, is a string w obtained by replacing every occurrence of a pair ab ∈ Γ Γ r with a fresh letter c ab .A fresh letter means that it is not currently used in the equation, nor in Γ, yet each occurrence of a fixed ab is replaced with the same letter.The a and c ab are just notation conventions, the actual letters in w do not store the information how they were obtained.For shortness, we call Γ block compression the Γ compression or block compression, when Γ is clear from the context; similar convention applies to (Γ , Γ r ) pair compression, called (Γ , Γ r ) compression or pair compression, when (Γ , Γ r ) is clear from the context.We say that a pair ab ∈ Γ Γ r is covered by a partition Γ , Γ r .
The intuition is that the algorithm aims at performing those compression operations on the solution word and to this end it modifies the equation a bit and then performs the compression operations on U and V (and conceptually also on the solution, i.e. on S(X) for each variable X).Below we describe, how it is performed on the equation.
BlockComp: For the equation U = V and the alphabet Γ of letters in this equation for each variable X we first guess the first and last letter of S(X) as well as the lengths , r of the longest prefix consisting only of a, called a-prefix, and b-suffix (defined similarly) of S(X).Then we replace X with a Xb r (or a b r or a when S(X) = a b r or S(X) = a ); this operation is called popping a-prefix and b-suffix.Then we perform the Γ-block compression on the equation (this is well defined, as we can treat variables as symbols from outside Γ).
PairComp: For the alphabet Γ, which will always be the alphabet of letters in the equation right before the block compression we partition Γ into Γ and Γ r (in a way described in Section 3.2) and then for each variable X guess whether S(X) begins with a letter b ∈ Γ r and if so, replace X with bX or b, when S(X) = b, and then do a symmetric action for the last letter and Γ ; this operation is later referred to as popping letters.Then we perform the (Γ , Γ r ) compression on the equation.
LinWordEq works in phases, until an equation with both sides of length 1 is obtained: in a single phase it establishes the alphabet Γ of letters in the equation, performs the Γ compression and then repeats: guess the partition of Γ to Γ and Γ r and perform the (Γ , Γ r ) compression, until each pair ab ∈ Γ 2 was covered by some partition.
Correctness.Given a solution S we say that some nondeterministic choices correspond to S, if they are done as if LinWordEq knew S. For instance, it guesses correctly the first letter of S(X) or whether S(X) = .(The choice of a partition does not fall under this category.)Lemma 1 ([8, Lemma 2.8 and Lemma 2.10]).BlockComp is sound and complete; to be more precise, for any solution S of an equation U = V for the nondeterministic choices I C A L P 2 0 1 7

95:4
Word Equations in Nondeterministic Linear Space corresponding to S the returned equation U = V has a solution S such that S (U ) is the Γ compression of S(U ) and S (X) is obtained from S(X) by removing the a-prefix and b-suffix, where a is the first letter of S(X) and b the last, and then performing the Γ compression.
When Γ and Γ r are disjoint, the PairComp(Γ , Γ r ) is sound and complete; to be more precise, for any solution S of an equation U = V for the nondeterministic choices corresponding to S the returned equation U = V has a solution S such that S (U ) is the (Γ , Γ r ) compression of S(U ) and S (X) is obtained from S(X) by removing the first letter of S(X), if it is in Γ r , and the last, if it is in Γ , and then performing the (Γ , Γ r ) compression.
The solution S from Lemma 1 is called a solution corresponding to S after (Γ , Γ r ) compression (Γ compression, respectively); we also talk about a solution corresponding to S, when the compression operation is clear from the context and extend this notion to a solution corresponding to S after a phase.What is important later on is how S is obtained from S: it is modified as if the subprocedures knew first/last letter of S(X) and popped appropriate letters from the variables and then compressed pairs/blocks in substitution for variables.
Lemma 1 yields the soundness and completeness of LinWordEq, for the termination we observe that iterating the compression operations shortens the string by a constant fraction, thus the length of a solution word shortens by a constant fraction in each phase.Lemma 2. Let w be a string over an alphabet Γ and w a string obtained from w by a Γ compression followed by a sequence of (Γ , Γ r ) compressions (where Γ , Γ r is a partition of Γ) such that each pair ab ∈ Γ 2 is covered by some partition.Then |w | ≤ 2|w|+1 3 .

Theorem 3. LinWordEq is sound, complete and terminates (for appropriate nondeterministic choices) for satisfiable equations. It runs in linear (bit) space.
In the following, we will also need one more technical property of block compression.

Lemma 4. Consider a solution S during a phase with nondeterministic choices corresponding to S and the corresponding solution S of U = V after the block compression. Then S (U ) has no two consecutive letters aa ∈ Γ.
This is true after block compression and afterwards no letters from Γ are introduced.
Compressing blocks in small space.Storing, even in a concise way, the lengths of popped prefixes and suffixes in Γ compression makes attaining the linear space difficult.This was already observed [8] and a linear-space implementation of BlockComp was given [8].It performs a different set of operations, yet the effect is the same as for BlockComp.Instead of explicitly naming the lengths of blocks, we treat them as integer parameters; then we declare, which maximal blocks are of the same length (those lengths depend linearly on the parameters); verifying the validity of such a guess is done by writing a system of (linear) Diophantine equations that formalise those equalities and checking its satisfiability.This procedure is described in detail in [8, Section 4].In the end, it can be implemented in linear bitspace.
Lemma 5 ([8,Lemma 4.7]).BlockComp can be implemented in space linear in the bit-size of the equation Huffman coding.At each step of the algorithm we encode letters (though not variables) in the equation using Huffman coding.This may mean that when going from U = V to U = V the encoding of letters changes and in fact using the former encoding in the latter equation may lead to super-linear space (imagine that we pop from each variable a letter that has a very long code).Using standard methods changing the encoding during a transition from Lemma 6.Given a string (encoded using some uniquely decodable code), its Huffman coding can be computed in linear bitspace.
Each subprocedure of LinWordEq that transforms an equation

Space consumption
In order to bound the space consumption, we will use bit-encoding of letters that depends on the current equation.We use the term 'encoding' even though it may assign different codes to different occurrences of the same letter, but two different letters never have the same code.Since we are interested in linear space only, we do not care about the multiplicative O(1) factors in the space consumption and can assume that our code is prefix-free, say by terminating each encoding with a special symbol $.We show that such an encoding uses linear space, which also shows that the Huffman encoding of the letters in the equation uses linear space, as Huffman encoding uses the smallest space among the prefix codes.The idea of our 'encoding' is: for each letter in the current equation we establish an interval I of indices in the original equation (viewed as as string ) on which it 'depends' (this has to be formalised) and encode this letter as U 0 V 0 [I]#i, when it is ith in the sequence of letters assigned I and U 0 V 0 [I] is the original equation restricted to indices in I The dependency is formalised in Section 3.1, while Section 3.2 first gives the high-level intuition and then upper-bound on the used space.
For technical reasons we insert into the equation ending markers at the beginning and end of U and V , i.e. write them as @U @, @V @ for some special symbol @.Those markers are ignored by the algorithm, yet they are needed for the encoding.

Dependency intervals
First, we need some notation.The input equation is denoted by U 0 = V 0 , the U = V and U = V are used for the current equation and equation after performing some operation.We treat the input equation as a single string U 0 V 0 and consider its indices, i.e. numbers from 1 to |U 0 V 0 |, denoted by letters i, i , j and intervals of such indices, denoted by letter In the current equation, i.e. the one stored by LinWordEq, we do not consider indices but rather positions and denote them by letters p, q.We do not think of them as numbers but rather as pointers: when U = V is transformed by some operation to U = V but the letter/variable at position p was not affected by this transformation, we still say that this letter/variable is at position p.On the other hand, the affected letters are on positions that were not present in U = V .In the same spirit we denote by p the positions in U = V and the corresponding position in S(U ) = S(V ).We still use the left-to-right ordering on positions, use p − 1 and p + 1 to denote the previous and next position; we also consider intervals of positions, yet they are used rarely so that they are not confused with intervals of indices, on which we focus mostly.Given an equation U = V and an interval of positions P by U V [P ] I C A L P 2 0 1 7

95:6
Word Equations in Nondeterministic Linear Space we denote the string of letters and variables at positions in P , again, this notation is used rarely.In the input equation the index and position is the same.
With each position p in the (current) equations (including the endmarkers) we associate dependency interval dep(p), called depint; if the depint is a single index {i}, we denote it i.The idea is that the letter at position p is uniquely determined by U 0 V 0 [dep(I)] (and the nondeterministic choices of the algorithm), note that it may include both variables and letters.We use the notions of ⊆ and ⊇ for the depints with a usual meaning; we take unions of the them, denoted by ∪, but only when the result is an interval.We say that I and I are similar, denoted as Assigning depints to letters.When X at position p pops a letter into position p then dep(p ) ← dep(p) (which is the position of this occurrence of X in the input equation).Whenever we perform the (Γ , Γ r ) compression then in parallel for each position p such that U V [p] ∈ Γ we assign dep(p) ← dep(p) ∪ dep(p + 1) (p + 1 may be a a position variable or an endmarker).Then we perform a symmetric action for positions whose letters are in Γ r (so for p − 1).
For Γ compression, we perform in parallel the following operation for each block (perhaps of length 1) of a letter in Γ: given a maximal block a at positions p, p + 1, . . ., p + − 1 we set the depints of those positions to i=−1 dep(p + i) (note that p − 1 and p + are included).
In the following we mostly focus on Pos ⊇ (i).As this is an interval of positions, we visualize that Pos ⊇ (I) extends to the neighbouring positions.Thus we will refer to operations of changing the depints before the block compression and pair compression as extending of Pos ⊇ (I) to new positions; those positions get their depints extended.Note that this notion does not apply to the case when we pop letters from variables.
Depints defined in this way satisfy the conditions (I1-I3).
Proof.We first show (I1) for Pos ⊇ (i).The proof is by induction; this is true at the beginning.If we make a union of depints, a position adjacent to a position in Pos ⊇ (i) symbol can become part of Pos ⊇ (i) (this can be iterated when the depints are changed before the blocks compression), which is fine.During the compression, we compress symbols on positions with the same depints, so this is fine.When we pop a letter from variable at position p to position p then dep(p ) = dep(p) ∈ Pos ⊇ (i) and by inductive assumption Pos ⊇ (i) was an interval, which shows the claim.We now show by induction that i ≤ i implies Pos ⊇ (i) ≤ Pos ⊇ (i ).Clearly this holds at the beginning, as then Pos ⊇ (i) = Pos(i) = {i} and Pos ⊇ (i ) = Pos(i ) = {i }.Consider the moment, in which the condition Pos ⊇ (i) ≤ Pos ⊇ (i ) is first violated, by symmetry it is enough to consider the case in which the first position in Pos ⊇ (i ) is smaller than the first in Pos ⊇ (i).If this position was just popped then it cannot be popped to the right, as the position of popping variable is in Pos ⊇ (i ).So it was popped to the left.But then the variable that popped it was on position in Pos ⊇ (i ) and by induction assumption Pos ⊇ (i ) ≥ Pos ⊇ (i), so it had a position from Pos ⊇ (i) to its left, contradiction.The other option is that this happened when a depint of a position was changed so that it got into Pos ⊇ (i ).But then the position to its right was in Pos ⊇ (i ) and by induction assumption either this position was in Pos ⊇ (i) or some position to the left of it was; in both cases the position also got into Pos ⊇ (i).The proof follows by a simple, yet tedious induction and it is omitted.
Concerning (I2), we show a stronger statement: given positions p, p + 1 it holds that dep(p) ≤ dep(p+1).Let i, i be the leftmost indices in dep(p), dep(p+1), respectively.Assume for the sake of contradiction that i > i .We already showed that then Pos ⊇ (i) ≥ Pos ⊇ (i ).So if p + 1 ∈ Pos ⊇ (i ) ≤ Pos ⊇ (i) p then also p ∈ Pos ⊇ (i ), i.e. i ∈ dep(p).As i < i then the leftmost index in dep(p ) cannot be i.The proof for rightmost index is similar.

Encoding of letters. Letters in Pos(I) are encoded as
etc. Note, that there is no a priori bound on the size of such numbers.Furthermore, if I ∼ I then encoding I#i and I #i is the same (these are the same symbols by (I3).

Pair compression strategy
We assume that LinWordEq makes the nondeterministic choices according to the solution, thus the space consumption of a run depends only on the choices of the partitions during pair compression, called a strategy.We describe a linear-space strategy.
Idea.Imagine we ensured that during one phase each variable popped O(1) letters and each Pos ⊇ (i) expanded by O(1) letters.Then |Pos ⊇ (i)| = O(1): we introduced O(1) positions to Pos(i), say at most k, and by Lemma 2 among positions in Pos ⊇ (i) at the beginning of the phase there were at least 2/3 took part in compression, so their number dropped by 1/3; thus |Pos ⊇ (i)| ≤ 3k.As a result, |Pos(I)| ≤ 3k for each depint I: as Pos(I) ⊆ Pos ⊇ (i) for i ∈ I.This would yield that the whole bit-space used for the encoding is linear: each number m used in U 0 V 0 [I]#m is at most 3k = O(1), so they increase the size by at most a constant fraction.On the other hand, the depints consume: (a simple proof is given later) and the right hand side is linear in terms of the input equation: || is the the bit-size of the input equation.
It remains to ensure that Pos ⊇ (i) do not extend too much and variables do not pop too much letters.Given a phase, we call a letter new, if it was introduced during this phase.New letters cannot be popped nor can Pos ⊇ (i) be extended to positions with new letters.Thus they are used to prevent extending Pos ⊇ (i) and popping: it is enough to ensure that I C A L P 2 0 1 7 the first/last letter of a variable is new and that a letter on the position to the left/right of Pos ⊇ (I) is new.
Unfortunately, we cannot ensure this for all variables Pos ⊇ (i).We can make this true in expectation: given a random partition there is a 1/4 probability that a fixed pair is compressed (and the resulting letter is new).This requires formalisation and calculations.
Strategy.Given a solution S of an equation we say that a variable X is left blocked if S(X) has at most one letter or the first or second letter in S(X) is new, otherwise a variable is left unblocked; define right blocked and right unblocked variables similarly.An index i is left blocked if in S(U ) (or S(V ), respectively) there is at most position to the left of Pos ⊇ (i) or one of the letters on the positions one and two to the left of Pos ⊇ (i) is new, otherwise i is left unblocked; define right blocked and right unblocked indices similarly.
Lemma 9. Consider a solution S = S 0 and consecutive solutions S 1 , S 2 , . . .corresponding to it during a phase.If a variable X becomes left (right) blocked for some S k , then it is left (right, respectively) blocked for each S for ≥ k and it pops to the left (right, respectively) at most 1 letter after it became left (right, respectively) blocked.If an index i becomes left (right) blocked for some S k then it is left (right, respectively) blocked for each S for ≥ k and at most one letter to the left (right, respectively) will have its depint extended by i after i became left (right, respectively) blocked.
The proof follows by a simple case inspection and it is omitted.
The strategy iterates steps 1, 2, 3 and 4. In a step i it chooses a partition so that the corresponding i-th sum below decreases by 1/2, unless this sum is already 0: i: left unblocked index The idea of the steps is: (1) upper-bounds the increase of bit-size of depints in the equation after popping letters.So by iteratively halving it we ensure that total encoding increase caused by popping letters is small.Similarly, (2) upper-bounds the increase due to expansion of indices to new depints.The following (3) is connected (in a more complex way) to an increase, after popping, of number of bits used for numbers in the encoding.Similarly (4) to an increase after the extension of depints.

Lemma 10. During the pair compression LinWordEq can always choose a partition that at least halves the value of a chosen non-zero sum among (1)-(4).
Proof.Consider (1) and take a random partition, in the sense that each letter a ∈ Γ goes to the Γ with probability 1/2 and to Γ r with probability 1/2.Let us fix a variable X and its side, say left.What happens with n X • ||X|| in (1) in the sum corresponding to left unblocked variables?If X is left blocked then, by Lemma 9, it will stay left blocked and so the contribution is and will be 0. If it is left unblocked, then its two first letters a, b are not new, so they are in Γ.If S(X) has only those two letters, then with probability 1/2 the a will be in Γ r and it will be popped and X will become left blocked (as S(X) has only one letter), the same analysis applies, when the third leftmost letter is new.The remaining case is that the three leftmost letters in S(X) are not new, let them be a, b, c ∈ Γ.By Lemma 4 a = b = c.With probability 1/4 ab ∈ Γ Γ r and with probability 1/4 bc ∈ Γ Γ r .Those events are disjoint (as in one b ∈ Γ r and in the other b ∈ Γ ) and so their union happens with probability 1/2.In both cases X will become left blocked, as a new letter is its first or second in S(X).In all uninvestigated cases the contribution of n X • ||X|| cannot raise, which shows the claim in this case.The case of (3) is shown in the same way.
For (2), the analysis for an index i that is left unblocked is similar, but this time we consider the positions to the left of Pos ⊇ (i) and Pos ⊇ (i) can extend to them (instead of letters being popped from variables in case of ( 1)) and some of them may be compressed to one.Note that if there are no letters to the left/right then this index is blocked from this side.The case of ( 4) is shown in the same way.
Space consumption.We now give the linear space bound on the size of equation.This formalises the intuition from the beginning of Section 3.2.As a first step, we show an upper-bound on the encoding size of the equation; define in the encoding and H d : of the numbers in the encoding.

Lemma 11. Given the equation
The proof follows by simple symbolic transformation of the definitions.
Instead of showing a linear bound on ||(U, V )|| we give a linear bound on H(U, V ).Recall that (U 0 , V 0 ) denotes the input equation.

Lemma 12.
Consider an equation U = V , its solution S, a phase of LinWordEq which makes the nondeterministic choices according to S and partitions according to the strategy.Let the returned equation be (U , V ).Then H(U , V ) ≤ 5  6 H(U, V ) + α||(U 0 , V 0 )|| and in a phase H on intermediate equations is at most βH(U, V ) + γ||(U 0 , V 0 )|| for some constants α, β, γ.
Proof.We separately estimate the H d and H n .Concerning H d , let us first estimate ||U 0 V 0 [dep(p)]|| summed over positions p of letters popped into the equation during a phase (note, this does not include the size of numbers used in the encoding).For each variable we pop perhaps several letters to the left and right before block compression, but those letters are immediately replaced with single letters, so we count each as 1; also, when this side of a variable becomes blocked, it can pop at most one letter.Otherwise, a side of a variable pops at most 1 letter per pair compression, in which it is unblocked from this side.Note that the depint is the same as for variable, so the encoding size is ||X||.So in total the bit-size of popped letters is at most: Observe that the third sum (the one summed over all partitions) at the beginning of the phase is equal to X 2n X • ||X||, as no side of the variable is blocked, and by the strategy point (1) its value at least halves every 4th pair compression (and it cannot increase, as by Lemma 9 no side of the variable can cease to be blocked).Thus ( 5) is at most We now similarly estimate how many positions got into Pos ⊇ (i) due to expansion of Pos ⊇ (i): Pos ⊇ (i) can expand to two letters during the block compression (to be more precise: to positions that are inside a block and to the positions to the let/right ones, but positions in a block are replaced with a single letter and one of them was in Pos ⊇ (i)) to one position at each side after i becomes blocked and by position for each partition P in which this side of i is not blocked.So the increase in the bit-size is (6) and as in (5) similarly at the beginning of the phase the second sum (so the one summed by partitions) is i: index 2||U 0 V 0 [i]|| = 2||(U 0 , V 0 )|| and it at least halves every 4th partition, by strategy point (2).Thus similar calculations show that ( 6) is at most 20||(U 0 , V 0 )||.
On the other hand, the number of positions in Pos ⊇ (i) drops till the end of the phase by at least is a letter, then Pos ⊇ (i) are all positions of letters and Lemma 2 yields that Pos ⊇ (i) looses at least is an ending marker, then the marker itself is unchanged and the remaining positions in Pos ⊇ (i) are letter-positions and Lemma 2 applies to them, so Pos ⊇ (i) looses at least We also estimate the maximal value of H d during the phase, as for intermediate equations we cannot guarantee that the compression reduced the length of all letters.We already showed that in a phase we increase H d by 40||(U 0 , V 0 )||.This yields a bound of H d (U, V ) + 40||(U 0 , V 0 )||, which shows the part of the claim of Lemma for H d .
Concerning H n , for an index i let k i , o i , e i denote, respectively: |Pos ⊇ (i)| at the beginning of the phase, number of positions of letters popped from a variable with depint i and number of positions to whose depint i extended.First we estimate i: index h(o i ) and i: index h(e i ) A. Jeż

95:11
and then use those estimations to calculate the bound on H n (U , V ).We first inspect the case of o i ; let P 1 , P 2 , . . .denote the consecutive partitions in phase.We show that The inequality follows as: if (one occurrence of) X popped o X letters, then it was not blocked on left/right side for o 1 /o 2 partitions, where o 1 + o 2 ≥ o X − 4 (note that one sequence can be popped to the left and right during block compression but it is immediately replaced with a single letter, so we treat them as one letter, also one letter can be popped to the left/right after X became blocked).Then in right hand side of (7) the contribution from (one occurrence of) X is at least where the first inequality follows as o 1 + o 2 ≥ o X − 4 and the second can be checked by simple numerical calculation.Lastly, in (7) each o i is equal to an appropriate o X .The sum in braces on the right hand side of ( 7) initially is at most 2|U 0 V 0 | ≤ 2||(U 0 , V 0 )|| and by strategy choice (3) it is at least halved every 4th step.So this sum is at most: The analysis for e i is similar: for a single index i the estimation of the number of position by which Pos ⊇ (i) extends is the same as the estimation of number of letters popped from an occurrence of a variable, thus We now estimate, how many positions were lost due to compression, recall that k i is the size of Pos ⊇ (i) at the beginning of the phase.Using the same analysis as in the case of H d , from Lemma 2 it follows that at least ki 3 − 1 positions were lost in the phase due to compression .Thus Consider two subcases: if Using simple properties of h as well as ( 8)-( 9) we upper-bound the right hand side by We should estimate the maximal H n value during the phase, as inside a phase we cannot guarantee that letters get compressed, i.e. estimate i: index h (k i + o i + e i ).Using similar calculation as in the case of (10) and properties of h we obtain: i: index h (k i + o i + e i ) ≤ 8H n (U, V ) + 2064||(U 0 , V 0 )|| , which shows the claim of the Lemma in the case of H n and so also in case of H.

Proof of Theorem 3
By Lemma 1 all our subprocedures are sound, so we never accept an unsatisfiable equation.
We now give analyse the nondeterministic choices that yield termination, completeness and linear space consumption.Consider an equation U = V at the beginning of the phase, let Γ be the set of letters in this equation.If it has a solution S , then it also has a solution S over Γ such that |S(X)| =|S (X)| for each variable: we can replace letters outside Γ with a fixed letter from Γ.During the phase we will make nondeterministic choices according to this S.
Let the equation obtained at the end of the phase be U = V and S be the corresponding solution.Then |S (U )| ≤ 2|S(U )|+1 3 by Lemma 2 and we begin the next phase with S .Hence we terminate after O(log N ) phases, where N is the length of some solution of the input equation.
To upper-bound the space consumption, we also estimate other stored information: we also store the alphabet from the beginning of the phase (this is linear in the size of the equation at the beginning of the phase) and the mapping of this alphabet to the current symbols (linear in the equation at the beginning of the phase plus the size of the current equation).The terminating condition that some pair of letters in Γ 2 was not covered is guessed nondeterministically, we do not store Γ 2 .The pair compression and block compression can be performed in linear space, see Lemma 6.Note that this includes the change of Huffman coding.
Given an interval I of indices in U 0 V 0 by Pos(I) we denote positions in the current equation whose depint is I, i.e.Pos(I) = {p | dep(p) = I}.In the analysis it is also convenient to look at positions whose depint is a superset of I: Pos ⊇ (I) = {p | dep(p) ⊇ I}, this is usually used for I = {i} We shall ensure the following properties: (I1) Given a depint I, the Pos(I) is an interval of positions, similarly Pos ⊇ (I).(I2) Given depints I, I such that Pos(I) = ∅ = Pos(I ) either I ≤ I or I ≥ I .(I3) For depints

3 −
is a variable then Pos ⊇ (i) includes the position of a variable and Lemma 2 applies to strings of letters to the left and right, say of length , r, where + r = |Pos ⊇ (i)| − 1.Then due to compressions Pos ⊇ (i) looses at least −1 3 + r−1 3 = |Pos ⊇ (i)| 1 positions.Thus:
(10) i + 1 + o i + e i ≤56 k i , then the summand can be estimated as h(56 k i ) ≤ 5 6 h(k i ) and we can upper bound the sum over those cases by5 6 i: index h(k i ).If 2 3 k i + 1 + o i + e i > 5 6 k i then 1 + o i + e i > 1 6 k i and so 2 3 k i + 1 + o i + e i < 5(1 + o i + e i ).Thus(10)is upper-bounded by: i: index h(5(1 + o i + e i )) .