Nondeterministic Tree-Walking Automata Are Not Closed Under Complementation
Abstract
It is proved that the family of tree languages recognized by nondeterministic tree-walking automata is not closed under complementation, solving a problem raised by Bojańczyk and Colcombet (“Tree-walking automata do not recognize all regular languages”, SIAM J. Comp. 38 (2008) 658–701). In addition, it is shown that nondeterministic tree-walking automata are stronger than unambiguous tree-walking automata.
Keywords and phrases:
Finite automata, tree-walking automata, complementationCategory:
Track B: Automata, Logic, Semantics, and Theory of ProgrammingCopyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Formal languages and automata theory ; Theory of computation Models of computationFunding:
Research supported by the Russian Science Foundation, project 23-11-00133.Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

1 Introduction
Tree-walking automata, first studied by Aho and Ullman [1], are among the fundamental models in automata theory. A tree-walking automaton walks over a labelled input tree of bounded degree, following the edges; at each moment, an automaton is at some node and is in one of finitely many states, and it uses its transition function to decide which edge to follow and which state to enter. The main questions about the expressive power of tree-walking automata were open for several decades, until the breakthrough results of Bojańczyk and Colcombet [4, 5], who proved that deterministic tree-walking automata (DTWA) are weaker than nondeterministic tree-walking automata (NTWA), which are in turn weaker than bottom-up tree automata.
In these papers, Bojańczyk and Colcombet considered three main problems on tree-walking automata stated by Neven [15]: two of them are the problems they solved, and the third problem is whether the class of tree languages recognized by NTWA is closed under complementation. This paper gives a negative solution to this problem, presenting a tree language recognized by an NTWA, such that its complement cannot be recognized by any NTWA.
The common idea of the two papers of Bojańczyk and Colcombet [4, 5] is using trees made out of specifically constructed elements, called patterns, and then showing that tree-walking automata cannot detect some rotations in the branching structure of such trees. They prove that DTWA cannot detect any rotations at all, whereas NTWA cannot detect some rotations that a bottom-up tree automaton can detect. In our paper, we reuse the tools of Bojańczyk and Colcombet [4, 5], and construct a new separating language defined by a certain condition on the branching structure. Our proof ultimately uses two trees that differ only by a single rotation.
Research on tree-walking automata and related models has been conducted in several directions. In particular, tree-walking automata with pebbles were introduced by Engelfriet and Hoogeboom [8], who proved them to be at most as powerful as bottom-up tree automata. Later, Bojańczyk et al. [6] proved strict hierarchies in the number of pebbles for both deterministic and nondeterministic pebble tree-walking automata. They also proved that no number of pebbles can help deterministic automata to simulate NTWA without pebbles. Logical characterizations of pebble tree-walking automata were given by Engelfriet and Hoogeboom [8, 9] and by Neven and Schwentick [16].
For deterministic tree-walking automata, every automaton can be transformed to one that halts on every tree: this was first done by Muscholl et al. [14] using Sipser’s [18] method of traversing the tree of all computations ending in the accepting configuration. This result also implies the closure of DTWA under complementation. Later, Kunc and Okhotin [12] presented a generalized construction applicable to graph-walking automata and producing reversible automata, and also reduced the number of states in the resulting automata from quadratic to linear in the size of the given deterministic automaton.
Much is known about the complexity of decision problems for tree-walking automata. The emptiness and the inclusion problems for both DTWA and NTWA are EXPTIME-complete, see Bojańczyk [2], in contrast to the P-complete emptiness problem for the more powerful bottom-up tree automata, see Veanes [19]. Samuelides and Segoufin [17] determined the complexity of the emptiness and the inclusion problems for -pebble tree-walking automata: they are -EXPTIME-complete. Among the recent results, Bojańczyk [3] proved that it is undecidable whether two regular tree languages can be separated by a deterministic tree-walking automaton. For graph-walking automata, both deterministic and nondeterministic, Martynova [13] proved that their non-emptiness problem is NEXPTIME-complete.
The second result of this paper is about unambiguous tree-walking automata (UTWA): these are NTWA, which, for every tree they accept, have a unique accepting computation, while the number of rejecting computations is unrestricted. The unambiguous case has been studied for different models of automata and for different complexity classes, see the survey by Colcombet [7]. Unambiguous automata are an intermediate model between deterministic and nondeterministic ones. In particular, for finite automata on strings, all three types of automata are equal in power, and their relative succinctness has been a subject of much research: see, e.g., the most recent contributions by Indzhev and Kiefer [11] and Göös et al. [10]. For tree-walking automata, DTWA are weaker than NTWA [4], and unambiguous tree-walking automata may theoretically coincide in power with either DTWA or NTWA, or they may be strictly between them. In this paper, we prove that UTWA are weaker than NTWA, while the question of whether they are stronger than DTWA or not remains open.
2 Trees and tree-walking automata
The notion of tree-walking automata is standard, even though it can be presented in various notation. This paper generally adopts the notation used by Bojańczyk and Colcombet [5], with insignificant modifications.
Definition 1.
Let be an alphabet of labels. A binary tree over is a partial mapping , where is a finite non-empty and prefix-closed set of nodes, and defines the label of each node. The empty string is the root node, and for each node , if the node is in , it is called the left child of , and is its parent; similarly, if is in , it is the right child of . A node either has both children or none; in the latter case it is called a leaf.
The edge between a parent and a child is defined by a function , where is the set of direction labels. For every two nodes and , such that is the -th child of , let and . The function is undefined on all other pairs.
A node is said to be above a node if for some ; in this case, is said to be below . A node is to the left of if none of them is above the other, and is lexicographically less than ; in this case, is to the right of .
For each node , the subtree of rooted at is a tree defined by for all with defined.
Some further notation for the trees turns out to be useful. The following function describes the local position of a node in a tree.
Definition 2.
Let be a tree with the set of nodes . Each node is assigned a type, drawn from the set . Define a function by , where defines whether is a left child, a right child or the root, and specifies if is a leaf or not.
A (nondeterministic) tree-walking automaton is typically defined as follows. It starts at the root in one of the initial states. At each step it knows the current state and sees the label and the type of the current node. Then, according to the transition function, it nondeterministically decides to proceed to its parent or to any of its children, and changes its state. If the automaton ever comes to the root in an accepting state, it accepts.
The definition assumed in this paper follows Bojańczyk and Colcombet [5]. Accordingly, the automaton is invested with the knowledge of the label and the type of the destination node, which are also part of a transition. If the destination node does not have the specified type, that transition cannot be applied. This ability does not give an automaton any extra power. This way, transitions become symmetric and can be reversed.
Definition 3.
A nondeterministic tree-walking automaton (NTWA) is a quintuple , where
-
is a finite alphabet of labels,
-
is a finite set of states,
-
is the set of initial states,
-
is the transition relation,
-
and is the set of accepting states.
Configurations of the automaton on a tree with a set of nodes are pairs , with and . A computation from a configuration to a configuration is any sequence of the form , where , and for all , , , and for every , with , the configurations and are connected by a transition, that is,
An accepting computation is a computation from an initial configuration , where is an initial state and is the root node, to any accepting configuration of the form , with . Thus, the automaton accepts only at the root.
A tree is accepted by the automaton if there is at least one accepting computation on . The language recognized by , denoted by , is the set of all trees the automaton accepts.
3 Separating language
We consider binary trees, with nodes labelled with or , where is the blank symbol. Let all leaves labelled with in a tree be enumerated from left to right, starting with , and denoted by
The language is defined as the set of all trees in which there is a triple of leaves labelled with , with consecutive numbers , , , satisfying the condition , where lca denotes the lowest common ancestor. The form of trees satisfying this condition is illustrated in Figure 1(left), whereas trees that are not in have the form shown in Figure 1(right). In the figure, blank leaves are omitted along with paths leading to them.
The following theorem is the main result of this paper.
Theorem 4.
There is a nondeterministic tree-walking automaton that recognizes the language . No nondeterministic tree-walking automaton can recognize the complement of .
Sketch of a proof of the first part..
An NTWA recognizing the language works using the following algorithm, illustrated in Figure 2. It begins with the traversal of the tree using depth-first search from left to right, in which it counts modulo the number of leaves with label . At some point, while making a turn (that is, while climbing from a left child to a parent node and then immediately descending into its right child), the automaton nondeterministically decides that this node must be the lowest common ancestor of a suitable triple of leaves , and that the last -labelled leaf encountered was . At this moment the automaton holds the residue in its state. Next, the automaton, having descended into the right subtree, nondeterministically guesses the path to the leaf . It ends its descent in some leaf labelled with (rejecting otherwise), still remembering the residue . And now the automaton wants to check that the number of the current leaf is equivalent to modulo . To this end, the automaton starts another depth-first search, now from right to left, initially keeping in memory the number , and decrementing it by one modulo at each leaf labelled with . If it finishes the traversal with residue zero in memory, then it accepts.
If the tree is not in , then the automaton cannot accept regardless of which node it believes to be the least common ancestor (the proof is omitted).
4 Tools from the paper by Bojańczyk and Colcombet and their further properties
So the language is recognized by a nondeterministic tree-walking automaton. Now it will be shown that no NTWA recognizes the complement of . Let be an NTWA, let be its set of states. The plan is to construct two trees, one not in and the other in , so that if the automaton accepts the former tree, then it also accepts the latter tree. Then the automaton cannot recognize the complement of the language .
In constructing these trees and proving that the automaton operates on them as desired, we use tools developed by Bojańczyk and Colcombet [5]. Namely, the desired two trees are constructed out of elements defined in their paper, and we also use basic properties of those elements in our proofs. In this section, the required definitions and lemmata by Bojańczyk and Colcombet [5] are presented, along with several new lemmata addressing some further basic properties of those elements.
Following Bojańczyk and Colcombet [5], we consider trees with designated holes (ports) and call them patterns.
Definition 5.
A pattern is a binary tree with labels , in which the labels and may only be used in leaves, and all leaves labelled with are left children. A pattern must have at least two nodes.
The root of the tree is called the root port or port . All leaves labelled with are leaf ports, enumerated from left to right starting with one. The number of leaf ports in a pattern is called its rank. A pattern of rank , with , thus has the set of ports , and it is called a -ary pattern.
Patterns can be attached to each other by substituting one pattern for a leaf labelled with in another pattern. This operation is called composition of patterns, and is denoted by : here patterns are attached to the leaf ports of a -ary pattern . If patterns are attached not to all leaf ports, then is written instead of a pattern to be substituted.
Consider how an automaton can move through a pattern if this pattern is a part of some tree.
Definition 6 (Bojańczyk and Colcombet [4, Defn. 3], [5, Defn. 3]).
Let be an NTWA with a set of states , and let be a pattern of rank . Let be two states and let be two ports. A computation of on that begins in state in port and ends in state in port , without visiting any ports on the way, and treating ports and as left non-leaf children labelled with , is said to be a run of type .
The automaton’s transition relation over is , and it contains the types of all runs of over (and no other quadruples).
Two patterns and are called equivalent (with respect to ) if they are of the same rank and their transition relations coincide: .
A computation of zero length is considered as a run, that is, , for all and .
The equivalence relation on patterns is defined so that it respects composition: if some pattern is obtained as a composition of other patterns, and if one of those subpatterns is replaced with an equivalent pattern, then the resulting pattern will be equivalent to .
Most trees constructed in the papers by Bojańczyk and Colcombet [4, 5] are obtained by combining several specifically constructed patterns, defined with respect to an automaton , which have the following remarkable properties.
Lemma 7 (Bojańczyk and Colcombet [4, Lemma 9], [5, Lemma 3.1]).
Let be a tree-walking automaton. Then there are patterns , , of rank , and , respectively, which have no labels , such that every pattern of rank at most obtained as a composition of any number of patterns , , is equivalent to one of , , (the one of the same rank as ).
In the following, , , are patterns constructed for the automaton by Lemma 4. An additional basic pattern is , defined as , where is a tree with three nodes: the root and the right leaf are labelled with , whereas the left leaf has a label , as in the paper by Bojańczyk and Colcombet [5].
In this paper, the patterns , , and , as well as some combinations of a few such patterns, shall be called elements, as all trees considered in this paper shall be constructed out of them.
In patterns composed of elements , , , one can attach an element to any port and get an equivalent pattern [5, Fact 3.2].
Consider a computation of the automaton on a pattern composed of elements and . This computation naturally splits into runs over the constituent elements. The automaton possibly returns to the same ports of some elements multiple times, and this complicates the analysis of such a computation. In order to handle the returns to the same ports, Bojańczyk and Colcombet [5, §3.2] introduced the notion of inner loop. An inner loop from a state to a state is a computation in the pattern starting in the state at the unique common node of the two elements (the junction node), and ending in the state at the same junction node, without visiting either port of the pattern on the way. The existence of such an inner loop is denoted by . The computation in this definition is allowed to be empty, and hence for all .
Some computations in patterns that return to their point of origin can be shrunk to just inner loops.
Lemma 8 (Bojańczyk and Colcombet [5, Lemma 3.3]).
Let be an NTWA. Let be patterns of nonzero rank obtained as compositions of any number of elements , , . Let the pattern be attached to the -th leaf port of . Assume that the automaton begins at the junction node between and in some state , moves over these patterns without visiting their ports except the junction node, and returns to the junction node in some state . Then .
Every inner loop can be executed at the junction node between any two elements , , or , because attaching to any port always produces an equivalent element by Lemma 4. Then, runs between ports of these elements can be regarded as runs between junction nodes, each located at the junction of two elements . Hence, one can consider runs extended with loops, which begin with an inner loop at the port of departure, then make a run to the destination port, and finally execute another inner loop there. For brevity, such runs shall be called transfers.
Definition 9 (Bojańczyk and Colcombet [5, Defn. 5]).
Let be an NTWA with a set of states . Let be a pattern of rank , let and . A transfer of type over is a computation that begins with an inner loop , continues with a run of type , and ends with another inner loop , where are some states.
The relation of transfers over is , and it contains the types of all transfers of over .
If two patterns and , composed of , , and , are equivalent, then their relations and coincide: indeed, the definition of a transfer depends only on the relation , whereas inner loops can be executed at any junction nodes.
A run through a pattern made of elements naturally splits into runs through these elements. Such a partition also can be made for transfers, as follows.
Definition 10.
Let a pattern be obtained as a composition of any number of elements and . For every transfer over , its partition into elementary transfers over constituent elements and is obtained by first splitting this transfer into inner loops and runs between neighbouring junction nodes, and then attaching every inner loop to the preceding run to obtain a transfer crossing this element (inner loops at the beginning are attached to the following run).
A transfer is called simple if, in the above partition, at most one elementary transfer crosses each element.
This notion of a simple transfer is analogous to a simple path in a tree.
Lemma 11.
Let be an NTWA, and let be a pattern obtained by a composition of any number of elements and . Then, for every transfer over there is a simple transfer of the same type.
The following notation for transfers of the automaton through elements , , and is introduced (see Bojańczyk and Colcombet [5, Fig. 5.1]).
Note that the notation , with , always refers to a transfer from the state to the state , regardless of the direction of the arrow. For example, denotes that there is a transfer from at port to at port in the element .
The next lemma asserts that, as long as an automaton can move from the left leaf port of to its right leaf port, starting in a state and ending in a state , it can either similarly move in every pattern constructed from elements , and from an arbitrary leaf port to the next leaf port in order, or there is an inner loop from state to state .
Lemma 12.
Let be an NTWA with a set of states . Let be a pattern of rank , with , made of any number of elements , and . Let be some states with . Then either for all with , or .
The proof is omitted due to space constraints.
The following lemma states that if the automaton can move through upwards, then it can move upwards through from at least one of the leaf ports using the same states; and the same result holds for downward motion.
Lemma 13 (Bojańczyk and Colcombet [5, Prop. 5.6]).
For all states and , if and only if or . Similarly, if and only if or .
5 Patterns and
Let be an NTWA, it should be proved that it does not recognize the complement of . Let be its set of states, let , and assume, without loss of generality, that is even. This automaton is fixed for the rest of the paper.
The goal of the proof is to construct two trees: a tree not in and a modified tree in , such that if the automaton accepts the first tree, then it can be lured to accept the second tree. These two trees are constructed in the form of two large patterns of rank , which are completed into trees by attaching a root with at the top and at the bottom.
Let be a number with residue modulo , such that .
The desired patterns are denoted by and and are illustrated in Figure 3.
The pattern , called the small correct pattern, is a chain of elements of rank . The other pattern , the faulty pattern, is constructed by attaching to the element two chains of elements each, one to the leaf port and the other to the root port of the central element. We shall call this central part the fault, since the resulting tree will be in due to the different structure of -leaves in this element.
In both patterns and , all leaves labelled with are enumerated from left to right; denote these leaves by in the small correct pattern, and by in the faulty pattern. For convenience, the root ports of elements containing these leaves are denoted by in the small correct pattern and by in the faulty pattern, whereas the root ports of the elements , are denoted by in the small correct pattern and in the faulty pattern. Let the leaf port be in both patterns. Note that, in the faulty pattern, is the root port of , and is the root port of : these two nodes are enumerated out of order of traversal.
In the small correct pattern , all triples of consecutive -leaves , , do not satisfy , whereas in , there is a triple with . Therefore, if the small correct pattern is replaced with in the tree that is not in , then the resulting tree will be in . The goal is to prove that if the automaton accepts the tree , then it also accepts the tree . This will show that the automaton does not recognize the complement of , which is enough to prove Theorem 4.
Consider any accepting computation of the automaton on the tree . This computation is split into segments between visits to the ports of the small correct pattern . Segments that lie outside can be exactly replicated in the tree . The rest of the computation is formed by runs through the small correct pattern , which have to be reproduced on the faulty pattern .
Main Lemma.
Let be an NTWA with the set of states of even size , operating on binary trees with labels , let , , and be the elements constructed for this automaton as in Section 4, and let the patterns and be as defined above. Then .
Any run through that begins and ends in the same port is exactly reproduced in the faulty pattern , since begins and ends with two subpatterns . Since runs going from the root port of to its leaf port are symmetric to runs from the leaf port to the root port, it is sufficient to prove the latter case. Accordingly, some states and with are fixed for the rest of the proof. And it should be proved that .
6 The correct pattern and a run of type
The ultimate goal is to construct a run of type through the faulty pattern , which is much larger than the small correct pattern . In this section, a run of this type is constructed for another pattern: the correct pattern , which is an inflated version of . It is composed of elements forming a chain. In the pattern , there are nodes , with , and , which are defined analogously to the nodes of .
First, consider a run through the small correct pattern . Since has two ports, and , and junction nodes, , some two of these nodes are visited by the automaton for the first time in the same state. Then, in the original computation, the part between the first visits to these nodes can be repeated periodically to form a computation on . This is possible, because . The goal is to take this computation on the correct pattern and to reproduce it on the faulty pattern , so that the periodically repeated sequence will pass by the fault, paying no attention to the difference.
In order to distinguish the faulty pattern from the correct pattern, the periodically repeated sequence should visit some of the elements attached from the right; assume that at least one is visited in the periodic part. The plan is to represent this periodic part as a sequence of segments between visits to . It will be proved that such segments can be executed by simple transfers. These segments shall be called proper steps.
Definition 14.
Let be an integer such that , and let be two states. Consider the pattern obtained by attaching elements into a chain, with every next element attached to the left leaf port of the previous one, as illustrated in Figure 4(left). Then a proper step of type is a simple transfer through this pattern, of type if , or of type if .
For a proper step of type is a transfer of type on .
The number is called the pace of the proper step.
A proper step of type can be used anywhere in the pattern : that is, for all integers satisfying and , there is a sequence of transfers on elements , that moves the automaton on the pattern from the configuration to the configuration without visiting root ports of any on the way, and never traversing any twice.
Next, the run through constructed above is modified so that its periodic part consists of proper steps.
Claim 15.
There is a run of type on the pattern that first comes to some node , with , in some state , having visited at most bottom elements of by that time. Next, the computation periodically repeats some sequence of proper steps, with each proper step having pace strictly between and . Every iteration of the period has pace at most , that is, it shifts the automaton by at most elements . Furthermore, every iteration involves at most consecutive elements . The periodic part of the computation leads the automaton to some node , with , in the same state , and after that the automaton finishes the run visiting at most top elements .
Sketch of a proof..
Consider the run through constructed above, which contains a periodic part. First, the periodically repeated sequence is shifted in order to begin with a visit to the root port of some (and accordingly to end with such a visit). This new sequence can be split into segments by the visits to the root ports of any . These segments, however, are not necessarily proper steps in the sense of Definition 6; but, using Lemma 4, each of these segments can be replaced with a proper step. Since each iteration of the period was initially made on consecutive elements , after the shift it fits into consecutive elements, and hence each step is of pace between and .
The next question is how proper steps of different types pass by the fault, that is, the element , which distinguishes the faulty pattern from the correct pattern .
7 Shrinking and stretching proper steps to bypass the fault
In this section it will be shown that proper steps that pay attention to the fault in the faulty pattern can be shrunk, or sometimes stretched, so that they bypass the fault.
The possibility of shrinking is established in the following form: if the automaton can move forward or backward by elements on the correct pattern , then it can either do the same on the faulty pattern, or it can move by elements on the correct pattern.
Claim 16.
Assume that a proper step of type exists for , and . Then at least one of two conditions holds:
-
I.
for every integer , such that and , the automaton can move on the faulty pattern from configuration to configuration ;
-
II.
there is a proper step of type for , and of type for .
Proof.
The proof is given only for the case of (the case of is proved symmetrically). The cases and are considered separately.
-
In the case it will be proved that a proper step of type exists.
Consider the partition of any proper step of type into transfers through elements . Let be the intermediate states in this partition, as in Figure 4(left). Then . By Lemma 4, implies , that is, if the element is replaced with the element , as in Figure 4(middle), then the automaton can make a computation of the form , without noticing the difference between the two patterns. Attaching an element to any port of an element results in an element equivalent to by Lemma 4. Then , and this is a proper step of type , see Figure 4(right).
-
In the case of , the goal of the proof is to show, for each number , that either this satisfies Condition I in the claim, or Condition II holds in general. Let be any integer with . It will be proved that either the automaton can move on the faulty pattern as stated, or there exists a proper step of type .
A proper step of type splits into two transfers, that is, for some state . Then, Lemma 4 asserts that , and since attaching produces an equivalent element, . A new pattern of rank is obtained out of the faulty pattern by removing all elements . Then the nodes and are consecutive ports of the pattern . Since , by Lemma 4, either for all , or . In the former case, the automaton can move from configuration to configuration on the faulty pattern , and the number satisfies Condition I. And in the latter case there is a proper step of type , and Condition II holds.
The next claim is that sometimes proper steps can be not only shrunk, but also stretched: if the automaton can shift by elements upwards on the correct pattern , then on the faulty pattern the automaton can either move upwards by the same distance without noticing the fault, or it can move one element further, or it can jump from anywhere to anywhere.
Claim 17.
Assume that there is a proper step of type , for some integer with , and for some two states . Let be an integer bounded as . Then at least one of the following three conditions holds:
-
I.
the automaton can move on the faulty pattern from configuration to configuration ;
-
II.
the automaton can move on from configuration to configuration ;
-
III.
for all , such that and , the automaton can move on from configuration to configuration .
The proof is by a longer case analysis; due to space constraints, only one case is presented.
Sketch of a proof for the case , ..
Let , for some intermediate states in the partition of some proper step of type into transfers through elements . Then implies , for some intermediate state . Next, by Lemma 4, there is at least one of the transfers and . If , then the sequence of transfers on the faulty pattern leads from configuration to configuration , and Condition I holds. And if , then, similarly, , and the automaton moves on from to ; this is Condition II.
8 How to move through the faulty pattern without noticing the fault
In this section, the proof of the Main Lemma will be completed, along with the whole proof of Theorem 4 stating that the nondeterministic tree-walking automata are not closed under complementation. It remains to prove that , using Claims 6, 7 and 7.
The desired computation from state in the leaf port of the faulty pattern to state in its root port is constructed as follows. Consider the computation on the correct pattern from Claim 6; it is fixed for the rest of this section. And now the goal is to modify this computation to reproduce it on the faulty pattern .
Since , the faulty pattern begins and ends with elements , just like the correct pattern . Therefore, the automaton , working on , can repeat the first and the last parts of the original computation defined in Claim 6: it can move up to configuration , and it can finish the computation from configuration . It remains to prove that the automaton can also move from configuration to configuration on the faulty pattern .
If any proper step satisfying Condition III from Claim 7 is used in the periodically repeated part of the computation, then this proper step can be stretched to almost the entire faulty pattern , and it can be used to skip the fault, so that the automaton moves on the faulty pattern from configuration to configuration . In the rest of the proof, it is assumed that there are no such proper steps in the periodically repeated sequence.
The next claim is that if the automaton works on the faulty pattern , then it may bypass the fault and get to some node after the fault in the state . It will come not necessarily to one of the nodes to which the original computation on the correct pattern arrives in the state ; what is important is that it comes in this state to some node far beyond the fault.
Claim 18.
The automaton operating on the faulty pattern may move from configuration to some configuration , where is a number with .
Sketch of a proof..
The idea is to take each proper step used in the periodic part of the computation on the correct pattern, and to reproduce it in the computation on the faulty pattern. When this is impossible, that proper step will be modified: a proper step downward will be shrunk using Claim 7, and an upward proper step will be stretched by Claim 7. This way, the sequence of modified proper steps on the faulty pattern will lead the automaton upward faster than the original sequence of proper steps on the correct pattern. Finally, once the modified sequence passes by the fault, the desired configuration can be obtained by just finishing the current iteration.
Now everything is prepared for the final step of the proof of the Main Lemma. It has been proved that on the faulty pattern the automaton passes by the fault and arrives in the state to some node , with . The original computation on the correct pattern regularly visits nodes in this region in the state , but the node need not be one of those nodes. The idea is to continue the computation on from , so that it gets back on the track of the periodic computation on . For that, the automaton should compensate for the shift of relative to the nearest node visited in the state on .
Proof of the Main Lemma..
If it is possible to pass by the fault with zero shift, then this is all, the Main Lemma is proved. Assume that this is impossible. Then, some proper step of some type from the periodic part of the computation on the correct pattern cannot be reproduced on the faulty pattern. In this case, Condition I in Claim 7 does not hold for this step. Then the claim asserts that there is a proper step of type , where and . Let a single iteration of the periodically repeated sequence move the automaton up by elements ; and from Claim 6 it is known that . Furthermore, , because otherwise all nodes from to would be visited in the state , and the automaton would pass by the fault without a shift. Then, if one uses the shrunken proper step of type in the iteration instead of the original proper step of type , then the resulting iteration moves the automaton upwards either by or by elements .
The idea is to use this iteration with a shrunken proper step several times to compensate for the shift. By Claim 8, the automaton can move on the faulty pattern from to , with . In the computation on the correct pattern the automaton comes to nodes with numbers , , , …, in the state . And to get back on the track of this computation, the automaton should compensate for the wrong residue of the number modulo . This can be done by applying at most iterations with the shrunken proper step. It remains to prove that the automaton cannot reach the fault and cannot move upward too far while compensating for the wrong residue modulo .
Since the original iteration has pace at least and involves at most consecutive elements , the iteration with the shrunken proper step has positive pace and involves at most consecutive elements . Thus, the automaton cannot reach the fault by applying such iterations starting in the node , because .
Now consider how far up the automaton can move while correcting the shift. It applies at most iterations with the shrunken proper step; such iterations have pace at most . Therefore, applying these iterations leads the automaton to a node with number at most , and , which is less than , because and . And since each iteration involves at most consecutive elements , during these iterations the automaton can visit nodes with numbers at most , which satisfies .
Therefore, the automaton , operating on the faulty pattern , can return to the track of the computation on the correct pattern, and come to the root in the state . This proves the Main Lemma, as well as the entire Theorem 4.
9 UTWA are weaker than NTWA
An unambiguous tree-walking automaton (UTWA) is a nondeterministic tree-walking automaton that has at most one accepting computation on each input tree. In this section, it is proved that UTWA are strictly weaker than NTWA.
The separating language , defined in Section 3, is the same as in the rest of this paper. By Theorem 4, it is recognized by an NTWA .
Theorem 19.
The class of tree languages recognized by unambiguous tree-walking automata is strictly smaller than the class of languages recognized by nondeterministic tree-walking automata. In particular, no UTWA recognizes the language .
Proof.
Let be any NTWA recognizing the language . It should be proved that the automaton is not unambiguous. Let be the number of its states. Let and , both of rank , be the patterns defined for the automaton as in Section 5. By the Main Lemma, .
Consider the following four trees.
Of these, the three trees are in the language , and hence accepted by the automaton . The tree is not in , and must be rejected by .
Let be an accepting computation on the tree . By replacing the small correct pattern in the tree with , and by using the inclusion , the computation is transformed into an accepting computation on the tree , such that for every run through the upper pattern made in there is a run of the same type through the small correct pattern (as it was made in ).
Similarly, an accepting computation on the tree is transformed into an accepting computation on the tree . All runs through the lower pattern made in can be reproduced on .
Suppose that the accepting computations and on the tree are the same. Then the computation , while passing through each pattern , makes only such runs that can be executed on . Replacing both patterns with , one obtains an accepting computation of the automaton on the tree , which is not in . This is a contradiction, and therefore the computations and are distinct, and thus the automaton is not unambiguous.
10 A deterministic one-pebble tree-walking automaton recognizing
It is worth noting that the language used in this paper can be recognized by a deterministic tree-walking automaton with one pebble (see Bojańczyk et al. [6] for a precise definition of this model).
Theorem 20.
There is a deterministic tree-walking automaton with one pebble recognizing the language .
Proof.
The automaton moves its pebble in the order of depth-first tree traversal. When it puts the pebble at some node , it proceeds with checking the following two conditions: first, that the left subtree of contains at least one -labelled leaf, and secondly, that the right subtree of contains at least two -labelled leaves. With the pebble in place, both searches can be done deterministically. If both conditions hold for some node , then the automaton reports that the tree is in . Otherwise, it moves the pebble to the next position and begins the next check.
If no node in the tree has both conditions satisfied at the same time, the automaton eventually completes moving its pebble around and reports that the tree is not in .
Since the family of one-pebble deterministic tree-walking automata is closed under complementation [14], the complement of the language is recognized by such an automaton as well. Thus, the complement of is a tree language recognized by a one-pebble deterministic tree-walking automaton, but not by any NTWA.
11 Conclusion
The second result of this paper, that is, that unambiguous tree-walking automata (UTWA) are weaker than nondeterministic ones, leaves a few related questions to investigate. Most importantly, it remains unknown whether UTWA are any more powerful than DTWA. If these families turn out to be different, then another question will arise, whether UTWA are closed under complementation. Furthermore, the same questions can be asked about unambiguous graph-walking automata (UGWA): if they can be determinized, then so can be UTWA, but it could also be possible that UTWA can be determinized whereas UGWA cannot.
References
- [1] Alfred V. Aho and Jeffrey D. Ullman. Translations on a context-free grammar. Inf. Control., 19(5):439–475, 1971. doi:10.1016/S0019-9958(71)90706-6.
- [2] Mikołaj Bojańczyk. Tree-walking automata. In Carlos Martín-Vide, Friedrich Otto, and Henning Fernau, editors, Language and Automata Theory and Applications, Second International Conference, LATA 2008, Tarragona, Spain, March 13-19, 2008. Revised Papers, volume 5196 of Lecture Notes in Computer Science, pages 1–2. Springer, 2008. doi:10.1007/978-3-540-88282-4_1.
- [3] Mikołaj Bojańczyk. It is undecidable if two regular tree languages can be separated by a deterministic tree-walking automaton. Fundam. Informaticae, 154(1-4):37–46, 2017. doi:10.3233/FI-2017-1551.
- [4] Mikołaj Bojańczyk and Thomas Colcombet. Tree-walking automata cannot be determinized. Theor. Comput. Sci., 350(2-3):164–173, 2006. doi:10.1016/j.tcs.2005.10.031.
- [5] Mikołaj Bojańczyk and Thomas Colcombet. Tree-walking automata do not recognize all regular languages. SIAM J. Comput., 38(2):658–701, 2008. doi:10.1137/050645427.
- [6] Mikołaj Bojańczyk, Mathias Samuelides, Thomas Schwentick, and Luc Segoufin. Expressive power of pebble automata. In Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener, editors, Automata, Languages and Programming, 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part I, volume 4051 of Lecture Notes in Computer Science, pages 157–168. Springer, 2006. doi:10.1007/11786986_15.
- [7] Thomas Colcombet. Unambiguity in automata theory. In Jeffrey O. Shallit and Alexander Okhotin, editors, Descriptional Complexity of Formal Systems - 17th International Workshop, DCFS 2015, Waterloo, ON, Canada, June 25-27, 2015. Proceedings, volume 9118 of Lecture Notes in Computer Science, pages 3–18. Springer, 2015. doi:10.1007/978-3-319-19225-3_1.
- [8] Joost Engelfriet and Hendrik Jan Hoogeboom. Tree-walking pebble automata. In Juhani Karhumäki, Hermann A. Maurer, Gheorghe Păun, and Grzegorz Rozenberg, editors, Jewels are Forever, Contributions on Theoretical Computer Science in Honor of Arto Salomaa, pages 72–83. Springer, 1999. doi:10.1007/978-3-642-60207-8_7.
- [9] Joost Engelfriet and Hendrik Jan Hoogeboom. Automata with nested pebbles capture first-order logic with transitive closure. Log. Methods Comput. Sci., 3(2), 2007. doi:10.2168/LMCS-3(2:3)2007.
- [10] Mika Göös, Stefan Kiefer, and Weiqiang Yuan. Lower bounds for unambiguous automata via communication complexity. In Mikołaj Bojańczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming, ICALP 2022, July 4-8, 2022, Paris, France, volume 229 of LIPIcs, pages 126:1–126:13. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.ICALP.2022.126.
- [11] Emil Indzhev and Stefan Kiefer. On complementing unambiguous automata and graphs with many cliques and cocliques. Inf. Process. Lett., 177:106270, 2022. doi:10.1016/J.IPL.2022.106270.
- [12] Michal Kunc and Alexander Okhotin. Reversibility of computations in graph-walking automata. Inf. Comput., 275:104631, 2020. doi:10.1016/j.ic.2020.104631.
- [13] Olga Martynova. Complexity of the emptiness problem for graph-walking automata and for tilings with star subgraphs. Inf. Comput., 296:105127, 2024. doi:10.1016/J.IC.2023.105127.
- [14] Anca Muscholl, Mathias Samuelides, and Luc Segoufin. Complementing deterministic tree-walking automata. Inf. Process. Lett., 99(1):33–39, 2006. doi:10.1016/j.ipl.2005.09.017.
- [15] Frank Neven. Automata, logic, and XML. In Julian C. Bradfield, editor, Computer Science Logic, 16th International Workshop, CSL 2002, 11th Annual Conference of the EACSL, Edinburgh, Scotland, UK, September 22-25, 2002, Proceedings, volume 2471 of Lecture Notes in Computer Science, pages 2–26. Springer, 2002. doi:10.1007/3-540-45793-3_2.
- [16] Frank Neven and Thomas Schwentick. On the power of tree-walking automata. Inf. Comput., 183(1):86–103, 2003. doi:10.1016/S0890-5401(03)00013-0.
- [17] Mathias Samuelides and Luc Segoufin. Complexity of pebble tree-walking automata. In Erzsébet Csuhaj-Varjú and Zoltán Ésik, editors, Fundamentals of Computation Theory, 16th International Symposium, FCT 2007, Budapest, Hungary, August 27-30, 2007, Proceedings, volume 4639 of Lecture Notes in Computer Science, pages 458–469. Springer, 2007. doi:10.1007/978-3-540-74240-1_40.
- [18] Michael Sipser. Halting space-bounded computations. Theor. Comput. Sci., 10:335–338, 1980. doi:10.1016/0304-3975(80)90053-5.
- [19] Margus Veanes. On computational complexity of basic decision problems of finite tree automata. UPMAIL Technical Report 133, Computing Science Department, Uppsala University, January 1997.