Games for Active XML Revisited

The paper studies the rewriting mechanisms for intensional documents in the Active XML framework, abstracted in the form of active context-free games. The safe rewriting problem studied in this paper is to decide whether the first player, Juliet, has a winning strategy for a given game and (nested) word; this corresponds to a successful rewriting strategy for a given intensional document. The paper examines several extensions to active context-free games. The primary extension allows more expressive schemas (namely XML schemas and regular nested word languages) for both target and replacement languages and has the effect that games are played on nested words instead of (flat) words as in previous studies. Other extensions consider validation of input parameters of web services, and an alternative semantics based on insertion of service call results. In general, the complexity of the safe rewriting problem is highly intractable (doubly exponential time), but the paper identifies interesting tractable cases.


Introduction
Scientific context This paper contributes to the theoretical foundations of intensional documents, in the framework of Active XML [1]. It studies gamebased abstractions of the mechanism transforming intensional documents into documents of a desired form by calling web services. One form of such games has been introduced under the name active context-free games in [14] as an abstraction of a problem studied in [12]. 1 The setting in [12] is as follows: an Active XML document is given, where some elements consist of functions representing web services that can be called. The goal is to rewrite the document by a series of web service calls into a document matching a given target schema.
Towards an intuition of Active XML document rewriting, consider the example in Figure 1 of an online local news site dynamically loading information about weather and local events (adapted from [12] and [14]). Figure 1a shows the initial Active XML document for such a site, containing function nodes which refer to a weather and an event service, respectively, instead of concrete weather and event data. After a single function call to each of these services has been materialised, the resulting document may look like the one depicted in Figure 1b. Note that the rewritten document now contains new function nodes; further rewriting might be necessary to reach a document in a given target schema (which could, for instance, require that the document contains at least one indoor event if the weather is rainy).  Modelling this rewriting problem as a game follows the approach of dealing with uncertainty by playing a "game against nature": We model the process intended to rewrite a given document into a target schema by performing function calls as a player (Juliet). As her moves, she chooses which function nodes to call, and her goal is to reach a document in the target schema. Returns of function calls, on the other hand, are chosen (in accordance with some schema for each called service) by an antagonistic second player (Romeo), whose goal is to foil Juliet. The question whether a given document can always be rewritten into the target schema may then be solved by deciding whether Juliet has a winning strategy. More specifically, given an input document, target schema and return schemas for function calls, there should exist a safe rewriting algorithm that always rewrites the input document into the target schema, no matter the concrete returns of function calls, if and only if Juliet has a winning strategy in the corresponding game. 2 In [12], the target schema is represented by an XML document type definition (DTD). It was argued that, due to the restricted nature of DTDs, the problem can be reduced to a rewriting game on strings where, in each move a single symbol is replaced by a string, the set of allowed replacement strings for each symbol is a regular language and the target language is regular 3 , as well.
In [14] the complexity of the problem to determine the winner in such games (mainly with finite replacement languages) was studied. Whereas this problem is undecidable in general, there are important cases in which it can be solved, particularly if Juliet chooses the symbols to be replaced in a left-to-right fashion. In and after [14,12], research very much concentrated on games on strings (and thus on the setting with DTDs). Furthermore, to achieve tractability, a

No replay Bounded
Unbounded Regular target language Regular replacement PSPACE 2-EXPTIME 2-EXPTIME Finite replacement PSPACE PSPACE EXPTIME DTD or XML Schema target language Regular replacement PTIME PSPACE EXPTIME Finite replacement PTIME PTIME EXPTIME Table 1: Summary of complexity results. All results are completeness results.
special emphasis was given to the restriction to bounded strategies, in which the recursion depth with respect to web service calls is bounded by some constant.
Our approach The aim of this paper is to broaden the scope and extend the investigation of games for Active XML in several aspects. First of all, we consider stronger schema languages (compared to DTDs) such as XML Schema and Relax NG, due to their practical importance. To allow for this extension, our games are played on nested words [3]. 4 Furthermore, we study the impact of the validation of input parameters for web service calls (partly considered already in [12]), and investigate an alternative semantics, where results of web service calls are inserted next to the node representing the web service, as opposed to replacing that node.
As we are particularly interested in the identification of tractable cases, we follow the previous line of research by concentrating on strategies in document order (left-to-right strategies) and by considering bounded strategies (bounded replay) and strategies in which no calls in results from previous web service calls are allowed (no replay). However, we also pinpoint the complexity of the general setting.
As a basic intuition for the concept of replay, consider again the online news site example from Figure 1, and assume that the schema for the event service's returns is (partially) given by @event svc Ñ pSports|Movieq@event svc, i.e. the event service allows for dynamic loading of additional results. A strategy with no replay would not be allowed to fetch any additional results in the situation of Figure 1b, while a strategy with bounded replay k (for some constant k) could load up to k more events after the first. A strategy with unbounded replay would be able to fetch an arbitrary number of results, but might lead to a rewriting process that does not terminate if unsuccessful.
Our contributions Our complexity results with respect to stronger schema languages are summarised in Table 1. In the general setting, the complexity is very bad: doubly exponential time. However, there are tractable cases for XML Schema: replay-free strategies in general and strategies with bounded replay in the case of finite replacement languages (that is, when there are only finitely many possible answers, for each web service). It should be noted that the PSPACE-hardness result for the case with DTDs, bounded replay and infinite replacement languages indicates that the respective PTIME claim in [12] is wrong.
In the setting where web services come with an input schema that restricts the parameters of web service calls, we only study replay-free strategies. It turns out that this case is tractable if all schemas are specified by DTDs and the number of web services is bounded. On the other hand, if the desired document structure is specified by an XML Schema or the number of function symbols is unbounded, the task becomes PSPACE-hard.
For insertion-based semantics, we identify an undecidable setting and establish a correspondence with the standard "replacement" semantics, otherwise.
As a side result of independent interest, we show that the word problem for alternating nested word automata is PSPACE-complete.
Related Work We note that the results on flat strings in this paper do not directly follow from the results in [14], as [14] assumed target languages given by DFAs as opposed to deterministic regular expressions, which are integral to both DTDs and more expressive XML schema languages. However, the techniques from [14] can be adapted.
More related work for active context-free games than the papers mentioned so far is discussed in [14]. Further results on active context-free games in the "flat strings" setting can be found in [2,4]. A different form of 2-player rewrite games are studied in [18]. More general structure rewriting games are defined in [9].
Organisation We give basic definitions in Section 2. Games with regular schema languages (given by nested word automata) are studied in Section 3, games in which the schemas are given as DTDs or XML Schemas are investigated in Section 4. Validation of parameters and insertion of web service results are considered in Section 5. Most proofs are delegated to the appendix for brevity.

Preliminaries
For any natural number n P N, we denote by rns the set t1, . . . , nu. Where M is a (finite) set, PpM q denotes the powerset of M , i.e. the set of all subsets of M . For an alphabet Σ, we denote the set of finite strings over Σ by Σ˚and ǫ denotes the empty string.
Nested words We use nested words 5 as an abstraction of XML documents [3]. For a finite alphabet Σ, Σ def " t a | a P Σu denotes the set of all opening Σ-tags and {Σ def " t {a | a P Σu the set of all closing Σ-tags. The set WFpΣq Ď p Σ Y {Σ q˚of (well-)nested words over Σ is the smallest set such that ǫ P WFpΣq, and if u, v P WFpΣq and a P Σ, then also u a v {a P WFpΣq. We (informally) associate with every nested word w its canonical forest representation, such that words a {a , a v {a and uv correspond to an a-labelled leaf, a tree with root a (and subforest corresponding to v), and the forest of u followed by the forest of v, respectively. A nested string w is rooted, if its corresponding forest is a tree. In a nested string w " w 1 . . . w n P WFpΣq, two tags w i P Σ and w j P {Σ with i ă j are associated if the substring w i . . . w j of w is rooted. To stress the distinction from nested strings in WFpΣq, we refer to strings in Σ˚as flat strings (over Σ).
What we describe as opening and closing tags is often referred to as call symbols and return symbols in the literature on nested words; we avoid these terms to avoid confusion with Read and Call moves used in context-free games (see below).
Context-free games A context-free game on nested words (cfG) G " pΣ, Γ, R, T q consists 6 of a finite alphabet Σ, a set Γ Ď Σ of function symbols, a rule set R Ď ΓˆWFpΣq and a target language T Ď WFpΣq. We will only consider the case where T and, for each symbol a P Γ, the set R a def " tu | pa, uq P Ru is a non-empty regular nested word language, to be defined in the next subsection.
A play of G is played by two players, Juliet and Romeo, on a word w P WFpΣq. In a nutshell, Juliet moves the focus along w in a left-to-right manner and decides, for every closing tag 7 {a whether she plays a Read or, in case a P Γ, a Call move. In the latter case, Romeo then replaces the rooted word ending at the position of {a with some word v P R a and the focus is set on the first symbol of v. In case of a Read move (or an opening tag) the focus just moves further on. Juliet wins a play if the word obtained at its end is in T .
Towards a formal definition, a configuration is a tuple κ " pp, u, vq P tJ, Rup Σ Y {Σ q˚ˆp Σ Y {Σ q˚where p is the player to move, uv P WFpΣq is the current word, and the first symbol of v is the current position. A winning configuration for Juliet is a configuration κ " pJ, u, ǫq with u P T . The configuration κ 1 " pp 1 , u 1 , v 1 q is a successor configuration of κ " pp, u, vq (Notation: κ Ñ κ 1 ) if one of the following holds: (1) p 1 " p " J, u 1 " us, and sv 1 " v for some s P Σ Y {Σ (Juliet plays Read); (2) p " J, p 1 " R, u " u 1 , v " v 1 " {a z for z P p Σ Y {Σ q˚, a P Γ, (Juliet plays Call); (3) p " R, p 1 " J, u " x a y, v " {a z for x, z P p Σ Y {Σ q˚, y P WFpΣq, u 1 " x and v 1 " y 1 z for some y 1 P R a (Romeo plays y 1 ); 8 The initial configuration of game G for string w is κ 0 pwq def " pJ, ǫ, wq. A play of G is either an infinite sequence Π " κ 0 , κ 1 , . . . or a finite sequence Π " κ 0 , κ 1 , . . . , κ k of configurations, where, for each i ą 0, κ i´1 Ñ κ i and, in 6 Some of the following definitions are taken from [4]. 7 It is easy to see that the winning chances of the game do not change if we allow Juliet to play Call moves at opening tags: if Juliet wants to play Call at an opening tag she can simply play Read until the focus reaches the corresponding closing tag and play Call then. On the other hand, if she can win a game by calling a closing tag, she can also win it by calling the corresponding opening tag, thanks to the fact that she has full information. 8 We note that a Call move on {a in a substring of the form a y {a actually deletes the substring y along with the opening and closing a-tags. This is consistent with the AXML intuition of the subtree rooted at a function node getting replaced when the function node is called.
the finite case, κ k has no successor configuration. In the latter case, Juliet wins the play if κ k is of the form pJ, u, ǫq with u P T , in all other cases, Romeo wins.
Proposition 1. Let G be a context-free game, and w a string. Then either Juliet or Romeo has a winning strategy on w, which is actually memoryless. Therefore, in the following, strategies σ for Juliet map configurations κ to moves σpκq P tCall, Readu and strategies τ for Romeo map configurations κ to moves τ pκq P WFpΣq.
For configurations κ, κ 1 and strategies σ, τ we write κ σ,τ ÝÑ κ 1 if κ 1 is the unique successor configuration of κ determined by strategies σ and τ . Given an initial word w and strategies σ, τ the play 9 Πpσ, τ, wq ÝÑ¨¨ï s uniquely determined. If Πpσ, τ, wq is finite, we denote the word represented by its final configuration by word G pw, σ, τ q.
A strategy σ for Juliet is finite on string w if the play Πpσ, τ, wq is finite for every strategy τ of Romeo. It is a winning strategy on w if Juliet wins the play Πpσ, τ, wq, for every τ of Romeo. A strategy τ for Romeo is a winning strategy for w if Romeo wins Πpσ, τ, wq, for every strategy σ of Juliet. We only consider finite strategies for Juliet, due to Juliet's winning condition. We denote the set of all finite strategies for Juliet in the game G by STRAT J pGq, and the set of all strategies for Romeo by STRAT R pGq.
The Call depth of a play Π is the maximum nesting depth of Call moves in Π, if this maximum exists. That is, the Call depth of a play is zero, if no Call is played at all, and one, if no Call is played inside a string yielded by a replacement move. For a strategy σ of Juliet and a string w P WFpΣq, the Call depth Depth G pσ, wq of σ on w is the maximum Call depth in any play Πpσ, τ, wq. A strategy σ has k-bounded Call depth if Depth G pσ, wq ď k for all w P WFpΣq. We denote by STRAT k J pGq the set of all strategies with k-bounded Call depth for Juliet on G. As a more intuitive formulation, we use the concept of replay, which is defined as Call depth (if it exists) minus one: Strategies for Juliet of Call depth one are called replay-free, and strategies of k-bounded Call depth, for any k, have bounded replay. For technical reasons, we need to use Call depth for some formal proofs and definitions, but we will stick with the more intuitive concept of replay wherever possible.
By JWinpGq we denote the set of all words for which Juliet has a winning strategy in STRAT J pGq (likewise for JWin k pGq and STRAT k J pGq).
Nested word automata A nested word automaton (NWA) A " pQ, Σ, δ, q 0 , F q [3] is basically a pushdown automaton which performs a push operation on every opening tag and a pop operation on every closing tag, and in which the pushdown symbols are just states. More formally, A consists of a set Q of states, an alphabet Σ, a transition function δ, an initial state q 0 P Q and a set F Ď Q of accepting states. The function δ is the union of a function pQˆ Σ q Ñ PpQˆQq and a function pQˆQˆ {Σ q Ñ PpQq.
A configuration κ of A is a tuple pq, αq P QˆQ˚, with a linear state q and a sequence α of hierarchical states, reflecting the pushdown store. A run of A on w " w 1 . . . w n P WFpΣq is a sequence κ 0 , . . . , κ n of configurations κ i " pq i , α i q of A such that for each i P rns and a P Σ it holds that • if w i " a , pq i , pq P δpq i´1 , a q (for some p P Q), and α i " pα i´1 , or In this case, we also write κ 0 w ❀ A κ n . We say that A accepts w if pq 0 , ǫq w ❀ A pq 1 , ǫq for some q 1 P F . The language LpAq Ď WFpΣq is defined as the set of all strings accepted by A and is called a regular language (of nested words).
An NWA is deterministic (or DNWA) if |δpq, a q| " 1 " |δpq, p, {a q| for all p, q P Q and a P Σ. In this case, we simply write δpq, a q " pq 1 , p 1 q instead of δpq, a q " tpq 1 , p 1 qu (and accordingly for δpq, p, {a q), and δ˚pp, wq " q if q is the unique state, for which pp, ǫq w ❀ A pq, ǫq. An NWA is in normal form if every transition function δpp, a q only uses pairs of the form pq, pq. Informally, when A reads an opening tag it always pushes its current state (before the opening tag) and therefore can see this state when it reads the corresponding closing tag. As in this case the hierarchical state is just the origin state p of the transition, we write δpp, a q " q as an abbreviation of δpp, a q " pq, pq, for DNWAs in normal form.

Lemma 2.
There is a polynomial-time algorithm that computes for every deterministic NWA an equivalent deterministic NWA in normal form.

Algorithmic Problems
In this paper, we study the following algorithmic problem JWinpGq for various classes G of context-free games.
JWinpGq Given: A context-free game G P G and a string w. Question: Is w P JWinpGq?
A class G of context-free games in JWinpGq comes with three parameters: • the representation of the target language T , • the representation of the replacement languages R a , and • to which extent replay is restricted.
It is a fair assumption that the representations of the target language and the replacement languages are of the same kind, but we will always discuss the impact of the replacement language representations separately. In our most general setting, investigated in Section 3, target languages are represented by deterministic nested word automata, and replacement languages by (not necessarily deterministic) nested word automata. We do not consider the representation of target languages by non-deterministic NWAs, as (1) already for DNWAs the complexity is very high in general, and (2) we can show that even in the replay-free case the complexity would become EXPTIME-complete. We usually denote the automata representing the target and replacement languages by ApT q and ApR a q, respectively.
In Section 4 we study the cases where T is given as an XML Schema or a DTD. In each setting, we consider the cases of unrestricted replay, bounded replay (Call depth k, for some k), and no replay (Call depth 1). We note that replay depth is formally not an actual game parameter, but the algorithmic problem can be restricted to strategies of Juliet of the stated kind.
If the class G of games is clear from the context, we often simply write JWin instead of JWinpGq.
We denote by |R| the combined size of all ApR a q, a P Γ, and by |G| the size of (a sensible representation of) G, i.e. |G| " |Σ|`|R|`|ApT q|.

Games with regular target languages
We first consider our most general case, where target languages are given by DNWAs, replacement languages by NWAs and replay is unrestricted, because the algorithm that we develop for this case can be adapted (and sped up) for many of the more restricted cases. It is important to note that our results do not rely on the presentation of schemas as nested word automata. In fact, in Section 4, we will assume that the target schema is given as an XML Schema or a DTD. However, for our algorithms nested word automata are handy to represent (linearisations of) regular tree languages and therefore in this section target languages are represented by NWAs. We emphasize that deterministic bottomup tree automata can be translated into deterministic NWAs in polynomial time [3].
This generic algorithm works in two main stages for a given cfG G and word w. It first analyses the game G and aggregates all necessary information in a so-called call effect C. Then it uses C to decide whether Juliet has a winning strategy in the game G on w.
The call effect C only depends on G and contains, for every function symbol f and every state q of the ApT q, all possible effects of the subgame starting with a Call move of Juliet on some symbol {f on the target language T , under the assumption that the sub-computation of ApT q on the word yielded by the game from {f starts in state q. More precisely, it summarises which sets S of states Juliet can enforce by some strategy σ, where each S is a set of states of ApT q that Romeo might enforce with a counter strategy against σ.
The first stage of the algorithm consists of an inductive computation in which successive approximations C 1 , C 2 , . . . of C are computed, where C i is the restriction of C to strategies of Juliet of Call depth i. The size of call effects and the number of iterations are at most exponential in |G|. However, the first stage can not be performed in exponential time as a single iteration might take doubly exponential time in |G|. It turns out through our corresponding lower bound that single iterations can not be done faster.
At the end of the first stage, the algorithm computes an alternating NWA A G (of exponential size) from C that decides the set JWinpGq. In the second stage, A G is evaluated on w, taking at most polynomial space in |A G | and |w|.
A restriction of games to bounded replay does not improve the general complexity of the problem, as this is dominated by the doubly exponential effort of a single iteration. However, for replay-free games, no iterations are needed, the initial call effect C 1 is of polynomial size and can easily be computed and therefore, in this case, the overall complexity is dominated by the second stage, yielding a polynomial-space algorithm.
Altogether we prove the following theorem in this section. The rest of this section gives a proof sketch for Theorem 3. Before we describe the generic algorithm in more detail, we discuss the very natural and more direct approach by alternating algorithms, in which a strategy for Juliet is nondeterministically guessed and the possible moves of Romeo are taken care of by universal branching. In our setting of context-free games, there are the following obstacles to this approach: (1) Romeo can, in general, choose from an infinite number of (and thus arbitrarily long) strings in R a , for the current a, and (2) it is not a priori clear that such algorithms terminate on all branches. Whereas the latter obstacle is not too serious (if Juliet has a winning strategy, termination on all branches is guaranteed), the former requires a more refined approach. We basically deal with it in two ways: in some cases it is possible to show that it does not help Romeo to choose strings of length beyond some bound; in the remaining cases (in particular in those cases considered in this section), the algorithms use abstracted moves instead of the actual replacement moves of the game. The two stages that were sketched above, then come very naturally: first, the abstraction has to be computed, then it can be used for the actual alternating computation.
Our abstraction from actual cfGs is based on the simple observation that instead of knowing the final word word G pw, σ, τ q that is reached in a play Πpσ, τ, wq, it suffices to know whether δ˚pq 0 , word G pw, σ, τ qq P F to tell the winner. If we fix a strategy σ of Juliet in a game on w, the possible outcomes of the game (for the different strategies of Romeo) can thus be summarised by states G pq 0 , w, σq def We next describe how to compute CrGs from a given cfG G. As already mentioned, our algorithm follows a fixpoint-based approach. It computes inductively, for k " 1, 2, . . . the call effect of the restricted game of maximum Call depth k. We show that the fixpoint reached by this process is the actual call effect CrGs.
To this end, let, for every cfG G, a P Σ, q P Q, and k ě 1, As an important special case, the call effect of replay-free games -the basis for the inductive computation -consists of only one set.
Lemma 5. For every q P Q and a P Σ, it holds that In particular, C 1 rGs can be computed from G in polynomial time.
This just follows from the definitions, as Romeo can choose any string from R a . We next describe how each C k`1 rGs can be computed from C k rGs. The algorithm uses alternating nested word automata (ANWAs) which we will now define.
An alternating nested word automaton (ANWA) A " pQ, Σ, δ, q 0 , F q is defined like an NWA, except that the two parts of δ map pQˆ Σ q into B`pQˆQq and pQˆQˆ {Σ q into B`pQq, respectively, where B`pQq denotes the set of all positive boolean combinations over elements of Q using the binary operatorŝ and _ (and likewise for B`pQˆQq).
The semantics of ANWA is defined via runs, which require the notion of tree domains. A tree domain is a prefix-closed language D Ď N˚of words over N such that, if wk P D for some w P D, k P N, then also wj P D for all j ă k. Strings in a tree domain are interpreted as node addresses for ordered trees in the standard way: ǫ addresses the root, and if w P D addresses some node v with k children, then w1, . . . , wk P D address those children.
For any function λ : D Ñ pQ Y pQˆQqq and node address x P D, we denote by λpxq the linear state component of λpxq, i.e. if λpxq " q or λpxq " pq, pq for some p, q P Q, then λpxq " q.
A run r " pD, λq of an ANWA A over a nested word w " w 1 . . . w n is a finite tree of depth n, represented by a tree domain D and a labelling function λ : D Ñ pQ Y pQˆQqq such that λpǫq " q 0 and, for every x P D of length i with ℓ children, it holds that • if w i`1 P Σ , then tλpx¨1q, . . . , λpx¨ℓqu |ù δpλpxq, w i`1 q, and • if w i`1 P {Σ with associated opening tag w j , and λpyq " pq, pq for some p, q P Q (where y is the prefix of x of length j), then tλpx¨1q, . . . , λpx¨ℓqu |ù δpλpxq, p, w i`1 q.
An ANWA A accepts a nested word w if there is a run pD, λq over w such that λpxq P F , for every x P D of length |w|.
ANWAs are used twice in the generic algorithm, first, to inductively compute C k`1 rGs from C k rGs, second to actually decide JWinpGq, given CrGs. The following proposition will be crucial, in both cases. Proposition 6. There is an algorithm that computes from the call effect CrGs of a game G in polynomial time in |CrGs| and |G| an ANWA A CrGs such that LpA CrGs q " JWinpGq.
The computation of C k`1 rGs from C k rGs involves a non-emptiness test for ANWAs, the second stage a test whether w P LpA CrGs q. Therefore, both of the following complexity results for ANWAs influence the complexity of our algorithms.
(b) The membership problem for ANWAs is PSPACE-complete.
Statement (a) follows immediately from the corresponding result for visibly pushdown automata in [5], statement (b) is new, to the best of our knowledge, and seems to be interesting in its own right. It is shown in the appendix. Now we continue describing the ingredients of the first stage of the generic algorithm.
Lemma 8. Given a state q P Q, an alphabet symbol a P Γ, and C k rGs, for some k ě 1, the call effect C k`1 rGspa, qq can be computed in doubly exponential time in |G|.
By Lemmas 5 and 8, one can compute C k rGs inductively, for every k ě 1. By definition it holds, for every q and a, that C k rGspa, qq is contained in the closure of C k`1 rGspq, aq under supersets. As there are ď 2 |Q| sets in each C k rGspa, qq (for a P Γ, q P Q), the computation reaches a fixed point after at most exponentially many iterations. We denote this fixed point by C˚rGs, that is, we define, for every a P Σ, q P Q: In particular, for each game G, there is a number ℓ ď |Γ|ˆ|Q|ˆ2 |Q| such that C˚rGs " C ℓ rGs and C m rGs " C ℓ rGs, for every m ě ℓ. However, it is not self evident that this process actually constructs CrGs, i.e., that C˚rGs " CrGs. The following result shows that this is actually the case.
Proposition 9. For every cfG G it holds: C˚rGs " CrGs. Now we can give a (high-level) proof for Theorem 3. Proof of Theorem 3. We first justify the upper bounds. Let G be a cfG and w a word. By Lemma 5, C 1 rGs can be computed in polynomial time from G. For the replay-free case, we can immediately construct an ANWA for JWinpGq and evaluate it on w, yielding a PSPACE upper bound by Proposition 7.
For (a) and (b), CrGs (C k rGs, respectively) can be computed in doubly exponential time, A C can be computed in exponential time (in the size of G), and whether w P LpA C q can then be tested in polynomial space in |A C | and |w|, that is, in at most exponential space in |G| and |w|.
That these upper bounds can not be considerably improved, is stated in the following proposition, thereby completing the proof of Theorem 3. Claims (a) and (b) of Proposition 10 follow from the corresponding parts of Proposition 7; in the proof, we construct from an ANWA A a replay-free cfG simulating A on any input word w (yielding claim (b)) and explain how replay can be added to that game to find and verify a witness for the non-emptiness of A, if one exists (yielding claim (a)).
For finite (and explicitly given) replacement languages the complexity changes considerably in the cases with replay, but not in the replay-free case. The upper bound in (a) follows as for finite replacement languages C k`1 rGspa, qq can be computed from C k rGspa, qq in polynomial space 10 . The PSPACE upper bound in (b) can then be achieved by the usual "recomputation technique" of space-bounded computations.
The lower bound in (a) already holds for flat words (see Theorem 4.3 in [14]). The lower bound in (b) follows as the proof of Proposition 10 only uses finite replacement languages.
As our algorithms generally construct ANWAs deciding JWinpGq, the data complexity for JWin is in PSPACE for all cases considered in this section due to Proposition 7.

Games with XML Schema target languages
The results of Section 3 provide a solid foundation for our further studies, but the setting studied there suffers from two problems: (1) the complexities are far too high (at least for games with replay) and (2) the assumption that target and replacement languages are specified by (D)NWAs is not very realistic. In this section, we address both issues at the same time: when we require that target languages are specified by typical XML schema languages (DTD or XML Schema), we get considerably better complexities.
The better complexities basically all have the same reason: XML Schema target languages can be described by a restriction of nested word automata, which we call simple below. This restriction translates to the alternating NWAs corresponding to call effects. For simple ANWAs, however, the two basic algorithmic problems, Non-emptiness and Membership have dramatically better complexities: PSPACE and PTIME as opposed to 2-EXPTIME and PSPACE, respectively. We emphasise that, in accordance with the official standards, our definitions for DTDs and XML Schema require deterministic regular expressions.
Altogether, we prove the following complexity results. Here, the lower bounds are proven for DTDs, and the upper bounds for XML Schemas.
The lower bound in Theorem 12 (b) for the case of games with DTDs contradicts the statement of a PTIME algorithm in Section 4.3 of [12] (unless PTIME " PSPACE). 11 Before we describe the proof of Theorem 12, we first define single-type tree grammars and local tree grammars as well-established abstractions of XML Schema and DTDs, respectively (see, e.g., [13]). However, we will refer to grammars of these types as XML Schemas and DTDs, respectively. Definition 13. A (regular) tree grammar is a tuple T " pΣ, ∆, S, P, λq, where • Σ is a finite alphabet of labels, • ∆ is a finite alphabet of types, • S P ∆ is the root or starting type, • P is a set of productions of the form X Ñ r X mapping each type X P ∆ to a deterministic regular expression r X over ∆, called the content model of X, and • λ : ∆ Ñ Σ is a labelling function assigning a label from Σ to each type in ∆.
T is single-type if for each X P ∆, the content model r X contains no competing types, i.e. if r X contains no two types Y ‰ Z with λpY q " λpZq. T is local, if it has exactly one type for every label.
We omit the definition of the formal semantics of regular tree grammars. The nested word language LpT q described by T is just the set of linearisations of trees of the tree language that is defined in the standard way.
We next define simple DNWAs, a restriction of DNWAs that captures all languages specified by single-type tree grammars. In simple DNWAs, states are typed, i.e. each state has a component in some type alphabet ∆. Informally, when a simple DNWA A reads a subword w " a v {a in state q, it determines already on reading a which state q 1 it will take after processing w, and this state will be of the same type as q. After reading a , the linear state of A only depends on the type of q, not the exact state; this models the single-type restriction. After reading a , A goes on to validate v, and if this validation fails, A enters a failure state K instead of q 1 . Thus, the state of A at a position basically only depends on its ancestor positions (in the tree view of the document) and their left siblings. The only way in which other nodes in subtrees of these nodes can influence the state is by assuming the sink state K. Thus, in the spirit of [11], we could call such DNWAs ancestor-sibling-based but we prefer the term simple for simplicity. Definition 14. A deterministic NWA ApT q " pQ, Σ, δ, q 0 , F q in normal form is simple (SNWA) if there exist a type alphabet ∆ and state set P with Q Ď Pˆ∆, a local acceptance function F loc : Σ Ñ PpQq, a target state function t : QˆΣ Ñ Q and a failure state K P QzF , such that the following conditions are satisfied for every a P Σ: • for every p, p 1 P P, X P ∆: δppp, Xq, a q " δppp 1 , Xq, a q; • for every q P F loc paq: δpq, p, {a q " tpp, aq; • for every q P QzF loc paq: δpq, p, {a q " K and • for every q P Q: δpK, a q " δpK, q, {a q " K.
A cfG is called simple if its target DNWA is simple.
Proposition 15. From every single-type tree grammar T , a simple DNWA A can be computed in polynomial time, such that LpAq " LpT q.
The following adaptation of the notion of simplicity to ANWAs is a bit technical. It will guarantee however that the ANWAs obtained from simple games are simple and have reasonable complexity properties.
Definition 16. An ANWA A " pQ, Σ, δ, q 0 , F q with Q Ď Pˆ∆ (for some state set P and type alphabet ∆) is simple (SANWA), if it has the following two properties.
• (Horizontal simplicity) There are a local acceptance function F loc : Σ Ñ PpQq, a test state q ? P Q, and a target state function t : QˆΣ Ñ Q, such that the transition function δ of A satisfies the following conditions: δpq, q 1 , {a q " tpq 1 , aq for all q P Q and q 1 " q ? ; Furthermore, for each pp, Xq P Q and a P Σ, it holds that tppp, Xq, aq " pp 1 , Xq for some p 1 P P .
• (Vertical Simplicity) For each X P ∆ and a P Σ, there is a q P Q such that for all p P P it holds that δppp, Xq, a q P B`ptquˆppPˆtXuq Y tq ? uqq.
Essentially, horizontal simplicity states that A has two kinds of computations on a well-nested subword: (1) computations starting from a pair pq, q ? q test a property of the subword and can either succeed or fail at the end of the subword (and thus influence the overall computation); (2) computations starting from a pair pq, q 1 q for q 1 " q ? basically ignore the subword. Even though they may branch in an alternating fashion, the state after the closing tag {a is the same in all subruns, is determined by tpq 1 , aq and has the same type as q 1 .
Vertical simplicity, on the other hand, states that all alternation in A happens in the choice of hierarchical states -while, on an opening tag, A may branch into sub-runs pushing different hierarchical states onto the stack, the choice of linear follow-up state is "locally deterministic", depending only the type of the previous state of A and the label of the tag being read, and the current type is preserved in all hierarchical states except for q ? . Together, these two conditions also guarantee that SNWAs may also be interpreted as SANWAs.
(a) Non-emptiness for SANWA is PSPACE-complete.
(b) The membership problem for SANWA is decidable in polynomial time.
Proof of Theorem 12. The generic algorithm from the previous section can be adapted for simple cfGs, but with better complexity thanks to Proposition 17, to yield the upper bounds stated in Theorem 12.
More precisely, Proposition 17 (b) and Lemma 5 yield a polynomial time bound for replay-free games. Proposition 17 (a) guarantees that the inductive step in the computation of CrGs can be carried out in polynomial space (as opposed to doubly exponential time). 12 The upper bounds for games with unrestricted replay follows immediately and the upper bound for bounded replay can be shown similarly as in Proposition 11 (b).
The lower bounds are given by the following proposition. They mostly follow from careful adaptation of lower bound proofs of [14] for games on flat strings. For finite (and explicitly given) replacement languages we get feasibility even for bounded replay, but no improvement for unbounded replay.
Proposition 19. For the class of games with target languages specified by XML Schemas and explicitly enumerated finite replacement languages, JWin is (a) EXPTIME-complete with unrestricted replay, and (b) PTIME-complete (under logspace-reductions) with bounded replay or without replay.
The same results hold for DTDs in place of XML Schemas.
Once again, as our algorithm generally computes a SANWA deciding JWinpGq, the data complexity for JWin is in PTIME for all cases considered here, due to Proposition 17.

Validation of parameters and Insertion
In this section, we focus on two features that have not been addressed in the previous two sections: validation of the parameters of a function call with respect to a given schema, and a semantics which allows that returned trees do not replace their call nodes but are inserted next to them.

Validation of parameters
As pointed out in [12], in Active XML, parameters of function calls should be valid with respect to some schema. Transferred to the setting of cfGs this means that Juliet should only be able to play a Call move in a configuration pJ, u a v, {a wq if a v {a is in V a for some set V a of words that are valid for calls of {a . Our definition of cfGs and the previous ones studied in the literature mostly ignore this aspect. 13 We do not investigate all possible game types in combination with parameter validation but rather concentrate on the most promising setting with respect to tractable algorithms. It turns out, that games without replay and with DTDs to specify target, replacement and validation languages have a tractable winning problem as long as the number of different validation DTDs is bounded by some constant. 14 It becomes intractable if the number of validation schemas can be unbounded and (already) with target and validation languages specified by XML Schemas, even with only one validation schema.
More precisely, we prove the following results.
Theorem 20. For the class of games with validation with a bounded number of validation DTDs and target languages specified by DTDs, JWin is in PTIME without replay.
The algorithm uses a bottom-up approach. The basic idea is that, starting from the leaves, at each level of the tree (that is for some node v and its leaf children) all relevant information about the game in the subtree t v is computed with the help of flat replay-free games and aggregated in v. Then the children of v are discarded and the algorithm continues until only the root remains.
The following result shows that for slightly stronger games, parameter validation worsens the complexity. 15 Theorem 21. For the class of games with validation, JWin (without replay) is (a) EXPTIME-hard, if target and validation languages are specified by DNWAs (even with only one function symbol); (b) PSPACE-hard, for games with only one function symbol, if the validation language is given by an XML schema, the target language by a DTD and a finite replacement language; and (c) PSPACE-hard, for games with an unbounded number of validation DTDs and replacement and target languages specified by DTDs.
Part (a) is proven by reduction from the intersection emptiness problem for DNWAs, while parts (b) and (c) use similar reductions from the problem of determining whether a quantified Boolean formula in disjunctive normal form is true.
Due to time constraints and as we are mainly interested in finding tractable cases, we have not looked for matching upper bounds.

Insertion rules
In our definition of Call moves, we define the successor configuration of a configuration pR, u a v, {a wq to be pJ, u, v 1 wq, that is, a v {a is replaced by a string v 1 P R a . However, Active XML also offers an "append" option, where results of function calls are inserted as siblings after the calling function node (cf. [1]). There are (at least) three possible semantics of a Call move for insertion (as opposed to replacement) based games: the next configuration could be (1) pending on "how much replay" we allow for Juliet. We consider (1) as the general setting, (2) as the setting with weak replay and (3) as the setting without replay. It turns out that the weak replay setting basically corresponds to the (unrestricted) setting with replacement rules and that (3) corresponds to the replay-free setting with replacement rules. Setting (1), however, gives Juliet a lot of power and makes JWinpGq undecidable.
Theorem 22. For the class of games with insertion semantics, target DNWAs and replacement NWAs, JWin is (a) undecidable in general; (b) 2-EXPTIME-complete for games with weak replay; and (c) P SP ACE-complete for games without replay.
The proof idea for Theorem 22 is to simulate insertion-based games by replacement-based games and vice versa; part (a) additionally uses the undecidability of JWin for arbitrary (i.e. not necessarily left-to-right) strategies on games with flat strings, which was proven to be undecidable in [14].

Conclusion
The complexity of context-free games on nested words differs considerably from that on flat words (2-EXPTIME vs. EXPTIME), but there are still interesting tractable cases. One of the main insights of this paper is that the main tractable cases remain tractable if one allows XML Schema instead of DTDs for the specification of schemas.
Another result is that adding validation of input parameters can worsen the complexity, but tractability can be maintained by a careful choice of the setting. However, here the step from DTDs to XML Schema may considerably worsen the complexity.

Insertion semantics with unlimited replay yields undecidability.
We leave open some corresponding upper bounds in the setting with validation of input parameters. In future work, we plan to study the impact of parameters of function calls more thoroughly.

A Appendix
For easier reference, we restate the results that were already stated in the body of the paper. Definitions and results not stated in the body can be identified by their number of the type A.xxx. At the end of the appendix there is another bibliography which contains references for all work mentioned in the appendix.

Proofs for Section 2
Lemma 2 (restated). There is a polynomial-time algorithm that computes for every deterministic NWA an equivalent deterministic NWA in normal form.

Proofs for Section 3
In this section, we give proofs for the upper and lower bounds on the complexity of JWin for unrestricted games stated in Section 3.

Upper bounds for Theorem 3
The proof of the upper bounds in Theorem 3 consists technically of three main parts: • the first part describes how to compute an ANWA for a cfG from its call effect (Proposition 6), • the second part establishes the complexity of emptiness and membership for ANWAS (Proposition 7), and • the third part shows that the fix point process sketched after Lemma 8 in Section 3 indeed computes the call effect of a game.

Transforming call effects into ANWAs
The proof of Proposition 6 requires a considerable amount of preparation. As mentioned in Section 3, our main tool for proving upper bounds on general cfGs is abstracting from subgames to the effects they induce on the target automaton ApT q. To facilitate the proof of Proposition 6, we extend this abstraction from the call effects of subgames on rooted strings as defined in Section 3 to effects of arbitrary nested strings. Formally, a (word) effect maps states q of ApT q to sets of sets of states of ApT q. The effect of a game G on a word w relative to state q is basically the set of all state sets X, for which Juliet has a strategy that guarantees that every play on w yields some word v with δ˚pq, vq P X. For ease of reference, we restate some definitions from Section 2 needed for word effects.
In the following, we sometimes consider subgames on a certain part of a string and talk about strategies for subgames. From a configuration pu, vwq, Juliet can use a strategy σ on the subgame on v. This means that she follows σ until a configuration puv 1 , wq is reached.
Definition A.1. For a cfG G " pΣ, Γ, R, T q with a deterministic target NWA ApT q " pQ, Σ, δ, q 0 , F q, we define the following notation.
• word G pw, σ, τ q denotes the unique final word that is reached in the game on w with strategies σ P STRAT J pGq and τ P STRAT R pGq.
• words G pw, σq def " tword G pw, σ, τ q | τ P STRAT R u denotes the set of final words that can be reached through strategies of Romeo, for a fixed strategy σ P STRAT J pGq.
• states G pq, w, σq def " tδ˚pq, vq | v P words G pw, σqu denotes the set of states that ApT q can take at the end of final words that can be reached through strategies of Romeo, for a fixed strategy σ P STRAT J pGq.
Finally, we define the word effect, E rG, ws : Q Ñ PpPpQqq, of G on w by E rG, wspqq def " rtstates G pq, w, σq | σ P STRAT J pGqus min , for every q P Q, where the operator r¨s min removes all non-minimal sets from a set of sets as before.
To simplify notation, the subscript G will often be omitted if the game G is clear from the context.
The intuition behind word effects is the following abstraction of cfGs into single-round games: On an input string w, Juliet first chooses a strategy σ, then Romeo chooses a strategy τ ; the outcome of the game on w is uniquely determined by σ and τ . In terms of effects, this corresponds to Juliet picking a set X " states G pq 0 , w, σq P E rG, wspq 0 q and Romeo then choosing a final state q " δ˚pq 0 , word G pw, σ, τ qq P X. This intuition also explains our use of the r¨s min operator, as it makes no sense for Juliet to offer Romeo a choice from a set X Ď Q if she can instead offer him the more limited options in some X 1 Ĺ X. 16 It is easy to see that Juliet has a winning strategy in G on w if and only if there is some X P E rG, wspq 0 q such that X Ď F ; to determine whether Juliet has a winning strategy it therefore suffices to compute E rG, ws.
It is natural to reason about effects for nested words in an inductive fashion. We first consider sequential composition. From Juliet's point of view, the game on a nested word uv (with u, v P WFpΣq) from a state q on proceeds as follows. Juliet fixes a strategy σ on u. The set of states that Romeo can reach at the end of the subgame on u is just states G pq, u, σq. For each state p P states G pq, u, σq, Juliet can choose a strategy σ p for v and the result set is then the union of all sets that can be reached by Romeo against any σ p on v.
To express the set of all combinations of outcomes for the second part, we use the following operator.
Definition A.2. Let D " tD 1 , . . . , D n u be a set of sets of sets. Then MixpDq is the set rtd 1 Y¨¨¨Y d n |d 1 P D 1^¨¨¨^dn P D n us min .
In other words, the Mix operation yields every way of taking the union of one element from each of D 1 , . . . , D n and then removes non-minimal sets.
Let E 1 , E 2 be mappings from Q into PpPpQq. Then the composition of E 1 and E 2 is defined as the mapping E 1˝E2 : Q Ñ PpPpQqq with Not surprisingly, effect composition commutes with word concatenation.
Before proving Lemma A.3, we give an auxiliary result that will greatly simplify proofs about effects and similar functions. To that end, we call a set D of sets normalised if it contains no two sets X, Y such that X Ĺ Y (or, equivalently, if D " rDs min ). For two sets of sets E 1 , E 2 , we write E 1 Ě E 2 if and only if every X P E 1 has a subset in E 2 .
Proof. We prove only E 1 Ď E 2 ; inclusion in the other direction then follows by symmetry. Let X 1 P E 1 , and let X 2 P E 2 with X 2 Ď X 1 . By assumption, there also exists X 1 1 P E 1 with X 1 1 Ď X 2 , and therefore X 1 1 Ď X 2 Ď X 1 . Since both X 1 and X 1 1 are in E 1 , and E 1 is normalised by assumption, this inclusion cannot be proper, and it follows that X 1 1 " X 2 " X 1 , and therefore X 1 " X 2 and X 1 P E 2 .
Proof of Lemma A.3. Let q P Q. This proof uses Lemma A.4 to prove the equality of the two normalised sets E rG, uvspqq and pE rG, us˝E rG, vsqpqq.
(Ě): Let X P E rG, uvspqq. Then there exists some strategy σ uv P STRATpGq such that X " states G pq, uv, σ uv q. Let σ u be the restriction of σ uv to the subgame on u, let X u P E rG, uspqq with X u Ď states G pq, u, σ u q and tp 1 , . . . , p k u " X u . For each i P rks, let σ i v be a restriction of σ uv to the subgame on v in case Romeo chooses a strategy τ with state G pq, u, σ, τ q " p i , and let By definition of˝, and because of normalisation, there exists some X 2 P pE rG, us˝E rG, vsqpqq with X 2 Ď X 1 . So, to show the desired inclusion, it suffices to prove that X 1 Ď X. Let By the definition of σ u and σ i v , this implies that p 1 P states G pq, uv, σ uv q " X.
(Ď): Let X P pE rG, us˝E rG, vsqpqq. By definition of˝, there are sets X u " tq 1 , . . . , q k u P E rG, uspqq and X i v P E rG, vspq i q for each i P rks such that Define a strategy σ uv on uv as follows. On u, Juliet plays according to σ u ; if this play yields some string u i P words G pu, σ u q with δ˚pq, u i q " q i , Juliet then plays according to σ i v on v. Denote states G pq, uv, σ uv q by X 1 for short. Due to normalisation, there exists some X 2 P E rG, uvs with X 2 Ď X 1 . What needs to be shown is therefore only that X 1 Ď X.
Let q 1 P X 1 . Then there exists a strategy τ for Romeo and strings u 1 , v 1 P WFpΣq such that u 1 v 1 " word G puv, σ uv , τ q, u 1 " word G pu, σ uv , τ q and δ˚pq, u 1 v 1 q " q 1 . Let q u " δ˚pq, u 1 q; then, it holds that q 1 " δ˚pq u , v 1 q. By the definition of σ uv , it follows that q u P X u and q 1 P X i v for some i, so q 1 P X, which concludes the proof.
It follows directly from Lemma A.3 that the sequential composition of effects is associative.
The word effect of a word of the form a v {a is induced by the word effect of v and the possible moves of the players on a and {a . In particular, as Juliet may choose Call on {a , the possible outcomes of a subgame on a subword of the form a {a become crucial. As in the main part of this paper, we summarise the possible outcomes of subgames on "two-letter words" of the form a {a by call effects as defined in Definition 4. For ease of reference, we restate that CrGspa, qq def " rtstates G pq, a {a , σq | σ P STRAT J,Call pGqus min , for every a P Σ and q P Q, where STRAT J,Call pGq contains all strategies of Juliet that start by playing Read on a and Call on {a .
To describe hierarchical composition of word effects we define, for every a P Σ the following operator H a : Q Ñ PpPpQqq For every two functions E : Q Ñ PpPpQqq and C : ΣˆQ Ñ PpPpQqq and q, q 1 P Q such that δpq, a q " q 1 , let Informally, interpreting E as a word effect and C as a call effect, the first set inside the Mix operator accounts for Call moves and the second for Read moves of Juliet. Now we can formulate how effects behave hierarchically.
Lemma A.5. For every cfG G " pΣ, Γ, R, T q, v P WFpΣq, and a P Σ, it holds E rG, a v {a s " H a rE rG, vs, CrGss.
Proof. We show, once again using Lemma A.4, that for every q P Q, it holds that E rG, a v {a spqq " H a rE rG, vs, CrGsspqq.
Clearly, X 1 " X 1 Y . . . Y X k has a subset in MixptCrGspa, qqY tδpr, p, {a qu | r P Xuq and therefore in H a rE rvs, CrGsspqq. It remains to be proven that X 1 Ď X.
Let q 1 P X 1 . Then q 1 P X i for some i P rks. If X i " tδpq i , p, {a u, then clearly q 1 " δ˚pq, a v i {a q P states G pq, a v {a , σq " X. Otherwise, q 1 P states G pq, a {a , σ i q and σ i coincides on a {a with σ on a v i {a , it follows again that q 1 P X.
(Ď): Let X P H a rE rG, vs, CrGsspqq and let δpq, a q " pq 1 , pq. Then, there exists some where each X i is either in CrGspa, qq or of the form tδpq i , p, {a qu. By the definition of E rG, vs, there exists some strategy σ v P STRAT on v such that states G pq 1 , v, σ v q " X v , and by the definition of CrGspa, qq, for each i with X i P CrGspa, qq there exists a strategy σ i P STRAT J,Call such that X i " states G pq, a {a , σ i q. We extend σ v to a strategy σ on a v {a as follows: Juliet reads the initial a , then plays on v according to σ v . The string v 1 resulting from this play on v has to fulfil δ˚pq 1 , v 1 q " q i for some i P rks; if, for this i it holds that X i " tδpq i , p, {a qu, then Juliet plays Read on {a , otherwise she plays Call on {a and plays according to σ i in the resulting sub-game. Let X 1 " states G pq, a v {a , σq; it is easy to see that X 1 Ď X, and since X 1 has a subset in E rG, a v {a spqq by definition, this proves the claim.
We are now ready to define the ANWA A C from Proposition 6. The intuition behind it is that A C uses alternation to guess strategy choices for Juliet and 17 As, in this case, σ i P STRAT J,Call is a strategy playing Call on {a , we can omit v here.
Romeo in the above abstraction of G on w using call effects and tracks a current state q in the target language DNWA ApT q. On opening tags, as well as on closing tags for which A C existentially guesses Juliet's move to be Read, A C simply simulates ApT q; on closing tags {a where A C decides for Juliet to play Call, A C then chooses existentially a set X P CrGspa, qq (corresponding to a substrategy for Juliet after the Call on {a ) and branches universally into all states q 1 P X (corresponding to Romeo's choice of a counter-strategy and a corresponding resulting state).
Formally, A C " pQ, Σ, δ C , q 0 , F q is an ANWA in normal form , where δ C is defined as follows. (Recall that ApT q " pQ, Σ, δ, q 0 , F q is the target language DNWA in normal form.) • For a P Σ, q P Q: • For a P Σ, q, p P Q: We go on to prove the correctness of A C . To that end, we call a run ρ of an ANWA A on a string w minimal if no proper subtree of ρ is a run of A on w (i.e. if each set of states chosen to follow up some state on reading some symbol is inclusion-minimal among the sets of states fulfilling the corresponding transition formula).
Lemma A.6. Let q P Q, w P WFpΣq and X Ď Q. Then, X P E rG, wspqq if and only if there is a minimal run of A C on w starting at q and ending in states from X.
Proof. Let q P Q, X Ď Q and w P WFpΣq. The proof is by induction on the structure of w.
For w " ǫ, the claim is trivially fulfilled, as E rG, ǫspqq " ttquu by the definition of string effects.
Let w " uv for u, v P WFpΣq. For the "only if" direction, it follows from Lemma A.3 that there are sets X u " tq i , . . . , q k u P E rG, uspqq and By induction, there exist a minimal run ρ u of A C starting at q and ending inside X u and for each i P rks a minimal run ρ i v on v starting at q i and ending inside X i v . From these, we can construct a run ρ of A C on w by replacing each leaf labelled q i in ρ u with the entire run ρ i v rooted at q i . Obviously, ρ is a run of A C starting at q and ending inside X, and ρ is minimal because ρ u and all ρ i v are. The "if" direction is proven analogously.
Let w " a v {a for a P Σ, w P WFpΣq. Let further δpq, a q " q 1 . For "only if", Lemma A.5 implies that X P H a rE rG, vs, CrGsspqq. This means that there is a set X v " tq 1 , . . . , q k u P E rG, vspq 1 q and sets X 1 w , . . . , X k w such that X " X 1 w Y . . . Y X k w and for each i P rks either X i w P CrGspa, pq or X i w " tδpq i , q, {a qu. By induction, there exists a minimal run ρ v of A C on v starting at q 1 and ending inside X v . We extend ρ v to a run ρ on w as follows: The root of ρ is labelled q and has as its only child the root of a copy of ρ v ; each leaf of this copy labelled q i has as its children exactly the states in X i w . Using the definition of A C , it is easy to verify that ρ is indeed a run of A C on w, and it is also clear that ρ starts at q and ends inside X. Finally, ρ is minimal because its subrun on v is minimal, and for each q i , the set X i w is an inclusion-minimal set fulfilling δ C pq i , q, {a q (for X i w P CrGspa, qq, this follows from CrGspa, qq being normalised). Again, the "if" part is proven analogously. Now we are in the position to prove Proposition 6: Proposition 6 (restated). There is an algorithm that computes from the call effect CrGs of a game G in polynomial time in |CrGs| and |G| an ANWA A C such that LpA C q " JWinpGq.
Proof. The statement follows from Lemma A.6, as A C has an accepting run on any string w P WFpΣq if and only if it has a minimal such run. Obviously, A C is of polynomial size in the size of G and CrGs and can be constructed from these in polynomial time The complexity of ANWAs Proposition 7 (restated).
(b) The membership problem for ANWAs is PSPACE-complete.
Proof. Statement (a) follows easily from [5] where 2-EXPTIME-completeness of Emptiness for alternating visibly pushdown automata was shown. The lower bound in that paper only requires finite well-nested words.
Towards the upper bound in (b), it is easy to see that an ANWA A " pQ, Σ, δ, q 0 , F q on some nested word w can be simulated by an alternating Turing machine with polynomial time bound, hence the classical results from [6] yield a polynomial space upper bound.
For future reference we note that this computation can be actually be done in polynomial space in |w| and the size |Q| of A's set of states, if it can be tested in polynomial space, whether • for a given set X Ď QˆQ of pairs of states, a symbol a and a state q, whether X |ù δpq, aq, and • for a given set X Ď Q of states, a symbol a and states p, q, whether X |ù δpq, p, aq.
The proof of this statement is along the same lines as the proof that alternating polynomial time is contained in polynomial space: The tree of all possible computations has polynomial depth and can be analysed with polynomial space. The lower bound in (b) is shown by a reduction from QBF, that is, the problem to decide whether a quantified Boolean formula evaluates to true. We assume that the input formula for QBF is of the form Φ " Q 1 x 1 . . . Q n x n ϕpx 1 , . . . x n q with Q i P tD, @u and a boolean formula ϕ with m clauses in conjunctive normal form.
The idea behind this reduction is to transform Φ into an ANWA A and a nested string w such that Φ is true if and only A accepts w. Actually, w is of a very simple form: v 1 ¨¨¨ v n X {X {v n ¨¨¨ {v 1 .
If the automaton A reads an opening tag v i , it branches existentially, if x i is existentially quantified, and it branches universally, if x i is universally quantified, thus choosing a truth assignment α for the variables. Finally, when it reads X , A branches universally, picking one of the m clauses of ϕ in every branch. When it reads the suffix {X {v n ¨¨¨ {v 1 of w, A tests that α makes the chosen clause true.
To this end, the automaton A uses three kinds of states: • assignment states, q`and q´, corresponding to true and false, respectively, • clause states q j , for j P rms, representing the clause chosen from ϕ to be tested for truth and • a starting state q 0 and an accepting state q F .
For the formal construction, let Φ " Q 1 x 1 . . . Q n x n ϕ be the input formula for QBF with Q i P tD, @u for all i P rns and a quantifier-free boolean formula ϕ " C 1^. . .^C m with clauses C j . Let w be constructed as above.
The ANWA A " pQ, Σ, δ, q 0 , tq F uq in normal form is defined as follows: • Q " tq 0 , q`, q´, q F u Y tq j | j P rmsu; • Σ " tv i | i P rnsu Y tXu; • For q P tq 0 , q`, q´u and i P rns, • δpq, X q " q 1^¨¨¨^qm ; • For q P tq`, q´u and j P rms, δpq j , q, {X q " $ & % q F if x n occurs in C j and q " q`or x n occurs in C j and q " q´, q j otherwise. ; • For q P tq`, q´u, j P rms, and 2 ď i ď n, if x i´1 occurs in C j and q " q`or x i´1 occurs in C j and q " q´, q j otherwise.
• For all q P Q, δpq, q 0 , {v 1 q " q, and • For all q P Q and i P rns, δpq F , q, {v i q " q F . It remains to be shown that Φ evaluates to true if and only if w P LpAq. We first note that A is deterministic on the suffix {X {v n ¨¨¨ {v 1 of w. It is not hard to show that, on this suffix, A reaches the accepting state from state q j , if and only if, the truth assignment α induced by the choices on v 1 ¨¨¨ v n makes C j true. Thus, the subrun on the suffix X {X {v n ¨¨¨ {v 1 of w is accepting, if and only if, α makes all m clauses true. Finally, the existential and universal branching of A on v 1 ¨¨¨ v n corresponds to the quantification of the variables of Φ in the obvious and correct way.
Lemma 8 (restated). Given a state q P Q, an alphabet symbol a P Γ, and C k rGs, for some k ě 1, the call effect C k`1 rGspa, qq can be computed in doubly exponential time in |G|.
Proof. Let a P Σ, q P Q, X Ď Q, and k ě 0. We show that, given C k rGs and a set X Ď Q, it can be decided in doubly exponential time in |Q| and polynomial time in |R| whether a subset of X is in P C k`1 rGspa, qq.
Let A C be as defined for the proof of Lemma A.6 with C k rGs as its basic Call effect, and let A be its modification with initial state q and set X of accepting states. A accepts all nested strings w on which there exists a strategy σ for Juliet of Call depth at most k such that states G pq, w, σq Ď X.
Let, for each a P Γ, A a " pQ a , Σ a , δ a , q 0,a , F a q be a NWA for R a . By definition, X has a subset in C k`1 rGspa, qq, if Juliet has a strategy σ of call depth k`1 on a {a that plays Call on {a and fulfils states G pq, a {a , σq Ď X. Such a strategy σ for Juliet exists if and only if for every word w P R a there is a strategy σ w of Juliet on w with states G pq, w, σ w q Ď X, thus if and only if R a Ď LpAq, equivalently R a X Ę LpAq " H . By using a standard product construction and a complementation of an ANWA, the test boils down to a non-emptiness test for an ANWA with a state set of polynomial size in |G| and can thus be done in doubly exponential time thanks to Proposition 7. 18

Adequacy of the fixed-point process
The following lemma will be used in the proof of Proposition 9.
Lemma A.7. For a cfG G " pΣ, Γ, R, T q with a deterministic target NWA ApT q " pQ, Σ, δ, q 0 , F q, it holds, for every a P Γ and q P Q: Proof. Since both sides of the claimed equation are minimal sets, it suffices by Lemma A.4 to show that each element of a set on one side of the equation has a subset on the other side.
(Ě): Let a P Γ, q P Q and X P CrGspa, qq. By definition of CrGs, there exists a strategy σ P STRAT J,Call such that X " states G pq, a {a , σq. Again by definition, Juliet plays Call on {a according to σ.
For every choice w P R a with which Romeo might respond to Juliet's initial Call move on {a , there is a sub-strategy σ w of σ on w. For each w P R a , let X w " states G pq, w, σq. Obviously, each X w has a subset in E rG, wspqq, and therefore the set X 1 " Ť wPRa X w has a subset in MixptE rG, wspqq | w P R a uq. It only remains to be proven that X 1 Ď X, so let q 1 P X 1 . Then, by the definition of X 1 , there is some w P R a , strategy τ w P STRAT R and w 1 P WFpΣq such that w 1 " word G pw, σ w , τ w q and δ˚pq, w 1 q " q 1 . From the way σ w was defined from σ, it follows that w 1 P words G p a {a , σq, and therefore q 1 P X.
(Ď): Let a P Γ, q P Q and X P MixptE rG, wspqq | w P R a uq. Then, for each w P R a there exists some set X w P E rG, wspqq such that X " Ť wPRa X w . By the definition of E rG,], this means that for every w P R a there is some strategy σ w P STRAT such that states G pq, w, σ w q " X w . Let σ P STRAT J,Call be the strategy on a {a where Juliet plays Call on {a and then, if Romeo picks w P R a as a replacement, keeps playing according to σ w on w. By definition of CrGs, the set X 1 " states G pq, a {a , σq has a subset in CrGspq, aq, and it only remains to be proven that X 1 Ď X. Let therefore q 1 P X 1 Then, there is some strategy τ P STRAT R and string w 1 P WFpΣq such that w 1 " word G p a {a σ, τ q. Since Juliet's first move according to σ is a Call on {a , there is some string w P R a which Romeo chooses as a replacement according to τ ; by definition of σ, it then holds that q 1 P states G pq, w, σ w q " X w Ď X as was to be proven.
For the following proof, the width of a nested word is the maximum number of children of any node in its corresponding forest. Its root width is just the number of trees in its forest. The (nesting) depth of a nested word is the depth of its canonical forest representation. Proposition 9 (restated). For every cfG G it holds: C˚rGs " CrGs.
Proof. For the proof we construct from a cfG G " pΣ, Γ, R, T q a game G 1 " pΣ, Γ, R 1 , T q, where R 1 consists of particular finite sublanguages R 1 a Ď R a , for every a P Γ. Then we show (a) C˚rGs " C˚rG 1 s, (b) C˚rG 1 s " CrG 1 s, and finally (c) CrG 1 s " CrGs.
To construct G 1 , we first examine the algorithm from the proof of Lemma 8 more closely. For a given state q P Q, alphabet symbol a P Σ, state set X Ď Q and effect C k rGs, the output of that algorithm depends only on the existence of a single string from R a -for a P Γ, the algorithm rejects if and only if there is a string in R a that is not accepted by A. For each q P Q, a P Σ, X Ď Q and k ě 1, let wpq, a, k, Xq be one such witness string of minimum length, if such a string exists. Obviously, the output of the algorithm from Lemma 8 for input q, a, X and C k rGs does not change if we replace R a by any subset of R a containing wpq, a, k, Xq.
Let k˚be the smallest number with C˚rGs " C k˚r Gs, and let W a be the set containing all wpq, a, k, Xq for all q P Q, k ď k˚and X Ď Q. Furthermore, for each w P WFpΣq, let vpa, wq be a string of minimum length such that E rG, ws " E rG, vpa, wqs and vpa, wq P R a and let V a " tvpa, wq | w P WFpΣqu. Since there are only finitely many different string effects, each set V a for a P Σ must be finite as well.
The replacement rules R 1 for G 1 are now constructed as follows: For each a P Γ, let R 1 a def " W a Y V a . By construction, it holds that R 1 a is a finite subset of R a , and an easy induction argument (along with the above considerations) shows that C k rG 1 s " C k rGs for each k ě 1. Along with the definition of C˚r¨s, this proves (a).
For (b) it is sufficient to show that each finite strategy σ P STRAT J rG 1 s on a word w has bounded Call depth. This can be easily established with the help of Kőnig's Lemma. To this end, we consider the strategy tree T σ,w for σ on w where each node is a game position of the form pp, u, vq with a player index p P tJ, Ru and strings u, v P p Σ Y {Σ q˚and each node corresponding to a game position κ has as children the possible follow-up positions κ 1 such that κ σ,τ ÝÑ κ 1 for σ and some counter-strategy τ P STRAT R . Each node of this tree has a finite number of children -nodes corresponding to positions belonging to Juliet have only a single child each (as σ is fixed), and positions in which Romeo is to replace some a P Σ have one child for each string in R 1 a . Thus, the Call depth of nodes is bounded, as otherwise T σ,w would be a finitely branching tree with branches of arbitrary length, which by Kőnig's Lemma would yield that T has an infinite branch, contradicting the finiteness assumption for σ.
Towards (c), we prove the slightly stronger claim that E rG, wspqq " E rG 1 , wspqq for all q P Q and w P WFpΣq. Lemma A.7 then implies (c). To this end, we prove that each set in E rG 1 , wspqq has a subset in E rG, wspqq and vice versa, which proves the desired equality by Lemma A.4.
One of these directions is almost trivial, as Romeo simply has no more possible moves in G 1 than in G. Thus, any strategy σ P STRAT J,Call pGq induces a sub-strategy σ 1 P STRAT J,Call pG 1 q with words G 1 pw, σ 1 q Ď words G pw, σq and therefore also states G 1 pq, w, σ 1 q Ď states G pq, w, σq.
For the other direction, let q P Q, w P WFpΣq and let σ 1 P STRAT J,Call pG 1 q with X " states G 1 pq, w, σ 1 q P E rG 1 , wspqq. Let d def " Depth G 1 pσ 1 , wq. This is welldefined as σ 1 is finite. We prove by nested induction over pd, nesting depth of w, root width of wq that there exists a strategy σ in G with states G pq, w, σq Ď X, which implies that X has a subset in E rG, wspqq.
If d " 0, then Juliet only plays Read on the entirety of w; obviously, this strategy is feasible in G as well and yields the same result.
If d ą 0, Juliet must play Call on w at some point, and therefore it holds that w ‰ ǫ.
If w " uv for u, v P WFpΣq, let σ 1 u be the sub-strategy of σ 1 on u, and let tq 1 , . . . , q k u " states G 1 pq, u, σ 1 u q. For each i P rks, let further σ 1 v,i be a substrategy of σ 1 on v in case the play on u yields some string u 1 with δ˚pq, u 1 q " q i . By induction (as u and v have smaller root width than w), there exist strategies σ u on u and σ v,i on v in G such that states G pq, u, σ u q Ď tq 1 , . . . , q k u and states G pq i , v, σ v,i q Ď states G 1 pq i , v, σ 1 v,i q. Let σ be the strategy on uv in G where Juliet plays according to σ u on u and according to σ v,i if the play on u yielded a string u 1 with δ˚pq, u 1 q " q i . Then, it holds that states G pq, w, σq Ď Ť iPrks states G 1 pq i , v, σ 1 v,i q Ď X. If w " a v {a for some a P Γ, v P WFpΣq, let δpq, a q " pq 1 , pq, let σ 1 v be the sub-strategy of σ 1 on v, and let tq 1 , . . . , q k u " states G 1 pq 1 , v, σ 1 v q. By induction (as the depth of v is smaller than the depth of w), there exists a strategy In the strategy σ on w, Juliet plays according to σ v on v. The play on v from q 1 according to σ is bound to reach some state q i for i P rks. If there is some string v i P words G 1 pv, σ 1 v q with δ˚pq 1 , v i q " q i such that Juliet would play Read on {a according to σ 1 in G 1 if the play on v yields v i , then Juliet also plays Read on {a according to σ; obviously, in this case, the resulting state from the play according to σ is in X. Otherwise, Juliet plays Call on {a in σ. Let z P R a be some arbitrary response for Romeo to this Call move in G; we now explain how Juliet plays on z according to σ.
By construction, the replacement language R 1 a in G 1 contains the string vpa, zq, so this string is a valid response for Romeo to the Call by Juliet on {a in G 1 . Let σ 1 vpa,zq be the sub-strategy of σ 1 if Romeo chooses this response. As σ 1 vpa,zq has a Call depth of at most d´1, by induction there exists a strategy σ vpa,zq for Juliet on vpa, zq in G with states G pq, vpa, zq, σ vpa,zq q Ď states G 1 pq, vpa, zq, σ 1 vpa,zq q. By the definition of vpa, zq, it holds that E rG, zs " E rG, vpa, zqs, which implies that there is a strategy σ z for Juliet on z in G such that states G pq, z, σ z q Ď states G pq, vpa, zq, σ vpa,zq q. In σ, Juliet then plays on z according to σ z , and the above set inclusions show that all states resulting from this play are in X as well, which completes the case w " a v {a for a P Γ and concludes the proof.

Lower bounds
Similar to Lemma A.6, where we constructed ANWA from given cfGs to obtain upper complexity bounds, we prove matching lower bounds for by transforming ANWA into cfGs.
Lemma A.8. There is a polynomial time algorithm that computes, given an ANWA A and a nested word w, a cfG G " pΣ, H, Γ, R, T q and a nested word w 1 such that w P LpAq if and only if Romeo has a winning strategy on w 1 in G, against all replay-free strategies of Juliet. Furthermore, G only depends on A (not on w) and can be computed in polynomial time in the size of A.
Proof. Let A " pQ, Σ, q 0 , δ, tq F uq be an ANWA and w P WFpΣq a nested word. The idea is to simulate the alternation of A in the game G on w 1 . We design G to only admit replay-free strategies for Juliet. To make this possible, we construct w 1 from w by adding substrings that offer enough "space" for this simulation.
Let q 1 , . . . , q m be an enumeration of the states in Q. Let Σ 1 and Σ 2 be two distinct copies of Σ with symbols of the form a 1 and a 2 , respectively, for every a P Σ.
For each a P Σ, we define We note that in vp a q, for each i ď m, there is a subword Encpδpq 1 , a qq, whereas in vp {a q there is a subword Encpδpq i , q j , {a qq, for every i, j ď m. The string w 1 is defined as the nested word vpwq, that results from w by replacing every tag σ P Σ Y {Σ with vpσq.
As explained above, the purpose of the game G is to simulate the alternation of A. We associate existential branching with Romeo and universal branching with Juliet. 19 To this end, the replacement languages for _ and ^ are as follows.
All other symbols should not be replaced in the game, so we set Γ " t_,^u.
The intention of the construction is that the behaviour of A on w is simulated as follows in the game on vpwq in G. Choices corresponding to _-gates in transitions are taken by Romeo (and we force Juliet to call every symbol _ as strings containing _-tags will not be accepted by the target NWA). The It remains to show that indeed w is accepted by A if and only if Romeo has a winning strategy on vpwq in G.
We call a strategy for Juliet on vpwq valid if Juliet plays Call on every _ symbol. Since Juliet can never win with a strategy that is not valid, we restrict our attention to valid strategies for Juliet on vpwq.
We will now show that each run of A on w corresponds to some strategy τ of Romeo on vpwq in G, and that an accepting run on w induces a winning strategy on vpwq and vice versa.
Let τ be a strategy for Romeo on vpwq, and let σ be some tag in w. We say that a subformula ϕ 1 encoded in vpσq is enabled according to τ and some counterstrategy for Juliet if the resulting sub-play on vpσq yields a substring of the form 1 {1 b Encpϕ 1 qEncpψq {b or 2 {2 b EncpψqEncpϕ 1 q {b (for some formula ψ). By the construction of vpσq, for each q P Q (and γ P Γ, if σ P {Σ )the set of all states q 1 P Q such that q 1 might be enabled in the sub-play on Encpδpq, σqq (resp. Encpδpq, γ, σqq) according to τ and some valid counterstrategy for Juliet satisfies the formula δpq, σq (resp. δpq, γ, σq). In this way, the strategy τ induces a run ρ of A on w such that for each valid counter-strategy of Juliet, the resulting rewriting of vpwq corresponds to one path in ρ.
Similarly, a run ρ of A on w induces a strategy τ for Romeo on vpwq; if, for some tag σ in w and state q P Q, P Ď QˆQ (resp. P Ď Q) is the follow-up state set satisfying δpq, σq (resp. δpq, p, σq for some appropriate p P Q), τ can be constructed to enable exactly the states from P for all counter-strategies of Juliet.
As the target automaton in G accepts a rewriting of vpwq if and only if it encodes a path in a run of A on w ending in an accepting state, the correspon-dence between runs of A on w and strategies of Romeo on vpwq in G implies that there exists a winning strategy for Romeo on vpwq in G if and only if there is an accepting run of A on w.
Using Lemma A.8, it is easy to prove our lower bounds. Proposition 10 (restated). For the class of unrestricted games JWin is (a) 2-EXPTIME-hard with bounded replay, and (b) PSPACE-hard with no replay.
Proof. The proof that JWin k pG all q is 2-EXPTIME-hard for all k ě 2 is by a reduction from the emptiness problem for ANWA, which is 2-EXPTIME-hard according to Proposition 7(a).
Given an ANWA A, let G 1 be the VP-cfG constructed by the algorithm of Lemma A.8. Let G be the game with an additional new function symbol s which Romeo is allowed to rewrite by any string of the form vpwq as defined in the proof of Lemma A.8. Then LpAq is non-empty if and only if Romeo has a winning strategy on s {s in G. This yields the desired reduction from emptiness for ANWA to JWin 2 pG all q.
PSPACE-hardness of JWin 1 pG all q follows directly from the corresponding hardness result for the ANWA membership problem (Prop. 7 (b)) along with the existence of a polynomial-time reduction proven in Lemma A.8.

Finite replacement languages
Proposition 11 (restated). For the class of unrestricted games with finite replacement languages, JWinpGq is (a) EXPTIME-complete with unbounded replay, and (b) PSPACE-complete with bounded or without replay.
Proof. As already mentioned in the body of the paper, the lower bounds follow from Theorem 4.3 in [14] and the proof of Proposition 10. Thus, only the upper bounds need to be established.
For (a), the non-emptiness test for R a X Ę LpAq can be replaced by a membership test v P Ę LpAq, for each of the finitely many strings v P R a . This can be done in polynomial space by Proposition 7. The exponential time upper bound then immediately follows because the number of iterations of the fixpoint process is at most exponential and the final test whether w is accepted by A CrGs needs only exponential time.
For (b), a polynomial space algorithm for a bounded number k of replay works basically just as in the general case, by first computing the call effect C k rGs from the input game G, then computing from it the ANWA A k C from Proposition 6 and finally simulating A k C on the input string w. The initial call effect, C 1 rGs, can again be computed in polynomial time. For each i, C i`1 rGs can be computed from C i rGs in polynomial space and finally, whether A k C accepts w can be tested in polynomial space in |w| and the number of states of A k C , that is, in the number of states of the target automaton of G.
Some care is needed though, as the (representation of the) intermediate automata and the resulting automaton A k C can be of superpolynomial (at most exponential) size. However, as usual for space bounded computations, the information about A k C and the intermediate automata can be recomputed whenever it is needed. The composition of these constantly many polynomial space computations then yields an overall polynomial space bound. It is crucial here that, as observed in the proof of Proposition 7, the evaluation of A k C is possible in polynomial space in |w| and the number of states of A k C . By a more complicated argument, the upper bound of Proposition 11 (a) can even be established in the case where the finite replacement languages are not given explicitly but by NWAs.

Proofs for Section 4
For our upper bounds, we formalise XML Schema 20 target languages by way of simple NWA (as defined in Section 4) and use similar techniques as in the upper bound proofs for Section 3. Lower bounds, on the other hand, will generally follow from lower bounds for context-free games on flat strings, as defined in [14].

Upper bounds
The general structure of the algorithms is the same as in Section 3. Technically, the two main parts of the proof are to show that SANWAs are suitable (SNWAs can be computed from XML Schemas, Proposition 15, and SANWAs from (simple) game effects, Proposition A.9) and to establish the complexity of SANWAs (Propositions A.11 and 17).

Suitability of simple NWAs
First off, we prove that simple NWA are at least as expressive as single-type tree grammars. The idea behind this is rather straightforward, as we only need to combine DFAs for each type's content model and, on reading some opening tag a , start some DFA in a sub-computation to check the nested string between a and the associated {a for compliance with the content model of some type X. Thanks to the single-type property, the type X is uniquely defined by a and the context from which a was read, so we obtain a deterministic automaton as desired. Proposition 15 (restated). From every single-type tree grammar T , a simple DNWA A can be computed in polynomial time, such that LpAq " LpT q.
Proof. Let T " pΣ, ∆, S, P, λq be a single-type tree grammar. We will construct a SNWA A such that LpT q " LpAq.
Due to the single-type property, for each type X P ∆ and each a P Σ, there is at most one type X 1 in the content model of X with λpX 1 q " a. Without loss of generality, assume that there is exactly one such type for each X and a (which can be done by adding a "dummy type" X K with r XK " H to T ), and denote this type by νpX, aq.
For each X P ∆, let A X " pP X , ∆, δ X , p 0,X , F X q be a DFA deciding Lpr X q (which can be computed from the deterministic regular expression r X in polynomial time). Assume w.l.o.g. that all P X , P Y are disjoint for X ‰ Y . Then, the SNWA A " pQ, Σ, δ, pp 0 , 0q, tpp f , 0quq is defined as follows: δppp, Xq, a q " pp 0,νpX,aq , νpX, aqq for each a P Σ, p P P , X P ∆, 20 For more background on formalisations of XML Schema we refer the reader to [10].
To show that LpT q " LpAq, it suffices to show that for every w P WFpΣq and X P ∆, it holds that λpXq w {λpXq P LpXq if and only if δ˚ppq 0,X , Xq, wq P F loc pλpXqq, where LpXq is defined like LpT q with root type X. The claimed equality then follows with LpT q " LpSq.
Proposition A.9. There is an algorithm that computes from the call effect CrGs of a simple game G in polynomial time in |CrGs| and |G| a SANWA A C such that LpA C q " JWinpGq.
Proof. We construct A ℓ C almost as the automaton A C in the proof of Proposition 6. However, as they are mimicking games, the alternating transitions in A C occur at closing tags, whereas the definition of simple ANWAs requires that alternating transitions occur only at opening tags. Thus we slightly adapt the construction as follows.
• For every q, q 1 P Q and a P Σ, δ ℓ C pq, q 1 , {a q is defined via a target state function t C as per the definition of SANWA.
The target state function t C and final state function F loc,C witnessing the simplicity of A ℓ C are defined by t C pq, aq def " q and F loc,C paq def " F loc paq, respectively. This automaton obviously fulfils both simplicity conditions, by construction and the simplicity of ApT q. The correctness of the automaton is proven analogously to the proof of Lemma A.6.

Complexity of simple ANWAs
To prove the upper bound in Proposition 17 (a), i.e., that non-emptiness for SANWAs is in PSPACE, we start off by proving a somewhat stronger result: That the problem of determining, given a NWA A and a SANWA B, whether there is a nested word accepted by both A and B, is in PSPACE. The standard approach for proving a result of this sort (a product construction between two NWA or two SANWA) is generally not feasible here, as SANWA are less expressive than NWA (so A cannot in general be transformed into a SANWA) and transforming B into a NWA might incur a doubly exponential blow-up in size. Therefore, a PSPACE algorithm has to be constructed especially for this problem and uses the following pumping property for strings in LpAq X LpBq.
As in the previous section, the width of a nested word is the maximum number of children of any node in its corresponding forest. Its root width is just the number of trees in its forest. The depth of a nested word is the depth of its canonical forest representation.
Lemma A.10. Let A " pQ A , Σ, δ A , q 0,A , F A q be a NWA and let B " pQ B , Σ, δ B , q 0,B , F B q be a SANWA with type alphabet ∆, final state function F loc , target state function t and test state q ? P Q. Then LpAq X LpBq " H if and only if there exists a string in LpAq X LpBq of width at most 2 |QB |¨| Σ|¨|Q A | and depth at most 3p|Σ|`1q|Q A | 2 |∆|.
Proof. The "if" direction is trivial. For "only if", assume for the sake of contradiction that LpAq X LpBq " H, but there is no string in LpAq X LpBq fulfilling the claimed upper bounds on both width and depth.
First, we observe that for all words w P LpBq, all nodes of any depth i in an arbitrary accepting run of B on w contain only linear states of the same type, i.e. if ρ " pD, λq is an accepting run of B on w, and x, y P D with |x| " |y| " i for any i P N, and if λpxq " pp, Xq and λpxq " pp 1 , Y q, then X " Y . This can be proven by a simple induction on i.
In the remainder of this proof, if ρ is a run of B on some string w and ρ 1 is a sub-run of ρ on a nested substring w 1 of w, we call ρ 1 successful if all leaves of ρ 1 are accepting with respect to the context of w 1 , i.e. if all leaves of ρ 1 are in F in case w 1 " w, or if all leaves of ρ 1 are in F loc paq in case a w 1 {a is a substring of w. Note that due to the definition of runs, all test subruns of ρ 1 (i.e. subruns starting with horizontal state q ? ) have to accept. Furthermore, by the above observation, all subtrees of ρ 1 immediately below its root start from the same state, as that state is uniquely given by the tags enclosing w 1 and the root type of ρ 1 .
First off, let w P LpAqXLpBq be a string of width greater than 2 |QB |¨| Σ|¨|Q A | and minimal length. We now prove that LpAq X LpBq contains a string shorter than w, in contradiction to the assumed minimality.
Let w 1 be a maximum-length nested substring of w with root width greater than 2 |QB |¨| Σ|¨|Q A |. Let ρ be an accepting run of B on w and ρ 1 its sub-run on w 1 . Similarly, since A may also be viewed as an ANWA, there is an accepting run π of A on w in which each non-leaf node has only a single child. Let π 1 be the sub-run of π on w 1 . For k " 1, .., |w 1 |, let k-layerpw 1 q P ΣˆppPpQ B qQ A q Y pPpQ 2 B qˆQ 2 A qq such that if w 1 k " a , pq 1 , p 1 q, .., pq ℓ , p ℓ q are all pairs of states at depth k in ρ 1 and pq, pq is the state pair of depth k in π 1 , then k-layerpw 1 q " pa, tpq 1 , p 1 q, .., pq ℓ , p ℓ qu, pq, pqq, and if w 1 k " {a , pq 1 q, .., pq ℓ q are all states at depth k in ρ 1 and q is the state at depth k in π 1 , then k-layerpw 1 q " pa, tq 1 , .., q ℓ u, qq. As the root width of w 1 is greater than |ΣˆPpQ B qˆQ A |, there are numbers i ă j ă |w 1 | such that i-layerpw 1 q " j-layerpw 1 q and the substrings w 1 1 ..w 1 i and w 1 1 ..w 1 j (and therefore also w 1 i`1 . . . w 1 j ) are well-nested. The claim, then, is that there are accepting runs of A and B on the stringw derived from w by deleting w 1 i`1 . . . w 1 j from w 1 . Assume now, again for the sake of contradiction, that there is no string in LpAq X LpBq of width at most 2 |QB|¨| Σ|¨|Q A | and depth at most 3p|Σ|1 q|Q A | 2 |∆|. By the above part of the proof, this means that all strings fulfilling the requirement on width must be of a depth exceeding 3p|Σ|`1q|Q A | 2 |∆|. Let w be such a string of minimal length, and let ρ be an accepting run of B and π an accepting run of A on w.
As the nesting depth of w is greater than 3p|Σ|`1q|Q A | 2 |∆|, there exist well-nested strings w 1 and w 2 such that for some a P Σ, • a w 1 {a is a substring of w, • a w 2 {a is a substring of w 1 , • all sub-runs of ρ on w 1 and w 2 start from the same state q a P Q B • either all sub-runs of ρ on w 1 and w 2 are unsuccessful or there exist successful runs in ρ on both w 1 and w 2 , and • the states of A according to π before and after reading a w 1 {a are the same as those before and after reading a w 2 {a .
The claim is that both A and B have accepting runs on the stringw derived from w by replacing w 1 with w 2 . As w 2 is a proper substring of w 1 , proving this claim yields the desired contradiction to the minimal length of w and thus the claim of Lemma A.10 Proposition A.11. There is an alternating algorithm that tests in polynomial time whether, for an NWA A and a SANWA B it holds LpAq X LpBq " H.
Proof. We formulate the claimed algorithm as a game for two players, whom we will call Adam and Eve to avoid confusion with the players for contextfree games. This game will always terminate after at most polynomially many rounds, so an alternating polynomial-time algorithm can easily be constructed from it by branching nondeterministically (resp. universally) for the moves for Eve (resp. Adam) and accepting the input if and only if Eve wins.
We will construct the game such that that Eve has a winning strategy on input NWA A " pQ A , Σ, δ A , q 0,A , F A q and SANWA B " pQ B , Σ, δ B , q 0,B , F B q with final state function F loc , test state q ? and target state function t if and only if LpAq X LpBq ‰ H. Eve's goal in this game is to prove that there is a string that is accepted by both A and B without writing down that string explicitly; by Lemma A.10, it does suffice to examine strings of at most exponential width and polynomial depth, but such a string can still not be explicitly spelled out using only polynomial space. We therefore represent a string implicitly by the behaviour it induces in A and B.
Game positions for Eve consist of two states p 1 , p 2 P Q A , a function S : Q B Ñ PpQ B q and two numbers c, n ě 0, and the game is constructed in such a way that Eve has a winning strategy from position pq 1 , q 2 , S, c, nq if and only if there is a string w of root width at most 2 c and nesting depth at most n such that q 1 w ❀ A q 2 , and for every q P Q B there is a run of B on w beginning in q and ending inside Spqq. We write c 0 for |Q B | logp|Σ|¨|Q A |q and n 0 for 3p|Σ|`1q|Q A | 2 |∆|. Lemma A.10 then guarantees that LpAqXLpBq is nonempty if and only if there is a state q f P F A and a function S with Spq 0,B q Ď F B such that Eve has a winning strategy from position pq 0,A , q f , S, c 0 , n 0 q.
In any position pq 1 , q 2 , S, c, nq, Eve has the following options: • If c ą 0, she may choose to play a concatenation round, asserting that w " v 1 v 2 for strings v 1 , v 2 P WFpΣq whose root width is at most half that of w. In this case, she chooses two functions S 1 , S 2 : Q B Ñ PpQ B q, corresponding to strings v 1 , v 2 as above and an "in-between" state q 1 P Q A . The functions S 1 and S 2 have to fulfil the condition that for each q P Q B it holds that Spqq " Ť pPS1pqq S 2 ppq; if S 1 and S 2 do not fulfil this condition, Adam wins. Otherwise, Adam has a choice of which part of Eve's assertion he wants to contest, so he may choose as a follow-up position either pq 1 , q 1 , S 1 , c´1, nq or pq 1 , q 2 , S 2 , c´1, nq.
• If n ą 0, Eve may choose to play a nesting round (with a P Σ), asserting that w " a v {a for some v P WFpΣq. To this end, she first chooses an alphabet symbol a and a function S 1 corresponding to v as above, as well as states q 1 1 , q 1 2 , p P Q A such that δ A pq 1 , a q " pq 1 1 , pq and δ A pq 1 2 , p, {a q " q 2 (if no such states exist, Adam wins immediately). Next, Adam chooses some q P Q B on which to contest Eve's claim. In response, Eve picks a state p 1 P Q B and a set of states P " tp 1 , .., p k u Ď Q B such that ptp 1 uˆP q |ù δ B pq, a q and Spqq " Ť pPP ztq ? u ttpp, aqu. If she cannot choose such a set, Adam wins.
If Adam has not won by this point, he has to contest Eve's claim that there is a string v such that B has a successful run on v. If q ? P P and S 1 pp 1 q Ď F paq, the string v claimed by Eve fails the test subrun mandated by B branching with q ? , so in this case, Adam wins. Otherwise, the game continues from position pq 1 1 , q 1 2 , S 1 , c 0 , n´1q, as the root width of the substring v is bounded by 2 c0 .
• Eve may choose to solve (with a P Σ), asserting that w " a {a . In this case, she chooses a symbol a P Σ. Similar to a nesting round, Adam then picks a state q P Q B on which to contest Eve's claim, to which Eve responds by choosing a state p P Q B and a set P Ď Q B . The game then ends and a winner is determined. Eve wins if and only if all of the following conditions are fulfilled: (a) There are states p 1 , q 1 P Q A such that δ A pq 1 , a q " pq 1 , p 1 q and δ A pq 1 , p 1 , {a q " q 2 ; (b) ptpuˆP q |ù δ B pq, a q; (c) Spqq " Ť pPP ztq ? u ttpp, aqu; (d) If q ? P P , then p P F paq.
• Eve may choose to solve with ǫ, asserting that w " ǫ. In this case, the game ends and Eve wins if and only if q 1 " q 2 and for each q P Q B it holds that Spqq " tqu.
Since each round that does not end the game decreases either the number of remaining nesting or concatenation rounds and the number of remaining concatenation rounds only increases at the end of a nesting round, the total number of rounds starting from pq 0,A , q f , S, c 0 , n 0 q is bounded by c 0 n 0 , which is polynomial in the size of A and B. It is easy to see that each choice by Eve or Adam requires only a polynomial-size certificate, and that each check for winning conditions is computable in polynomial time. Therefore, an alternating algorithm checking whether Eve has a winning strategy on this game (as described above) has a polynomial upper bound on its running time. It remains to be shown that this algorithm indeed tests A and B for intersection emptiness, i.e. that Eve has a winning strategy from pq 0,A , q f , S, c 0 , n 0 q for some q f P F A if and only if LpAq X LpBq ‰ H.
To prove this claim, we show that the following statements are equivalent: (1) Eve has a winning strategy from position pq 1 , q 2 , S, c, nq; (2) There is a string w P WFpΣq of width at most 2 |QB | |Σ||Q A |, root width at most 2 c and depth at most n such that there is a run of A on w from q 1 to q 2 , and for each q P Q, there is a successful run of B on w from q ending inside Spqq.
p1q ñ p2q: Assume Eve has a winning strategy σ from position pq 1 , q 2 , S, c, nq. We prove (2) by induction on the structure of σ.
If Eve solves with ǫ as her first move according to σ, the string w " ǫ obviously fulfils the claim of (2).
If Eve's first move according to σ is to solve with some a P Σ, then w " a {a fulfils the claim of (2). Since c, n ě 0, w fulfils the desired upper bounds on nesting depth and width; winning condition (a) ensures the existence of a run of A; and as for each q P Q that Adam chooses, Eve can respond with a set of horizontal states compliant with the transition formulae of B according to winning conditions (b) to (d), the desired runs of B on w exist as well.
If Eve begins with a concatenation round according to σ, it follows that there exist a state q 1 P Q A and functions S 1 , S 2 : Q B Ñ PpQ B q such that for each q P Q B it holds that Spqq " Ť pPS1pqq S 2 ppq and Eve has a winning strategy on both pq 1 , q 1 , S 1 , c´1, nq and pq 1 , q 2 , S 2 , c´1, nq. By induction, this implies that there are strings v 1 , v 2 P WFpΣq of width at most 2 |QB| |Σ||Q A |, root width at most 2 c´1 and depth at most n for which there exist appropriate runs of A and B; it is easy to see that w " v 1 v 2 fulfils the width and depth requirements of the claim, and that the claimed runs of A and B on w can be constructed by combining those on v 1 and v 2 .
If Eve starts by playing a nesting round with some a P Σ, there exists a function S 1 as well as states q 1 1 , q 1 2 , p P Q A such that δ A pq 1 , a q " pq 1 1 , pq and δ A pq 1 2 , p, {a q " q 2 . Furthermore, for each q P Q B , there is a state p P Q B and a set P Ď Q B such that ptpuˆP q |ù δ B pq, a q and Spqq " Ť pPP ztq ? u ttpp, aqu, and if q ? P P then S 1 ppq Ď F paq. Finally, Eve has a winning strategy starting from position pq 1 1 , q 1 2 , S 1 , c 0 , n´1q. By induction, there exists a string v of width at most 2 |QB | |Σ||Q A | and depth at most n´1 for which there exist appropriate runs of A and B; the string w " a v {a therefore fulfils the claimed restrictions on depth and width. Again, it is easy to see that a run of A on w can be constructed from the one on v. To construct the desired runs of B on w, denote the run on v starting at p and ending in S 1 ppq by ρ and let q P Q B . A run on w starting at q is then constructed as follows: The root node, labelled q, has tpuˆP as the set of labels of its children. Each of these nodes pp, p 1 q is the root of a copy of ρ, whose leaves are all inside S 1 ppq; if p 1 " q ? , the leaves of the corresponding copy of ρ have no further children; otherwise, their only child is labelled with the state tpp 1 , aq. Using the above properties and the definition for SANWA semantics, it is easy to verify that the tree thus constructed is indeed a run of B on w starting at q and ending inside Spqq.
p2q ñ p1q: This part of the proof is by an induction on the structure of w analogous to the above proof of p1q ñ p2q.
(b) The membership problem for SANWA is decidable in polynomial time.
Proof. That non-emptiness for SANWAs is in PSPACE follows directly from Proposition A.11, as alternating polynomial time equals polynomial space.
PSPACE-hardness can be proven by a simple reduction (with a constantsized NWA A accepting WFpΣq) from the nonemptiness problem for SANWA, which in turn is PSPACE-hard by reduction from the nonemptiness problem for alternating finite automata, interpreting flat strings w 1 . . . w n P Σ˚as nested strings w 1 {w 1 . . . w n {w n P WFpΣq of nesting depth 0 (and vice versa). It is then quite easy to construct from an AFA B 1 a SANWA B such that LpB 1 q ‰ H if and only if B accepts some nested string of depth 0.
To show (b), that the membership problem for SANWAs can be decided in polynomial time, it suffices to show that the problem can be decided by an alternating Turing machine with logarithmic space. The computation of a SANWA A can be easily simulated by an alternating Turing machine M . To this end, the TM M could branch existentially and universally, just as A. In particular, on a word w it would have exactly one run for each run of A on w. However, such a naive simulation would need to remember the stack contents to compute tpp, aq at the next closing tag {a , and thus the space required would be proportional to the nesting depth of the input word.
To achieve a logarithmic space bound, we can modify M as follows. Whenever a transition at an opening tag a yields a pair pq, pq with p " q ? , the computation branches universally into two subcomputations: one moves directly to the corresponding closing tag {a and continues after that from state tpp, aq. The other proceeds as A on the current subword but does not need to remember p. Whenever such a computation reaches a closing tag it accepts. Test subruns, starting from a pair pq, q ? q are simulated slightly different: they remember the nesting depth of the opening tag a and behave at the corresponding closing tag just as A would. However, if a test subrun starts a test-subsubrun the latter only needs to remember the new nesting depth, as it can stop when the subsubrun has finished.
The correspondence between runs of A and the ATM can be shown by induction on the nesting depth of the input word w. In particular, the ATM accepts just if A does.
More formally, we claim that Algorithm 1 evaluates a SANWA A " pQ, Σ, δ, q 0 , F q with local acceptance function F loc , test state q ? and target state function t on a nested word w " w 1 . . . w n P WFpΣq (with w i P Σ Y {Σ for each i P rns). To this end, it keeps track of a current state q P Q of A, two indices i and j denoting the starting and ending position in w of the substring to be verified in its current run, and an index f P Σ Z t0u that tracks whether the current string is to be verified against the accepting states of A (f " 0) or some F loc paq (f " a). To simplify notation for the former case, we let F loc p0q def " F . We first elaborate on how to execute line 7 of Algorithm 1 in alternating logarithmic space. Assume that each transition function δpq, a q in A is given in prefix notation, i.e. formulas are of the form (i)^pϕ 1 , ϕ 2 q or (ii) _pϕ 1 , ϕ 2 q or (iii) pq 1 , pq. In case (i), the algorithm guesses universally whether to branch Choose alternatingly pq 1 , pq according to δpq, w i q 8: if p ‰ q ? then 9: i Ð (position of closing tag associated with w i ) + 1 10: q Ð tpp, w i q 11: else 12: //p " q ? ; start test subrun: 13: q Ð q 1

14:
if w i`1 P Σ then 15: i Ð i`1 16: j Ð (position of closing tag associated with w i ) Reject into ϕ 1 or ϕ 2 , in case (ii) this choice is existential, and in case (iii), a result is fixed. Clearly, this is feasible in alternating logarithmic space and equivalent to first choosing existentially a set P Ď Q 2 with P |ù δpq, a q and then universally picking a tuple pq 1 , pq P P .
It is also easy to see that Algorithm 1 terminates (as the value of i increases in each iteration of the loop in line 5 while j only ever decreases) and requires only logarithmic space.
It remains to be proven that Algorithm 1 is correct. We do this by proving that, for any nested word w, state q P Q, f P Σ Y t0u and indices i, j such that w i . . . w j is a well-nested string, lines 5-24 of Algorithm 1 accept in an alternating fashion if and only if there is a run of A on w i . . . w j starting at q and ending inside F loc pf q. The proof is by induction on the structure of w and uses as its crucial component the above insight that picking a follow-up state tuple from δpq, w i q in line 7 is equivalent to universally selecting a child of a depth i node labelled q in some run of A, and that each existential strategy for the alternating execution of Algorithm 1 corresponds to a single run of A in this way.

Lower bounds
All of our lower bounds for simple games follow from lower bounds for cfGs on flat strings, that is, games on strings as defined in [14], with target and replacement languages represented by deterministic regular expressions. Lower bound results for replay-free games and bounded replay with finite replacement languages and target languages represented as DFAs were already proven in [14]. They can be transferred to games with target languages described by deterministic regular expressions. As an entirely new result compared to [14], we prove here a lower bound for bounded replay (actually, Call depth 2 suffices) and later sketch how these results carry over to nested word cfGs. 21 Intuitively, a regular expression is deterministic, if each of its positions can be matched uniquely with a symbol of the regular expression, without lookahead. Formally let, for a regular expression r, Dprq be the expression, in which the i-th symbol σ of r is replaced by pσ, iq, e.g. Dppa`bq˚aq " ppa, 1q`pb, 2qq˚pa, 3q. We call r is deterministic, if there do not exist strings w, v, v 1 , symbol σ and numbers i, j such that wpσ, iqv P LpDprqq, wpσ, jqv 1 P LpDprqq and i " j. Proof. We prove this by reduction from the complement of the problem Corridor Tiling: Given a set U of tiles, relations V, H Ď UˆU of vertical and horizontal constraints, initial and final tiles u i , u f P U and a number n (represented in unary), is there a correct tiling of width n and arbitrary height that starts with u i , ends with u f and violates none of the (vertical or horizontal) constraints.
Formally, a tiling of width n and height m is a mapping t : rnsˆrms Ñ U .
Corridor Tiling asks whether an instance I " pU, u i , u f , V, H, nq has a valid tiling of width n. It is well known that this problem is PSPACE-complete (see, e.g., [7] for a slightly different definition of tilings). Since PSPACE is closed under complementation, the complement of Corridor Tiling is complete for PSPACE as well.
We give here a reduction from the complement of Corridor Tiling to JWin.
The reduction constructs, given an instance I " pU, u i , u f , V, H, nq for Corridor Tiling, a game G " pΣ, Γ, R, T q and a symbol s from Γ such that Juliet has a winning strategy on s if and only if I does not have a valid corridor tiling. The basic idea is that, after Juliet's first Call move on s, Romeo will answer with an encoding w of a valid corridor tiling, if one exists. With her Call moves of depth 2, Juliet may then try to flag inconsistencies (i.e. constraint violations) in the tiling given by Romeo; finally, the target automaton should accept a tiling if Juliet did indeed point out an actual inconsistency.
The game G is over an alphabet Σ which is obtained by the union of U with a setÛ of disjoint copiesû of all elements u P U and the set ts, The replacement and target languages are described below.
A tiling candidate (for I) is a string of the form pppU ? v ? h q n #q˚, whose length-n blocks of elements from U are supposed to be interpreted as lines of a tiling, with protest symbols ? v , ? h after each tile and a line separator symbol # at the end of each line. The replacement language R s consists of all tiling candidates v such that u i is the first symbol of v. It is easy to see that R s can be described by a DRE of polynomial size in |I|. The other replacement languages are very simple: R u " tûu for each u P U , R ? h " t! h u and R ?v " t! v u.
The construction of the target language T is best motivated by sketching how plays can proceed on the input string s. First, Juliet should be forced to play Call on s and allow Romeo to actually give a candidate for a valid tiling. Therefore, s R T .
By the definition of R s , Romeo responds to this Call with a tiling candidate which already begins with the correct tile. It is now Juliet's task to flag an error in this tiling, i.e. either • two tiles separated by pn´1q tiles with corresponding protest symbols and one line separator symbol (a potential vertical error), or • two tiles separated by exactly 2 protest symbols (a potential horizontal error), or • a single tile at the end of v (a potential incorrect final tile).
To flag any tiles, Juliet plays Call on them, forcing Romeo to replace any called tile x by a marked tilex. If the marked tiles indeed make up an error, we want Juliet to win, so the DRE for T should describe such strings. If, on the other hand, Juliet tries to cheat by marking too few or too many tiles, or tiles that do not make up an error, she should lose the game.
To allow easy DRE-based checking of the three types of errors mentioned above, Juliet also has to specify the type of error right after the first tile she flagged; in case of a horizontal (vertical) error, she has to Call ? h (? v ) to have it replaced with ! h (! v ). This basically "tells" the target DRE what sort of error to check for. An incorrect final tile does not need its own type of protest symbol, because (as we will see) a flagged inconsistency of this sort can be recognised by a DRE "as is".
To construct the target language DRE, we first define some abbreviations: • For any set S " ts 1 , . . . s k u and REs α s for each s P S, À sPS α s stands for the RE α s1`. . .`α s k ; • U 1 denotes the DRE À uPU u? v ? h , and pU 1`# q k the k-fold repetition of pU 1`# q; • for each u P U , V u def " ! v ? h pU 1`# q n p À pu,u 1 qRVû It is easy to verify that for each u P U , V u , and H u are DREs of polynomial size in |I|. Intuitively, V u (H u ) describes all suffixes immediately to the right ofû (û? v ) in tilings where Juliet has correctly flagged a vertical (horizontal) error starting with u.
The target language T can now be described by the DRE which is also of polynomial size in |I|. It is easy to see that if a valid tiling exists, Romeo can simply win the game by providing it in the first move. Therefore, in this case, Juliet does not have a winning strategy. On the other hand, if no tiling exists, Romeo can only give a tiling candidate with at least one (vertical, horizontal or final tile) error in his first move and Juliet can win by marking one such error.
The following two results can be shown by careful adaptation of the corresponding lower bound proofs in [14].
Lemma A.13. For the class of games on flat strings with target and replacement languages specified by deterministic regular expressions, JWin is PTIMEhard (under logspace reductions) without replay.
Lemma A.14. For the class of games on flat strings with target and replacement languages specified by deterministic regular expressions, JWin is EXPTIME-hard with unlimited replay. Proof. All lower bounds follow by the same reduction from corresponding lower bounds for flat cfGs, which were just given as Lemma A.14, Lemma A.12 and Lemma A.13.
The idea for the reduction from flat cfGs to simple (nested) cfGs is as follows: All input and replacement strings w " w 1 . . . w n P Σ˚are replaced by " w " w 1 {w 1 . . . w n {w n P WFpΣq; to this end a target DFA ApT q is simulated by a SNWA in normal form with an extra state q n such that δpq, a q " q n for each q and a, F loc paq " q n for each a, and tpq, aq is the transition function of ApT q. Replacement NFAs are similarly transformed into NWAs.
Using the reduction from the proof of Proposition 18, Theorem 12 also yields the following result, which we will need in later proofs: The proof is quite similar to the proof of the upper bound in Proposition 17 (b). It combines an alternating logspace-computation, that simulates all plays of the game on the input string, with universal branching to divide, at each opening tag a , the processing of the remaining word into the processing of the subword until the corresponding closing tag {a and the processing of the remaining word after that {a .
We first describe how the game on an input string w can be simulated by an alternating logspace-computation. This part of the proof is very similar to the proof of the upper bound of Theorem 5.8 in [14]. Let k be the bound on the replay depth. We consider the equivalent version of cfGs in which Juliet decides already when she reads an opening tag a , whether she wants Romeo to rewrite a subword u " a ¨¨¨ {a .
The idea is that the choices of Juliet and Romeo are simulated by existential and universal branching of the algorithm in the obvious fashion. However, if Juliet calls an opening tag a at some position i and Romeo replaces the corresponding subword u " a ¨¨¨ {a of w by a word v from R a then the algorithm does not actually replace u but rather stores the information that u has been replaced by a pointer to i and another pointer to v (which is stored in the representation of G). As the replay depth is bounded by k, at each time at most k such pairs of pointers are active, consuming at most Oplogp|G|qq many bits. The test whether the resulting word (of each branch) is accepted by the target automaton T is integrated into this branching process as follows. Each process maintains a current linear state p reflecting the state of T in the unique computation on the prefix of the current string, that is, if the current game configuration is pJ, w 1 , w 2 q, the current state is the one obtained by T after reading w 1 . Whenever Juliet reads an opening tag a , the computation universally branches into two subcomputations. The first subcomputation checks whether Juliet has a winning strategy in the game on the subword between a and its corresponding closing tag. The other subcomputation continues after that closing tag in the state determined by the target state function. Juliet can only win if both subcomputations accept. Each subcomputation may recursively branch in the same way. When a subcomputation reaches a closing tag {a it accepts if the current linear state is in F loc paq and rejects otherwise. It is not hard to see that this algorithm has an accepting run on a word w if and only if Juliet has a winning strategy on w. As the algorithm only uses logarithmic space it witnesses the desired PTIME upper bound.
In this section, we give proofs for our results concerning parameter validation and games with insertion stated in Section 5.

Validation of parameters
In this subsection, we consider cfGs with parameter validation, i.e. games of the form G " pΣ, Γ, R, V, T q which have an additional validity relation V Ď ΓˆWFpΣq. We will generally assume each validation language V a def " tw P WFpΣq | pa, wq P V u (for a P Γ) to be a nonempty nested word language conforming to some specification (e.g. NWA, DTD or XML Schema). The semantics of such games is similar to the general semantics for cfGs, except for the fact that, in a configuration pJ, u a v, {a wq, Juliet is only able to play Call on {a if it holds that a v {a P V a . Note that, while it isn't strictly necessary to pass the outermost a to V a along with v, we still do so in order to easier describe V a as a language of trees with root node labelled a.
As mentioned in Section 5, we restrict our attention to games without replay, as we are seeking to identify tractable cases, and JWin is already PSPACEhard for bounded-replay games with target languages specified by DTDs without parameter validation.

Upper bounds
First off, we prove tractability for a restricted class of validation cfGs. As notation used in the proof, we say that a function symbol g is "from V f ", if V g " V f . Theorem 20 (restated).
For the class of games with validation with a bounded number of validation DTDs and target languages specified by DTDs, JWin is in PTIME without replay.
Proof. (sketch) The basic proof idea for this result follows a similar approach to that used in [12]: Going through the input string (interpreted as a tree) in a bottom-up fashion, we check for each node's child string whether it (and the subtree below it) can be rewritten to fit the target and verification languages in a replay-free manner. This allows us to tell whether Juliet is able to safely play Read or Call on the node whose child string we just examined, and possibly on ancestor nodes as well. In this manner, deciding JWinpGq basically boils down to performing a polynomial number of safe rewritability tests for replay-free games on flat strings, which are each feasible in polynomial time by Corollary A. 15.
For the sake of simple presentation, we identify trees and their nested word linearisations throughout this proof.
As described above, our goal is to subsequently remove subtrees in a bottomup manner and only consider flat strings of leaf node labels. More precisely, each removal step replaces a subtree of depth one, that is, a node v whose children are all leaves, by a single node with a label that contains all relevant information about its (former) subtree with respect to the game. If, for instance, the subtree below a node v with function symbol f cannot be rewritten to conform to the corresponding part of some DTD V f , this information will be encoded into the label of v and Juliet will never be able to play Call on v or any of its ancestors with a function symbol from V f , no matter her rewriting capabilities on other parts of the input tree.
Let t be the tree representing some well-nested rooted 22 word w. By labelpvq we denote the label of a node v. By S we denote the set tT, V 1 , . . . , V d u of schemas of the game. The profile P pt 1 q Ď S of a tree t 1 is the set of schemas for which t 1 is valid. We first consider subgames on subtrees t v rooted at some node v with label a. With each replay-free strategy σ on t v that does not play Call on v itself, we associate the profile set P σ pt v q of profiles P , for which Romeo has a counterstrategy yielding a tree t 1 with P " P pt 1 q. The dossier Dpvq of v is the set of all sets X, for which there is a strategy σ of Juliet such that P σ pt v q Ď X. In our words, Dpvq is the closure of the set of all sets P σ pt v q under taking supersets. 23 In the bottom-up computation mentioned above, we plan to replace the subtree below each node v with label a and change v's label to pa, P σ pt v qq. Once, this process reaches the root rootptq of the tree, it can be instantly decided whether Juliet has a wining strategy on w. Indeed, this is the case if and only if Dprootptqq contains a profile set P, such that every profile P P P contains the target schema T .
To illustrate the above definitions, we consider the special case d " 1, that is, besides the target schema T there is only one validation schema V . In this case, there are four possible profiles of trees: tV, T u, tV u, tT u, H. As an example, a tree has profile tV u if it is valid with respect to V but not with respect to T .
The four different profiles yield 2 4 " 16 possible profile sets and 2 16 " 65536 candidate dossiers. However, only the following six cases need to be distinguished: • ttV, T uu P D: Juliet has a strategy that guarantees to yield a tree t 1 that is valid with respect to both schemas; • ttT uu P D and ttV uu P D: Juliet has a strategy that guarantees a tree t 1 in T and a strategy that guarantees a tree in t 2 in V , but neither t 1 nor t 2 is valid with respect to the other schema; • ttT uu P D, but ttV uu R D: Juliet has a strategy that guarantees a tree t 1 in T , but no strategy that guarantees a tree in t 2 in V ; • ttV uu P D, but ttT uu R D: Juliet has a strategy that guarantees a tree t 2 in V , but no strategy that guarantees a tree in t 1 in T ; • ttV u, tT uu P D: Juliet has a strategy that guarantees to yield a tree that is either in T or in V , but she can not enforce either of the two; • D " ttHuu: no matter how Juliet plays, Romeo can always enforce a tree that is invalid for both T and V .
In all lower cases, we assume that none of the upper cases applies. We now start with the detailed description of the algorithm. We assume 24 that all content models of DTDs are given by DFAs.
As stated above, the algorithm works in a bottom-up fashion. First, for all leaf nodes, their dossier is computed. As there is no actual subgame on a leaf node v (that does not play Call on that node), each such dossier is just ttP pt v quu. In this case, P pt v q is just the set of schemas in which the (original) label of v is allowed at a leaf node.
The key step that the algorithm performs is to compute the dossier of a node v with children u 1 , . . . , u m all of whose dossiers are already given. The idea is to compute Dpvq with the help of replay-free games on flat strings, whose winning problem can be decided thanks to Corollary A.15.
For these flat games, the algorithm needs to compute, in a preprocessing phase that only depends on G, flat replacement sets R 1 f , for every function symbol f P Γ. As replacement strings represent strings in which no further Call moves are possible, the labels of their positions do not include dossiers but rather the profile of the actual tree that they represent.
Each set R 1 f can be computed as follows. Let L f denote the content model of f in V f (represented by some DFA A f ). For each symbol a occurring in L f , let Σ f,a be the set of all pairs pa, P q, such that there is a tree t 1 with profile P and root label a that is valid with respect to R f . For each f , a and P , it can be decided in polynomial time whether pa, P q P Σ f,a by constructing a deterministic tree automaton that accepts all trees that are valid with respect to R f and the schemas in P , and invalid with respect to the schemas in SzP . As d is fixed, this amounts to an emptiness test for the polynomial-size product of d`1 deterministic tree automata. It follows that all sets Σ f,a can be computed in time polynomial in the size of G. 25 The set R 1 f consists of all strings over Ť aPΣ Σ f,a whose Σ-projection is in L f . Given the sets Σ f,a , a DFA for R 1 f can be easily (and efficiently) computed. Now, with the schemas R 1 f at hand, we describe the computation of Dpvq from u 1 , . . . , u m and their dossiers in more detail.
For a dossier D " tP 1 , . . . , P ℓ u and a symbol a, let spa, Dq denote the string 26 pa, Dqpa, P 1 q¨¨¨pa, P ℓ q¨# a .
The idea behind the construction of the flat game is as follows.
The original game on a tree t z with root label g (where z is a child of the current root node v) can be viewed as follows: Juliet chooses a strategy for the first phase of the game before the closing tag {g of z is reached. This strategy corresponds to some profile set P i P Dpzq. By choosing a counterstrategy for this subgame, Romeo basically picks a profile P P P i . Then Juliet decides whether she plays Call at {g (subject to validity with respect to V g ) and Romeo replaces z, in case she plays Call.
In the flat game on pg, Dqpg, P 1 q¨¨¨pg, P ℓ q# g this is mimicked as follows: Juliet chooses her strategy by playing Call at pg, P i q. Romeo replaces pg, P i q by some pair pg, P q with P P P i . So far the games exactly mimicks the original game before reaching {g . If P allows Juliet to play Call at {g (that is, if V g P P ), she can call the follow-up symbol # g which is then replaced by Romeo with a string from R 1 g . The case that Juliet cheats by playing Call although V g R P can be easily detected by the target automaton (whose construction will be explained soon, otherwise).
For each of the 2 2 d`1 possible profile sets Q, the algorithm determines the winner for a particular replay-free game on the string splabelpu 1 q, Dpu 1 qq, . . . , splabelpu m q, Dpu m qqq with replacement sets • R 1 a , for every symbol # a and • tpa, P 1 q, . . . , pa, P j qu, for each symbol pa, Pq with P " tpa, P 1 q, . . . , pa, P j qu.
It only remains to specify the target language of the game, which, of course, depends on Q. The DFA A Q for the target language for profile set Q has to determine whether Juliet has a winning strategy in the (original) subgame on t v that yields a tree with a profile in Q.
To this end, A Q ignores all symbols that do not represent actual subtrees in the original game, that is, • all symbols pg, Dq, as they only indicate the beginning of a substring for some node; • all symbols pa, Pq with profile sets P as they correspond to strategy options for Juliet that she did not choose; and • all symbols # g as they represent cases in which Juliet played Read and the respective subtree is represented by the symbol pg, P q, chosen by Romeo; We call all other symbols relevant.
Thus, A Q accepts all strings y resulting from the game, for which the subsequence y 1 of relevant symbols is consistent with some profile P P Q. That is, if 27 • for all symbols pq, P 1 q of y 1 it holds P Ď P 1 and, • for each schema D P P the Σ-projection of y 1 is in (the language of) D.
As d is fixed, A Q is of polynomial size.
This completes the construction of the flat game and thus of the algorithm. Each of the bottom-up reduction steps amounts to a (large but) constant number of tests whether Juliet has a winning strategy in a flat game without replay and therefore can be done in overall polynomial time.
It is not too difficult but tedious to verify that the algorithm is also correct.

Lower bounds
In this subsection, we prove lower bounds for less restricted classes of validation cfGs. We prove the lower bounds of Theorem 21 as single results in the order in which they were stated in Section 5: from most expressive to least expressive target, replacement and validation languages.
Theorem A. 16. For the class of validation games with target, validation and replacement languages specified by DNWAs, JWin is EXPTIME-hard without replay. This lower bound already holds for games with one single function symbol.
Proof. We show EXPTIME-hardness by reduction from the intersection emptiness problem for deterministic nested word automata: Given n DNWAs A 1 , . . . , A n , does it hold that LpA 1 q X . . . X LpA n q " H? That this problem is EXPTIMEhard follows directly from the EXPTIME-hardness of the intersection emptiness problem for deterministic top-down finite tree automata [16]. Given DNWAs A 1 , . . . , A n over an alphabet Σ, we construct a game G and input string w such that Juliet has a winning strategy on w in G if and only if there is no string v P WFpΣq accepted by all n automata. The game G uses the alphabet Σ Y ts, tu, with s, t R Σ, and t being the only function symbol of G.
The input string is w " t n`1 s {s {t n`1 , i.e. the tree linearised by w is simply a path of length n`2 whose n`1 non-leaf nodes are labelled t and whose leaf is labelled s. According to G, play on w should proceed as follows: First, Juliet plays Call on the first {t in w (i.e. the innermost t). We emphasize that t is the only function symbol and is therefore used for two different purposes in this proof.
Romeo replies to this call by providing some string v P WFpΣq; if possible, Romeo will want to choose as v a string contained in the intersection of all LpA i q for i P rns. Juliet, in turn, will try to show that there is some i P rns such that v R LpA i q; she does so by playing Call on the i-th remaining {t in the rewritten string t n v {t n . The validation language for t will ensure that this Call is only possible if v is indeed not in LpA i q. Romeo can reply to such a Call by Juliet with an arbitrary string in WFpΣq. However, the actual choice of this string is inconsequential as all is needed for Juliet to win is that there are less than n occurrences of {t in the resulting string.
More formally, the game G over alphabet Σ Y ts, tu with Γ " ttu is defined with the replacement language R t " WFpΣq and validation language V t " t t s {s {t u Y t t i v {t i | v P WFpΣqzLpA i q, i P rnsu. The target language is G can be efficiently computed from A 1 , . . . , A n , as DNWAs can be complemented in polynomial time, and given DNWAs for WFpΣqzLpA i q for each i P rns, DNWAs for V t and T can easily be constructed.
Clearly, any strategy σ for Juliet on w in G that does not Call the first {t cannot be a winning strategy, as the target language does not contain any s tags and the only part of the validation language containing s tags only ever applies to the innermost t. From there, it is straightforward to prove that Juliet has a winning strategy on w if and only if Romeo can not respond to this first Call with a string that is contained in the intersection of all LpA i q for i P rns, i.e. iff LpA 1 q X . . . X LpA n q " H.
Theorem A.17. For the class of validation games with target, validation and replacement languages specified by XML Schemas, JWin is PSPACE-hard. This lower bound already holds for games with one single function symbol, whose replacement and target language are given by DTDs and whose replacement language is finite.
Proof. (sketch) We prove this claim by giving a reduction from the problem QBF of determining for a given quantified Boolean formula Φ, whether that Φ is true. We assume that the input formula is of the form Φ " Q 1 x 1 . . . Q n x n ϕpx 1 , . . . x n q with Q i P tD, @u for all i P rns and a Boolean formula ϕ " C 1 _ . . . _ C m with m clauses in disjunctive normal form. Without loss of generality, we further assume that no clause contains both x i and x i for any i P rns.
We construct from Φ a validation game G with a single function symbol f , and an input string w such that Juliet has a winning strategy on w in G if and only if Φ is true. We first sketch the construction and the manner in which play proceeds according to G before giving a formal construction. For the sake of simpler presentation, we identify trees and their nested word linearisations.
The input string consists of a path of m clause nodes each labelled f . Below the final clause node, w consists of a "spine" of backbone nodes with labels b 1 to b n`1 , where each b i has as its left child a variable node labelled f , and as its right child a node labelled b i`1 . The leaf node b n`1 terminates this chain.
As should be obvious from this description, nodes labelled f in this string serve different purposes, depending on their placement in w. This is reflected by the single-type tree grammar for the validation language V f having several different types for nodes which may be labelled f . In principle, the clause nodes may be assigned types from tC 1 , . . . , C m u, while the variable node child of each node labelled b i for some i will (usually) be typed as x i . Additionally, the tree grammar for V f may assign to each node labelled b i a type from tb 1 i , . . . , b m i u. The exact purpose of these types will become clear in the rest of the proof.
Play on the input string w proceeds as follows: In a left-to-right order, the variable nodes of type x 1 to x n are the first to be played on (in the same order as the variables x 1 , . . . , x n are quantified in Φ). A play rewriting these nodes establishes an assignment α of truth values to the variables x 1 , . . . , x n , with Juliet choosing assignments for existentially quantified variables and Romeo choosing for universally quantified variables.
Afterwards, Juliet is supposed to select one clause C i that evaluates to "true" under α by playing Call on the i-th clause node from the bottom, with Romeo replacing it and thus truncating the input tree to end in a leaf after a path of f nodes. The validation language will ensure that Juliet is only allowed to play Call on the i-th clause node from the bottom if α indeed satisfies C i . If Juliet manages to play Call on any clause node, she wins the game, otherwise she loses.
We sketch in some more detail how Juliet and Romeo construct a variable assignment before giving formal details on the construction. For universally quantified variables x i , the matter is simple: No validation (or target) language will be able to match type x i in this position, so Juliet is forced to play Call on it, giving Romeo the opportunity to replace it with 0 or 1 (which is then interpreted as setting x i to be true resp. false under α). For existentially quantified variables x i , the binary choice of setting x i to true or false is modelled by Juliet's choice whether or not to Call the symbol f of type x i : The replacement language for type x i also has to be t0, 1u (as there is only the single function symbol f used as a label for all function types), so in this case an uncalled x i is interpreted as setting x i to be true under α, while both 0 and 1 will be interpreted as setting x i to be false. Note that the replacement language R f " t 0 {0 , 1 {1 u thus constructed is finite and definable by a DTD.
The target language consists of all strings linearising paths containing only non-leaf nodes labelled f and ending in a leaf node labelled 0 or 1. Again, it is easy to see that this language can be represented by a DTD.
Variable nodes should always allow Juliet to Call them on the input string described above, so ǫ P V f . Furthermore, the part of the validation language used at each clause node of distance i from the bottom b 1 node should accept exactly those subtrees encoding satisfying assignments for C i .
Altogether, we can give a tree grammar T f to define the schema for V f . This grammar uses the label alphabet Σ (as above), type alphabet ∆ " tx i , b j i , C j | i P rns, j P rm|u Y tb n`1 , 0, 1, xu with a labelling function λ mapping all types that are also symbols in Σ to themselves, all b j i (for j P rms) to b i and all other types to f , and the following productions (with C 1 being the start symbol): • for all i P rns, j P rms: It is clear to see that this grammar is indeed single-type, and that all of its content models are specified by deterministic regular expressions.
We can now explain in detail the exact purpose of the types defined above: When Juliet and Romeo construct a variable assignment, variable nodes can be matched to type C 1 in the above grammar and (as they are leaves) accepted with child string ǫ. After the variable assignment has been constructed, if Juliet calls the j-th clause node from the bottom, that node is matched to type C 1 , with subsequent child clause nodes being matched to clause types with increasing clause numbers. The bottom clause node is matched to C j . Since that node's child is labelled b 1 , it has to be matched to b j 1 , and this upper index j is "carried down" through the backbone nodes, making certain that each backbone node "knows" which clause is to be checked. The sub-grammars for each type b j i then takes care of checking whether the truth assignment constructed by Juliet and Romeo indeed fulfils clause C j . The correctness of this correctness is proven as follows: Each play on the variable nodes induces an assignment to the variables x 1 . . . x n compliant with their quantification in Φ (and vice versa), and the subtree starting at the ith clause node from the bottom is in V f if and only if it has been rewritten to correspond to a variable assignment satisfying C i . This directly implies that Juliet has a winning strategy on w in G if and only if Φ is true, which concludes the reduction.
As seen in the proof of Theorem 20, the running time of the algorithm we give for deciding JWin with target, replacement and and verification DTDs grows superpolynomially in the parameter d, i.e. the number of function symbols. The following result shows it is unlikely that one can avoid such behaviour.
Theorem A.18. For the class of games with validation, JWin (without replay) is PSPACE-hard, for games with an unbounded number of validation DTDs and replacement and target languages specified by DTDs.
Proof. This follows from the proof of Theorem A.17, with slight modifications. As in that proof, we show PSPACE-hardness by reduction from QBF, with the quantor-free part of the input formula in disjunctive normal form.
First off, note that each single-type tree grammar may be seen as a DTD over its type alphabet. More precisely, if T " pΣ, ∆, S, P, λq is a single-type tree grammar, then the tree grammar T 1 " p∆, ∆, S, P, id ∆ q (where id ∆ is the identity function on ∆ mapping each type to itself) is local. We use this fact to construct from the verification language V f given in the proof of Theorem A.17 several validation DTDs, with the number of function symbols (and corresponding validation languages) constructed in the reduction from QBF growing with the number of clauses and variables of the input formula.
The input string w is similar to the one from the proof of Theorem A.18, consisting of a path of m clause nodes and, below them, a subtree made up of variable and backbone nodes. Other than in that proof, however, the clause nodes are already labelled C 1 to C m (with C 1 labelling the topmost node, directly below the root). The variable nodes are already labelled X 1 to x n right from the start.
The target language is almost the same as in the proof of Theorem A.18 (accounting, however, for clause node labels), and the replacement language is identical to the one given there. The validation languages for each x i simply consists of a singleton node labelled x i .
Validation languages for each C j are obtained from the tree grammar T f for V f given in the proof of Theorem A.18 as the sub-grammars of T f starting at C j , with the only difference being that each b j i is simply replaced by b i . This is because the purpose of the upper index j in that proof was carrying the clause number selected by Juliet down through the variable assignment subtree. This technique is no longer necessary here, due to the fact that the validation language for each C j is separate, which means that the clause to be checked is inherent to its corresponding validation DTD and thus already "known" to all of its variables.
The correctness of this construction is shown as in the proof of Theorem A.18.

Insertion rules
We consider here cfGs with insertion instead of replacement rules, i.e. games of the form G " pΣ, Γ, I, T q where the insertion relation I Ď ΓˆWFpΣq takes the place of the replacement relation R from cfGs as defined in Section 2. The semantics is similar to standard cfGs, except for the definition of follow-up configurations after a Call move by Juliet. We recall that we consider three different semantics here: the general setting, where Juliet may play another subgame on the substring she just called; the weak replay setting, where Juliet only gets to play on the newly inserted substring after a Call; and the setting without replay, where the play proceeds to the right of a newly inserted substring without modifying it.
Our restriction to games having only insertion rules is primarily to simplify the presentation of our proofs. It is relatively easy (if tedious) to prove that games can be extended to contain both replacement and insertion rules without changing the complexity of JWin, as long as appropriate semantics for insertion and replacement rules are chosen.
We generally assume G " pΣ, Γ, I, T q to be an insertion game with target language T represented by a DNWA ApT q and insertion languages I a represented by an arbitrary NWA for each a P Γ.
We restate Proposition 22 for easier reference. Proposition 22 (restated). For the class of games with insertion semantics, target DNWAs and replacement NWAs, JWin is (a) undecidable in general; (b) 2-EXPTIME-complete for games with weak replay; and (c) P SP ACE-complete for games without replay.
Before proving Proposition 22, we prove two auxiliary results showing a strong correspondence between replacement games and insertion games (with appropriate semantics). Recall that for a replacement game G, JWinpGq denotes the set of all winning strings for Juliet in G with unbounded replay and JWin 1 pGq without replay; similarly, we denote the winning set for Juliet in an insertion game G 1 by JWinpG 1 q in the general setting, by JWin 1`p G 1 q with weak replay, and by JWin 1 pGq without replay.
Lemma A.19. There exists a polynomial-time algorithm that, given a replacement cfG G " pΣ, Γ, R, T q and nested word w P WFpΣq, outputs an insertion cfG G 1 " pΣ 1 , Γ, I, T 1 q and word w 1 P WFpΣ 1 q such that • w P JWinpGq ô w 1 P JWin 1`p G 1 q, and • w P JWin 1 pGq ô w 1 P JWin 1 pG 1 q.
Proof. The main observation we need is that replacement in cfGs is generally very localised, i.e. a Call on {a in a string of the form u a v {a only affects a v {a , the shortest well-nested suffix of the current string up to {a .
The obvious idea behind the proof is to simulate replacement rules with insertion rules. The crucial insight for this simulation is that, while we cannot delete the rooted suffix w " a v {a from a current string, the new target automaton ApT 1 q can "undo" the effect of w on ApT q by reverting it to the state it had before reading w. To this end, ApT 1 q simulates ApT q, all the while memorising (in its state) a "fallback state" that ApT q was in before beginning to read w. That way, ApT 1 q can always revert its simulation of ApT q to the point before w was read, effectively making ApT q "forget" w and thus simulating a replacement of w.
In this way, it is easy to simulate deletion of suffixes that would be replaced in G, so we only need some way of knowing when such a deletion should take place. To this end, we encapsulate replacement strings u for G within backspace tags as b u {b (with b R Σ). Now, when the automaton A 1 reads a b , it knows that what follows after is supposed to be a replacement string, so it "forgets" the last rooted suffix of the current string, jumps back to the last fallback state and continues simulating ApT q on u, re-setting its fallback state along the way as necessary.
Formally, let ApT q " pQ, Σ, δ, q 0 , F q be a DNWA in normal form for T and let b R Σ. We define G 1 as follows: • I a " t b u {b | u P R a u for all a P Γ and • T 1 " LpA 1 q for the DNWA A 1 defined below.
In keeping with the above intuition, A 1 tracks in its state pp, qq a current state p and a fallback state q of ApT q. When A 1 reads an a (resp. {a ), it knows that the rooted string immediately to the left of a has not been replaced, so it simulates a step of ApT q to obtain a new current state and sets the new fallback state to be the state A had immediately before reading a (respectively the a associated with the current {a ).
On reading b , A 1 knows that the last minimal nested string has been replaced in G, so it returns its simulation of ApT q to the fall-back state and simulates ApT q on the replacement string following after b from there. On {b , neither the current nor fallback state changes, as the last rooted string to the right of {b may be considered the last minimal suffix of the current word in the replacement game.
If, after reading a string and simulating ApT q on it as described above, ApT q accepts (i.e. the current state of A 1 is in F ), A 1 accepts as ApT q would.
Lemma A.20. There exists a polynomial-time algorithm that, given an insertion cfG G " pΣ, Γ, I, T q and nested word w P WFpΣq, outputs a replacement game G 1 " pΣ 1 , Γ, R, T 1 q and word w 1 P WFpΣ 1 q such that • w P JWin 1`p Gq ô w 1 P JWinpG 1 q, and • w P JWin 1 pGq ô w 1 P JWin 1 pG 1 q.
Proof. The basic idea behind simulating insertion games using replacement games is to replace every subword a v {a of w by a µpvq {a a 1 {a 1 in w 1 (where a 1 is a new "copy" of a) and to simulate the insertion of a new substring to the right of a v {a by the replacement of a 1 {a 1 . We refer to the additional substrings of the form a 1 {a 1 as "anchors".
To this end, we need to ensure that (a) no non-anchor substring ever gets replaced, and (b) each replacement string contains new anchors for further insertions. For part (a), we add extra symbols to the input alphabet, while part (b) is done through the transformation from w P WFpΣq to w 1 P WFpΣ 1 q hinted at in the claim's statement.
More formally, we set Σ 1 " Σ Y ta 1 | a P Σu, i.e. we add a second disjoint copy of Σ to itself. Strings will generally be transformed using a function µ : WFpΣq Ñ WFpΣ 1 q defined inductively by • µpǫq " ǫ • µpuvq " µpuqµpvq for all u, v P WFpΣq and • µp a v {a q " a µpvq {a a 1 {a 1 for all a P Σ, v P WFpΣq.
The target language of G 1 is defined as T 1 " tµpwq | w P T u; it is easy to see that a DNWA for T 1 can be constructed from ApT q by simply ignoring symbols from Σ 1 zΣ.
The set of function symbols in G 1 is just ta 1 | a P Γu, and the replacement languages are defined by R a 1 " tµpwq | w P R a u.
Again, it is easy to see that automata for each R a 1 can be computed from those for R a in polynomial time.
Finally, the input string gets transformed (in polynomial time) via µ as well: w 1 " µpwq.
Proof of Proposition 22. Parts (b) and (c) follow directly from Lemmas A.20, A.19 as well as Proposition 10. All that remains to be proven is therefore the undecidability of JWin in the general setting.
Intuitively this holds because, on a string of the form a v {a , jumping back to the start after calling {a effectively allows Juliet to play arbitrarily many left-to-right passes on v, thereby enabling her to simulate any (not just L2R-) strategy on v. We utilise this fact to give a reduction from the algorithmic problem to find out whether for a flat string w, and a context-free game G with flat regular replacement and target languages and the ability of Juliet to freely select positions (not only from left-to-right), Juliet has a winning strategy. It was shown in [14] that this problem is undecidable. For precise definitions of these games we refer to [14].
For the reduction, we construct a cfG G n from a given input flat cfG G r " pΣ, Γ, R, T q with target language DFA A " pQ, Σ, δ, q 0 , F q for T . The idea is to simulate an arbitrary strategy for Juliet on some flat string w P Σ˚in G r by means of a L2R strategy on the nested word " w P WFpΣq derived from w P Σb y replacing each symbol a in w with a {a .
We make use of the relatively simple observation that an arbitrary strategy of Juliet on w (in which Juliet may freely choose which position in the current string to Call next) can easily be simulated by an unbounded number of left-toright passes over the current string using only the moves Read (which moves the current position within the string one step to the right), Call (which does not change the current position) and additional left-step (LS) moves, which reset the current position to 0 once the end of the current string has been reached (cf. [4]).
The idea for the reduction, now, is to transform the input string w into a string of the form r " w {r (for some r R Σ), simulate each left-to-right pass for Juliet on w appropriately on " w and then use a Call on {r to simulate a LS move, appending some irrelevant "tail" t {t (for t R Σ) to the current nested string in the process.
The only minor conceptual difficulty is how to simulate a left-to-right pass of Juliet on " w using insertion rules, as context-free games with non-nested regular languages are defined using only replacement in [14]. This can be done with a similar technique as described in the proof of Lemma A.19 -replacement strings v from some replacement language R a Ď Σ˚are transformed into nested strings as above and encapsulated in "backspace" tags as b " v {b (for b R Σ); on reading an opening b , the target DNWA for G n "forgets" the last nested string before the b by restoring a fallback state of A. The only difference to the proof of Lemma A.19 is that here, the target DNWA for G n merely has to simulate a DFA, not a DNWA.