Lexicographic Transductions of Finite Words

Filiot, Emmanuel; Lhote, Nathan; Reynier, Pierre-Alain

doi:10.4230/LIPIcs.MFCS.2025.50

Lexicographic Transductions of Finite Words

Emmanuel Filiot

Université libre de Bruxelles (ULB), Belgium Nathan Lhote

Aix Marseille Univ, CNRS, LIS, Marseille, France Pierre-Alain Reynier Aix Marseille Univ, CNRS, LIS, Marseille, France

Abstract

Regular transductions over finite words have linear input-to-output growth. This class of transductions enjoys many characterizations, such as transductions computable by two-way transducers as well as transductions definable in MSO (in the sense of Courcelle). Recently, regular transductions have been extended by Bojańczyk to polyregular transductions, which have polynomial growth, and are characterized by pebble transducers and MSO interpretations. Another class of interest is that of transductions defined by streaming string transducers or marble transducers, which have exponential growth and are incomparable with polyregular transductions.

In this paper, we consider MSO set interpretations (MSOSI) over finite words, that were introduced by Colcombet and Loeding. MSOSI are a natural candidate for the class of “regular transductions with exponential growth”, and are rather well behaved. However, MSOSI for now lacks two desirable properties that regular and polyregular transductions have. The first property is to have an automata description. This property is closely related to a second property, that of being regularity preserving, meaning preserving regular languages under inverse image.

We first show that if MSOSI are (effectively) regularity preserving then any automatic $\omega$ -word has a decidable MSO theory, an almost 20 years old conjecture of Bárány.

Our main contribution is the introduction of a class of transductions of exponential growth, which we call lexicographic transductions. We provide three different presentations for this class: first, as the closure of simple transductions (recognizable transductions) under a single operator called maplex; second, as a syntactic fragment of MSOSI (but the regular languages are given by automata instead of formulas); and third, we give an automaton based model called nested marble transducers, which generalize both marble transducers and pebble transducers. We show that this class enjoys many nice properties including being regularity preserving.

Keywords and phrases:

Transducers, Automata, MSO, Logical interpretations, Automatic structures

Funding:

Emmanuel Filiot: research director at F.R.S.-FNRS. This work was partially funded by the FNRS project 40020726.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Automata extensions

Related Version:

Full Version: https://arxiv.org/abs/2503.01746 [18]

Acknowledgements:

Nathan Lhote would like to warmly thank Mikołaj Bojańczyk, Rafał Stefański and Lê Thành Dũng (Tito) Nguyễn for many interesting discussions at different stages of this work.

Funding:

This work has been partly funded by the QuaSy project (ANR-23-CE48-0008).

DOI:

10.4230/LIPIcs.MFCS.2025.50

Event:

50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)

Editors:

Paweł Gawrychowski, Filip Mazowiecki, and Michał Skrzypczak

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Before discussing lexicographic transductions, the central notion of this article, we give some context on transduction classes and connections to automatic structures.

MSOSI and the connection to automatic structures.

MSO set interpretations (MSOSI) were introduced in [11], as a generalization of automatic structures (as well as $\omega$ -automatic, tree automatic and $\omega$ -tree automatic structures). Indeed an automatic structure can be seen as an MSOSI whose domain is a single structure with decidable MSO theory such as $(\mathbb{N},\leq)$ . Using a framework of transformations turns out to be very fruitful, and most of the properties of automatic structures already hold for set-interpretations over structures with decidable MSO theory. The core property of automatic structures (and their generalizations) is that they have a decidable FO theory. More generally, MSOSI have what we call the FO backward translation property, meaning that the inverse image of an FO definable set by an MSOSI is MSO definable. This property is obtained via simple, yet powerful, syntactic formula substitution. This technique actually allows to show more generally that MSOSI are closed under post-composition by FO interpretations (FOI).

Generally speaking, automatic structures do not have a decidable MSO theory. This has motivated a line of research that looks for interesting structures with a decidable MSO theory. For instance, morphic $\omega$ -words, as well as two generalizations called $k$ -lexicographic $\omega$ -words [2] and toric $\omega$ -words [4], have been shown to have a decidable MSO theory. Morphic $\omega$ -words and $k$ -lexicographic $\omega$ -words are particular cases of automatic $\omega$ -words¹¹1Not to be confused with automatic sequences.. An automatic $\omega$ -word is an automatic structure with unary relations and a single binary relation that is a total order isomorphic to $(\mathbb{N},\leq)$ (it is crucial that the structure be given by its order relation and not by the successor). To the best of our knowledge, it is not known whether an automatic $\omega$ -word with an undecidable MSO theory exists, raising the next conjecture.

Conjecture 1 ([2], Section 9).

Any automatic $\omega$ -word has a decidable MSO theory²²2Bárány actually conjectures that any automatic $\omega$ -word has a so-called canonical presentation..

In [2, Corollary 5.6], the author even shows that $k$ -lexicographic $\omega$ -words are closed under sequential transductions. As we show in Proposition 11 this property is deeply connected to preserving MSO definable sets by inverse image (which we call regularity preserving³³3It follows the terminologiy of [3]. This property is sometimes called regular continuity [10].) and is stronger than having a decidable MSO theory.

A different setting where one can obtain regularity preserving transductions, is provided in [8] where it is shown that MSO interpretations (MSOI) from finite words to finite words characterize the polyregular transductions. Once again, as for automatic $\omega$ -words, the output structure must be defined by its order and not by the successor.

This calls for a more unifying argument and systematic study of MSOSI whose output structures are linearly ordered, that we phrase as a conjecture⁴⁴4One could state stronger conjectures extending the structures to trees, $\omega$ -words or infinite trees.:

Conjecture 2.

MSOSI from finite words to finite words are regularity preserving.

In this article, we focus on transductions from finite words to finite words for two main reasons: it is already challenging, and it captures part of the difficulty of $\omega$ -words. We show in Proposition 14 that a positive answer to Conjecture 2 entails that Conjecture 1 holds.

On regular transductions with exponential growth.

The theory of finite word transducers has a long history (in fact as long as automata theory) and is still actively studied.

Various classes of transductions have been introduced, most notably (and ordered inclusion-wise): sequential (Seq), rational (Rat), regular (Reg) and the more recent polyregular transductions (PolyReg) as well as transductions defined by streaming string transducers (SST), which subsume Reg but are incomparable to PolyReg. For a recent survey, see [23].

These classes are rather well-known and enjoy nice regularity properties, including being closed under composition (except for SST which is still closed under post-composition by Seq) which entails⁵⁵5In Proposition 11 we see that the two are closely related. being regularity preserving. However some important questions remain open, such as equivalence of PolyReg transductions, which is not known to be decidable.

The two classes of Reg and PolyReg enjoy natural logical characterizations, namely word-to-word MSO transductions (MSOT) and MSOI, respectively. The fact that MSOT are regularity preserving is again obtained by simple formula substitution and holds for arbitrary structures. In contrast in the case of MSOI, the only known proof is via a translation into an automaton model called pebble transducers. This raises a natural question for MSOSI:

Question 1.

Can we get an automaton model corresponding to MSOSI over finite words?

A positive answer to this question would hopefully provide a proof of Conjecture 2, since natural automata models are usually closed under post-composition by Seq ⁵. While hope plays an important part in research, we have good reasons to think this is a hard problem: as mentioned above this would solve a long standing open problem on automatic structures.

It is rather clear that PolyReg captures the “right” notion of regular transductions with polynomial growth. While MSOSI seems like a natural candidate, not enough is known about this class yet to say that it captures the “right” notion of regular transductions with exponential growth. Let us more humbly describe what should be, in our view, a nice class of regular transductions with exponential growth: this class should be characterized by different, somewhat natural⁶⁶6As opposed to an artificial model like the union of PolyReg with SST., computation models which subsume the well-behaved classes of PolyReg and SST. It should be regularity preserving and potentially⁵ have extra closure properties by pre- or post-composition with smaller classes. In this article we introduce the class of lexicographic transductions (Lex) which meets all the above criteria.

Contributions.

The first contribution of the article is a hardness result: showing that word-to-word MSOSI are regularity preserving is at least as hard as showing that any automatic $\omega$ -word has a decidable MSO theory (Proposition 14). To obtain this result we define automatic transduction (AT) which are naturally equivalent to MSOSI but formulated in a way that makes the connection with automatic structures clearer. That way we obtain a one-to-one correspondence between automatic $\omega$ -words and automatic transductions over a unary alphabet which define a total function.

The main contribution of the article is the introduction of a new class of transductions, called lexicographic transductions (Lex). We give three different characterizations of this class and show that it enjoys many nice properties, including being regularity preserving.

The first definition of Lex is in the spirit of list functions of [7, 5]: we start with simple functions which are recognizable transductions whose range contains words of length at most $1$ only. Then we close the class under a single type of operator called maplex which works as follows: $\textsf{maplex}\ f$ maps a word $u$ to the concatenation $f(u_{1})f(u_{2})\dots f(u_{n})$ where $u_{1},\dots,u_{n}$ are all the labellings of $u$ over some fixed and totally ordered alphabet, enumerated in lexicographic order.

Secondly, we show that this class can be expressed as a syntactic restriction of AT, which we call lexicographic automatic transductions ( $\textsf{AT}_{\textsf{Lex}}$ ). These two characterizations are actually syntactically equivalent but quite different in spirit. We leverage the aforementioned correspondence between automatic $\omega$ -words and automatic transduction, as well as a result of Bárány to show that the nesting of maplex operators generates a strict hierarchy of transduction classes (Proposition 27).

Thirdly, we introduce an automaton model called nested marble transducers (NMT). Nested marble transducers are quite expressive: they generalize marble transducers [15, 12] which are known to coincide with SST, they also naturally generalize PolyReg. Informally, a level $k$ nested marble transducer can annotate its input as a marble transducer (i.e. it drops a marble whenever moving left and lifts a marble whenever moving right), and call a level $k{-}1$ nested marble transducer to run on this annotated configuration. This call returns both an output string and a state which the top-level transducer can use to take its next transition. This passing of information from the lower levels to the higher levels is what allows to prove closure under post-composition with a sequential transducer. This is the key ingredient to show that NMT have regular domains, are regularity preserving, and more generally are closed under post-composition by PolyReg.

Regarding expressiveness, Lex can be expressed by NMT in a rather direct way (Theorem 36). In the other way, transductions expressed in Lex do not have such a state-passing mechanism, hence showing that NMT is included in Lex constitutes the technical heart of this article (Theorem 37). An important step consists in showing that one can remove the state-passing mechanism in NMT (Theorem 33). On top of being technical, we show it is computationally costly: there is an unavoidable non-elementary blow-up to transform a nested marble transducers into nested marble transducers without state-passing. This allows to prove the equivalence between the different models. In addition, as shown in Figure 1, this equivalence holds at each level of the aforementioned hierarchy.

Figure 1: Overview of the different equivalent models, with the transformations between them. The dotted arrow denotes an equivalence with a semantic restriction of

\textsf{AT}_{k\text{-}\textsf{Lex}}

(see [18] for details).

2 Word languages and transductions

Words and languages.

Given an alphabet $\Sigma$ , a $\Sigma$ -word $u$ (or just word if $\Sigma$ is clear from the context) is a sequence of letters from $\Sigma$ . We denote by $\epsilon$ the empty word, and by $|u|$ the length of a word $u$ . In particular $|\epsilon|=0$ . For all integers $n\geq 0$ , we let $\Sigma^{n}$ (resp. $\Sigma^{\leq n}$ ) be the set of words of length $n$ (resp. at most $n$ ). We let $\text{Pos}(u)=\{1,\dots,|u|\}$ be the set of positions of $u$ , and for all $i\in\text{Pos}(u)$ , $u[i]\in\Sigma$ is the $i$ -th letter of $u$ . We write $\Sigma^{*}$ for the set of words over $\Sigma$ , and $\Sigma^{+}$ for the set of non-empty words. An $\omega$ -word is defined similarly, with a set of positions equal to $\mathbb{N}$ . A word language over $\Sigma$ is a subset of $\Sigma^{*}$ . In this paper, we let $|$ be a symbol called separator, assumed to be distinct from any alphabet symbol. Let $\Sigma_{1},\Sigma_{2}$ be two alphabets, $\ell\in\mathbb{N}$ and $u_{1}\in\Sigma_{1}^{\ell},u_{2}\in\Sigma_{2}^{\ell}$ be two words of length $\ell$ . The convolution $u_{1}\otimes u_{2}$ of $u_{1}$ and $u_{2}$ is the word in $(\Sigma_{1}\times\Sigma_{2})^{\ell}$ such that for all $1\leq i\leq\ell$ , $(u_{1}\otimes u_{2})[i]=(u_{1}[i],u_{2}[i])$ .

Finite automata.

A (non-deterministic) finite automaton (NFA) over an alphabet $\Sigma$ is denoted as a tuple $A=(Q,q_{0},F,\Delta)$ where $Q$ is the set of states, $q_{0}$ the initial state, $F\subseteq Q$ the final states, and $\Delta\subseteq Q\times\Sigma\times Q$ the transition relation. We write $q\xrightarrow{u}_{A}q^{\prime}$ when there exists a run of $A$ from state $q$ to state $q^{\prime}$ on $u$ , and denote by $L(A)=\{u{\in}\Sigma^{*}\mid q_{0}\xrightarrow{u}_{A}q_{f}\in F\}$ the language recognized by $A$ . When $A$ is a deterministic finite automaton (DFA), the transition relation is denoted by a (partial) function $\delta:Q\times\Sigma\rightharpoonup Q$ .

Word transductions.

A word transduction (or just transduction for short) over $\Sigma,\Gamma$ two alphabets is a (partial) function $f\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ . We denote by $\textsf{dom}(f)$ its domain. Given two transductions $f_{1},f_{2}\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ with disjoint domains, we let $f_{1}+f_{2}$ be the transduction of domain $\textsf{dom}(f_{1})\cup\textsf{dom}(f_{2})$ such that $(f_{1}+f_{2})(u)=f_{i}(u)$ if $u\in\textsf{dom}(f_{i})$ . Given $f\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ , $g\colon\Gamma^{*}\rightharpoonup\Lambda^{*}$ , we write $(g\ f)\colon\Sigma^{*}\rightharpoonup\Lambda^{*}$ the composition $g\circ f$ . Given $h\colon\Lambda^{*}\rightharpoonup\Delta^{*}$ , $(h\ g\ f)$ stands for $(h\ (g\ f))$ . For $u\in\Sigma^{*}$ , we also write $(h\ g\ f\ u)$ for $(h\ g\ f)(u)$ .

A transduction $f$ has exponential growth if there exists $c\in\mathbb{N}$ such that for all $u\in\textsf{dom}(f)$ , $|f(u)|\leq 2^{c|u|}$ holds. A transduction $f$ has polynomial growth if there exist $c,k\in\mathbb{N}$ such that for all $u\in\textsf{dom}(f)$ , $|f(u)|\leq c|u|^{k}$ holds.

Example 3 (Reverse, copy and square).

Let $\Sigma$ be an alphabet. The transduction $\textsf{rev}:\Sigma^{*}\rightarrow\Sigma^{*}$ takes as input any word $u=\sigma_{1}\dots\sigma_{n}$ and outputs its reverse $\sigma_{n}\dots\sigma_{1}$ , for all $\sigma_{i}\in\Sigma$ . The transduction copy takes $u$ and returns $u u$ .

Let $\Sigma$ be an alphabet and $\underline{\Sigma}=\{\underline{\sigma}\mid\sigma\in\Sigma\}$ . Given a word $u=\sigma_{1}\dots\sigma_{n}$ and a position $i\in\text{Pos}(u)$ , we let $\textsf{under}_{i}(u)=\sigma_{1}\dots\sigma_{i{-}1}\underline{\sigma_{i}}% \sigma_{i+1}\dots\sigma_{n}$ . The transduction $\textsf{square}\colon\Sigma^{*}\rightarrow(\Sigma\cup\underline{\Sigma})^{*}$ is defined as $\textsf{square}(u)=\textsf{under}_{1}(u)\dots\textsf{under}_{|u|}(u)$ . For example $\textsf{square}(abc)=\underline{a}bca\underline{b}cab\underline{c}$ .

Example 4 (Subwords).

Let $\textsf{sub}:\Sigma^{*}\rightarrow\Sigma^{*}$ be the transduction which enumerates all the subwords of a word in lexicographic order (with rightmost significant bit). For example $\textsf{sub}(abc)=a.b.ab.c.ac.bc.abc$ . Note that sub has exponential growth.

Example 5 (Map).

Let $\Sigma$ be some alphabet and $|\not\in\Sigma$ be some separator symbol. Let $\Sigma_{|}=\Sigma\cup\{|\}$ . Let $f\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ . The transduction $\textsf{map}\ f\colon\Sigma_{|}^{*}\rightharpoonup\Sigma_{|}^{*}$ takes any input word of the form $u=u_{1}|u_{2}|\dots|u_{n}$ where $u_{i}\in\Sigma^{*}$ for all $i\in\{1,\dots,n\}$ , and returns $f(u_{1})|f(u_{2})|\dots|f(u_{n})$ (if all the $f(u_{i})$ are defined, otherwise $(\textsf{map}\ f)(u)$ is undefined.

Sequential and rational transductions.

Sequential transductions are transductions recognized by sequential transducers. A sequential transducer over some alphabets $\Sigma$ and $\Gamma$ (not necessarily disjoint) is a pair $T=(A,\mu)$ where $A=(Q,q_{0},F,\delta)$ is a DFA over $\Sigma$ and $\mu\colon\textsf{dom}(\delta)\rightarrow\Gamma^{*}$ is a total function. We write $q\xrightarrow{u/v}_{T}q^{\prime}$ whenever there exists a sequence of states $q_{1}=q,q_{2},\dots,q_{n+1}=q^{\prime}$ such that $q_{1}\xrightarrow{u[1]}_{A}q_{2}\dots q_{n}\xrightarrow{u[n]}_{A}q_{n+1}$ where $n=|u|$ , and $v=\mu(q_{1},u[1])\dots\mu(q_{n},u[n])$ . The transduction $f_{T}$ recognized by $T$ is defined for all $u\in L(A)$ by $f_{T}(u)=v$ such that $q_{0}\xrightarrow{u/v}q_{f}\in F$ . Note that $\textsf{dom}(f_{T})=L(A)$ . We denote by Seq the class of sequential transductions. Like sequential transducers, a (non-deterministic, functional) finite state transducer is defined as a pair $T=(A,\mu)$ but $A$ can be non-deterministic, with the functional restriction: for all words $u\in L(A)$ , the outputs of all the accepting runs over $u$ are all equal. With this restriction, $T$ recognizes a transduction $f_{T}$ . A rational transduction is a transduction $f_{T}$ for some $T$ , and we denote by Rat the class of rational transductions [21].

Regular and polyregular transductions.

The class of regular (resp. polyregular) transductions is the smallest class of transductions which is closed under composition of transductions and map, and contains the sequential transductions, copy and rev (resp. the sequential transductions, rev and square) [9, 5]. We denote by PolyReg the class of polyregular transductions.

3 MSO set interpretations, properties and limitations

MSO set interpretations

Signatures, formulas and structures.

A relational signature (or simply signature) is a set $\mathcal{S}$ of symbols together with a function $\mathsf{arity}:\mathcal{S}\rightarrow\mathbb{N}$ . We consider a set of first-order variables denoted by lower case letter $x,y,z,\ldots$ as well as a set of second-order variables denoted by upper case letter $X,Y,Z\ldots$ . The MSO-formulas over signature $\mathcal{S}$ , denoted by $\textsf{MSO}[\mathcal{S}]$ , are denoted by the grammar $\phi::=\exists x\phi\mid\exists X\phi\mid\phi\wedge\phi\mid\neg\phi\mid X(x)% \mid R(x_{1},\ldots,x_{r})$ , where $x,x_{1},\ldots,x_{r}$ are first-order variables, $X$ is a second-order variable and $R\in\mathcal{S}$ with $\mathsf{arity}(R)=r$ . We denote by $\textsf{FO}[\mathcal{S}]$ the formulas which do not use second-order variables.

A relational structure $u$ over signature $\mathcal{S}$ is a set $U$ called the universe of the structure, together with, for each symbol $R\in\mathcal{S}$ of arity $r$ , an interpretation $R^{u}\subseteq U^{r}$ .

Regularity preserving.

A function from $\mathcal{S}$ -structures to $\mathcal{T}$ -structures is called regularity preserving if the inverse image of an $\textsf{MSO}[\mathcal{T}]$ definable set is $\textsf{MSO}[\mathcal{S}]$ definable. We say that a class of functions is regularity preserving if all functions in the class are.

Word structures.

The word signature over $\Sigma$ is the tuple $S_{\Sigma}=((\sigma(x))_{\sigma\in\Sigma},\leq(x,y))$ where $\sigma(x)$ are unary predicate symbols and $\leq(x,y)$ , usually written $x\leq y$ , is a binary predicate symbol. Any word $u$ can be naturally associated with an $S_{\Sigma}$ -structure $\tilde{u}=(U,(\sigma^{\tilde{u}})_{\sigma\in\Sigma},\leq^{\tilde{u}})$ where $U=\text{Pos}(u)$ , $\sigma^{\tilde{u}}$ is a set of positions labeled $\sigma$ , for all $\sigma\in\Sigma$ , and $\leq^{\tilde{u}}$ is the natural (linear) order on $\text{Pos}(u)$ . We write $u$ instead of $\tilde{u}$ if it is clear from the context that $u$ is an $S_{\Sigma}$ -structure. A word structure over $\Sigma$ is an $S_{\Sigma}$ -structure isomorphic to some $\tilde{u}$ . Note that being a word structure is FO definable.

Definition 6 (MSO set interpretations [11]).

An MSO set interpretation (MSOSI) $T$ from $\mathcal{S}$ -structures to $\mathcal{T}$ -structures, is given by $k\in\mathbb{N}\setminus\left\{0\right\}$ called the dimension, a domain formula $\phi_{\textsf{dom}}\in\textsf{MSO}[\mathcal{S}]$ , an output universe formula $\phi_{\mathsf{univ}}(\overline{X})\in\textsf{MSO}[\mathcal{S}]$ , and for each symbol $R\in\mathcal{T}$ of arity $r$ a formula $\phi_{R}(\overline{X_{1}},\ldots,\overline{X_{r}})\in\textsf{MSO}[\mathcal{S}]$ , where $\overline{X},\overline{X_{1}},\ldots$ are $k$ -tuples of variables.

The semantics of $T$ is a partial transduction $f_{T}$ from $\mathcal{S}$ -structures to $\mathcal{T}$ -structures. The domain of $f_{T}$ is the set of structures $u$ such that $u\models\phi_{\textsf{dom}}$ . Given such a $u$ with universe $U$ , we define its image $v=f_{T}(u)$ as the structure with universe $V=\{\overline{P}\in(2^{U})^{k}\mid u\models\phi_{\mathsf{univ}}(\overline{P})\}$ , and for each $R\in\mathcal{T}$ of arity $r$ , the interpretation $R^{v}=\{(\overline{P_{1}},\ldots,\overline{P_{r}})\in V^{k}\mid u\models\phi_{% R}(\overline{P_{1}},\ldots,\overline{P_{r}})\}$ .We say that an MSOSI is (finite) word-to-word if its domain and co-domain only contain word structures over some respective alphabets $\Sigma,\Gamma$ .

$\blacktriangleright$ Remark 7.

Given an MSOSI from $S_{\Sigma}$ -structures to $S_{\Gamma}$ -structures, one can restrict the domain formula to word structures whose image are word structures. This is because being a linear order is FO-definable.

Example 8.

The transduction sub of Example 4 is definable by the MSOSI $T=(k=2,\phi_{\textsf{dom}}=\top,\phi_{\mathsf{univ}}(X,Y),(\phi_{\sigma}(X,Y))% _{\sigma\in\Sigma},\phi_{\leq}(X,Y))$ . The main idea is to let $X$ range over all possible subsets, and $Y$ range over all possible singletons $\{y\}$ such that $y\in X$ . Let ${\sf sing}(Y,y)=Y(y)\wedge\forall y^{\prime}(Y(y^{\prime})\rightarrow y^{% \prime}=y)$ . It holds true iff $Y=\{y\}$ . Then, $\phi_{\mathsf{univ}}(X,Y)=\exists y({\sf sing}(Y,y)\wedge X(y))$ . The label of an output position $(X,\{y\})$ is the label of the input position $y$ , i.e. $\phi_{\sigma}(X,Y)=\exists y({\sf sing}(Y,y)\wedge\sigma(y))$ . Finally, any two output positions $(X_{1},\{y_{1}\})$ , $(X_{2},\{y_{2}\})$ are ordered lexicographically (with rightmost significant bit): if $X_{1}=X_{2}$ , then $y_{1}\leq y_{2}$ , otherwise, the smallest mismatching position $x$ (i.e. a position in the symmetric difference of $X_{1}$ and $X_{2}$ ) must be in $X_{1}$ . Those properties are easily expressible in MSO.

MSO transductions, MSO and FO interpretations.

An MSO interpretation (MSOI) is an MSOSI whose free set variables are restricted to be singleton sets. This can be syntactically enforced in the universe formula $\phi_{\mathsf{univ}}$ , as being a singleton is an MSO definable property. Equivalently, MSOI are defined as MSOSI but instead the free variables are first-order. Note that transductions realized by MSOI have only polynomial growth. An FO interpretation (FOI) is an MSOI whose formulas are all FO-formulas. Finally an MSO transduction (MSOT) is (roughly⁷⁷7Classically, one adds a bounded number of copies of the input to get the full class of MSOT. speaking) an MSO interpretation of dimension $1$ . MSOT capture exactly the class of regular transductions [13, 1, 16, 23].

The following theorem is at the core of the theory of set interpretations, and automatic structures. It holds in all generality, and furthermore the compositions can be done by simple formula substitutions.

Theorem 9 ([11, Proposition 2.4]).

MSO set interpretations are effectively closed under pre-composition by MSOT and post-composition by FOI.

Exponential versus polynomial growth.

There is a dichotomy for the growth of set interpretations over words, deeply connected to the similar dichotomy for the automata ambiguity, between exponential and polynomial growths. Moreover for polynomial growth transductions, the level of growth exactly coincides with the minimum dimension of an MSOSI defining the transduction. The result holds in the more general case of trees.

Theorem 10 ([20, Theorem 1.5],[6, Theorem 2.3]).

A set interpretation over words has growth either $2^{\Theta(n)}$ , or $\Theta(n^{k})$ for some $k\in\mathbb{N}$ , and this can be computed in PTime. In the latter case⁸⁸8Note that to get this tight correspondence, we need to allow a bounded number of copies of the input, see [20, Definition 4.3]., one can compute an equivalent MSOI of dimension $k$ .

Quite a lot is known about word-to-word set interpretations with polynomial growth, which are called polyregular transductions and enjoy many different characterizations [8, Theorem 7].

Regularity preserving.

An open question on word-to-word MSOSI is whether they are regularity preserving. This can actually be formulated in terms of closure properties.

Proposition 11.

The following are equivalent:

$\blacksquare$

Word-to-word MSOSI are regularity preserving,
$\blacksquare$

The class of word-to-word MSOSI is closed under post-composition with transductions computed by Mealy machines (see [24] for a definition of Mealy machines),
$\blacksquare$

The class of word-to-word MSOSI is closed under post-composition with polyregular transductions.

Automatic transductions

We describe an automata-based presentation of MSOSI, which we call automatic transductions. Algorithmically speaking, it is more amenable to efficient processing, as it is based on automata instead of MSO, and it makes the connection between automatic structures and set interpretations more obvious.

Definition 12.

An automatic transduction (AT for short) from $\Sigma^{*}$ to $\mathcal{T}$ -structures is described as a tuple $T=(\Sigma,B,A_{\textsf{dom}},A_{\mathsf{univ}},(A_{R})_{R\in\mathcal{T}})$ where:

$\blacksquare$

$B$ is a finite alphabet describing a work alphabet
$\blacksquare$

$A_{\textsf{dom}}$ is an automaton over $\Sigma$ recognizing the domain of the transduction,
$\blacksquare$

$A_{\mathsf{univ}}$ is an automaton over $\Sigma\times B$ . Words accepted by $A_{\mathsf{univ}}$ are called configurations,
$\blacksquare$

for each $R\in\mathcal{T}$ of arity $r$ , $A_{R}$ is an automaton over $\Sigma\times B^{r}$ describing tuples of the relation $R$ .

Given a word $u\in\Sigma^{*}$ , the output $\mathcal{T}$ -structure $v=f_{T}(u)$ is defined, whenever $u\in L(A_{\textsf{dom}})$ , as follows: its universe is the set $V=\left\{x\in B^{*}|\ u\otimes x\in L(A_{\mathsf{univ}})\right\}$ ; a predicate symbol $R\in\mathcal{T}$ of arity $r$ is interpreted as $R^{v}=\left\{(x_{1},\ldots,x_{r})\in V^{r}|u\otimes x_{1}\otimes\ldots\otimes x% _{r}\in L(A_{R})\right\}$ .

$\blacktriangleright$ Remark 13.

Automatic transductions are essentially identical to MSOSI, except restricted to input word structures, where one can leverage the classical equivalence between MSO and automata. They can be naturally generalized to work over input structures such as $\omega$ -words, trees and infinite trees, giving rise to the notions of $\omega$ -automatic, tree-automatic and $\omega$ -tree-automatic transductions. Note that an $\omega$ -automatic structure is precisely an $\omega$ -automatic transduction whose domain is a single infinite word $a^{\omega}$ .

Transduction/structure correspondence.

In [18], we show a correspondence between word-to-word automatic transductions and automatic $\omega$ -words (defined therein), thanks to which we obtain the following proposition.

Proposition 14.

If word-to-word MSOSI are effectively regularity preserving, then automatic $\omega$ -words have a decidable MSO theory⁹⁹9One could actually prove the stronger implication that any automatic $\omega$ -word has a canonical presentation, as in [2]..

This entails that a positive answer to Conjecture 2 would provide a positive answer to Conjecture 1, which has been open since [2].

4 Lexicographic transductions

As explained in the latter section, we do not know whether MSOSI are regularity preserving, and as a consequence of Proposition 14, proving that it enjoys this property would prove a long-standing conjecture of the theory of automatic structures [2]. In this section, we introduce a subclass of MSOSI which enjoys this property, called lexicographic transductions.

Definition of lexicographic transductions

We first define this class in terms of closure of basic transductions, called simple transductions, under a lexicographic map operation. The connection with MSOSI is done at the end of this section (Subsection 4), via a corresponding subclass of automatic transductions.

Simple transductions.

A regular constant (partial) transduction of type $\Sigma^{*}\rightharpoonup\Gamma^{*}$ is an expression of the form $L{\triangleright}w$ , where $L$ is a regular language over $\Sigma$ and $w$ is a word in $\Gamma^{*}$ , such that for all $u\in\Sigma^{*}$ , $(L{\triangleright}w)(u)$ is defined only if $u\in L$ , by $(L{\triangleright}w)(u)=w$ . A simple transduction¹⁰¹⁰10It is a restriction of the known class of recognizable transduction to output words of length at most $1$ . $f$ is a finite union of regular constant transductions whose codomain only contains words of length at most $1$ . A simple transduction $f\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ is denoted by $f=\sum_{i=1}^{n}L_{i}{\triangleright}w_{i}$ such that $L_{1},\dots,L_{n}\subseteq\Sigma^{*}$ are pairwise disjoint regular languages, and $w_{1},\dots,w_{n}\in\Gamma^{\leq 1}$ .

Lexicographic enumerators.

An ordered alphabet is a pair $\lambda=(B,\prec)$ such that $B$ is finite set and $\prec$ is a linear order over $B$ . The order $\prec$ is extended lexicographically (using the same notation) to words of same length over $B$ , with most significant letter to the right: for all $n$ and all $u,v\in B^{n}$ , $u\prec v$ if there exists a position $i\leq n$ such that $u[i]\prec v[i]$ and for all $i<j\leq n$ , $u[j]=v[j]$ . Note that $\prec$ is a total order over $B^{n}$ , for all $n$ . We denote $\textsf{succ}_{\lambda}:B^{*}\rightharpoonup B^{*}$ the successor function on $B^{*}$ induced by $\prec$ .

Remind that $|$ is a fixed separator symbol. The $\lambda$ -lexicographic enumerator is the function

\begin{array}[]{llllllll}\textsf{lex{-}enum}_{\lambda}\ :\ \bigcup_{\Sigma,% \Gamma\text{ alphabets}}&\Sigma^{*}&\rightarrow&((\Sigma\times B)^{*}|)^{*}(% \Sigma\times B)^{*}\\ &w&\mapsto&(w\otimes u_{1})|(w\otimes u_{2})|\dots|(w\otimes u_{k})\end{array}

where $|w|=|u_{1}|=\dots=|u_{k}|$ , $u_{1}$ is minimal for $\prec$ , $u_{k}$ is maximal for $\prec$ and for all $1\leq i<k$ , $u_{i+1}=\textsf{succ}_{\lambda}(u_{i})$ . Note that $k=|B|^{|w|}$ .

Example 15.

Let $\Sigma=\{a,b\}$ let $\lambda=(B,\prec)$ be a finite order with $B=\{0,1\}$ and $0\prec 1$ . For all $\sigma\in\Sigma$ and $b\in B$ , we write $\begin{smallmatrix}\sigma\\ b\end{smallmatrix}$ instead of $(\sigma,b)$ and $\begin{smallmatrix}\Sigma\\ b\end{smallmatrix}$ to denote the set of pairs $(\sigma,b)$ for all $\sigma\in\Sigma$ . Then $\textsf{lex{-}enum}_{\lambda}(abb)=\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}|\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}|\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}|\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}|\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}|\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}|\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}|\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}$ .

MapLex combinator.

Let $\lambda=(B,\prec)$ be an ordered alphabet. We define the function

\begin{array}[]{llllll}\textsf{maplex}_{\lambda}&:&\bigcup_{\Sigma,\Gamma\text% { alphabets}}((\Sigma\times B)^{*}\rightarrow\Gamma^{*})\rightarrow\Sigma^{*}% \rightarrow\Gamma^{*}\\ \end{array}

such that for all $\Sigma,\Gamma$ alphabets, all $f\colon(\Sigma\times B)^{*}\rightarrow\Gamma^{*}$ and $u\in\Sigma^{*}$ ,

\textsf{maplex}_{\lambda}\ f\ u\ =\ f(v_{1})f(v_{2})\dots f(v_{k})

where $\textsf{lex{-}enum}_{\lambda}(u)=v_{1}|v_{2}|\dots|v_{k}$ . Note that $u$ is in the domain of $\textsf{maplex}_{\lambda}\ f$ if and only if $v_{1},\ldots,v_{k}$ are all in the domain of $f$ . We write maplex when $\lambda$ is clear from the context.

Definition 16 (Lexicographic transductions).

Lexicographic transductions, denoted by Lex, are defined inductively by $\textsf{Lex}_{0}$ the class of simple transductions and $\textsf{Lex}_{k+1}=\{\textsf{maplex}_{\lambda}\ f\mid$ $f\in\textsf{Lex}_{k},\ \lambda\ \textup{ordered alphabet }\}$ . Elements of $\textsf{Lex}_{k}$ are called $k$ -lexicographic transductions.

Lemma 17.

For all $f\in\textsf{Lex}$ , its domain $\textsf{dom}(f)$ is regular.

Proof.

Any Lex transduction $f\colon\Sigma^{*}\rightharpoonup\Gamma^{*}$ is equal to $\textsf{maplex}_{\lambda_{1}}\ (\textsf{maplex}_{\lambda_{2}}\dots\ (\textsf{% maplex}_{\lambda_{k}}\ s)\dots)$ for some $k\geq 0$ , some ordered alphabets $(\lambda_{i}=(B_{i},\prec_{i}))_{i}$ and some simple transduction $s:(\Sigma\times B_{1}\times\dots\times B_{k})^{*}\rightharpoonup\Gamma^{*}$ . Then, $f$ is defined on $u\in\Sigma^{*}$ iff for all $1\leq i\leq k$ and all $b_{i}\in B_{i}^{|u|}$ , $s(u\otimes b_{1}\otimes\dots\otimes b_{k})$ is defined. Now, observe that $\textsf{dom}(f)$ is the complement of the $\Sigma$ -projection of the complement of $\textsf{dom}(s)$ . This entails the result as $\textsf{dom}(s)$ is regular and regular languages are closed under morphisms (and complement). $\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 18.

A simple transduction $s\in\textsf{Lex}_{0}$ , has growth $O(1)$ , and $\textsf{maplex}_{\lambda}\ s$ has growth $O(|B|^{n})$ , for $\lambda=(B,\lambda)$ . One application of maplex can thus cause an exponential blowup. Note however that these exponentials don’t compose, but multiply with extra applications of maplex. Using the same notations as in the above proof, $f$ has growth $O(|B_{1}{\times}\cdots{\times B_{k}}|^{n})$ .

Example 19 (Identity and Reverse).

Take $\lambda=(B,\prec)$ and $\lambda^{\prime}=(B,\prec^{\prime})$ with $B=\{0,1\}$ , $0\prec 1$ , and $1\prec^{\prime}0$ . For all $\sigma\in\Sigma$ , let $L_{\sigma}=\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}\begin{pmatrix}\sigma\\ 1\end{pmatrix}\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}$ and $L_{\epsilon}=(\Sigma\times B)^{*}\setminus(\bigcup_{\sigma}L_{\sigma})$ .

\begin{array}[]{rclcrcl}\textsf{id}&=&\textsf{maplex}_{\lambda}\ (L_{\epsilon}% {\triangleright}\epsilon+\sum_{\sigma\in\Sigma}L_{\sigma}{\triangleright}% \sigma)&&\textsf{rev}&=&\textsf{maplex}_{\lambda^{\prime}}\ (L_{\epsilon}{% \triangleright}\epsilon+\sum_{\sigma\in\Sigma}L_{\sigma}{\triangleright}\sigma% )\end{array}

This is illustrated below on input $a b c$ , with the output of the simple function below every word of the enumeration.

\begin{array}[]{ll}\textsf{id}&\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{a}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{b}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{c}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}\\ \textsf{rev}&\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 1\end{smallmatrix}}_{c}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{b}|\underbrace{\begin{smallmatrix}a\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{a}|\underbrace{\begin{smallmatrix}a\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\end{smallmatrix}\begin{smallmatrix}c\\ 0\end{smallmatrix}}_{\epsilon}\end{array}

Example 20 (Morphisms).

Let $\phi_{a}:u\in\{a,b\}^{*}\rightarrow a^{*}$ be the morphism defined by $\phi_{a}(a)=a$ and $\phi_{a}(b)=\epsilon$ . We have $\phi_{a}\in\textsf{Lex}_{1}$ . It suffices to take $B=\{0,1\}$ with $0\prec 1$ . Then, let $L_{a}=\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}\begin{pmatrix}a\\ 1\end{pmatrix}\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}$ , and $L_{\epsilon}=\overline{L_{a}}$ . Then $\phi_{a}\ =\ \textsf{maplex}\ (L_{a}{\triangleright}a+L_{\epsilon}{% \triangleright}\epsilon).$ More generally, if $\psi:\Sigma^{*}\rightarrow\Gamma^{*}$ is an arbitrary morphism, we show that $\psi\in\textsf{Lex}_{1}$ . Note that $\psi$ may transform a single letter into several letters, while simple transductions output at most one letter. To overcome this difference, we consider a larger linearly ordered set. Let $M=\text{max}_{\sigma\in\Sigma}|\psi(\sigma)|$ . If $M=0$ , then $\psi$ is the constant transduction which outputs $\epsilon$ , so $\psi\in\textsf{Lex}_{0}$ . Otherwise, let $\lambda_{M}=(B_{M},<)$ with $B_{M}=\{0,1,\dots,M\}$ naturally ordered. Let $I:\Gamma\rightarrow 2^{\Sigma\times\mathbb{N}}$ such that for all $\gamma\in\Gamma$ , $I(\gamma)$ is the set of pairs $(\sigma,i)$ such that $\psi(\sigma)[i]=\gamma$ . Note that for all $\gamma\in\Gamma$ , $I(\gamma)\subseteq\Sigma\times\{1,\dots,M\}$ . Define $L_{\gamma}$ as the set given by the regexp $\bigcup_{(\sigma,i)\in I(\gamma)}\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}\begin{pmatrix}\sigma\\ i\end{pmatrix}\begin{pmatrix}\Sigma\\ 0\end{pmatrix}^{*}$ and $L_{\epsilon}$ the complement of the union of all $L_{\gamma}$ . Then $\psi\ =\ \textsf{maplex}_{\lambda_{M}}\ (L_{\epsilon}{\triangleright}\epsilon% \ +\ \sum_{\gamma\in\Gamma}L_{\gamma}{\triangleright}\gamma).$

Example 20 can be generalized to sequential transductions.

Lemma 21.

$\textsf{Seq}\subseteq\textsf{Lex}_{1}$ .

Example 22 (Domain restriction).

Let $k\geq 0$ . Given $f\colon\Sigma^{*}\rightarrow\Gamma^{*}$ a transduction in $\textsf{Lex}_{k}$ and $L\subseteq\Sigma^{*}$ a regular language, the transduction $f_{|L}:u\mapsto f(u)$ if $u\in\textsf{dom}(f)\cap L$ is in $\textsf{Lex}_{k}$ . We show this inductively on $k$ : it is clear for $f\in\textsf{Lex}_{0}$ . Assume $f=\textsf{maplex}_{\lambda}\ g$ with $\lambda=(B,\prec)$ and let $\pi_{\Sigma}:(\Sigma\times B)^{*}\rightarrow\Sigma^{*}$ be the natural projection morphism. Then $f_{|L}=\textsf{maplex}_{\lambda}\ g_{|\pi_{\Sigma}^{{-}1}(L)}$ , which proves that $\textsf{Lex}_{k}$ is closed under domain restriction.

Example 23 (Subwords).

We show that $\textsf{sub}\in\textsf{Lex}_{2}$ (see Example 4 for the definition of sub). We take $\lambda=(B,<)$ , with $B=\{0,1\}$ and define the following morphism $\textsf{del}_{0}:(\Sigma\times B)^{*}\rightarrow\Sigma^{*}$ by $\textsf{del}_{0}(\sigma,0)=\epsilon$ and $\textsf{del}_{0}(\sigma,1)=\sigma$ . We can then show $\textsf{sub}\ =\ \textsf{maplex}_{\lambda}\ \textsf{del}_{0}.$ From Ex. 20, morphisms are in $\textsf{Lex}_{1}$ , so $\textsf{sub}\in\textsf{Lex}_{2}$ .

Example 24 (Square, illustrated on Fig. 2).

The transductions square and $\textsf{under}_{i}$ have been defined in Ex. 3. We show that $\textsf{square}\in\textsf{Lex}_{2}$ . Let $\lambda=(B,<)$ with $B=\{0,1\}$ and let $f:(\Sigma\times B)^{*}\rightarrow(\Sigma\cup\underline{\Sigma})^{*}$ such that for all $u\in\Sigma^{*}$ of length $n$ , for all $1\leq i\leq n$ , $f(u\otimes(0^{i{-}1}10^{n-i}))=\textsf{under}_{i}(u)$ , and for $b\not\in 0^{*}10^{*}$ , $f(u\otimes b)=\epsilon$ . It holds that $\textsf{square}=\textsf{maplex}_{\lambda}\ f$ , because $0^{i{-}1}10^{n-i}<0^{j{-}1}10^{n-j}$ for all $i<j$ . It remains to show that $f\in\textsf{Lex}_{1}$ . It is because $f=\textsf{maplex}_{\lambda}\ g$ for $g:(\Sigma\times B^{2})^{*}\rightarrow\Sigma^{*}$ the following simple transduction: for all $u\otimes b_{1}\otimes b_{2}\in(\Sigma\times B^{2})^{*}$ , if $b_{1}\not\in 0^{*}10^{*}$ or $b_{2}\not\in 0^{*}10^{*}$ , $g(u\otimes b_{1}\otimes b_{2})=\epsilon$ , otherwise let $i$ be the unique position at which $1$ occurs in $b_{1}$ and $j$ the unique position at which a $1$ occurs in $b_{2}$ . If $i=j$ , then $g(u\otimes b_{1}\otimes b_{2})=\underline{u[j]}$ , otherwise $g(u\otimes b_{1}\otimes b_{2})=u[j]$ . Since those properties are regular, $g$ is a simple transduction.

$\underbrace{\begin{smallmatrix}a\\ 0\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 0\end{smallmatrix}}_{\underline{a}}|\underbrace{\begin{smallmatrix}a\\ 1\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 1\end{smallmatrix}}_{b}|\underbrace{\begin{smallmatrix}a\\ 1\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 0\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 0\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 0\end{smallmatrix}}_{a}|\underbrace{\begin{smallmatrix}a\\ 0\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 1\end{smallmatrix}}_{\underline{b}}|\underbrace{\begin{smallmatrix}a\\ 0\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 0\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 0\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 1\end{smallmatrix}}_{\epsilon}|\underbrace{\begin{smallmatrix}a\\ 1\\ 1\end{smallmatrix}\begin{smallmatrix}b\\ 1\\ 1\end{smallmatrix}}_{\epsilon}$

Figure 2: Equality

\textsf{square}=(\textsf{maplex}_{\lambda}\ \textsf{maplex}_{\lambda}\ g)

illustrated on input

a b

, with the results of applying

g

underneath.

Presentation as automatic transductions

We give an alternative presentation in terms of automatic transductions. Let $k\geq 1$ be a positive integer, and $\overline{\lambda}=((B_{1},\prec_{1}),\ldots,(B_{k},\prec_{k}))$ be a $k$ -tuple of ordered alphabets and let $B=B_{1}\times\cdots\times B_{k}$ . We define the associated $k$ -lexicographic order for words of the same length over $B^{*}$ by $u\prec_{\overline{\lambda}}v$ if $u=u_{1}\otimes\cdots\otimes u_{k}$ , $v=v_{1}\otimes\cdots\otimes v_{k}$ , and there is $i\in\left\{1,\ldots,k\right\}$ such that $u_{i}\prec_{i}v_{i}$ and for all $j<i$ , $u_{j}=v_{j}$ .

Definition 25.

Let $\overline{\lambda}=((B_{1},\prec_{1}),\ldots,(B_{k},\prec_{k}))$ be a $k$ -tuple of ordered alphabets, let $B=B_{1}\times\cdots\times B_{k}$ and let $\prec_{\overline{\lambda}}$ be the associated $k$ -lexicographic order.

A $k$ -lexicographic automatic transducer over the alphabet $B$ is an automatic transducer with work alphabet $B$ such that the order is exactly $\prec_{\overline{\lambda}}$ . A transduction is said to be $k$ -lexicographic automatic if it can be defined by a $k$ -lexicographic automatic transducer. We denote by $\textsf{AT}_{k\text{-}\textsf{Lex}}$ the class of $k$ -lexicographic automatic transductions and by $\textsf{AT}_{\textsf{Lex}}$ the union of these, which we call lexicographic automatic transductions.

The next proposition is rather immediate.

Proposition 26 ( $\textsf{AT}_{k\text{-}\textsf{Lex}}=\textsf{Lex}_{k}$ ).

For all $k\geq 1$ , a transduction is $k$ -lexicographic iff it is $k$ -lexicographic automatic.

In [18], we make a connection between $k$ -lexicographic automatic transductions and $k$ -lexicographic automatic $\omega$ -words [2]. In [2, Theorem 6.1], the author shows that $k$ -lexicographic automatic $\omega$ -words form a strict hierarchy and provides explicit witnesses for each level of the hierarchy. As a consequence of Proposition 26, we obtain that $k$ -lexicographic transductions form a strict hierarchy, as stated by the following proposition.

Proposition 27.

For all $k\geq 1$ , $\textsf{Lex}_{k}\subsetneq\textsf{Lex}_{k+1}$ .

5 Nested marble transducers

We introduce in this section a transducer model, called nested marble transducers, and show that the class of transductions it recognizes is exactly the class of lexicographic transductions. Nested marble transducers generalize marble transducers [15, 12]. A marble transducer belongs to the family of transducers with an unbounded number of pebbles (of finitely many colours), with the following restriction: whenever it moves left, it has to drop a pebble, and whenever it moves right, it has to lift a pebble. The term marble is meant to emphasize this restriction. A nested marble transducer of level $k\geq 1$ behaves like a marble transducer which can call, when reading the leftmarker $\vdash$ , a nested marble transducer of level $k{-}1$ . A nested marble transducer of level $0$ is what we call a simple transducer. It is just a DFA with an output function on its accepting states, so it realizes a transduction whose range is finite.

Definition 28 (Simple transducers).

Let $\Sigma,\Gamma$ be finite sets (not necessarily disjoint). A $(\Sigma,\Gamma)$ -simple transducer is a pair $T=(A,\mu)$ where $A=(Q,q_{0},F,\delta)$ is a DFA over $\Sigma\cup\{\vdash,\dashv\}$ and $\mu:F\rightarrow\Gamma^{\leq 1}$ is a total function.

We define two semantics for $T$ , an operational semantics $f_{T}^{op}:Q\times\Sigma^{*}\rightharpoonup\Gamma^{*}\times F$ which takes as input a word and also a state from which the computation starts, and returns a word and the state reached when the computation ends, if it is accepting. Otherwise $f_{T}^{op}$ is not defined. Formally, $f_{T}^{op}(q,u)$ is defined for all $u$ such that $q\xrightarrow{{\vdash}u{\dashv}}_{A}q_{f}$ for some $q_{f}\in F$ , by $f_{T}^{op}(q,u)=(\mu(q_{f}),q_{f})$ .

From the operational semantics, we also define the transduction $f_{T}:\Sigma^{*}\rightharpoonup\Gamma^{*}$ recognized by $T$ by applying the operational semantics from the initial state, and projecting away the final state, i.e. $f_{T}(u)=\pi_{1}(f_{T}^{op}(q_{0},u))$ , where $\pi_{1}$ is the projection on the first component.

Definition 29 (Nested marble transducers from $\Sigma$ to $\Gamma$ ).

A $(0,\Sigma,\Gamma)$ -nested marble transducer is a $(\Sigma,\Gamma)$ -simple transducer. For $k\geq 1$ , a $(k,\Sigma,\Gamma)$ -nested marble transducer is a tuple $T=(\Sigma,\Gamma,C,c_{0},Q_{T},q_{0},F_{T},\delta,\delta_{\textsf{call}},% \delta_{\textsf{ret}},\mu,T^{\prime})$ where:

$\blacksquare$

$C$ is a finite set of (marble) colors, $c_{0}$ is an initial color;
$\blacksquare$

$Q_{T}$ is a finite set of states, $q_{0}$ is an initial state, and $F_{T}$ a set of accepting states;
$\blacksquare$

$T^{\prime}$ is a $(k{-}1,\Sigma\times C,\Gamma)$ -nested marble transducer with set of states $Q_{T^{\prime}}$ and set of accepting states $F_{T^{\prime}}$ ;
$\blacksquare$

$\delta:Q_{T}\times(\Sigma\cup\{\dashv\})\times C\rightarrow(C\cup\{\perp\})% \times Q_{T}$ is a transition function;
$\blacksquare$

$\delta_{\textsf{call}}\colon Q_{T}\times C\rightarrow Q_{T^{\prime}}$ is a call function; $\delta_{\textsf{ret}}:Q_{T}\times C\times F_{T^{\prime}}\rightarrow Q_{T}$ is a return function;
$\blacksquare$

$\mu\colon\textsf{dom}(\delta)\rightarrow\Gamma^{*}$ is an output function.

We use $(k,\Sigma,\Gamma)$ -NMT (or just $k$ -NMT if $\Sigma$ , $\Gamma$ are clear from the context) as a shortcut for $(k,\Sigma,\Gamma)$ -nested marble transducer. $T^{\prime}$ is the assistant NMT and $k$ the level of $T$ . Finally, we often say marble instead of marble colour.

We now define the semantics informally. The reading head of $T$ is initially placed on the rightmost position labeled $\dashv$ , marked with a marble of color $c_{0}$ , in state $q_{0}$ . Transitions work as follows: suppose the current state is $q$ and the reading head is on some position $i$ labeled by $\sigma\in\Sigma\cup\{\dashv,\vdash\}$ and by some marble of color $c\in C$ . Whatever transition in $\delta$ can be applied, some output word is produced by $T$ according to $\mu$ . Then there are three cases:

1.

if $\sigma\in\Sigma\cup\{\dashv\}$ and $\delta(q,\sigma,c)=(c^{\prime},q^{\prime})$ where $c^{\prime}\in C$ , then the reading head moves to position $(i{-}1)$ in state $q^{\prime}$ and a marble of color $c^{\prime}$ is placed (on position $i{-}1$ );
2.

if $\sigma\in\Sigma$ and $\delta(q,\sigma,c)=(\perp,q^{\prime})$ , then $T$ lifts the current marble and moves its reading head to position $i+1$ in state $q^{\prime}$ ;
3.

if $\sigma=\ \vdash$ then $T$ calls $T^{\prime}$ initialized with state $\delta_{\textsf{call}}(q,c)$ , on the input word annotated with marbles. When $T^{\prime}$ finishes its computation in some accepting state $q^{\prime}$ , $T$ lifts marble $c$ , moves its reading head to position $1$ and continues its computation from state $\delta_{\textsf{ret}}(q,c,q^{\prime})$ .

The (operational) semantics of $T$ is a function $f_{T}^{op}:Q_{T}\times\Sigma^{*}\rightarrow\Gamma^{*}\times F_{T}$ , that we define inductively. The case $k=0$ has been defined after Definition 28. If $k\geq 1$ and $T=(\Sigma,\Gamma,C,c_{0},Q_{T},q_{0},F_{T},\delta,\delta_{\textsf{call}},% \delta_{\textsf{ret}},\mu,T^{\prime})$ then we assume $f_{T^{\prime}}^{op}:Q_{T^{\prime}}\times(\Sigma\times C)^{*}\rightarrow\Gamma^% {*}\times F_{T^{\prime}}$ to be defined inductively. Let us now define $f_{T}^{op}$ . A configuration of $T$ over a word $u\in\Sigma^{*}$ is a triple $(q,i,v)$ such that $q$ is the current state, $i\in\text{Pos}(u)\cup\{0,n+1\}$ is the current position (where $n=|u|$ ), and $v\in C^{*}$ is an annotation of the suffix $({\vdash}u{\dashv})[i{:}n{+}1]$ . We define a labeled successor relation $(q,i,cv)\xrightarrow{w}_{T}(q^{\prime},i^{\prime},v^{\prime})$ , between any two configurations where $c\in C$ , labeled by $w\in\Gamma^{*}$ , whenever one of the following cases hold:

1.

$1\leq i\leq n+1$ , $\delta(q,c)=(c^{\prime},q^{\prime})$ , $i^{\prime}=i{-}1$ , $v^{\prime}=c^{\prime}cv$ and $w=\mu(q,c)$ ;
2.

$1\leq i\leq n$ , $\delta(q,c)=(\perp,q^{\prime})$ , $i^{\prime}=i+1$ , $v^{\prime}=v$ and $w=\mu(q,c)$ ;
3.

$i=0$ , $f_{T^{\prime}}^{op}(\delta_{\textsf{call}}(q,c),({\vdash}u{\dashv})\otimes cv)% =(w,p)$ , $q^{\prime}=\delta_{\textsf{ret}}(q,c,p)$ , $i^{\prime}=1$ and $v^{\prime}=v$ .

The function $f_{T}^{op}:Q_{T}\times\Sigma^{*}\rightharpoonup\Gamma^{*}\times F_{T}$ recognized by $T$ is defined, for all $q\in Q_{T}$ and all $u\in\Sigma^{*}$ such that there exists a sequence of configurations over $u$ : $\nu_{0}=(q,n+1,c_{0})\xrightarrow{w_{1}}_{T}\nu_{1}\xrightarrow{w_{2}}_{T}\nu_% {3}\dots\nu_{k{-}1}\xrightarrow{w_{k}}_{T}\nu_{k}$ where the state $q_{f}$ of $\nu_{k}$ is accepting (i.e. in $F$ ) and the states of configurations $\nu_{i}$ , $i<k$ , are non-accepting, by $f_{T}^{op}(q,u)=(w_{1}\dots w_{k},q_{f})$ .

The transduction $f_{T}:\Sigma^{*}\rightharpoonup\Gamma^{*}$ recognized by $T$ is defined as the projection of $f_{T}^{op}(q_{0},u)$ on $\Sigma$ and $\Gamma$ , i.e. if $f_{T}^{op}(q_{0},u)=(v,q_{f})$ then $f_{T}(u)=v$ . We denote by NMT the class of transductions recognizable by some $(k,\Sigma,\Gamma){-}\textsf{NMT}$ . The local size of an NMT is the number of its transitions, states and marbles. Its size is its local size plus the size the lower level NMT it calls. We similarly define the number of (resp. local number of) states/marbles/transitions.

Example 30.

We describe a $2$ -NMT $T_{2}$ computing the transduction sub of Example 23. We recall that $\textsf{sub}=\textsf{maplex}_{\lambda}\ \textsf{del}_{0}$ where $\lambda=(B=\{0,1\},<)$ and $\textsf{del}_{0}$ is a morphism. The transducer $T_{2}$ behaves as a marble transducer which computes the $\lambda$ -lexicographic enumerator $\textsf{lex{-}enum}_{\lambda}$ , and whenever a full annotation of its input has been computed, it calls a $1$ -NMT $T_{1}$ which computes $\textsf{del}_{0}$ . Let us explain how $T_{2}$ computes the next annotation in lexicographic order (which corresponds to the binary addition with most significant bit to the right). When the reading head of $T_{2}$ is on the left marker, all input positions are marked with some pebble in $B=\{0,1\}$ . Then, $T_{2}$ scans its input to the right (lifting all the pebbles it sees) until it reads a $0$ , replaces it by $1$ and moves again to the left marker, dropping pebble $1$ all the way back to the left marker. Only three states are needed. Initially, $T_{2}$ drops pebble $0$ on all positions, from right to left. We explain now how $T_{1}$ works: it scans its input from right to left, and, whenever it reads an input $(\sigma,b)$ with $b=1$ , outputs $\sigma$ . Whenever $T_{1}$ reaches the leftmarker, it calls a simple transducer $T_{0}$ which does nothing but outputting $\epsilon$ .

The following result states that NMT are closed under post-composition with Seq. To prove it, we strongly rely on the ability to pass state information through mappings $\delta_{\textsf{call}}$ and $\delta_{\textsf{ret}}$ to adapt a classical product construction of automata.

Lemma 31 ( $\textsf{Seq}\circ\textsf{NMT}\subseteq\textsf{NMT}$ ).

For all $k\geq 0$ , all $(k,\Sigma,\Gamma)$ -NMT $T$ and all sequential transducer $S$ over $\Gamma,\Lambda$ , one can construct, in polynomial time, a $(\text{max}(k,1),\Sigma,\Lambda)$ -NMT $T^{\prime}$ such that $f_{T^{\prime}}=f_{S}\circ f_{T}$ .

State-passing free nested marble transducers.

In the definition of NMT, there are two explicit forms of information-passing: state information can be passed from level $k$ to level $k{-}1$ through the function $\delta_{\textsf{call}}$ , and state information can be passed from level $k{-}1$ to level $k$ via the function $\delta_{\textsf{ret}}$ . In addition, there is an implicit one through the domain of assistant transducers: indeed, the definition of the semantics requires that all calls to assistant transducers do accept, hence the assistant transducer can influence the master transducer by rejecting a word. In this subsection, we prove that information-passing can be removed while preserving the computational power of $k$ -NMT, however at the cost of increasing the size by a tower of exponentials of height $k$ . While state-passing was useful to prove the closure under post-composition with sequential transductions (Lemma 31), it will be more convenient to consider state-passing free nested marble transducers in the sequel, in particular to prove that NMT recognize lexicographic transductions (Subsection 5).

Definition 32.

A state-passing free $(k,\Sigma,\Gamma)$ -nested marble transducer ( $(k,\Sigma,\Gamma)$ - $\textsf{NMT}_{\textsf{spf}}$ for short), is either a simple transducer if $k=0$ , or, if $k>0$ , a $(k,\Sigma,\Gamma)$ -NMT $T=(\Sigma,\Gamma,C,c_{0},Q_{T},q_{0},F,\delta,\delta_{\textsf{call}},\delta_{% \textsf{ret}},\mu,T^{\prime})$ such that

1.

$T^{\prime}$ is a $(k{-}1,\Sigma\times C,\Gamma)$ - $\textsf{NMT}_{\textsf{spf}}$ with set of states $Q_{T^{\prime}}$ and initial state $q^{\prime}_{0}$
2.

$\delta_{\textsf{call}}(q,c)=q^{\prime}_{0}$ for all $q\in Q_{T}$ and $c\in C$
3.

$\delta_{\textsf{ret}}(q,c,q^{\prime})=q$ for all $q\in Q_{T},c\in C$ and $q^{\prime}\in Q_{T^{\prime}}$
4.

calls to the assistant transducer $T^{\prime}$ always accept.

Since the functions $\delta_{\textsf{call}}$ and $\delta_{\textsf{ret}}$ play no role, we often omit them in the tuple denoting $T$ .

Theorem 33 (State-passing removal).

For all $k$ -NMT $T$ , there exists an equivalent $k$ - $\textsf{NMT}_{\textsf{spf}}$ $T^{\prime}$ whose size is $k{-}\textsc{EXP}$ in that of $T$ . This non-elementary blow-up is unavoidable.

Before proving this result, we show a property on domains of NMT. A nested marble automaton $A$ of level $k$ is a nested marble transducer $T$ of level $k$ whose output function $\mu$ is the constant function that always returns $\epsilon$ . The language of $A$ is defined as $L(A)=\textsf{dom}(f_{T})$ .

Lemma 34.

A language is recognizable by a nested marble automaton of level $k$ and $n$ states iff it is recognizable by a finite automaton of size in $k$ -EXP( $n$ ). This non-elementary blow-up is unavoidable.

Proof sketch.

It is clear that any regular language is the domain of some simple transduction. Conversely, let $A$ be a nested marble automaton of level $k$ . If $k=0$ , it is obvious.If $k\geq 1$ , let $A=(\Sigma,C,c_{0},Q_{A},q_{0},F,\delta,\delta_{\textsf{call}},\delta_{\textsf{% ret}},A^{\prime})$ where $A^{\prime}$ has level $k{-}1$ (in the tuple, we have omitted the output alphabet and function, since they play no role). By IH, for all pairs $(q_{c},q_{r})$ of states of $A^{\prime}$ , the language of words accepted by $A^{\prime}$ by a run starting in $q_{c}$ and ending in $q_{r}$ is regular, and can be described by some finite automaton $D_{q_{c},q_{r}}$ of size in $(k{-}1)$ -EXP( $n$ ).

We turn $A$ into a marble automaton $B$ (of level $1$ ) such that $L(B)=L(A)$ . The marbles of $B$ are enriched with information on the states of automata $D_{q_{c},q_{r}}$ , for all pairs $(q_{c},q_{r})$ , ensuring that $B$ knows the state of all the automata $D_{q_{c},q_{r}}$ after reading the current suffix. $B$ exploits this information to simulate $A$ and whenever $A$ calls $A^{\prime}$ with some initial state $q_{c}$ , instead, $B$ knows, if it exists, the state $q_{r}$ of $A^{\prime}$ such that the current marked input is accepted by $A^{\prime}_{q_{c},q_{r}}$ . If such a state exists, then it is unique as $A^{\prime}$ is deterministic, and $B$ can bypass calling $A^{\prime}$ and directly apply its return transition. Otherwise, $B$ stops.

The result follows as marble automata are known to recognize regular languages. The conversion of a marble automaton into a finite automaton is exponential both in the number of states and number of marbles (see e.g. [15, 12], as well as Theorem 5.4 of [22] for a detailed construction), yielding a tower of $k$ -exponential inductively.

It can be shown that this non-elementary blowup is not avoidable, because first-order sentences on word structures (with one successor) with quantifier rank $r$ , can be converted in an exponentially bigger nested marble automaton, while it is known that such sentences can be converted into an equivalent finite automaton of unavoidable size a tower of exponential of height $r$ [19]. $\hfill\blacktriangleleft$

Corollary 35.

Transductions recognized by nested marble transducers have regular domains.

We are now ready to sketch the proof of Theorem 33.

Proof sketch of Theorem 33.

There are two kinds of state-passing, through functions $\delta_{\textsf{call}}$ and $\delta_{\textsf{ret}}$ . We deal with them separately. First, removal of $\delta_{\textsf{call}}$ can be done by enriching marbles with the current state, so as to pass this information to the assistant transducer. Removal of $\delta_{\textsf{ret}}$ is more involved, but can be done by induction using a technique similar to the one used to prove Lemma 34. By induction hypothesis, the assistant transducer can be replaced by an equivalent state-passing free NMT. In addition, its domain is regular thanks to Lemma 34. Hence, one can enrich the marbles so as to precompute the final state reached by the assistant transducer, and in turn simulate the function $\delta_{\textsf{ret}}$ . This also allows to ensures that all calls to the assistant transducer do accept.

Last, we justify the fact that the non-elementary blow-up is unavoidable. It is because the domain of any state-passing free NMT $S$ is recognizable by a finite automaton of exponential size. Indeed, the calls to assistant transducers always terminate, so the domain of $S$ does not depend on assistant transducers, hence can be described by a marble automaton, hence by a finite automaton of size exponential in the number of local states and marbles of $S$ . Thus, the existence of an elementary construction for state-passing removal would contradict the non-elementary blow-up stated in Lemma 34. $\hfill\blacktriangleleft$

Equivalence with lexicographic transductions.

In this subsection, we prove (Theorems 36 and 37) that a transduction is recognizable by a $k$ -nested marble transducer iff it is $k$ -lexicographic. Consider some $f\in LEX_{k}$ . Then $f=\textsf{maplex}_{\lambda_{1}}(\textsf{maplex}_{\lambda_{2}}\dots\textsf{% maplex}_{\lambda_{k}}\ s)\dots)$ for some $\lambda_{i}=(B_{i},\prec_{i})$ and $s$ a simple transduction. We call $B_{1},\dots,B_{k}$ the ordered alphabets of $f$ and $s$ the simple transduction of $f$ . Given an ordered alphabet $(B,\prec)$ , one can enumerate the annotations of a word according to the lexicographic extension using a marble automaton.

Theorem 36 ( $\textsf{Lex}\subseteq\textsf{NMT}$ ).

Any transduction $f\in\textsf{Lex}_{k}$ is recognizable by some NMT $T_{f}$ of level $k$ . If $B_{1},\dots,B_{k}$ are the ordered alphabets of $f$ and $s$ its simple transduction, represented by a sequential transducer with $m$ states, then $T_{f}$ has $O(k+m)$ states and $\sum_{i=1}^{k}|B_{i}|$ marbles.

Conversely, we prove that the transductions recognized by $\textsf{NMT}_{\textsf{spf}}$ are lexicographic.

Theorem 37 ( $\textsf{NMT}_{\textsf{spf}}\subseteq\textsf{Lex}$ ).

Any transduction $f$ recognizable by some $\textsf{NMT}_{\textsf{spf}}$ $T$ of level $k$ is $k$ -lexicographic.

The proof is rather involved. We provide high level arguments. The result is shown by induction on $k$ . It is trivial for $k=0$ . For $k>0$ , the main idea is to see the sequence of successive configurations of $T$ on some input as a lexicographic enumeration. This is possible due to the stack discipline of marbles. By extending the marble alphabet with sufficient information, we define a total order on marbles such that successive configurations of $T$ , extended with this information, forms a lexicographically increasing chain.

6 Expressiveness and closure properties of lexicographic transductions

In this section, we prove that Lex contains all the polyregular transductions [5], and all the transductions recognizable by (copyful) streaming string transducers [17, 1]. We also show that Lex is closed by post-composition under any polyregular transduction, and closed by pre-composition under any rational transduction. We start by showing that lexicographic transductions preserve regular languages under inverse image.

Proposition 38.

Transductions in Lex are regularity preserving.

Proof.

It is an immediate corollary of the inclusion $\textsf{Lex}\subseteq\textsf{NMT}$ (Theorem 36), that NMT are closed by post-composition by a sequential transduction (Lemma 31), and that NMT have regular domains (Corollary 35). $\hfill\blacktriangleleft$

We show that Lex subsumes both SST and PolyReg. More precisely that any transduction recognizable by a (copyful) streaming string transducer (SST) is $1$ -lexicographic. We do not give the definition of SST and refer the reader to [12] for more details. We also show that NMT of level $k$ subsume $k$ -pebble transducers. Again we do not give precise definitions of pebble transducers and refer the reader to [5].

Theorem 39 (SST and PolyReg in Lex).

The following hold:

$\blacksquare$

$\textsf{SST}=\textsf{Lex}_{1}$ ,
$\blacksquare$

A transduction definable by a $k$ -pebble transducer is in $\textsf{Lex}_{k}$ . In particular $\textsf{PolyReg}\subseteq\textsf{Lex}$ .

Proof.

It is already known that marble transducers (i.e. nested marble transducers of level $1$ ) capture exactly the class of SST-recognizable transductions [12]. The result then follows from Theorem 36 and Theorem 37. For the second statement, we note that single pebble can be simulated by one level of marbles, with colors $\left\{0,1\right\}$ with the restriction that at most one marble can have color $1$ per level. $\hfill\blacktriangleleft$

Any polyregular transduction can be expressed as a composition of sequential transductions, square, map and rev [5]. We show that Lex is closed by post-composition by these transductions (e.g. for sequential functions it has been shown in Lemma 31). As a consequence we obtain that Lex is closed under post-composition by polyregular transductions.

Theorem 40.

$\textsf{PolyReg}\circ\textsf{Lex}\subseteq\textsf{Lex}$ .

Finally, we show that lexicographic transductions are closed under pre-composition by any rational transduction.

Theorem 41 ( $\textsf{Lex}\circ\textsf{Rat}\subseteq\textsf{Lex}$ ).

Let $k\geq 1$ . For any rational transduction $g:\Gamma^{*}\rightharpoonup\Lambda^{*}$ and any $k$ -lexicographic transduction $f\colon\Lambda^{*}\rightharpoonup\Gamma^{*}$ , $f\circ g$ is $k$ -lexicographic.

Proof.

A rational transduction can be decomposed as a letter-to-letter rational transduction, followed by a morphism. Example 20 shows that morphisms are lexicographic. Similar ideas apply inductively to show that Lex is closed by pre-composition under morphisms and letter-to-letter rational transductions. $\hfill\blacktriangleleft$

7 Discussion

We have introduced lexicographic transductions, a subclass of MSOSI, and provided three characterizations: in terms of closure of simple functions by the maplex operator, as lexicographic automatic transductions that can be seen as a syntactic restriction of MSOSI, and through nested marble transducers. Thanks to these characterizations, this class is shown to subsume both PolyReg and SST. Moreover, it is actually closed under post-composition by PolyReg and thus is regularity preserving, which MSOSI is not known to be.

Open questions on MSOSI.

We leave open whether MSOSI is regularity preserving. A way to attack the problem is to try to obtain an equivalent automata-based model, as stated in Question 1. However, as we have shown, a positive answer would entail that all automatic $\omega$ -words have a decidable MSO theory, which has been open for almost 20 years.

Open questions on Lex.

We have shown that Lex is a rather well-behaved class, however some interesting questions remain open. Although we suspect that MSOSI strictly subsumes Lex, this remains unproven. For example, we could not show that Lex is closed under pre-composition with Reg, in particular, it is already unclear whether $\textsf{Lex}\circ\textsf{rev}\subseteq\textsf{Lex}$ holds.

The equivalence problem is central to transducer theory, but its decidability status is still unknown for PolyReg transductions. It is also open for Lex, which subsumes PolyReg.

One can decide whether a Lex transduction is in PolyReg (Theorem 10), however we do not know if the Lex hierarchy is decidable (already for its first level $\textsf{Lex}_{1}$ ).

Possible extensions of Lex.

While Lex has proven to be an interesting class, the zoo of word-to-word transductions with exponential growth is relatively unknown. We propose three possible extensions of Lex, in increasing expressiveness, all included in MSOSI.

The class $\textsf{Lex}\circ\textsf{Reg}$ may be an interesting class in itself and the first level $\textsf{Lex}_{1}\circ\textsf{Reg}$ coincides with the rather natural class of two-way streaming string transducers.

A more general way of extending Lex is to generalize the operation lex-enum to allow lexicographic orders where the significance of letters is given by an arbitrary MSO definable order. This class subsumes $\textsf{Lex}\circ\textsf{Reg}$ however it is not clear that it is regularity preserving.

Another possible generalization is to replace marbles with the so-called invisible pebbles of Engelfriet[14] and define nested invisible pebble transducers, where the nested levels can see the pebbles of the previous levels but not the ones of their own. We believe that the state-passing free version can be shown to be still included in MSOSI but it is not clear that it is regularity preserving. However, we conjecture that the version with state-passing can recognize non-regular languages, and hence is too expressive.

References

[1] Rajeev Alur and Pavol Cerný. Expressiveness of streaming string transducers. In Kamal Lodaya and Meena Mahajan, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2010, December 15-18, 2010, Chennai, India, volume 8 of LIPIcs, pages 1–12. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2010. doi:10.4230/LIPICS.FSTTCS.2010.1.
[2] Vince Bárány. A hierarchy of automatic $\omega$ -words having a decidable MSO theory. RAIRO Theor. Informatics Appl., 42(3):417–450, 2008. doi:10.1051/ITA:2008008.
[3] Jean Berstel, Luc Boasson, Olivier Carton, Bruno Petazzoni, and Jean-Eric Pin. Operations preserving regular languages. Theor. Comput. Sci., 354(3):405–420, 2006. doi:10.1016/J.TCS.2005.11.034.
[4] Valérie Berthé, Toghrul Karimov, Joris Nieuwveld, Joël Ouaknine, Mihir Vahanwala, and James Worrell. The monadic theory of toric words. Theor. Comput. Sci., 1025:114959, 2025. doi:10.1016/J.TCS.2024.114959.
[5] Mikolaj Bojańczyk. Polyregular functions. CoRR, abs/1810.08760, 2018. arXiv:1810.08760.
[6] Mikolaj Bojańczyk. On the growth rates of polyregular functions. In 38th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2023, Boston, MA, USA, June 26-29, 2023, pages 1–13. IEEE, 2023. doi:10.1109/LICS56636.2023.10175808.
[7] Mikolaj Bojańczyk, Laure Daviaud, and Shankara Narayanan Krishna. Regular and first-order list functions. In Anuj Dawar and Erich Grädel, editors, Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09-12, 2018, pages 125–134. ACM, 2018. doi:10.1145/3209108.3209163.
[8] Mikolaj Bojańczyk, Sandra Kiefer, and Nathan Lhote. String-to-string interpretations with polynomial-size output. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece, volume 132 of LIPIcs, pages 106:1–106:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.ICALP.2019.106.
[9] Mikolaj Bojańnczyk and Rafal Stefanski. Single-use automata and transducers for infinite alphabets. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrücken, Germany (Virtual Conference), volume 168 of LIPIcs, pages 113:1–113:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.ICALP.2020.113.
[10] Michaël Cadilhac, Olivier Carton, and Charles Paperman. Continuity of functional transducers: A profinite study of rational functions. Log. Methods Comput. Sci., 16(1), 2020. doi:10.23638/LMCS-16(1:24)2020.
[11] Thomas Colcombet and Christof Löding. Transforming structures by set interpretations. Log. Methods Comput. Sci., 3(2), 2007. doi:10.2168/LMCS-3(2:4)2007.
[12] Gaëtan Douéneau-Tabot, Emmanuel Filiot, and Paul Gastin. Register transducers are marble transducers. In Javier Esparza and Daniel Král’, editors, 45th International Symposium on Mathematical Foundations of Computer Science, MFCS 2020, August 24-28, 2020, Prague, Czech Republic, volume 170 of LIPIcs, pages 29:1–29:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.MFCS.2020.29.
[13] Joost Engelfriet and Hendrik Jan Hoogeboom. Two-way finite state transducers and monadic second-order logic. In Jirí Wiedermann, Peter van Emde Boas, and Mogens Nielsen, editors, Automata, Languages and Programming, 26th International Colloquium, ICALP’99, Prague, Czech Republic, July 11-15, 1999, Proceedings, volume 1644 of Lecture Notes in Computer Science, pages 311–320. Springer, 1999. doi:10.1007/3-540-48523-6_28.
[14] Joost Engelfriet, Hendrik Jan Hoogeboom, and Bart Samwel. XML navigation and transformation by tree-walking automata and transducers with visible and invisible pebbles. Theor. Comput. Sci., 850:40–97, 2021. doi:10.1016/J.TCS.2020.10.030.
[15] Joost Engelfriet, Hendrik Jan Hoogeboom, and Jan-Pascal van Best. Trips on trees. Acta Cybern., 14(1):51–64, 1999. URL: https://cyber.bibl.u-szeged.hu/index.php/actcybern/article/view/3510.
[16] Emmanuel Filiot and Pierre-Alain Reynier. Transducers, logic and algebra for functions of finite words. ACM SIGLOG News, 3(3):4–19, 2016. doi:10.1145/2984450.2984453.
[17] Emmanuel Filiot and Pierre-Alain Reynier. Copyful streaming string transducers. Fundam. Informaticae, 178(1-2):59–76, 2021. doi:10.3233/FI-2021-1998.
[18] Emmanuel Filiot, Pierre-Alain Reynier, and Nathan Lhote. Lexicographic transductions of finite words. CoRR, abs/2503.01746, 2025. doi:10.48550/arXiv.2503.01746.
[19] Markus Frick and Martin Grohe. The complexity of first-order and monadic second-order logic revisited. Ann. Pure Appl. Log., 130(1-3):3–31, 2004. doi:10.1016/J.APAL.2004.01.007.
[20] Paul Gallot, Nathan Lhote, and Lê Thành Dũng Nguyên. The structure of polynomial growth for tree automata/transducers and mso set queries, 2025. doi:10.48550/arXiv.2501.10270.
[21] Tero Harju and Juhani Karhumäki. Finite transducers and rational transductions. In Jean-Éric Pin, editor, Handbook of Automata Theory, pages 79–111. European Mathematical Society Publishing House, Zürich, Switzerland, 2021. doi:10.4171/AUTOMATA-1/3.
[22] Tsutomu Kamimura and Giora Slutzki. Parallel and two-way automata on directed ordered acyclic graphs. Inf. Control., 49(1):10–51, 1981. doi:10.1016/S0019-9958(81)90438-1.
[23] Anca Muscholl and Gabriele Puppis. The many facets of string transducers (invited talk). In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 2:1–2:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.2.
[24] Jacques Sakarovitch. Elements of Automata Theory. Cambridge University Press, USA, 2009.

[bib.bib1] [1] Rajeev Alur and Pavol Cerný. Expressiveness of streaming string transducers. In Kamal Lodaya and Meena Mahajan, editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2010, December 15-18, 2010, Chennai, India, volume 8 of LIPIcs, pages 1–12. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2010. doi:10.4230/LIPICS.FSTTCS.2010.1.

[bib.bib2] [2] Vince Bárány. A hierarchy of automatic $\omega$ -words having a decidable MSO theory. RAIRO Theor. Informatics Appl., 42(3):417–450, 2008. doi:10.1051/ITA:2008008.

[bib.bib3] [3] Jean Berstel, Luc Boasson, Olivier Carton, Bruno Petazzoni, and Jean-Eric Pin. Operations preserving regular languages. Theor. Comput. Sci., 354(3):405–420, 2006. doi:10.1016/J.TCS.2005.11.034.

[bib.bib4] [4] Valérie Berthé, Toghrul Karimov, Joris Nieuwveld, Joël Ouaknine, Mihir Vahanwala, and James Worrell. The monadic theory of toric words. Theor. Comput. Sci., 1025:114959, 2025. doi:10.1016/J.TCS.2024.114959.

[bib.bib5] [5] Mikolaj Bojańczyk. Polyregular functions. CoRR, abs/1810.08760, 2018. arXiv:1810.08760.

[bib.bib6] [6] Mikolaj Bojańczyk. On the growth rates of polyregular functions. In 38th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2023, Boston, MA, USA, June 26-29, 2023, pages 1–13. IEEE, 2023. doi:10.1109/LICS56636.2023.10175808.

[bib.bib7] [7] Mikolaj Bojańczyk, Laure Daviaud, and Shankara Narayanan Krishna. Regular and first-order list functions. In Anuj Dawar and Erich Grädel, editors, Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09-12, 2018, pages 125–134. ACM, 2018. doi:10.1145/3209108.3209163.

[bib.bib8] [8] Mikolaj Bojańczyk, Sandra Kiefer, and Nathan Lhote. String-to-string interpretations with polynomial-size output. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece, volume 132 of LIPIcs, pages 106:1–106:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.ICALP.2019.106.

[bib.bib9] [9] Mikolaj Bojańnczyk and Rafal Stefanski. Single-use automata and transducers for infinite alphabets. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrücken, Germany (Virtual Conference), volume 168 of LIPIcs, pages 113:1–113:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.ICALP.2020.113.

[bib.bib10] [10] Michaël Cadilhac, Olivier Carton, and Charles Paperman. Continuity of functional transducers: A profinite study of rational functions. Log. Methods Comput. Sci., 16(1), 2020. doi:10.23638/LMCS-16(1:24)2020.

[bib.bib11] [11] Thomas Colcombet and Christof Löding. Transforming structures by set interpretations. Log. Methods Comput. Sci., 3(2), 2007. doi:10.2168/LMCS-3(2:4)2007.

[bib.bib12] [12] Gaëtan Douéneau-Tabot, Emmanuel Filiot, and Paul Gastin. Register transducers are marble transducers. In Javier Esparza and Daniel Král’, editors, 45th International Symposium on Mathematical Foundations of Computer Science, MFCS 2020, August 24-28, 2020, Prague, Czech Republic, volume 170 of LIPIcs, pages 29:1–29:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.MFCS.2020.29.

[bib.bib13] [13] Joost Engelfriet and Hendrik Jan Hoogeboom. Two-way finite state transducers and monadic second-order logic. In Jirí Wiedermann, Peter van Emde Boas, and Mogens Nielsen, editors, Automata, Languages and Programming, 26th International Colloquium, ICALP’99, Prague, Czech Republic, July 11-15, 1999, Proceedings, volume 1644 of Lecture Notes in Computer Science, pages 311–320. Springer, 1999. doi:10.1007/3-540-48523-6_28.

[bib.bib14] [14] Joost Engelfriet, Hendrik Jan Hoogeboom, and Bart Samwel. XML navigation and transformation by tree-walking automata and transducers with visible and invisible pebbles. Theor. Comput. Sci., 850:40–97, 2021. doi:10.1016/J.TCS.2020.10.030.

[bib.bib15] [15] Joost Engelfriet, Hendrik Jan Hoogeboom, and Jan-Pascal van Best. Trips on trees. Acta Cybern., 14(1):51–64, 1999. URL: https://cyber.bibl.u-szeged.hu/index.php/actcybern/article/view/3510.

[bib.bib16] [16] Emmanuel Filiot and Pierre-Alain Reynier. Transducers, logic and algebra for functions of finite words. ACM SIGLOG News, 3(3):4–19, 2016. doi:10.1145/2984450.2984453.

[bib.bib17] [17] Emmanuel Filiot and Pierre-Alain Reynier. Copyful streaming string transducers. Fundam. Informaticae, 178(1-2):59–76, 2021. doi:10.3233/FI-2021-1998.

[bib.bib18] [18] Emmanuel Filiot, Pierre-Alain Reynier, and Nathan Lhote. Lexicographic transductions of finite words. CoRR, abs/2503.01746, 2025. doi:10.48550/arXiv.2503.01746.

[bib.bib19] [19] Markus Frick and Martin Grohe. The complexity of first-order and monadic second-order logic revisited. Ann. Pure Appl. Log., 130(1-3):3–31, 2004. doi:10.1016/J.APAL.2004.01.007.

[bib.bib20] [20] Paul Gallot, Nathan Lhote, and Lê Thành Dũng Nguyên. The structure of polynomial growth for tree automata/transducers and mso set queries, 2025. doi:10.48550/arXiv.2501.10270.

[bib.bib21] [21] Tero Harju and Juhani Karhumäki. Finite transducers and rational transductions. In Jean-Éric Pin, editor, Handbook of Automata Theory, pages 79–111. European Mathematical Society Publishing House, Zürich, Switzerland, 2021. doi:10.4171/AUTOMATA-1/3.

[bib.bib22] [22] Tsutomu Kamimura and Giora Slutzki. Parallel and two-way automata on directed ordered acyclic graphs. Inf. Control., 49(1):10–51, 1981. doi:10.1016/S0019-9958(81)90438-1.

[bib.bib23] [23] Anca Muscholl and Gabriele Puppis. The many facets of string transducers (invited talk). In Rolf Niedermeier and Christophe Paul, editors, 36th International Symposium on Theoretical Aspects of Computer Science, STACS 2019, March 13-16, 2019, Berlin, Germany, volume 126 of LIPIcs, pages 2:1–2:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.STACS.2019.2.

[bib.bib24] [24] Jacques Sakarovitch. Elements of Automata Theory. Cambridge University Press, USA, 2009.

Lexicographic Transductions of Finite Words

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

MSOSI and the connection to automatic structures.

Conjecture 1 ([2], Section 9).

Conjecture 2.

On regular transductions with exponential growth.

Question 1.

Contributions.

2 Word languages and transductions

Words and languages.

Finite automata.

Word transductions.

Example 3 (Reverse, copy and square).

Example 4 (Subwords).

Example 5 (Map).

Sequential and rational transductions.

Regular and polyregular transductions.

3 MSO set interpretations, properties and limitations

MSO set interpretations

Signatures, formulas and structures.

Regularity preserving.

Word structures.

Definition 6 (MSO set interpretations [11]).

▶ Remark 7.

Example 8.

MSO transductions, MSO and FO interpretations.

Theorem 9 ([11, Proposition 2.4]).

Exponential versus polynomial growth.

Theorem 10 ([20, Theorem 1.5],[6, Theorem 2.3]).

Regularity preserving.

Proposition 11.

Automatic transductions

Definition 12.

▶ Remark 13.

Transduction/structure correspondence.

Proposition 14.

4 Lexicographic transductions

Definition of lexicographic transductions

Simple transductions.

Lexicographic enumerators.

Example 15.

MapLex combinator.

Definition 16 (Lexicographic transductions).

Lemma 17.

Proof.

▶ Remark 18.

Example 19 (Identity and Reverse).

Example 20 (Morphisms).

Lemma 21.

Example 22 (Domain restriction).

Example 23 (Subwords).

Example 24 (Square, illustrated on Fig. 2).

Presentation as automatic transductions

Definition 25.

Proposition 26 (ATk⁢-Lex=Lexk).

Proposition 27.

5 Nested marble transducers

Definition 28 (Simple transducers).

Definition 29 (Nested marble transducers from Σ to Γ).

Example 30.

Lemma 31 (Seq∘NMT⊆NMT).

State-passing free nested marble transducers.

Definition 32.

Theorem 33 (State-passing removal).

Lemma 34.

Proof sketch.

Corollary 35.

Proof sketch of Theorem 33.

$\blacktriangleright$ Remark 7.

$\blacktriangleright$ Remark 13.

$\blacktriangleright$ Remark 18.

Proposition 26 ( $\textsf{AT}_{k\text{-}\textsf{Lex}}=\textsf{Lex}_{k}$ ).

Definition 29 (Nested marble transducers from $\Sigma$ to $\Gamma$ ).

Lemma 31 ( $\textsf{Seq}\circ\textsf{NMT}\subseteq\textsf{NMT}$ ).

Theorem 36 ( $\textsf{Lex}\subseteq\textsf{NMT}$ ).

Theorem 37 ( $\textsf{NMT}_{\textsf{spf}}\subseteq\textsf{Lex}$ ).

Theorem 41 ( $\textsf{Lex}\circ\textsf{Rat}\subseteq\textsf{Lex}$ ).