Abstract 1 Introduction 2 Related Work 3 Background 4 Cascades of automata 5 Expressiveness results 6 Efficient closure properties of cascades of reset automata 7 Conclusions and Future Work References

On Cascades of Reset Automata

Roberto Borelli ORCID University of Udine, Italy Luca Geatti ORCID University of Udine, Italy Marco Montali ORCID Free University of Bozen-Bolzano, Italy Angelo Montanari ORCID University of Udine, Italy
Abstract

The Krohn-Rhodes decomposition theorem is a pivotal result in automata theory. It introduces the concept of cascade product, where two semiautomata, that is, automata devoid of initial and final states, are combined in a feed-forward fashion. The theorem states that any semiautomaton can be decomposed into a sequence of permutation-reset semiautomata. For the counter-free case, this decomposition consists entirely of reset components with two states each. This decomposition has significantly impacted recent research in various areas of computer science, including the identification of a class of transformer encoders equivalent to star-free languages and the conversion of Linear Temporal Logic formulas into past-only expressions (pastification).

The paper revisits the cascade product in the context of reset automata, thus considering each component of the cascade as a language acceptor. First, we give regular expression counterparts of cascades of reset automata. We then establish several expressiveness results, identifying hierarchies of languages based on the restriction of the height (number of components) of the cascade or of the number of states in each level. We also show that any cascade of reset automata can be transformed, with a quadratic increase in height, into a cascade that only includes two-state components. Finally, we show that some fundamental operations on cascades, like intersection, union, negation, and concatenation with a symbol to the left, can be directly and efficiently computed by adding a two-state component.

Keywords and phrases:
Automata, Cascade products, Regular expressions, Krohn-Rhodes theory
Copyright and License:
[Uncaptioned image] © Roberto Borelli, Luca Geatti, Marco Montali, and Angelo Montanari; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Regular languages
; Theory of computation Automata extensions
Acknowledgements:
The authors would like to thank Alessio Mansutti for his valuable comments during the preparation of this paper.
Funding:
Luca Geatti and Angelo Montanari acknowledge the support from the 2024 Italian INdAM-GNCS project “Certificazione, monitoraggio, ed interpretabilità in sistemi di intelligenza artificiale”, ref. no. CUP E53C23001670001 and the support from the Interconnected Nord-Est Innovation Ecosystem (iNEST), which received funding from the European Union Next-GenerationEU (PIANO NAZIONALE DI RIPRESA E RESILIENZA (PNRR) – MISSIONE 4 COMPONENTE 2, INVESTIMENTO 1.5 -– D.D. 1058 23/06/2022, ECS00000043). In addition, Marco Montali and Angelo Montanari acknowledges the support from the MUR PNRR project FAIR - Future AI Research (PE00000013) also funded by the European Union Next-GenerationEU.
Editors:
Olaf Beyersdorff, Michał Pilipczuk, Elaine Pimentel, and Nguyễn Kim Thắng

1 Introduction

The Krohn-Rhodes decomposition theorem is a fundamental result both in automata theory and in semigroup algebra [12]. It relies on the concept of cascade product of two semiautomata, i.e., automata devoid of initial and final states, and thus, ultimately, edge-labeled graphs. In this setup, the first semiautomaton operates on an alphabet Σ, while the second one reads symbols belonging to the Cartesian product of Σ and the set of states of the first semiautomaton. The key feature of the cascade product, which extends the notion of direct product, is that the second semiautomaton transitions from state s to state s by reading the pair (σ,q) if and only if the input symbol is σ and the first semiautomaton is in state q.

The Krohn-Rhodes theorem states that any semiautomaton can be decomposed into a cascade (i.e., a sequence of cascade products) of permutation-reset semiautomata.111Formally, the decomposition is guaranteed to preserve a homomorphism from the cascade to the initial semiautomaton.  In such semiautomata, each symbol of the alphabet induces a function on the set of states that is either a permutation, i.e., a bijective function, or a reset, that is, there is a specific state to which all other states are mapped into when reading that symbol. Crucially, if the semiautomaton is counter-free, that is, it does not contain non-trivial cycles [18], the Krohn-Rhodes theorem guarantees the existence of a decomposition that consists of reset automata only, i.e., automata where all symbols induce reset functions (as described above) or the identity function.

The Krohn-Rhodes theorem, in particular the decomposition of counter-free automata into reset automata, had a significant impact on some meaningful problems of current research in computer science. A notable example comes from Angluin et al. [1], who employ the Krohn-Rhodes decomposition theorem to prove that Linear Temporal Logic (𝖫𝖳𝖫 [19]) is equivalent to transformer encoders with hard attention and strict future masking (see also [13]). Specifically, they show how reset semiautomata can be encoded in B-RASP, a minimal programming language that compiles into transformers. Similarly, studies such as [21, 10, 11] utilized this theorem to analyze the sample complexity of cascades and the expressiveness of Recurrent Neural Networks without circular dependencies. Another example is provided by Maler [14, 15], who used the decomposition theorem to transform any formula of 𝖫𝖳𝖫, interpreted over finite words, into an equivalent formula using only past operators (see also [20]), a problem now known as pastification [2].

In this paper, we revisit the cascade product in the reset automata setting, i.e., language acceptors whose underlying semiautomaton is a reset. We address various expressiveness issues for cascade products by themselves and in relation to regular expressions. These results represent a necessary step towards a more efficient exploitation of Krohn-Rhodes decomposition in pastification, with the ultimate goal of lowering its current, triply exponential upper bound, which is far away from the know, singly exponential lower bound.

The paper consists of three main parts. In the first part, we address the question: given a cascade of reset automata, which is its corresponding regular expression? We begin by focusing on cascades of height 1, proving that the language corresponding to a reset automaton over the alphabet Σ is always of the form J(ΣRI), for some I,RΣ, such that IR= and either J=I or J=. Then, we extend the analysis to cascades of reset automata of arbitrary height. As a first step, we show that the last level can always be transformed into a two-state automaton, and then, by exploiting such a result, we derive the regular expression corresponding to a generic cascade of reset automata.

In the second part, we build on the previously obtained results and establish several expressiveness results about cascades of reset automata. We structure the analysis into three types of cascades:

  1. (i)

    short cascades (whose height is bounded by 2),

  2. (ii)

    narrow cascades (where each component has two states, but there is not a height limitation), and

  3. (iii)

    general cascades (with no limitations on the height or on the number of states per level).

As for short cascades, we prove that any language over an alphabet Σ of cardinality k that is definable by a reset cascade of height 2 can also be defined by one where the first component has at most k+1 states. Additionally, we show that increasing the number of states in the first component results in a strict increase in expressiveness: there exists a family of languages which are definable by a two-reset cascade whose first component has n states, but are not definable if the first component is restricted to n1 states. Similarly, for narrow cascades, we show that increasing the height results in an increase in expressiveness. These two results – the increase in the number of states in the first component for short cascades and the increase in height for narrow cascades – lead to two hierarchies (with infinitely many levels) of languages that are not definable at previous levels. Finally, we show that any general cascade can be transformed into one whose components have all 2 states, with an increase in height of at most a quadratic factor (relative to the height of the original cascade).

In the last part, we deal with closure properties of the languages recognized by reset cascades, and show that some fundamental operations can be computed in an efficient way. More precisely, we prove that the operations of intersection, union, negation, and concatenation with Σ to the left (“next operation”) all require the addition of one component with 2 states only.

The paper is structured as follows. Related work is discussed in Section 2. In Section 3, we provide some background knowledge. In Section 4, we introduce the cascades of reset automata, we state some basic results about them, and we provide a characterization of the languages that they recognize in terms of regular expressions. In Section 5, we present some expressiveness results for short, narrow, and general cascades of reset automata. Section 6 focus on closure properties and the efficient computation of some basic operations. Finally, Section 7 provides an assessment of the work done, and it outlines some directions for future research.

2 Related Work

The Krohn-Rhodes theorem and the cascade product turn out be quite useful in understanding the structure and the expressiveness of finite-state systems, in particular in the context of automata and neural networks, and their connection to logic. Various recent contributions have leveraged this foundational theory to explore the expressiveness, modularity, and learning potential of automata in such a context.

A pivotal contribution in this area is the work by Maler on the cascade decomposition of semiautomata [14, 15]. It revisits the Eilenberg’s variant of the Krohn-Rhodes theorem [6] and offers a constructive proof that any semiautomaton can be decomposed into a cascade of elementary (permutation and reset) semiautomata. The paper introduces the holonomy tree as a data structure to represent cascade decompositions and an algorithm to build such a tree. Crucially, the algorithm carefully maps the permutations of the obtained cascade product to non-trivial cycles of the starting semiautomaton: this guarantees that, whenever the starting semiautomaton is counter-free, that is, devoid of non-trivial cycles, the generated cascade decomposition only consists of reset components. An exponential bound on the size of the cascade decomposition in terms of the size of the starting semiautomaton is given. This algorithm can be used to actually translate counter-free automata to temporal logic. More precisely, Maler shows how to translate any cascade product of reset semiautomata into a pure past 𝖫𝖳𝖫 formula, that is, a formula featuring only past temporal modalities. Together with the transformation of the future fragment of 𝖫𝖳𝖫, interpreted over finite words, into counter-free automata, this leads to a triply exponential upper bound to the problem of transforming pure-future 𝖫𝖳𝖫 over finite words into pure-past 𝖫𝖳𝖫 (pastification problem). Equivalently, in the case of 𝖫𝖳𝖫 interpreted over infinite words, Maler shows how to use the proposed algorithm to normalize every 𝖫𝖳𝖫 formula by mapping it into one belonging to the Reactivity class [16], at a cost of a triply exponential blowup. For both problems, that is, pastification and normalization, the best known lower bounds are singly exponential [3, 17].

The Krohn-Rhodes theorem has also been applied to analyze the complexity of semigroups, as shown in [9]. This study examines semigroups of upper triangular matrices over finite fields and establishes that the Krohn-Rhodes complexity of these semigroups corresponds to n1, where n is the matrix dimension. These results underline the deep connection between the algebraic structure of semigroups and their matrix representations, providing a measure of how intricate the cascade product representation needs to be for such semigroups.

In [8], the Krohn-Rhodes theorem is used to characterize piecewise testable and commutative languages. The authors define biased reset semiautomata, where the current state changes at most once, and characterize cascades 𝒜, where is a biased reset semiautomaton. Theorem 4.12 in Section 4 can be seen as a simplification and a generalization (to cascades of unbounded height) of such a characterization. Finally, the authors propose the notion of scope of a cascade, which is used to analyze the dot-depth of star-free languages.

In [21], Ronca builds on the Krohn-Rhodes theorem, proposing automata cascades as a structured and modular framework to describe complex systems. The resulting framework allows automata to be decomposed into components with specific functionalities, enabling fine-grained control of their expressiveness. By focusing on component-based decomposition, the study demonstrates that the sample complexity of learning automata cascades is linear in the number of components and their individual complexities, up to logarithmic factors. This contrasts with traditional state-centric perspectives, where sample complexity scales with the number of states, often limiting the feasibility of learning large systems. The relationships between the cascade product and neural networks are investigated in [10]. Recurrent Neural Cascades (RNCs) are a class of networks with acyclic connections, which naturally align with the cascade product of automata. By exploiting the Krohn-Rhodes theorem, the authors prove that RNCs capture star-free regular languages.

The Krohn-Rhodes theorem also underpins the exploration of transformer models in [13]. While transformers lack recurrence, the paper demonstrates that their layered architecture can simulate the cascade decomposition of finite automata. Leveraging Krohn-Rhodes theory, the authors show that shallow transformers can hierarchically approximate automata computations, enabling polynomial-sized and constant-depth shortcuts for specific automata.

In [1], Angluin et al. draws direct parallels between the expressive power of masked hard-attention transformers and star-free regular languages. These models, constrained by strict future masking, are shown to be equivalent to 𝖫𝖳𝖫 and counter-free automata – both closely tied to the Krohn-Rhodes cascade framework. The study underscores how the structured limitations of these transformers, akin to a cascade decomposition, yield expressive yet computationally efficient models.

Together, these contributions extend the applicability of the Krohn-Rhodes theory to neural networks, transformers, and beyond, demonstrating the versatiliy of the cascade framework as a powerful principle in computation.

3 Background

A semiautomaton 𝒜 is a tuple (Σ,Q,δ) such that:

  1. (i)

    Σ is a (finite) alphabet;

  2. (ii)

    Q is a set of states;

  3. (iii)

    δ:Q×ΣQ is a transition function.

An automaton 𝒜=(Σ,Q,δ,q0,F) is a semiautomaton extended with an initial state q0Q and a set FQ of final states. With δ we denote the Kleene’s closure of δ. We say that 𝒜 is two-state iff |Q|=2.

Given an automaton 𝒜=(Σ,Q,δ,q0,F) and a (finite) word σσ0,,σnΣ, the run τQ+ induced by σ is a sequence q0,q1,,qn+1 such that δ(qi,σi)=qi+1, for all 0in. We say that τ is accepting iff qn+1F. A word σΣ is accepted by 𝒜 iff the run induced by σ is accepting. We define the language of 𝒜, denoted by (𝒜), as the set of accepted words. Given a state qQ, let q(𝒜) be the set of words inducing a run τq0,,qm with qm=q. The classic direct product of automata is defined as follows.

Definition 3.1 (Direct product of automata).

Let 𝒜=(Σ,Q,δ,q0,F) and 𝒜=(Σ,Q,δ,q0, F) be two automata. The direct product of 𝒜 and 𝒜, denoted by 𝒜×𝒜, is the automaton (Σ,Q×Q,δ′′,(q0,q0),F×F) such that, for all (q,q)Q×Q and for all aΣ, it holds that δ′′((q,q),a)=(δ(q,a),δ(q,a)).

The cascade product of semiautomata is defined as follows.

Definition 3.2 (Cascade Product of semiautomata [15, 22]).

Let Σ be a finite alphabet and let 𝒜=(Σ,Q,δ) and 𝒜=(Σ×Q,Q,δ) be two semiautomata over the alphabets Σ and Σ×Q, respectively. We define the cascade product between 𝒜 and 𝒜, denoted with 𝒜𝒜, as the semiautomaton (Σ,Q×Q,δ′′) such that, for all (q,q)Q×Q and for all aΣ:

δ′′((q,q),a)=(δ(q,a),δ(q,(a,q))) (2)

We will often simply use “cascade” for “cascade product”.

It is worth noticing that the cascade product of semiautomata is a generalization of the classic direct product: the latter can be recovered by imposing the alphabet of the second semiautomaton to be Σ (i.e., the alphabet of the first one) instead of Σ×Q.

The cascade product is an associative operation, meaning that (𝒜𝒜)𝒜′′ is the same semiautomaton as 𝒜(𝒜𝒜′′). We define the height of the product 𝒜1𝒜n as n.

We now introduce two classes of semiautomata, reset and permutation, depending on the form of their transitions. We first define the notion of function induced by a symbol.

Definition 3.3 (Function induced by a symbol).

Let 𝒜=(Σ,Q,δ) be a semiautomaton. For each symbol aΣ, we define the function induced by a in 𝒜, denoted by τa𝒜 (or simply τa when 𝒜 is clear from the context), as the transformation τa:QQ such that, for all qQ, it holds τa(q)=q iff δ(q,a)=q.

Reset and permutation functions are defined as follows.

Definition 3.4 (Reset and permutation functions).

Let τ:QQ. We say that τ is a reset function iff there exists qQ such that τ(q)=q, for all qQ. In this case, we say that τ is a reset on q. If τ:QQ is a bijection, then it is called a permutation.

On the basis of the functions induced by the symbols of their alphabet, we define the following classes of semiautomata.

Definition 3.5 (Classes of semiautomata).

Let 𝒜=(Σ,Q,δ) be a semiautomaton. We say that 𝒜 is:

  • a permutation-reset semiautomaton iff, for each aΣ, τa is either a permutation or a reset.

  • a permutation semiautomaton iff, for each aΣ, τa is a permutation;

  • a reset semiautomaton iff, for each aΣ, τa is either the identity function or a reset function;

  • a pure-reset semiautomaton iff, for each aΣ, τa is a reset function.

We now introduce counter-free semiautomata [18]. Let σΣ. From now on, we denote by (σ)i the word generated by concatenating i times the word σ to itself. A word σΣ, with σε, defines a nontrivial cycle in a semiautomaton 𝒜=(Σ,Q,δ) if there exists a state qQ such that:

  1. (i)

    δ(q,σ)q

  2. (ii)

    δ(q,(σ)i)=q, for some i>1.

We say that a semiautomaton 𝒜 is counter-free if there are no words that define a nontrivial cycle. Counter-free automata recognize exactly the set of languages definable by star-free regular expressions, i.e., expressions devoid of Kleene’star. We denote this set by 𝒮.

A fundamental result in the field is the Krohn-Rhodes Cascade Decomposition Theorem. The theorem’s initial formulation was expressed in the context of semigroups [12], and its automata-theoretic counterpart [14] can be articulated as follows.

Theorem 3.6 (The Krohn-Rhodes Cascade Decomposition Theorem [12, 14]).

For each semiautomaton 𝒜=(Σ,Q,δ), there exists a cascade product of semiautomata 𝒞𝒜1𝒜2𝒜n such that:

  1. (i)

    𝒜i is a permutation-reset semiautomaton, for each 1in;

  2. (ii)

    there is an homomorphism222We refer to [12, 14] for a formal definition of homomorphism between semiautomata. from 𝒞 to 𝒜;

  3. (iii)

    if 𝒜 is counter-free, then 𝒜i is a two-state reset semiautomaton, for each 1in.

4 Cascades of automata

In this section, we begin our study of the languages recognized by cascades of automata. We start by formally defining them and stating some basic properties. Then, we focus on cascades of reset automata, and provide a characterization of the languages they recognize in terms of regular expressions.

4.1 Definitions and basic properties

To begin with, we generalize the notion of cascade product of semiautomata (Definition 3.2) to automata.

Definition 4.1 (Cascade product of automata).

Let Σ be a finite alphabet and let 𝒜=(Σ,Q,δ,q0,F) and 𝒜=(Σ×Q,Q,δ,q0,F) be two automata over the alphabets Σ and Σ×Q, respectively. We define the cascade product of 𝒜 and 𝒜, denoted by 𝒜𝒜, as the automaton (Σ,Q×Q,δ′′,(q0,q0),F×F) where δ′′ is defined as in Definition 3.2.

We say that a language is definable by a cascade 𝒞 iff =(𝒞)Figure 1 shows the cascade product of two reset automata defining the language aΣ.

Figure 1: The reset automaton 𝒜1 with set of states Q={q0,q1} over the alphabet Σ={a,b} (left). The reset automaton 𝒜2 over the alphabet Σ×Q (middle). The cascade product 𝒜1𝒜2 over the alphabet Σ that recognizes the languages aΣ (right).

In the following, we will use the term cascade to refer both to the component automata and to the resulting automaton.

We now show how to compute the language recognized by a cascade of automata on the basis of the languages recognized by its components. Let Σ1 and Σ2 be two alphabets. Let σ1=σ11σn1(Σ1)n and σ2=σ12σn2(Σ2)n be two words of length n. We define 𝖺𝗎𝗀(σ1,σ2)(Σ1×Σ2)n as the word (σ11,σ12)(σn1,σn2).

Definition 4.2 (Language of at a state s over 𝒜).

Let 𝒜=Σ,Q,δA,q0,FA and =Σ×Q,S,δB,s0,FB be two automata. The language of at state sS over 𝒜, denoted by s()[𝒜], is defined as follows: the empty word only belongs to s0()[𝒜]; a word σ=σ1σk, with k1, belongs to s()[𝒜] if

  1. (i)

    σ1σk1 induces a run τ=q0,q1,qk1 on 𝒜; and

  2. (ii)

    𝖺𝗎𝗀(σ,τ)s().

The language of a cascade can be computed from those of its components as follows. Let 𝒞=𝒜 be a cascade. The words forcing 𝒞 to reach a state (q,s) are exactly those words such that:

  1. (i)

    they force 𝒜 to reach state q; and

  2. (ii)

    they force to reach state s, when augmented with the run of 𝒜.

Proposition 4.3 (Language of a cascade in terms of its components).

Let 𝒜=Σ,Q,δA,q0, FA and =Σ×Q,S,δB,s0,FB be two automata. It holds that:

  1. 1.

    (q,s)(𝒜)=q(𝒜)s()[𝒜], for all states qQ and sS;

  2. 2.

    (𝒜)=(q,s)F(q,s)(𝒜).

We now show that, as it happens with semiautomata, the direct product of automata is just a special case of the cascade product. To this end, we first define the notion of augmentation of an automaton.

Definition 4.4 (Augmentation).

Let 𝒜=Σ,Q,δ,q0,F and 𝒜=(Σ,Q,δ,q0,F) be two automata such that either Σ=Σ or Σ=Σ×S, for an arbitrary finite set S. We define the augmentation of 𝒜 relative to 𝒜, denoted by 𝖺𝗎𝗀(𝒜,𝒜), as the automaton (Σ′′,Q,δ′′,q0,F) such that:

  • if Σ=Σ, then Σ′′Σ×Q and for all qQ and all aΣ, δ′′(q,(a,))=δ(q,a);

  • if Σ=Σ×S, then Σ′′Σ×Q×S and, for all qQ and for all (a,s)Σ×S, it holds that δ′′(q,(a,,s))=δ(q,(a,s)).

Given a cascade 𝒞=𝒜1𝒜n over Σ, we define the augmentation of 𝒞 relative to 𝒜, denoted by 𝖺𝗎𝗀(𝒜,𝒞), as the cascade 𝖺𝗎𝗀(𝒜,𝒜1)𝖺𝗎𝗀(𝒜,𝒜n).

The notion of augmentation can be generalized to a pair of cascades 𝒞 and 𝒞 by treating 𝒞 as a single automaton: from now on, when we will refer to the cascade product of 𝒞 and 𝒞, we will interpret it as the cascade product of the automaton 𝒜 generated by 𝒞 and 𝒞.

The next proposition shows that direct product can be simulated by means of augmentation and cascade product.

Proposition 4.5 (Direct product by means of cascade product).

Let 𝒜=Σ,Q,δ,q0,F be an automaton and let 𝒞 be a cascade over Σ. It holds that 𝒜𝖺𝗎𝗀(𝒜,𝒞)=𝒜×𝒞.

Furthermore, augmenting an automaton does not affect its property of being reset (or permutation), as stated by the following  Proposition 4.6.

Proposition 4.6.

Let Σ be a finite alphabet and let 𝒜 be an automaton over Σ or Σ×S, for an arbitrary finite set S. If 𝒜 is a reset (resp., permutation) automaton, then, for any automaton 𝒜 over Σ, 𝖺𝗎𝗀(𝒜,𝒜) is a reset (resp., permutation) automaton.

It follows that, in particular, if 𝒞 is a cascade of reset (resp., permutation) automata, then 𝖺𝗎𝗀(𝒜,𝒞) is a cascade of reset (resp., permutation) automata. From Propositions 4.5 and 4.6, it follows directly that, given two cascades 𝒞 and 𝒞 of height m and n of reset (resp., permutation) automata, there exists a cascade of height m+n of reset (resp., permutation) automata for (𝒞)(𝒞). In Section 6, we will show how to compute other basic operations on cascades of resets.

4.2 Languages of cascades of resets

In this part, we characterize the language recognized by a cascade of reset automata in terms of regular expressions. We begin with the case of cascades of height 1 and then we move to cascades of unbounded height.

4.2.1 Cascades of height 1

The study of which regular expressions characterize height-1 cascades of resets coincides with the study of the languages recognized by reset automata. The following theorem gives a characterization of reset automata in terms of regular expressions.

Theorem 4.7 (The languages of reset automata).

Let Σ be a finite alphabet. A language Σ is recognized by a reset automaton if and only if =J(ΣRI) for some I,RΣ such that IR= and either J=I or J=.

Intuitively, an automaton reading a symbol that induces a reset function on a final state is forced to end up in that state, regardless of which state it was in before. Furthermore, it remains in that state if all subsequent symbols induce identity functions. In the case of words containing multiple resets on a final state, only the last of these symbols matters, resulting in words of the form ΣRI. The case of J=I arises when the initial state is also final. In this scenario, to accept a word, the automaton does not need to read a symbol that induces a reset on a final state (since it is already there), but only needs to stay in the initial state.

A by-product of Theorem 4.7 is that any reset automaton is equivalent to one with two states, only one of which is final. The rationale is as follows:

  1. (i)

    the symbols in R induce a reset on the single final state;

  2. (ii)

    the symbols in I act as identities; and

  3. (iii)

    the symbols neither in R nor in I induce resets on the single non-final state.

Moreover, the initial state is also the final state if and only if J=I. A graphical account is given in Figure 2.

Proposition 4.8.

For every reset automaton, there exists an equivalent one with two states, exactly one of which is final.

Figure 2: The reset automaton corresponding to a language of the form J(ΣRI) in the case J= (on the left) and in the case J=I (on the right).

Theorem 4.7 allows us to establish a first connection between Linear Temporal Logic on finite traces (𝖫𝖳𝖫𝖿 [5]) formulas and equivalent reset cascades. As highlighted in the introduction, the languages expressible in 𝖫𝖳𝖫𝖿 are exactly the star-free languages, that is, those languages that can be represented by regular expressions that do not use the Kleene star, or equivalently, by languages whose minimal automaton is counter-free [18]. By Krohn-Rhodes’ theorem (Theorem 3.6), it follows that the languages definable in 𝖫𝖳𝖫𝖿 are precisely those expressible through cascades of resets. Given the relevance of reset cascade decomposition in problems such as pastification [2, 4] and normalization [7] of temporal logic formulas, it is crucial to understand which 𝖫𝖳𝖫𝖿 formulas can be expressed with cascades of a specific height. The following result shows that even simple formulas like missing (the proposition letter “p” holds at the initial time point) or p𝖴missing (there is a future point where “q” holds, and until then, “p” remains true) cannot be expressed with cascades of resets of height 1. In fact, the languages they recognize333Here, assuming a set of atomic propositions 𝒜𝒫{p,q,r,}, the languages of formulas over 𝒜𝒫 are defined over the alphabet Σ2𝒜𝒫. Moreover, given any p𝒜𝒫, we indicate with p all the letters aΣ such that pa. are respectively pΣ and (p)qΣ, which are not of the form J(ΣRI), for any choice of R, I, and J. However, the formula 𝖥missing (there exists a point in the future where “p” holds) can be expressed with reset cascades of height 1, as its language is of the form J(ΣRI), choosing Rp, IΣp, and J=.

Corollary 4.9.

The languages defined by the 𝖫𝖳𝖫𝖿 formulas missing and p𝖴missing are not definable with height-1 cascades of resets.

4.2.2 Cascades of unbounded-height

In this section we derive regular expressions for cascades of arbitrary height. As a first step, we show that a cascade of height h of reset automata, say 𝒜1𝒜h, can be transformed into an equivalent cascade of the same height, still consisting of reset automata, where the last automaton (𝒜h) has exactly two states, one of which is the only accepting state. This result forms the basis for Theorem 4.12, which provides the characterization of cascades of arbitrary height.

Lemma 4.10.

Let 𝒜 be an automaton with set of states Q over the alphabet Σ, and let be a reset automaton over the alphabet Σ×Q. There exists a 2-states reset automaton , with exactly one final state, such that (𝒜)=(𝒜).

The proof of Lemma 4.10 heavily relies on the characterization of cascades of height 1. More precisely, since is a reset automaton over the alphabet Σ×Q, by Proposition 4.8, there exists an equivalent reset automaton with two states (one of which is the only accepting state) over the same alphabet. Since is at the bottom of the cascade, the language of the cascade 𝒜 is the same as the language of 𝒜. This is because there are no other automata below in the cascade that can exploit information about ’s current state. As a matter of fact, in Section 5, we will prove that this no longer true when applying the same procedure to 𝒜: there exist languages definable by a cascade 𝒜, where 𝒜 has 3 states and has 2 states, that cannot be expressed if the number of states of 𝒜 is limited to 2.

Let us now introduce the notion of filtered automaton, which is obtained from a given automaton by removing (filtering) certain outgoing transitions and possibly changing its initial state.

Definition 4.11 (Filtered Automaton).

Let 𝒜=Q,Σ,q0,δ,F be an automaton. A filter is pair (q,H), where qQ and HΣ×Q. The partial automaton 𝒜, filtered by (q,H), denoted as 𝒜Hq, is the automaton (Q,Σ,q,δ,F) where δ(q,σ)δ(q,σ) if (σ,q)H, or is undefined otherwise.

Before formally stating Theorem 4.12, that characterizes the languages of cascades of unbounded-height, we give an intuitive account of it. Let 𝒜 be a cascade, with 𝒜 an automaton and a reset automaton, where, w.l.o.g. (Lemma 4.10), has only two states and exactly one final state. Any word accepted by 𝒜 must drive both 𝒜 and to an accepting state. Its language can be captured by analyzing the symbols inducing a reset function that leads to a final state of , and the symbols inducing identities in . The words in the language of 𝒜 are precisely those consisting of

  1. (i)

    a prefix that, for any symbol (σ,q) inducing a reset on a final state of , drives automaton 𝒜 to state q;

  2. (ii)

    followed by the symbol σΣ (let qσ be the state reached by 𝒜 after reading it);

  3. (iii)

    a suffix that forces to remain in its accepting state through its identity functions IB, and forces 𝒜 to reach a final state starting from qσ.

In addition, 𝒜 cannot transition from state q when reading a symbol σ if the pair (σ,q) does not belong to ’s identity functions, as this would cause to leave its accepting state. Therefore, the suffix corresponds to the language of automaton 𝒜, filtered by (δA(q,σ),IB). This is formally expressed by the following theorem.

Theorem 4.12 (Languages of cascades of unbounded-height).

Let 𝒜=Σ,Q={q0,,qn},δA, q0,FA be an automaton and let =Σ×Q,{s0,s1},δ,s0,{sf}, with sf{s0,s1}, be a two-state reset automaton with one final state. It holds that:

(𝒜)=M(σ,q)Rsfq(𝒜)σ(𝒜IBδA(q,σ)) (3)

where Rsf is the set of symbols in Σ×Q that induce a reset function on state sf, and M(𝒜IBq0) if s0=sf or M otherwise.

Figure 3: On the left and on the center, the reset automata 𝒜 and , respectively, for the cascade 𝒜 recognizing the language ba+. On the right, the automaton 𝒜IBs1, where IB={(a,q1),(b,q0)} are the identities of automaton .

Figure 3 gives an example of application of Theorem 4.12 to a cascade over the alphabet Σ{a,b} of two reset automata, 𝒜 (on the left) and (on the center), with two states each, recognizing the language ba+. Using Theorem 4.12, we have that (𝒜)=q0(𝒜)a(𝒜IBq1), where 𝒜IBq1 is the automaton obtained from 𝒜 filtered by the identities IB={(a,q1),(b,q0)} of automaton . Since q0(𝒜)=b and (𝒜IBq1)=a, we obtain (𝒜)=ba+. The following is a corollary of Theorem 4.12 in the case in which is a pure-reset automaton.

Corollary 4.13.

Let 𝒜=Σ,{q0,,qn},δA,q0,FA be an automaton and let =Σ×Q,{s0,s1},δ,s0,{sf} be a two-state pure-reset automaton with one final state. It holds that

(𝒜)=M(σ,q)RsfδA(q,σ)FAq(𝒜)σ (4)

where Mϵ if s0=sf and q0FA, or M otherwise.

It is worth noticing that the regular expressions in Theorems 4.12 and 4.13 refer to states of the cascade under consideration. This is a major difference with the characterization of reset cascades of height 1 (Theorem 4.7). In order to prove some undefinability results in the next section, we provide a characterization of the languages recognized by cascades of two-states resets of height 2 where the second component is pure-reset, based on regular expressions that do not refer to states of the cascade.

Lemma 4.14.

Let Σ be a language. is definable by a cascade of height 2 in which the second component is pure-reset if and only if =Mi=1nKiσi for some M, n, σi and Ki such that:

  1. (i)

    M is either ϕ or ϵ;

  2. (ii)

    0n2|Σ|;

  3. (iii)

    for all i=1n it holds σiΣ;

  4. (iv)

    there exists a language L recognizable by a two-state reset automaton such that for all i=1n, either Ki is L or Ki is L¯=ΣL.

Notice that the Lemma above can be easily extended to the case in which the first automaton in the cascade has k states, for some k2. This is done by relaxing (iv) and imposing that Ki can be chosen between k languages K1,,Kk such that {K1,,Kk} is a partition of Σ and each Ki is definable by a cascade of resets of height 1. Constraint (ii) is also relaxed to 0nk|Σ|.

We will use Theorem 4.12, Corollary 4.13, and Lemma 4.14 in the next section to prove undefinability results of certain languages by cascades of a given height and with a specified number of states at each level.

5 Expressiveness results

In this section, we analyze the expressive power of various types of reset automaton cascades. We begin by defining several language classes, and subsequently structure our analysis into short cascades (where the height is constrained to at most two), narrow cascades (where the height is unbounded but each component contains two states), and general cascades (with no restrictions on either the height or the number of states).

Definition 5.1 (Classes and 𝒫).

Let h>0 and let k1,,kh>1. We denote by (k1,,kh) the class of languages definable by a cascade 𝒜1𝒜h of reset automata such that 𝒜i has ki states, for each 1ih. We denote by 𝒫(k1,,kh) the subclass of (k1,,kh) where the last automaton (𝒜h) is required to be pure-reset. For h>0 and k>1, we define kh2k1,,khk(k1,,kh) as the set of languages definable by a cascade of height h, where each component has at most k states. We define h>0,k>1kh as the set of languages definable by any cascade of reset automata. The classes 𝒫kh and 𝒫 are defined analogously.

Refer to caption
Figure 4: Summary of (some of) the results in Section 5.

Figure 4 provides an overview of (some of) the results presented in this section. Specifically, it illustrates that increasing the cascade height and increasing the number of states at the first level lead to two distinct language hierarchies.

5.1 Short Cascades

We begin by considering short cascades, i.e. cascades of reset automata of height 2. As a first step, we start by comparing the classes (2),𝒫(2,2) and (2,2), and then we focus on 𝒫(k,2) and (k,2) for every k>2.

We already know that with a single pure-reset automaton we can recognize the set of all words ending with a certain symbol of the alphabet (this follows from Theorem 4.7 in the special case in which I=). As an example, it holds that Σa𝒫(2). Now, if we introduce an additional pure-reset layer, we can effectively recognize the set of words ending with a two-character suffix. However, we also demonstrate that this is impossible using a single reset automaton.

Lemma 5.2.

Let L=Σaa. It holds that:

  1. (i)

    L𝒫(2,2);

  2. (ii)

    L(2).

Lemma 5.2 shows that increasing the height of a cascade, even of height 1 and even with a pure-reset automaton, results into a gain of expressive power.

In the upcoming lemma, we demonstrate that, at the same height, prohibiting identities in the final layer results in a loss of expressive power. To illustrate this, let us consider the language aΣ. As shown in Figure 1, this language can be defined using a cascade of two reset automata, with the last one specifically containing identities. Building upon Lemma 4.14, we further demonstrate that achieving the same language recognition is not possible when prohibiting identities in the final layer, no matter of the number of states of the first automaton.

Lemma 5.3.

Let Σ be an alphabet with at least two symbols, let L=aΣ. It holds that:

  1. (i)

    L(2,2);

  2. (ii)

    L𝒫(k,2), for every k2.

Figure 5: On the left, the reset automaton 𝒜 and on the right the reset automaton such that, for Σ={a,b,c}, the cascade C𝒜 accepts the language Σ(Σ2{aa,bb,cc}){b,c}, which precisely corresponds to the language L3 described in Lemma 5.4. When viewed as a single automaton, C is also the minimal automaton for the language L3.

In Lemma 4.10, we have shown that the final component of a cascade can always be restricted to two states. A natural question arises: Can the first component also be limited to just two states? The answer is negative, as illustrated by the following example. Consider an alphabet of three symbols and the language L consisting of all words that end with two distinct symbols (e.g. cbL but aaL). As demonstrated in Figure 5, this language can be recognized by a cascade of two reset automata where the first component has three states. However, we will prove that it cannot be recognized if the first component has only two states. Intuitively, the first component’s role is to remember the second-to-last symbol, but with an alphabet of three symbols and only two states, this task becomes impossible. The following Width-Hierarchy Lemma formalizes this intuition, demonstrating the existence of an infinite hierarchy of languages that can be defined using cascades of two resets where the first component contains k states, but cannot be defined when the first component is restricted to k1 states.

Lemma 5.4 (Width-Hierarchy Lemma).

For each k>2, let Σ={σ0,,σk1}. Let Lk=Σ(Σ20i<kσiσi)(Σσ0), it holds that:

  1. (i)

    Lk𝒫(k,2);

  2. (ii)

    Lk(k1,2).

The following corollary (depicted in Figure 4) follows from Lemmas 5.2, 5.3, and 5.4.

Corollary 5.5.

It holds that:

  • (2)𝒫(2,2)(2,2);

  • 𝒫(3,2)(2,2) and (2,2)𝒫(3,2).

Additionally, we prove that for a fixed alphabet Σ of cardinality k, any language over Σ expressible by a cascade of height 2 can also be expressed by a cascade of height 2 where the first component has at most k+1 states. The intuition behind this is that, due to the restriction of transitions to resets or identities, only a finite number of states can be reached from the initial state. This is formalized in the next lemma, which also establishes the optimality of the bound on the number of states of the first component.

Lemma 5.6.

Let Σ be a finite alphabet with size |Σ|=k. Let L be a language such that LΣ. For every m>0, if L(m,2), then L(k+1,2). Furthermore, there exists a language LkΣ such that Lk(k+1,2) but Lk(k,2).

Figure 6: The cascade C𝒜 accepts the language L3Σ(Σ2{aa,bb,cc}). When treated as a single automaton, C consists of 8 states, in contrast to the minimal automaton for L3, which has only 7 states.

The language Lk used to prove the optimality of the bound in Lemma 5.6 is defined as Σ(Σ20i<kσiσi). As an example, Figure 6 shows the case of L3.

5.2 Narrow Cascades

Thus far, our discussion has centered around cascades composed of one or two components. Now, we shift our focus to narrow cascades, i.e. cascades of greater height but in which each components is restricted to have two states (i.e. 2h). Just as we have seen that some languages cannot be expressed by cascades of height 1, we will demonstrate that for any given height, there exists a language that cannot be captured at that height, provided the components of the cascade are restricted to two states. We call this the Height-Hierarchy Lemma, and is a counterpart of the Width-Hierarchy Lemma (Lemma 5.4) focused on the height of cascades. It is based on the following family of languages: for each h2, we consider the language Lh=Σh2aΣ, that is all words that contain symbol “a” precisely at position h1. The Height-Hierarchy Lemma below proves that, for any h2, the language Lh is not definable by cascades of two states reset automata of height less than h.

Lemma 5.7 (Height-Hierarchy Lemma).

For each h2, let Lh=Σh2aΣ. It holds that:

  1. (i)

    Lh2h;

  2. (ii)

    Lh2h1.

Figure 7: Switch automaton.

We briefly explain the intuition behind Lemma 5.7. Regarding point (i) Lh2h, the construction of the two-state reset cascade proceeds as follows. The base case corresponds to Figure 1, while the inductive step for height h involves the use of the two-state reset automaton 𝒜𝑠𝑤𝑖𝑡𝑐ℎ (illustrated in Figure 7) in cascade with the augmentation of the cascade for the case h1. Intuitively, 𝒜𝑠𝑤𝑖𝑡𝑐ℎ recognizes all words containing at least one symbol. Using in cascade h2 copies of 𝒜𝑠𝑤𝑖𝑡𝑐ℎ together with the cascade in Figure 1, corresponds exactly to the language Σh2aΣ. In Figure 8, we provide an example of the construction for the case h=3.

Figure 8: A cascade C3=𝒜1𝒜2𝒜3 that recognizes the language ΣaΣ. The first two components enforce that any accepted word contains at least two symbols, as (𝒜1𝒜2)=ΣΣΣ.

The proof that Lh2h1 is more involved and proceeds by induction on h. For the base case (h=2), we have L2=aΣ. This case is verified by Lemma 5.3. For the inductive step, assume that the statement holds for every ih. We need to prove that it also holds for i=h+1. Suppose, by contradiction, that Lh+12h. By definition, this would imply the existence of a cascade C=𝒜1𝒜h of two-state reset automata such that (C)=Lh+1=Σh1aΣ. However, starting from C, the proof shows how to construct a new cascade C=1h1 of two-state reset automata such that (C)=Σh2aΣ=Lh. The existence of C contradicts the inductive hypothesis, which states that Lh2h1. Therefore, the assumption that Lh+12h must be false, and the cascade C cannot exist.

5.3 General Cascades

In this subsection, we examine cascades of reset automata without imposing restrictions on the number of states in each component or on the total number of components. In the previous part, Lemma 5.6 demonstrates that, for cascade of two resets over the alphabet Σ, the maximum expressiveness is achieved when the first component has |Σ|+1 states. Here, we extend this result to cascades of unbounded height in the Width-Collapse Lemma, that provides lower bounds on the number of states in each component, for which adding states at certain levels does not affect expressiveness.

Lemma 5.8 (Width-Collapse Lemma).

Let Σ be a finite alphabet with |Σ|=k2. Let LΣ be a language. For any positive integers h and k1,,kh, if L(k1,,kh,2), then L(f(1),,f(h),2), where f(i)=ki+11k1.

We now demonstrate how to transform general cascades into narrow cascades. Specifically, we show how any cascade of reset automata (of height h and with ki states at level i, for each i{1,,h}) can be transformed into an equivalent narrow cascade (i.e. made of two-state resets), at the cost of increasing its height at most by a factor of 2+i=1h1log2(ki). This result is based on two key points:

  1. 1.

    Given a general cascade, we can always append a pure-reset automaton at the end without altering its language;

  2. 2.

    the Narrowing Lemma, which we prove below, demonstrates that any cascade of reset automata, whose final component is pure-reset and containing a component 𝒜j with kj states (and kj>2), can be transformed into a new cascade where 𝒜j is replaced by two new automata, with 2 and kj2 states each.

Instrumental to the Narrowing Lemma, the following result demonstrates that, given a general cascade whose last component is a pure-reset automaton, we can modify this last component to make all the states of the preceding components final, without altering the recognized language.

Lemma 5.9.

Consider a cascade 𝒜 of automata, where is a two-state pure-reset automaton. Let 𝒜 be the automaton obtained from 𝒜 by making all states final. Then, there exists a two-state pure-reset automaton such that (𝒜)=(𝒜).

The Narrowing Lemma is stated as follows.

Lemma 5.10 (Narrowing Lemma).

Let C=𝒜1𝒜h be a cascade where 𝒜i is a reset automaton with ki states for each 1ih1, and 𝒜h is a pure-reset automaton. Let j be an index such that 1jh1. Then, there exists a cascade C of reset automata

C=𝒜1𝒜j112𝒜j+1𝒜h (5)

such that:

  1. (i)

    each 𝒜i has ki states for ij;

  2. (ii)

    if 𝒜i is pure-reset (resp., reset), then also 𝒜i is pure-reset (resp., reset), for ij;

  3. (iii)

    1 has 2 states and 2 has kj2 states; and

  4. (iv)

    (C)=(C).

By iteratively applying the Narrowing Lemma to every component with more than two states, we obtain a procedure that, given a cascade of reset automata, produces an equivalent cascade where all components are two-state reset automata. Moreover, it is worth noticing that:

  1. (i)

    by Lemma 4.10, w.l.o.g. the last component of any cascades of reset (or pure-resets) has two states, and therefore the Narrowing Lemma does not need to be applied at the last level;

  2. (ii)

    if the final component of a cascade is not a pure-reset, a new pure-reset level can always be added without affecting the language of the cascade.

This leads to the following inclusions.

Corollary 5.11.

For each positive h,k1,,kh it holds that

  1. 1.

    𝒫(k1,,kh)𝒫2H+1

  2. 2.

    (k1,,kh)2H+2

where H=log2k1++log2kh1.

Combining Lemma 5.8 and Corollary 5.11, we conclude that if a language L is recognized by a cascade of resets of height h, it can also be recognized by a cascade of height Θ(h2) composed entirely of two-state resets.

Corollary 5.12.

Let Σ be an alphabet such that |Σ|=k2 and let LΣ be a language. If L admits a cascade of reset automata of height h, then L2H where HΘ(h2). If k=2, then H=h2+h+22.

Exploiting the bound for the case |Σ|=2, we can prove undefinability of certain languages by general cascades, i.e. without any bound on their height nor on the number of states of its component. As an example, by Lemma 5.7, we know that the language L=Σ6aΣ over the alphabet Σ={a,b} does not belong to the class 27. If L could be recognized by a cascade of height h=3, then it would also be recognized by a two-state cascade of height H=h2+h+22=7, leading to the following conclusion: with Σ={a,b}, the language Σ6aΣ does not admit any cascade of resets of height 3.

Building upon this reasoning, we can formulate the Generalized Height-Hierarchy Lemma. Unlike the original Height-Hierarchy Lemma, which focuses solely on two-state cascades, the generalized version addresses the undefinability of cascades in a broader context, encompassing general cascades.

Lemma 5.13 (Generalized Height-Hierarchy Lemma).

Let h be a positive integer, and define H=h2+h+22+1. Consider the language LHΣ, where LH=ΣH2aΣ and Σ is a two-symbol alphabet. The language LH cannot be recognized by any cascade of reset automata of height h, but it holds that LH2H.

6 Efficient closure properties of cascades of reset automata

In this section, we present an efficient method for computing specific closure properties of reset cascades. For instance, for the case in which the operation is binary, given two cascades of resets 𝒞 and 𝒞 (made of only two-states components), we show how it is possible to compute a cascade of two-states resets that recognizes (𝒞)(𝒞) by adding at most one two-state reset automaton (that, in this context, we call brick). We show this for the following operations:

  1. (i)

    intersection;

  2. (ii)

    complementation;

  3. (iii)

    union; and

  4. (iv)

    left-concatenation of Σ, i.e. given a language to compute Σ.444It is worth noticing that this operation corresponds to compute the closure under the 𝖫𝖳𝖫 next modality.

Proposition 4.5 already shows that intersection can be implemented efficiently for cascades of resets: given two reset cascades 𝒞 and 𝒞 (with m and n two-states components, respectively), there exists a cascade for (𝒞)(𝒞) with m+n two-state resets.

Before showing the construction for the remaining operations, we give the following key definitions. We define the finalized version of an automaton 𝒜, denoted with 𝖿𝗂𝗇𝗏(𝒜), as the automaton obtained from 𝒜 by setting all its states as final. The definition naturally extends to cascades: the finalized version of 𝒞, denoted with 𝖿𝗂𝗇𝗏(𝒞), is defined as 𝖿𝗂𝗇𝗏(𝒜1)𝖿𝗂𝗇𝗏(𝒜n). Clearly, if 𝒞 is a cascade of reset automata, 𝖿𝗂𝗇𝗏(𝒞) is still a cascade of reset automata. We define the reachability set of an automaton relative to a set of states as follows.

Definition 6.1 (Reachability Set).

Let 𝒜=Σ,Q,δ,q0,F be an automaton. Let PQ be a set of states. The reachability set of 𝒜 with respect to P, denoted with 𝖱𝖲(𝒜,P), is the set {(σ,q)(Σ×Q):δ(q,σ)P}. We denote with 𝖱𝖲(𝒜,P)¯ the set (Σ×Q)𝖱𝖲(𝒜,P). We write 𝖱𝖲(𝒜) to refer to 𝖱𝖲(𝒜,F).

We now show how to efficiently compute the remaining closure properties.

Complementation

To compute complementation, we introduce the negation brick, whose structure is illustrated in Figure 9 and is formally defined here below.

Figure 9: The negation brick 𝗇𝖾𝗀𝖻(𝒜) in the two cases: (a) ϵ(𝒜) (b) ϵ(𝒜).
Definition 6.2 (Negation brick).

Let 𝒜=Σ,Q,δ,q0,F be an automaton. The negation brick for 𝒜, denoted with 𝗇𝖾𝗀𝖻(𝒜), is the two-state pure-reset automaton Σ×Q,{n0,n1},δ,n0,{nf} such that:

  1. (i)

    the final state nf is n1 if and only if ϵ(𝒜);

  2. (ii)

    the function τ induced by symbols in 𝖱𝖲(𝒜) maps all states in the non-final state, i.e. τ:{n0,n1}{n0,n1}{nf};

  3. (iii)

    the function τ induced by symbols in 𝖱𝖲(𝒜)¯ maps all states in the final one, i.e. τ:{n0,n1}{nf}.

The intuition is that the negation brick, when appended to the end of a cascade 𝒞, reaches its final state if and only if the underlying cascade 𝒞 is not in a final state. Consequently, by setting all the states of 𝒞 as final, we obtain a cascade that recognizes the complement of (𝒞), as proved by the following lemma.

Lemma 6.3.

Let 𝒞 be a cascade of automata. The cascade 𝒞𝖿𝗂𝗇𝗏(𝒞)𝗇𝖾𝗀𝖻(𝒞) recognizes the language (𝒞)¯. Moreover, if 𝒞 is a cascade of reset automata, then so is 𝒞.

Interestingly, if the cascade terminates with a pure-reset layer 𝒜, this automaton can itself serve the function of the negation brick, without the need of an additional component.

Lemma 6.4.

Let 𝒞=𝒜1𝒜n be a cascade of automata such that 𝒜n is a pure-reset automaton. There exists a cascade 𝒞=𝒜1𝒜n such that:

  1. (i)

    (𝒞)=(𝒞)¯;

  2. (ii)

    each automaton 𝒜i has the same number of states as 𝒜i;

  3. (iii)

    if 𝒜i is a reset (resp., pure-reset), then 𝒜i is also a reset (resp., pure-reset).

Union

Given two cascades 𝒞 and 𝒞 of height m and n, respectively, since (𝒞)(𝒞)=(𝒞)¯(𝒞)¯¯, it is possible to build cascade for (𝒞)(𝒞) of height m+n+3, using the previously discussed constructions. In this section, we present a more efficient construction that introduces only one additional component, referred to as the union brick, resulting in a cascade for (𝒞)(𝒞) of height n+m+1.

Definition 6.5 (Union brick).

Let 𝒜=Σ,QA,δA,q0A,FA and =Σ,QB,δB,q0B,FB be two automata. Let UQA×QB the set of states {(qA,qB):qAFAqBFB}. Let 𝒞=𝒜𝖺𝗎𝗀(𝒜,). The union brick of 𝒜 and , denoted with 𝗎𝗇𝗂𝗈𝗇𝖻(𝒜,), is the two-state pure-reset automaton Σ×Q,{u0,u1},δ,u0,{uf} such that:

  1. (i)

    the final state uf is u0 if and only if ϵ(𝒜)();

  2. (ii)

    the function τ induced by symbols in 𝖱𝖲(𝒞,U) maps all states in the final one, i.e. τ:{n0,n1}{nf};

  3. (iii)

    the function τ induced by symbols in 𝖱𝖲(𝒞,U)¯ maps all states in the non-final state, i.e. τ:{n0,n1}{n0,n1}{nf}.

Similarly to the case of complementation, when appended to the end of a cascade 𝒞𝖺𝗎𝗀(𝒞,𝒞), the union brick reaches its final state if and only if either 𝒞 is in a final state or 𝖺𝗎𝗀(𝒞,𝒞) is in a final state. This leads to the following lemma.

Lemma 6.6.

Let 𝒞 and 𝒞 be two cascade of automata. The cascade 𝒞′′𝖿𝗂𝗇𝗏(𝒞𝖺𝗎𝗀(𝒞,𝒞))𝗎𝗇𝗂𝗈𝗇𝖻(𝒞,𝒞) recognizes the language (𝒞)(𝒞). Moreover, if 𝒞 and 𝒞 are cascades of reset automata, then so is 𝒞′′.

Also in this case, if one of the two automata corresponds to a cascade terminating with a pure-reset component 𝒜, the union can be performed without the need for additional layers: the automaton 𝒜 effectively serves as the union brick.

Left-concatenation of 𝚺

Given a cascade 𝒞, we demonstrate how to construct a cascade that recognizes the language Σ(𝒞), adding only one brick and guaranteeing that the property of being a reset cascade is preserved. As a by-product of this construction, we obtain that, given a cascade of resets of height h equivalent to an 𝖫𝖳𝖫 formula ϕ (interpreted over finite words), it is possible to construct a cascades of resets for missing(ϕ) of height h+1, where missing is the next modality of 𝖫𝖳𝖫.

We first define the next version of an automaton. The next version of an automaton 𝒜, denoted as 𝗇𝖾𝗑𝗍𝗏(𝒜), is defined considering the Cartesian product between the alphabet of 𝒜 and the set {𝗈𝖿𝖿,𝗈𝗇}. Intuitively, if 𝒜 transitions from q to q with a symbols σ, so does 𝗇𝖾𝗑𝗍𝗏(𝒜) with the symbol (σ,𝗈𝗇). On the contrary, all symbols (σ,𝗈𝖿𝖿) force 𝗇𝖾𝗑𝗍𝗏(𝒜) to transition to the initial state. The formal definition of 𝗇𝖾𝗑𝗍𝗏(𝒜) is given here below.

Definition 6.7 (Next version of an automaton).

Let 𝒜=Σ,Q,δ,q0,F be an automaton such that either Σ=Σ or Σ=Σ×S, for an arbitrary finite set S. We define the next version of 𝒜, denoted as 𝗇𝖾𝗑𝗍𝗏(𝒜), as the automaton (Σ′′,Q,δ,q0,F) such that:

  • if Σ=Σ, then Σ′′Σ×{𝗈𝖿𝖿,𝗈𝗇} and, for all qQ and for all aΣ, it holds: δ(q,(a,𝗈𝗇))=δ(q,a) and δ(q,(,𝗈𝖿𝖿))=q0.

  • if Σ=Σ×S, then Σ′′Σ×{𝗈𝖿𝖿,𝗈𝗇}×S and, for all qQ and for all (a,s)Σ×S, it holds that: δ(q,(a,𝗈𝗇,s))=δ(q,(a,s)) and δ(q,(,𝗈𝖿𝖿,))=q0.

Given a cascade 𝒞=𝒜1𝒜n over Σ, we define the next version of 𝒞, denoted with 𝗇𝖾𝗑𝗍𝗏(C), as the cascade 𝗇𝖾𝗑𝗍𝗏(𝒜1)𝗇𝖾𝗑𝗍𝗏(𝒜n) over Σ×{𝗈𝖿𝖿,𝗈𝗇}.

Figure 10 shows the next versions of the automata in Figure 1. Crucially, computing the next version of an automaton does not alter its property of being a reset automaton.

Lemma 6.8.

Let Σ be a finite alphabet and let 𝒜 be an automaton over Σ or over Σ×S for an arbitrary finite set S. If 𝒜 is reset automaton, then also 𝗇𝖾𝗑𝗍𝗏(𝒜) is a reset automaton.

Figure 10: The next version of the two automata in the cascade of Figure 1.

Now, given any cascade 𝒞, to capture the language Σ(𝒞), it suffices to consider the automaton 𝒜𝑠𝑤𝑖𝑡𝑐ℎ (depicted in Figure 7) with the next version of 𝒞. In fact, considering that initially both 𝒜𝑠𝑤𝑖𝑡𝑐ℎ and 𝗇𝖾𝗑𝗍𝗏(𝒞) are in their initial states (which, for 𝒜𝑠𝑤𝑖𝑡𝑐ℎ, is state 𝗈𝖿𝖿), reading the first input symbol σ forces:

  1. (i)

    𝒜𝑠𝑤𝑖𝑡𝑐ℎ to transition to state 𝗈𝗇; and

  2. (ii)

    𝗇𝖾𝗑𝗍𝗏(𝒞) to remain in its initial state, because the symbol it reads is (σ,𝗈𝖿𝖿).

After the first symbol and for all the rest of the input word, 𝒜𝑠𝑤𝑖𝑡𝑐ℎ remains in state 𝗈𝗇, while 𝗇𝖾𝗑𝗍𝗏(𝒞) operates like 𝒞 because it reads symbols of the form (σ,𝗈𝗇). As shown by the following lemma, this captures exactly Σ(𝒞).

Lemma 6.9.

Let 𝒞 be a cascade of automata. The cascade 𝒞𝒜𝑠𝑤𝑖𝑡𝑐ℎ𝗇𝖾𝗑𝗍𝗏(𝒞) recognizes the language Σ(𝒞). Moreover, if 𝒞 is a cascade of reset automata, then so is 𝒞.

From Lemma 5.7, it follows the optimality of the construction outlined in Lemma 6.9.

7 Conclusions and Future Work

In this paper, we investigated some fundamental properties of cascades of reset automata. Unlike the approach commonly followed in the literature, where the cascade product is restricted to semi-automata, we focused on the case of automata. This allowed us to study the properties of the recognized languages. As an initial step, we showed how to compute regular expressions equivalent to a cascade. Then, on the basis of such a transformation, we established some meaningful expressiveness results, in particular lower bounds to the height and to the minimum number of states per level of a cascade of resets for specific families of languages. Finally, we showed how to compute the closure of reset cascades under certain basic operations by adding at most one brick to the end of the cascade.

As for the future developments of the work, finding an efficient construction for the closure of reset cascades under the concatenation operation is undoubtedly a crucial direction. This would enable the design of an efficient approach to handling the eventually and until operators of 𝖫𝖳𝖫, providing, together with the results given in the last section of the paper, an efficient decomposition into reset cascades for full 𝖫𝖳𝖫. This would improve the triply-exponential upper bound to such a decomposition achieved by Maler’s algorithm [4, 15]. Last but not least, giving analogous expressiveness and closure results for permutation automata appears to be another promising avenue for further investigation.

References

  • [1] Dana Angluin, David Chiang, and Andy Yang. Masked hard-attention transformers and boolean RASP recognize exactly the star-free languages. CoRR, abs/2310.13897, 2023. doi:10.48550/arXiv.2310.13897.
  • [2] Alessandro Artale, Luca Geatti, Nicola Gigante, Andrea Mazzullo, and Angelo Montanari. A Singly Exponential Transformation of LTL[X, F] into Pure Past LTL. In Pierre Marquis, Tran Cao Son, and Gabriele Kern-Isberner, editors, Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning, KR 2023, Rhodes, Greece, September 2-8, 2023, pages 65–74, 2023. doi:10.24963/KR.2023/7.
  • [3] Alessandro Artale, Luca Geatti, Nicola Gigante, Andrea Mazzullo, and Angelo Montanari. Succinctness issues for LTLf and safety and cosafety fragments of LTL. Information and Computation, 302:105262, 2025. doi:10.1016/j.ic.2024.105262.
  • [4] Giuseppe De Giacomo, Antonio Di Stasio, Francesco Fuggitti, and Sasha Rubin. Pure-past linear temporal and dynamic logic on finite traces. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 4959–4965, 2021.
  • [5] Giuseppe De Giacomo and Moshe Y. Vardi. Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. In Francesca Rossi, editor, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pages 854–860. IJCAI/AAAI, 2013. URL: http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6997.
  • [6] Samuel Eilenberg. Automata, languages, and machines., B. Pure and applied mathematics. Academic Press, 1976. URL: https://www.worldcat.org/oclc/310535259.
  • [7] Javier Esparza, Rubén Rubio, and Salomon Sickert. Efficient Normalization of Linear Temporal Logic. J. ACM, 71(2):16:1–16:42, 2024. doi:10.1145/3651152.
  • [8] Marcus Gelderie. Classifying regular languages via cascade products of automata. In Language and Automata Theory and Applications: 5th International Conference, LATA 2011, Tarragona, Spain, May 26-31, 2011. Proceedings 5, pages 286–297. Springer, 2011. doi:10.1007/978-3-642-21254-3_22.
  • [9] Mark Kambites. On the krohn-rhodes complexity of semigroups of upper triangular matrices. Int. J. Algebra Comput., 17(1):187–201, 2007. doi:10.1142/S0218196707003548.
  • [10] Nadezda Alexandrovna Knorozova and Alessandro Ronca. On the expressivity of recurrent neural cascades. In Michael J. Wooldridge, Jennifer G. Dy, and Sriraam Natarajan, editors, Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, pages 10589–10596. AAAI Press, 2024. doi:10.1609/AAAI.V38I9.28929.
  • [11] Nadezda Alexandrovna Knorozova and Alessandro Ronca. On the expressivity of recurrent neural cascades with identity. In Pierre Marquis, Magdalena Ortiz, and Maurice Pagnucco, editors, Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning, KR 2024, Hanoi, Vietnam. November 2-8, 2024, 2024. doi:10.24963/KR.2024/82.
  • [12] Kenneth Krohn and John Rhodes. Algebraic theory of machines. I. Prime decomposition theorem for finite semigroups and machines. Transactions of the American Mathematical Society, 116:450–464, 1965.
  • [13] Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, and Cyril Zhang. Transformers learn shortcuts to automata. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023. URL: https://openreview.net/forum?id=De4FYqjFueZ.
  • [14] Oded Maler. On the Krohn-Rhodes Cascaded Decomposition Theorem. In Zohar Manna and Doron A. Peled, editors, Time for Verification, Essays in Memory of Amir Pnueli, volume 6200 of Lecture Notes in Computer Science, pages 260–278. Springer, 2010. doi:10.1007/978-3-642-13754-9_12.
  • [15] Oded Maler and Amir Pnueli. Tight Bounds on the Complexity of Cascaded Decomposition of Automata. In 31st Annual Symposium on Foundations of Computer Science, St. Louis, Missouri, USA, October 22-24, 1990, Volume II, pages 672–682. IEEE Computer Society, 1990. doi:10.1109/FSCS.1990.89589.
  • [16] Zohar Manna and Amir Pnueli. A hierarchy of temporal properties (invited paper, 1989). In Proceedings of the 9th annual ACM symposium on Principles of distributed computing, pages 377–410, 1990. doi:10.1145/93385.93442.
  • [17] Nicolas Markey. Temporal logic with past is exponentially more succinct, concurrency column. Bull. EATCS, 79:122–128, 2003.
  • [18] Robert McNaughton and Seymour A Papert. Counter-Free Automata (MIT research monograph no. 65). The MIT Press, 1971.
  • [19] Amir Pnueli. The temporal logic of programs. In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977), pages 46–57. IEEE, 1977. doi:10.1109/SFCS.1977.32.
  • [20] Alessandro Ronca. The transformation logics. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, South Korea, August 3-9, 2024, pages 3549–3557. ijcai.org, 2024. URL: https://www.ijcai.org/proceedings/2024/393.
  • [21] Alessandro Ronca, Nadezda Alexandrovna Knorozova, and Giuseppe De Giacomo. Automata cascades: Expressivity and sample complexity. In Brian Williams, Yiling Chen, and Jennifer Neville, editors, Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February 7-14, 2023, pages 9588–9595. AAAI Press, 2023. doi:10.1609/AAAI.V37I8.26147.
  • [22] Karl-Heinz Zimmermann. On Krohn-Rhodes theory for semiautomata. CoRR, abs/2010.16235, 2020. arXiv:2010.16235.