Abstract 1 Introduction 2 Algebraic Recognition of Regular Languages 3 Algebraic Recognition of Probabilistic Languages 4 Algebraic Recognition of Effectful Languages 5 Conclusions and Future Work References

Algebraic Language Theory with Effects

Fabian Lenke ORCID Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Stefan Milius ORCID Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Henning Urbat ORCID Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Thorsten Wißmann ORCID Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
Abstract

Regular languages – the languages accepted by deterministic finite automata – are known to be precisely the languages recognized by finite monoids. This characterization is the origin of algebraic language theory. In this paper, we generalize the correspondence between automata and monoids to automata with generic computational effects given by a monad, providing the foundations of an effectful algebraic language theory. We show that, under suitable conditions on the monad, a language is computable by an effectful automaton precisely when it is recognizable by (1) an effectful monoid morphism into an effect-free finite monoid, and (2) a monoid morphism into a monad-monoid bialgebra whose carrier is a finitely generated algebra for the monad, the former mode of recognition being conceptually completely new. Our prime application is a novel algebraic approach to languages computed by probabilistic finite automata. Additionally, we derive new algebraic characterizations for nondeterministic probabilistic finite automata and for weighted finite automata over unrestricted semirings, generalizing previous results on weighted algebraic recognition over commutative rings.

Keywords and phrases:
Automaton, Monoid, Monad, Effect, Algebraic language theory
Category:
Track B: Automata, Logic, Semantics, and Theory of Programming
Copyright and License:
[Uncaptioned image] © Fabian Lenke, Stefan Milius, Henning Urbat, and Thorsten Wißmann; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Formal languages and automata theory
; Theory of computation Regular languages
Related Version:
Full version with appendix: https://arxiv.org/abs/2410.12569 [38]
Funding:
Fabian Lenke and Henning Urbat are supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 470467389. Stefan Milius is supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 517924115.
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

The algebraic approach to finite automata rests on the observation that regular languages (the languages accepted by finite automata) coincide with the languages recognized by finite monoids. This provides the basis for investigating complex properties of regular languages with semigroup-theoretic methods. A prime example is the celebrated result by McNaughton, Papert, and Schützenberger [56, 42] that a regular language is definable in first-order logic over finite words iff its syntactic monoid (the minimal monoid recognizing the language) is aperiodic. Since the latter property is easy to verify, this implies decidability of first-order definability. Similar characterizations and decidability results are known for numerous subclasses of regular languages [46, 25, 48]. The correspondence between automata and monoids has been generalized beyond regular languages, for example to ω-regular languages [66, 45], languages of words over linear orders [10], data languages [17], tree languages [6, 55, 18, 12], cost functions [23], and weighted languages over commutative rings [52].

In this paper, we show that the equivalence between automata and monoids can be established at the more general level of automata with generic computational effects given by a monad 𝕋 [59, 31]. This class of automata forms a common generalization of a wide range of automata models such as probabilistic automata [51], nondeterministic probabilistic automata [64], weighted automata [26], and even pushdown automata and Turing machines [31]. We introduce two modes of algebraic recognition for 𝕋-effectful languages, both of which are natural effectful generalizations of the classical recognition by finite monoids: recognition by (1) 𝕋-effectful monoid morphisms into finite effect-free monoids, and (2) ordinary monoid morphisms into finitely generated 𝕋-algebras equipped with an additional monoid structure (plus optional compatibility conditions). We subsequently identify suitable conditions on the monad 𝕋 ensuring that (1) and (2) capture precisely the languages computed by finite 𝕋-automata; this yields an effectful automata/monoid correspondence (Theorems 4.11 and 4.22). Our results fundamentally exploit the double role played by monads in computation, namely as abstractions of both computational effects [43] and algebraic theories [40].

The investigation of algebraic recognition at the present level of generality has several benefits. First and foremost, the abstract perspective provided by generic effects naturally motivates and isolates conceptual ideas that would be easily missed for concrete instantiations of the monad 𝕋. Notably, recognition mode (1) above is conceptually completely new, and mode (2) generalizes earlier work on algebraic recognition over commutative varieties [2] to arbitrary (non-commutative) effects. In this way, our theory leads to novel algebraic characterizations of several important automata models.

Our prime application is a characterization of languages computed by probabilistic finite automata (PFAs) [51]. PFAs extend classical deterministic finite automata by probabilistic effects, and thus serve as a natural model of state-based computations that involve uncertainty or randomization. These arise in many application domains [44, 8], as witnessed for instance by the wide range and success of probabilistic model checkers [37, 33]. On the theoretical side, PFAs share some remarkable similarities with finite automata; in particular, machine-independent characterizations of PFA-computable languages in terms of probabilistic regular expressions [19, 53] and probabilistic monadic second-order logic [65] are known. However, the fundamental algebraic perspective on finite automata has thus far withstood a probabilistic generalization. We fill this gap by establishing two different modes of probabilistic algebraic recognition (Theorems 3.10 and 3.18) for PFA-computable languages which instantiate the two above-mentioned modes, namely (1) recognition by finite monoids via probabilistic monoid morphisms; (2) recognition by convex monoids carried by a finitely generated convex set. The first mode emphasizes the effectful nature of probabilistic languages, while the second one relies on traditional universal algebra. These characterizations may serve as a starting point for the algebraic investigation of PFA-computable languages.

Further notable instances of the general theory include algebraic characterizations of nondeterministic probabilistic automata and weighted automata. For the latter, a weighted automata/monoid correspondence was only known for weights from a commutative ring [52]; our version applies to general semirings and thus captures additional types of weighted automata such as min-plus and max-plus automata [47, Ch. 5] and transducers [47, Ch. 3].

Related Work.

The use of category theory to unify results for different automata models has a long tradition[30, 7, 5]. 𝕋-automata were first studied by Goncharov et al. [31] as an instance of effectful coalgebras [54, 59]; they also fit into the framework of functor automata by Colcombet and Petrişan [24]. On the algebra side, Bojańczyk [14] used monads to unify notions of algebraic recognition. This abstract perspective on algebraic language theory has led to a series of works, including a uniform theory of Eilenberg-type correspondences [63] and an abstract account of logical definability of languages [13]. However, while each of the above works studies either automata-based or algebraic recognition individually, formal connections between both approaches are far less explored at categorical generality. The only results in this direction appear in the work of Adámek et al. [2] on the automata/monoid correspondence in commutative varieties. Our approach takes the step to non-commutative monads and on the way develops the entirely new concept of effectful language recognition.

2 Algebraic Recognition of Regular Languages

To set the scene for our algebraic approach to effectful languages, we recall the classical correspondence between finite automata and finite monoids as recognizers for regular languages [48, 50]. Let us settle some notation used in the sequel:

Notation 2.1.

Fix a finite alphabet Σ, and let Σ denote the set of finite words over Σ, with empty word εΣ. We put 1={} and 2={,}. A map x:1X is identified with the element x()X, which by abuse of notation is also denoted by xX. Maps p:X2 (predicates) are identified with subsets of X; in particular, languages are presented as predicates L:Σ2. We denote the composite of two maps f:XY, g:YZ by f;g:XZ (note the order!), and the identity map on X by 𝗂𝖽X:XX. We use to define anonymous functions. Lastly, XY denotes the set of all maps from X to Y.

Finite Automata.

A deterministic finite automaton (DFA) 𝒜=(Q,i,δ,o) consists of a finite set Q of states and maps representing an initial state, transitions, and final states:

i:δ:o:

Let δ¯:Σ(QQ), δ¯(a)=δ(,a), denote the curried form of δ, and define the iterated transition map δ¯:Σ(QQ) and the language L:Σ2 computed by 𝒜 by

δ¯(ε)=𝗂𝖽Q,δ¯(wa)=δ¯(w);δ¯(a) and L(w)=i;δ¯(w);o for wΣ. (2.1)

Monoids.

A monoid (M,,e) is a set M equipped with an associative multiplication M×MM and a neutral element eM; that is, the equations (xy)z=x(yz) and ex=x=xe hold for all x,y,zM. A map h:NM between monoids (N,,n) and (M,,e) is a monoid morphism if h(n)=e and h(xy)=h(x)h(y) for all x,yN.

Example 2.2.
  1. (1)

    For every set X, the set XX of endomaps forms a monoid with multiplication given by composition ; and neutral element 𝗂𝖽X:XX.

  2. (2)

    The set Σ of words, with concatenation as multiplication and neutral element ε, is the free monoid on Σ: for every monoid (M,,e) and every map h0:ΣM, there exists a unique monoid morphism h:ΣM such that h0(a)=h(a) for all aΣ. The morphism h, the free extension of h0, is given by h(a1an)=h0(a1)h0(an) for a1,,anΣ. For instance, the map δ¯:Σ(QQ) defined in (2.1) is the free extension of δ¯:Σ(QQ).

Monoids enable an algebraic notion of language recognition. A monoid M recognizes the language L:Σ2 if there exists a monoid morphism h:ΣM and a predicate p:M2 such that L=h;p. Monoid recognition captures precisely the regular languages:

Theorem 2.3 ([50, Thm. 1]).

For every language L:Σ2, there exists a DFA computing L iff there exists a finite monoid recognizing L.

Proof sketch.

If a DFA 𝒜=(Q,i,δ,o) computes L, the monoid QQ recognizes L via the morphism δ¯:Σ(QQ) and predicate p:(QQ)2 defined by p(f)=i;f;o2.

Conversely, given a finite monoid (M,,e) that recognizes L via a monoid morphism h:ΣM and a predicate p:M2, we can turn M into a DFA 𝒜=(M,e,δ,p) computing L with transitions defined by δ(m,a)=mh(a).

3 Algebraic Recognition of Probabilistic Languages

Before we present the correspondence of automata and monoids on the categorical level of general effectful automata in Section 4, we whet the reader’s appetite by illustrating the important special case of probabilistic languages. These are languages computed by probabilistic finite automata [51], whose computational effects are finite probability distributions. All results in this section are instances of those in Section 4; however, for the convenience of the reader we provide sketches of the concrete arguments our general proofs instantiate to.

3.1 Probability Distributions and Probabilistic Channels

A finite probability distribution on a set X is a map d:X[0,1] whose support supp(d):={xXd(x)0} is finite and which satisfies xXd(x)=1. A distribution can be represented as a finite formal sum iIrixi where xiX, ri[0,1],iIri=1 and d(x)=iI:xi=xri for xX. The set of all distributions on X is denoted by 𝒟X.

A map of the form f:X𝒟Y, denoted by f:X  Y, is a probabilistic channel or Markov kernel from X to Y. Intuitively, f is a map from X to Y that assigns to a given input x the output y with probability f(x)(y). We write X  Y for the set of all probabilistic channels from X to Y. Two probabilistic channels f:X  Y and g:Y  Z can be composed to yield a probabilistic channel fg:X  Z given by (fg)(x)=(zyYf(x)(y)g(y)(z)). The unit at X is the probabilistic channel ηX:X  X sending xX to the Dirac distribution δx𝒟X defined by δx=(y1 if x=y else 0). Using sum notation, composition of probabilistic channels is given by (fg)(x)=iIjJiririjzij, where f(x)=iIriyi and g(yi)=jJirijzij, and the unit is ηX(x)=1x. Composition of probabilistic channels is associative and has η as an identity: (fg)h=f(gh) and ηXf=fηY for f:X  Y,g:Y  Z and h:Z  W. A channel f:X  2 is a probabilistic predicate f:X[0,1] on X, where we identify 𝒟2 with the unit interval. A map f:XY induces the pure channel f;ηY:X  Y, which we also denote by f:XY by abuse of notation.

3.2 Probabilistic Automata

Probabilistic automata, due to Rabin [51], generalize deterministic automata. Transitions are no longer given by a unique successor state δ(q,a) for every state q and input a, but a probability distribution over possible successor states. Accordingly, probabilistic automata compute probabilistic languages, which are simply probabilistic predicates Σ  2. Formally:

Definition 3.1.

A probabilistic finite automaton (PFA) 𝒜=(Q,i,δ,o) consists of a finite set Q of states and the following channels for an initial distribution, the transition distributions, and an acceptance predicate, respectively:

i:δ:o:

We denote by δ¯:Σ(Q  Q) the curried form of δ and define the iterated transition map δ¯:Σ(Q  Q) and the language L:Σ  2 computed by 𝒜, respectively, by

δ¯(ε)=ηQ,δ¯(wa)=δ¯(w)δ¯(a) and L(w)=iδ¯(w)o for wΣ.

Comparing with the definition of DFAs in Section 2, we see that DFAs are precisely PFAs where i, δ, o are pure maps.

Remark 3.2.

We think of i(q) as the probability that 𝒜 starts in state q, of δ(q,a)(q) as the probability that 𝒜 transitions from q to q on input a, and of o(q)𝒟2[0,1] as the probability that q is accepting. Unravelling the above definition, the language L:Σ  2 computed by 𝒜 is given by the explicit formula

L(w)=qQn+1i(q0)(k=1nδ(qk1,ak)(qk))o(qn)for w=a1an. (3.1)

The summand for qQn+1 is the probability that, on input w, the automaton takes the path q and accepts w. Hence, L(w) is the total acceptance probability.

Remark 3.3.
  1. (1)

    Rabin’s original notion of PFA [51] features an initial state and a set of final states, which amounts to restricting i:1  Q and o:Q  2 in Definition 3.1 to pure channels. Except for the behaviour on the empty word, the two versions are expressively equivalent [21, Lemmas 3.1.1 and 3.1.2].

  2. (2)

    The PFA model used here is often called reactive in the literature, as opposed to generative PFAs, whose transition map is of type Q  1+Q×Σ, computing stochastic languages, which are (not necessarily finite!) distributions over Σ. Generative PFA are not instances of 𝕋-automata (Definition 4.3) and therefore not considered in this paper.

Our aim is to understand PFA-computable probabilistic languages in terms of recognition by algebraic structures, in the same way that finite monoids recognize regular languages. To this end, we introduce two modes of probabilistic algebraic recognition and prove that they capture precisely the PFA-computable languages.

3.3 Recognizing Probabilistic Languages by Finite Monoids

For our first mode of probabilistic algebraic recognition, we stick with finite monoids as recognizing structures, and only add probabilistic effects to the recognizing monoid morphisms. First we need some auxiliary machinery:

Definition 3.4.

For all finite sets X and all sets Y,Z we define the maps

ξX,Y :𝒟(XY)(X𝒟Y), ξX,Y(d)(x) =(yf:XY,f(x)=yd(f)), (3.2)
λX,Y :(X𝒟Y)𝒟(XY), λX,Y(g) =(fxXg(x)(f(x))), (3.3)
πY,Z :𝒟Y×𝒟Z𝒟(Y×Z), πY,Z(d,e) =((y,z)d(y)e(z)). (3.4)

Intuitively, ξX,Y(d)(x)(y) is the probability of picking some f according to d satisfying f(x)=y. For a channel g:X  Y, the probability of f:XY in the distribution λX,Y(g) provides a measure of how compatible f is to g. The map πY,Z sends two distributions on Y and Z to their product distribution on Y×Z. Well-definedness of ξ and λ and the next lemma can be shown by calculation; conceptually, they follow from 𝒟 being an affine monad.

Lemma 3.5.

We have λX,Y;ξX,Y=𝗂𝖽. In particular, the map ξX,Y is surjective.

Notation 3.6.

Given channels f1:X1  Y1 and f2:X2  Y2, we define the channel f1ׯf2:X1×X2  Y1×Y2 as the composite in diagram in Figure 3.

Figure 1: Definition of ׯ.
Figure 2: Probabilistic monoid morphism.
Figure 3: A PFA.
Definition 3.7.

A probabilistic monoid morphism from a monoid (N,,n) to a monoid (M,,e) is a channel h:N  M making the diagram in Figure 3 commute, where the maps , n, e are regarded as pure channels.

Remark 3.8.

The universal property of Σ extends to the probabilistic case: For every monoid (M,,e) and every channel h0:Σ  M, there exists a unique probabilistic monoid morphism h:Σ  M with h(a)=h0(a) for all aΣ. It is given by

h(w)=(mm=m1mni=1nh0(ai)(mi))for w=a1anΣ.
Definition 3.9.

A monoid M probabilistically recognizes a probabilistic language L:Σ  2 if there exists a probabilistic monoid morphism h:Σ  M and a probabilistic predicate p:M  2 such that L=hp.

We stress that, in Definition 3.9, probabilistic effects only appear in the channels h and p, while the monoid M itself is pure. At first sight, it may seem more natural to use “probabilistic monoids”, with proper channels as neutral element 1  M and multiplication M×M  M, as recognizers. Remarkably, we need not require this additional generality:

Theorem 3.10.

For every probabilistic language L:Σ  2, there exists a PFA computing L iff there exists a finite monoid probabilistically recognizing L.

Proof sketch.

Given a PFA 𝒜=(Q,δ,i,o) computing L, the finite monoid QQ probabilistically recognizes L via the probabilistic monoid morphism h:Σ  (QQ) that freely extends the probabilistic channel δ¯λQ,Q:Σ  (QQ), and the probabilistic predicate p:(QQ)  2 defined by p(f)=ifo𝒟2. Explicitly, the maps h and p are given by

h(w)=(ff=f1;;fni=1nqQδ(q,ai)(fi(q)))andp(f)=qQi(q)o(f(q)),

for w=a1an. A lengthy calculation using ˜3.5 shows that L=hp.

Conversely, if a finite monoid (M,,e) probabilistically recognizes L via h:Σ  M and p:M  2, then the PFA 𝒜=(M,δ,e,p) computes L, with δ:M×Σ  M defined by

δ(m,a)=(nm:mm=nh(a)(m)).

Example 3.11.

The probabilistic monoid morphism induced by the channel h0:{a}  ({0,1},,0) sending a120+121, together with the pure predicate 𝗂𝖽:{0,1}{0,1} recognizes the language L(an)=112n. The corresponding PFA due to Theorem 3.10 is shown in Figure 3.

3.4 Recognizing Probabilistic Languages by Convex Monoids

The presence of probabilistic effects in the recognizing morphisms places the above mode of probabilistic recognition outside standard universal algebra. In this section we develop an equivalent, purely algebraic approach based on the theory of convex sets.

A convex set [61] is a set X equipped with a family of binary operations +r:X×XX (r[0,1]) subject to following equations, where s=r+srs0 and r=rs:

x+rx=x,x+0y=y,x+ry=y+1rx,x+r(y+sz)=(x+ry)+sz

A map f:XY between convex sets is affine if f(x+rx)=f(x)+rf(x) for x,xX and r[0,1].

Example 3.12.
  1. (1)

    The prototypical convex sets are convex subsets Xκ (for a cardinal κ) with the operations x+ry:=rx+(1r)y. Up to affine isomorphism, these are precisely the cancellative convex sets [61], which are those satisfying

    x+ry=x+rzy=zfor all x,y,zX and r(0,1).
  2. (2)

    The set 𝒟X of distributions on a set X is a convex set with structure given by d+re=(xrd(x)+(1r)e(x)) for d,e𝒟X. This is the free convex set on X: every map h:XY to a convex set Y extends uniquely to an affine map h#:𝒟XY such that h=ηX;h#. Concretely, h# can be defined by h#(i=1nrixi)=h(x1)+r1h#(i=2nri1r1xi).

  3. (3)

    For all sets X and Y, the set X  Y forms a convex set with the operations f+rg=(xf(x)+rg(x)) for f,g:X  Y, xX and r[0,1].

Reutenauer [52] showed that finite-dimensional -algebras – real vector spaces equipped with a compatible monoid structure – precisely recognize rational power series Σ. For algebraic recognition of probabilistic languages, we generalize -algebras to convex monoids, which are monoids with an additional convex structure that is respected by the multiplication:

Definition 3.13.

A convex monoid is a convex set M equipped with a monoid structure (M,,e) whose multiplication :M×MM satisfies

(m+rm)n=mn+rmn and m(n+rn)=mn+rmn. (3.5)
Example 3.14.
  1. (1)

    The convex set 𝒟Σ with multiplication (iIrivi)(jJsjwj)=iI,jJrisjviwj and neutral element ηΣ(ε)=1ε is the free convex monoid on Σ: every map h0:ΣM to a convex monoid M extends to a unique affine monoid morphism h:𝒟ΣM such that h(1a)=h0(a).

  2. (2)

    For every set X, the convex set X  X from Example 3.123 forms a convex monoid with channel composition as multiplication and unit ηX as neutral element.

Definition 3.15.

A convex monoid M recognizes a language L:Σ  2 if there exists a monoid morphism h:ΣM and an affine map p:M𝒟2 such that L=h;p.

For a correspondence between PFA-computable languages and convex monoids, we need to impose a suitable finiteness restriction on the latter. Finite convex monoids are not sufficient; instead, we shall work with a more permissive notion of finiteness:

Definition 3.16.

A convex set X is finitely generated if there exists an affine surjection s:𝒟GX for some finite set G. A convex monoid is fg-carried if its underlying convex set is finitely generated.

Intuitively, this definition says that X is the convex hull of a finite subset s[G]X: every element in X is a convex combination of the elements s(g),gG.

Example 3.17.
  1. (1)

    A convex subset of n is finitely generated iff it is a bounded convex polytope [32], that is, it is compact and has finitely many extremal points.

  2. (2)

    For all finite sets X, the convex monoid X  X from Example 3.123 is fg-carried. This is witnessed by the map ξX,X:𝒟(XX)(X𝒟X) of (3.2), which is surjective by ˜3.5, and affine, since it is the free extension of the map ff;ηX.

Fg-carried convex monoids give rise to our second algebraic characterization of PFAs:

Theorem 3.18.

For every probabilistic language L:Σ  2, there exists a PFA computing L iff there exists an fg-carried convex monoid recognizing L.

Proof sketch.

Given a PFA 𝒜=(Q,δ,i,f) computing L, the fg-carried convex monoid Q  Q from Example 3.172 recognizes L via the morphism δ¯:Σ(Q  Q) and the affine map p:(Q  Q)𝒟2 given by p(f)=ifo𝒟2.

Conversely, suppose that (M,,e) is an fg-carried convex monoid (witnessed by an affine surjection s:𝒟QM) recognizing L via h:ΣM and p:M𝒟2. Define the PFA 𝒜Q=(Q,δ,i,o) where o=ηQ;s;p, the transition distribution δ is chosen such that s(δ(g,a))=s(g)h(a) for all gQ and aΣ, and the initial distribution i is chosen such that s(i)=e; such choices exist by surjectivity of s. Then 𝒜Q computes L.

A remarkable property of fg-carried convex monoids is that they always admit a finite presentation, despite not necessarily being finite themselves. To see this, let us recall some terminology from universal algebra. Given an equational class 𝒱 of algebras over a finitary signature Λ, a finite presentation of an algebra A𝒱 is given by (1) a finite set G of generators, (2) a finite set R of relations si=ti (i=1,,n) where si,tiTΛG are Λ-terms in variables from G, and (3) a surjective Λ-algebra morphism q:TΛGA satisfying q(si)=q(ti) for all i, subject to the universal property that every morphism h:TΛGB, where B𝒱 and h(si)=h(ti) for all i, factorizes through q. In particular, we have the notions of finitely presentable convex set and finitely presentable convex monoid.

By a non-trivial result due to Sokolova and Woracek [60], finitely generated convex sets are finitely presentable (the converse holds trivially). This is the key to the following theorem; a categorical version of it is later proved in Theorem 4.17.

Theorem 3.19.

Every fg-carried convex monoid is finitely presentable.

Proof sketch.

Let M be an fg-carried convex monoid. Choose a finite presentation (G,R,q) of M as a convex set. Since the multiplication of M is fully determined by its action on q[G] by (3.5), this extends to a finite presentation of the convex monoid M by adding a relation gg=tg,g for each g,gG, where tg,g is any term in the signature of convex sets such that q(tg,g)=q(g)q(g). We will see in Example 3.24 that the converse of Theorem 3.19 does not hold.

3.5 Syntactic Convex Monoids

Compared to ordinary monoids, convex monoids are fairly complex structures. The benefit of the additional complexity is the existence of canonical recognizers for probabilistic languages. Recall that every regular language L:Σ2 has a canonical recognizer, the syntactic monoid Syn(L). It is given by the quotient Σ/L of the free monoid Σ modulo the syntactic congruence LΣ×Σ, where vLw iff L(xvy)=L(xwy) for all x,yΣ. The projection hL:ΣSyn(L), sending wΣ to its congruence class [w], recognizes L via p:Syn(L)2 with p([w])=1 iff wL. Moreover, hL factorizes through every surjective morphism h:ΣM recognizing L, thus hL is the “smallest” surjective morphism recognizing L. For probabilistic languages, this notion generalizes as follows:

Definition 3.20.

Given a probabilistic language L:Σ  2, a syntactic convex monoid of L is a convex monoid Syn(L) with a surjective affine monoid morphism hL:𝒟ΣSyn(L) such that ηΣ;hL recognizes L and, moreover, hL factorizes through every surjective affine monoid morphism h such that ηΣ;h recognizes L:

Syntactic structures for formal languages are well-studied from a categorical perspective [14, 63, 2]. The following result is an instance of [2, Thm. 3.14]:

Theorem 3.21.

Every probabilistic language L:Σ  2 has a syntactic convex monoid, unique up to isomorphism. It is presented by generators Σ and relations given by the syntactic congruence L𝒟Σ×𝒟Σ defined in Equation 3.6. The maps hL:𝒟ΣSyn(L) and pL:Syn(L)[0,1] are given by hL(irivi)=[irivi] and pL([irivi])=iriL(vi).

iriviLjsjwjiffx,yΣ:iriL(xviy)=jsjL(xwjy) (3.6)

Alternatively, one can construct the syntactic convex monoid as the transition monoid of the minimal 𝒟-automaton (see the full version [38]), entailing a restriction on the convex structure of syntactic convex monoids:

Theorem 3.22.

For every language L:Σ  2, the convex set Syn(L) is cancellative.

Syntactic convex monoids are useful as a descriptional tool for characterizing (and potentially deciding) properties of languages in algebraic terms. Here is a simple illustration:

Example 3.23.

A probabilistic language L:Σ  2 is commutative if L(a1an)=L(aπ(1)aπ(n)) for all a1,,anΣ and all permutations π of {1,,n}. One easily verifies that L is commutative iff Syn(L) is a commutative convex monoid.

Let us note that while every PFA-computable language is recognized by some fg-carried convex monoid (Theorem 3.18), its syntactic convex monoid is generally not fg-carried:

Example 3.24.

Consider the PFA from Example 3.11 computing the probabilistic language L(an)=12n. Its syntactic convex monoid Syn(L) is isomorphic to the half-open interval (0,1] with the usual convex structure and multiplication of reals; indeed, the map i:𝒟Σ/L(0,1] given by [krkank]krk2nk is easily seen to be an isomorphism. Since the finitely generated convex subsets of are closed intervals, Syn(L) is not fg-carried. However, despite Theorem 3.19 not applying here, the convex monoid (0,1] can be shown to be finitely presentable, with the finite presentation given by a single generator a and a single relation e+13aa=a. The proof is somewhat intricate; see the full version [38] for details.

Open Problem.

Is Syn(L) finitely presentable for every PFA-computable language L?

4 Algebraic Recognition of Effectful Languages

We now turn to the main results of our paper: two novel modes of algebraic recognition for languages computed by effectful automata. All results are parametric in the computational effect, which is modelled by a monad satisfying a suitable condition. Our results instatiate to the characterizations of PFA-computable probabilistic languages from Section 3. Other instances of our results yield new algebraic characterizations for languages recognized by weighted automata and automata that combine nondeterministic and probabilistic branching.

4.1 Monads

In the following, familiarity with basic category theory is assumed; see Mac Lane [39] for a gentle introduction. We recall some concepts from the theory of monads [40] to fix our terminology and notation. We write 𝖲𝖾𝗍 for the category of sets and functions. A monad 𝕋=(T,η,μ) on 𝖲𝖾𝗍 is a triple consisting of an endofunctor T:𝖲𝖾𝗍𝖲𝖾𝗍 and two natural transformations η:𝖨𝖽T (the unit) and μ:TTT (the multiplication), satisfying the laws Tμ;μ=μT;μ and Tη;μ=𝖨𝖽T=ηT;μ.

The Kleisli category 𝒦(𝕋) has sets as objects, and a morphism from X to Y, denoted f:X  Y, is a map f:XTY. The composite of f:X  Y and g:Y  Z is denoted by fg:X  Z and defined by fg=f;Tg;μZ. The identity morphism on X is the component ηX:X  X of the unit. Intuitively, a Kleisli morphism is a function with computational effects given by the monad 𝕋 [43]. A map f:XY is identified with the Kleisli morphism f;ηY:X  Y; such Kleisli morphisms are said to be pure, since they correspond to effect-free computations. Note that there is no need for operator precedence between ; and since (f;g)h=f;(gh) and (gh);k=g(h;k) for all Kleisli morphisms g:XTY,h:YTZ and pure maps f:WX,k:TZU.

Further, a 𝕋-algebra (A,a) consists of set A (the carrier) and a map a:TAA (the structure) satisfying the associative law Ta;a=μA;a and the unit law ηA;a=𝗂𝖽A. A morphism from (A,a) to a 𝕋-algebra (B,b) (a 𝕋-morphism for short) is a map h:AB such that a;h=Th;b. We write 𝖠𝗅𝗀(𝕋) for the category of 𝕋-algebras and 𝕋-morphisms. Products in 𝖠𝗅𝗀(𝕋) are formed in 𝖲𝖾𝗍: the product algebra of (Ai,ai), iI, has the structure T(iAi)TpiiiTAiiaiiAi, where pi:iAiAi is the ith product projection.

The forgetful functor from 𝖠𝗅𝗀(𝕋) to 𝖲𝖾𝗍 has a left adjoint sending a set X to the free 𝕋-algebra 𝕋X=(TX,μX) over X. For every 𝕋-algebra (A,a) and every map h:XA, there exists a unique 𝕋-morphism h#:𝕋X(A,a) such that ηX;h#=h; that 𝕋-morphism h# is the free extension of h. The Kleisli category 𝒦(𝕋) is equivalent to a full subcategory of 𝖠𝗅𝗀(𝕋) via the embedding X𝕋X and hh#.

Monads provide a categorical view of universal algebra. Every finitary algebraic theory (Λ,E), given by a signature Λ of finitary operation symbols and a set E of equations between Λ-terms, induces a monad 𝕋 on 𝖲𝖾𝗍, where TX is the carrier of the free (Λ,E)-algebra (viz. the set of all Λ-terms over X modulo the equations in E), and ηX:XTX and μX:TTXTX are given by inclusion of variables and flattening of terms over terms. Then 𝕋-algebras bijectively correspond to Λ-algebras satisfying all equations in E, that is, algebras of the variety specified by (Λ,E). Monads 𝕋 induced by finitary algebraic theories are precisely the finitary monads, that is, those preserving directed colimits [3].

Every monad 𝕋 has a canonical left strength, the natural transformation

𝗅𝗌X,Y:X×TYT(X×Y)defined by𝗅𝗌X,Y(x,t)=T(y(x,y))(t).

Its right strength 𝗋𝗌X,Y:TX×YT(X×Y) is defined analogously. The monad 𝕋 is commutative if 𝗋𝗌X,TY𝗅𝗌X,Y=𝗅𝗌TX,Y𝗋𝗌X,Y in 𝒦(𝕋). For every monad (be it commutative or not), we denote the left-hand composite of this equation – a double strength – by

πX,Y:=( (4.1)

Commutative finitary monads are precisely those induced by a commutative algebraic theory (Λ,E). This means that all operations commute with each other; for example, for all binary operations α,βΛ, we have α(β(x1,2,x1,2),β(x2,1,x2,2))=β(α(x1,1,x2,1),α(x1,2,x2,2)), and similarly for every pair of operations of other (not necessarily equal) arities.

Example 4.1.

In our applications we shall encounter the following monads:

  1. (1)

    The distribution monad 𝒟 sends a set X to the set 𝒟X of all finite probability distributions on X (Section 3.1), and a map f:XY to the map 𝒟f:𝒟X𝒟Y defined by 𝒟f(d)=(yxX,f(x)=yd(x)); in sum notation, 𝒟f(irixi)=irif(xi). Its unit ηX:X𝒟X is given by ηX(x)=δx, and the multiplication μX:𝒟𝒟X𝒟X is defined by μX(e)=(xd𝒟Xe(d)d(x)). In sum notation, ηX(x)=1x and μX(iIri(jJirijxij))=iIjJiririjxij. The Kleisli category of 𝒟 is the category of sets and probabilistic channels, and algebras for 𝒟 are precisely the convex sets [62]. The monad 𝒟 is commutative; the natural transformation (4.1) is concretely given by (3.4).

  2. (2)

    The convex power set of distributions monad 𝒞 sends a set X to the set 𝒞X of non-empty, finitely generated convex subsets of 𝒟X, and a map f:XY to 𝒞f:𝒞X𝒞Y defined by 𝒞f(S)={𝒟f(d)dS}. Its unit ηX:X𝒞X is ηX(x)={δx}, and the multiplication μX:𝒞𝒞X𝒞X is defined by

    μX(S)=ΦS{Usupp(Φ)Φ(U)dUUsupp(Φ):dUU}for every S𝒞𝒞X.

    Algebras for 𝒞 are precisely convex semilattices [20], which are convex sets A carrying an additional semilattice (i.e. a commutative idempotent semigroup) structure (A,+) satisfying (x+y)+rz=(x+rz)+(y+rz) for x,y,zA,r[0,1]. The monad 𝒞 is not commutative.

  3. (3)

    A semiring is a set S equipped with both the structure of a monoid (S,,1) and of a commutative monoid (S,+,0) such that multiplication distributes over addition. Every semiring S induces a monad 𝒮 sending a set X to 𝒮X={f:XSf(x)0 for finitely many xX}, and a map f:XY to the map 𝒮f:𝒮X𝒮Y defined by 𝒮f(g)=(yf(x)=yg(x)). The unit ηX:X𝒮X is given by ηX(x)=(y1 if x=y else 0), and the multiplication μX:𝒮𝒮X𝒮X is defined by μX(e)=(xd𝒮Xe(d)d(x)). Algebras for 𝒮 correspond precisely to S-semimodules, that is, commutative monoids (M,+,0) with an associative scalar multiplication S×MM that distributes over the additive structures of S and M. The monad 𝒮 is commutative iff the multiplication of S is commutative. The distribution monad 𝒟 is a submonad of 𝒮 for the semiring S= of reals with the usual operations.

  4. (4)

    The monad sends a set X to X=X (finite words over X), and a map f:XY to f=f:XY defined by f(x1xn)=f(x1)f(xn). The unit ηX:XX is given by ηX(x)=x and the multiplication μX:(X)X by flattening (concatenation) of words. Algebras for correspond precisely to monoids.

4.2 Automata with Effects

Automata with computational effects in a monad were introduced in previous work [59, 31]. PFAs as in Definition 3.1 are the instance where the monad is 𝒟 and the output algebra is the free 𝒟-algebra 𝒟2.

Assumption 4.2.

We fix a monad 𝕋=(T,η,μ) on 𝖲𝖾𝗍 and a 𝕋-algebra O.

Definition 4.3.

A finite 𝕋-automaton (𝕋-FA) 𝒜=(Q,δ,i,f) consists of a finite set Q of states together with two 𝒦(𝕋)-morphisms and a map as shown below:

i:δ:o:

They represent an initial state, transitions, and outputs, respectively. We define the curried version δ¯:Σ(Q  Q) of δ and the extended transition map δ¯:Σ(Q  Q) just like in Definition 3.1. The language L:ΣO computed by 𝒜 is given by L(w)=iδ¯(w);o#.

Remark 4.4.
  1. (1)

    Earlier works [59, 31] model 𝕋-automata as coalgebras QO×(TQ)Σ, and their semantics is defined via the final coalgebra for the functor FX=O×XΣ, carried by the set OΣ of languages [54]. Definition 4.3 is equivalent to the coalgebraic one – modulo initial state and currying of transitions – yet better suited for our algebraic constructions.

  2. (2)

    𝕋-automata and their language semantics are also instances of the framework of functor automata by Colcombet and Petrişan [24] interpreted either in the Kleisli category for 𝕋 for free output algebras O=𝕋O0, or in the Eilenberg-Moore category for 𝕋 for general output algebras and the state object being the free algebra 𝕋Q.

  3. (3)

    Every 𝕋-FA is equivalent to a 𝕋-FA with a pure initial state: one may simply add a new pure initial state simulating the behaviour of a non-pure one [64, Rem. 1]. Hence, the essence of the effectful nature of 𝕋-FAs lies in the transitions.

Example 4.5.
  1. (1)

    𝒟-FAs with the convex output set O=𝒟2 are precisely PFAs.

  2. (2)

    𝒞-FAs are nondeterministic probabilistic finite automata (NPFAs[64]. They combine nondeterministic and probabilistic branching and as such are closely related to Segala systems [58] and Markov decision processes [47, Ch. 36]. We take the output convex semilattice O=[0,1]max given by the interval [0,1] with its usual convex structure and taking maxima as the semilattice operation. An NPFA 𝒜=(Q,δ,i,o) consists of a finite state set Q and maps i:1𝒞Q, δ:Q×Σ𝒞Q, and o:Q[0,1]. If 𝒜 is in state q and receives the input letter a, then it chooses a distribution dδ(q,a) and transitions to the state q with the probability d(q). The choices are made with the goal of maximizing the acceptance probability of the input. Formally, 𝒜 computes the probabilistic language Lmax:Σ[0,1] defined for w=a1an by

    wmax{qQn+1d0(q0)(k=1ndqk1,k(qk))o(qn)d0i,q.k.dq,kδ(q,ak)}.

    Under this semantics, NPFAs are more expressive than PFAs [64]. Two alternative semantics emerge by modifying the output convex semilattice: for O=[0,1]𝗆𝗂𝗇, the interval [0,1] with the minimum operation, an NPFA computes the language Lmin:Σ[0,1] of minimal acceptance probabilities. For O=𝒞2, the convex semilattice of closed subintervals of [0,1], it computes the language Lint:Σ𝒞2 sending wΣ to the interval [Lmin(w),Lmax(w)]. See van Heerdt et al. [64] for a detailed coalgebraic account of the different semantics.

  3. (3)

    𝒮-FAs with the output semimodule O=𝒮1S, are precisely weighted finite automata (WFAs[26] over the semiring S. A WFA computes a weighted language (or formal power series) L:ΣS, which is given by the formula (3.1), with sums and products formed in S. Interesting choices for the semiring S include:

    1. (1)

      the semiring of reals – note that PFAs are a special case of WFAs over ;

    2. (2)

      the min-plus or max-plus semiring of natural numbers with the operation min (resp. max) as addition and the operation + as multiplication; WFAs then correspond to min-plus and max-plus automata computing shortest (resp. longest) paths [47, Ch. 5].

    3. (3)

      the semiring of regular languages over an alphabet Γ w.r.t. union and concatenation; WFAs are equivalent to transducers computing relations RΣ×Γ [47, Ch. 3].

4.3 Recognizing Effectful Languages by Bialgebras

We now introduce two modes of algebraic recognition for languages computed by 𝕋-automata. In contrast to the exposition in Section 3 for the instance 𝕋=𝒟, we swap the order of presentation and first consider the recognition by algebraic structures and subsequently the recognition by effectful homomorphisms in Section 4.4, since from the categorical viewpoint the latter is best understood in the context of the former.

Definition 4.6.

A (𝕋,)-bialgebra is a set M equipped with both a 𝕋-algebra structure a:TMM and the structure of a monoid (M,,e). It is a 𝕋-monoid if the monoid multiplication M×MM is a 𝕋-bimorphism: for every mM the maps m(),()m:MM are 𝕋-endomorphisms on (M,a).

Remark 4.7.

Both (𝕋,)-bialgebras and 𝕋-monoids admit a categorical view:

  1. (1)

    (𝕋,)-bialgebras correspond to algebras for the coproduct [1] of the monads 𝕋 and .

  2. (2)

    If 𝕋 is commutative, then 𝕋-monoids correspond to algebras for the composite monad 𝕋 induced by the canonical distributive law of over 𝕋 [41, Thm 4.3.4]. They also correspond to monoid objects in the closed monoidal category (𝖠𝗅𝗀(𝕋),,𝕋1) whose tensor product represents 𝕋-bimorphisms (i.e. 𝕋-morphisms ABC correspond to 𝕋-bimorphisms A×BC[9, 36, 57]. The internal hom of A,B𝖠𝗅𝗀(𝕋) is the algebra [A,B] of all 𝕋-morphisms from A to B, viewed as a subalgebra of the product B|A|. In particular, composition [A,B]×[B,C];[A,C] is a 𝕋-bimorphism, so ([A,A],;,𝗂𝖽A) is a 𝕋-monoid.

Example 4.8.
  1. (1)

    A (𝒟,)-bialgebra is a convex set carrying an additional monoid structure (without any interaction between the two structures). A 𝒟-monoid is a convex monoid, that is, a (𝒟,)-bialgebra whose monoid multiplication satisfies (3.5).

  2. (2)

    For every set X, the set X  X is a (𝕋,)-bialgebra with monoid structure given by Kleisli composition and 𝕋-algebra structure given by the product (𝕋X)X of the free 𝕋-algebra 𝕋X. If 𝕋 is commutative, then X  X is a 𝕋-monoid; in fact, it is isomorphic to the 𝕋-monoid [𝕋X,𝕋X] (˜4.72).

Both the notion of recognition of probabilistic languages by convex monoids and the respective finiteness condition (Definitions 3.15 and 3.16) are instances of the following:

Definition 4.9.

A (𝕋,)-bialgebra M recognizes the language L:ΣO if L=h;p for some monoid morphism h:ΣM and some 𝕋-morphism p:MO.

Definition 4.10.

A 𝕋-algebra (A,a) is finitely generated if there exists a surjective 𝕋-morphism 𝕋G(A,a) for some finite set G. A (𝕋,)-bialgebra is fg-carried if its underlying 𝕋-algebra is finitely generated.

Theorem 4.11.

Suppose that X  X is a finitely generated 𝕋-algebra for every finite set X. Then for every language L:ΣO, the following are equivalent:

  1. (1)

    There exists a 𝕋-FA computing L.

  2. (2)

    There exists an fg-carried (𝕋,)-bialgebra recognizing L.

If the monad 𝕋 is commutative, then these statements are equivalent to:

  1. (3)

    There exists an fg-carried 𝕋-monoid recognizing L.

Remark 4.12.

The condition of the theorem is equivalent to finitely generated 𝕋-algebras being closed under finite products.

Proof sketch.

(1)(2) Let 𝒜=(Q,δ,i,o) be a 𝕋-FA computing L. Then L is recognized by the fg-carried (𝕋,)-bialgebra Q  Q via the monoid morphism δ¯:Σ(Q  Q) and the 𝕋-morphism p:(Q  Q)O which is given by p(f)=if;o#. The proof that p is a 𝕋-morphism requires the initial state i to be pure, which we may assume due to ˜4.4.
(2)(1) Let M be an fg-carried (𝕋,)-bialgebra recognizing L via the monoid morphism h:ΣM and the 𝕋-morphism p:MO. Since M is finitely generated as a 𝕋-algebra, there exists a surjective 𝕋-morphism s:𝕋QM for some finite set Q. Let (M,,e) denote the monoid structure, and let s0:QM and h0:ΣM be the domain restrictions of s and h. We construct a 𝕋-FA 𝒜=(Q,δ,i,o) computing L as follows: we choose i:1  Q and δ:Q×Σ  Q such that the first two diagrams below commute – these choices exist because s is surjective. Moreover, we define o:QO by o:=s0;p as in the third diagram.

    

One can prove that the 𝕋-FA 𝒜 computes the language L.
Now suppose that the monad 𝕋 is commutative. Then (1)(3) is shown like (1)(2), adding the observation that the recognizing (𝕋,)-bialgebra Q  Q is a 𝕋-monoid as in Example 4.82. The implication (3)(2) is trivial.

Example 4.13.

The condition of Theorem 4.11 holds for 𝕋{𝒟,𝒞,𝒮} from Example 4.1. We present for each finite set X a finite set M and a map ξ0:M(X  X) such that ξ=ξ0#:TM(X  X) is surjective. The respective witnesses are given in Figure 4, where XX denotes the set of all partial functions on X. For all three monads, the condition amounts to the observation that every effectful function f:X  X can be built from (partial or total) effect-free functions using 𝕋-operations.

𝕋 𝕋-FA M ξ0:M(X  X)
𝒟 PFA XX ξ0(f)=(xηX(f(x)))
𝒞 NPFA XX ξ0(f)=(xηX(f(x)))
𝒮 WFA XX ξ0(f)=(xηX(f(x)) if f(x) is defined, else 0𝒮X )
Figure 4: Witnesses for X  X being (monoidally) finitely generated.
Remark 4.14.

Using the abstract theory of syntactic structures [14, 63, 2], it is possible to associate a canonical algebraic recognizer to every language L:ΣO, namely its syntactic (𝕋,)-bialgebra and, in the commutative case, its syntactic 𝕋-monoid. The syntactic convex monoid (Theorem 3.21) is an instance of the latter.

Next, we show that, under conditions on 𝕋, fg-carried 𝕋-monoids are finitely presentable.

Definition 4.15.

Let 𝕊 be a monad. An 𝕊-algebra is finitely presentable if it is the coequalizer in 𝖠𝗅𝗀(𝕊) of some pair p,q:𝕊X𝕊Y of 𝕊-morphisms for finite sets X and Y.

Remark 4.16.

For the monad 𝕊 associated to an algebraic theory (Λ,E), an 𝕊-algebra is finitely presentable iff its corresponding Λ-algebra in the variety defined by (Λ,E) admits a finite presentation by generators and relations [4, Prop. 11.28]. Instantiating 𝕊 to the monad corresponding to the algebraic theory of 𝕋-monoids, where 𝕋 is finitary, we obtain a notion of finitely presentable 𝕋-monoid.

Theorem 4.17.

If 𝕋 is finitary and commutative, and finitely generated 𝕋-algebras are finitely presentable, then every fg-carried 𝕋-monoid is finitely presentable.

Note that Theorem 3.19 is the instance of this result for 𝕋=𝒟.

4.4 Recognizing Effectful Languages by Finite Monoids

Our second mode of recognition of 𝕋-FA-computable languages uses effectful monoid morphisms to finite monoids; recognition by probabilistic channels from Section 3.3 is an instance.

Notation 4.18.

Given 𝒦(𝕋)-morphisms fi:Xi  Yi (i=1,2), we define the Kleisli morphism f1ׯf2:X1×X2  Y1×Y2 as in ˜3.6, with 𝕋 in lieu of 𝒟.

Definition 4.19.

A (𝕋-)effectful monoid morphism from a monoid (N,,n) to a monoid (M,,e) is a 𝒦(𝕋)-morphism h:N  M such that the diagram in Figure 3 commutes. The monoid M effectfully recognizes the language L:ΣO if there exists an effectful monoid morphism h:Σ  M and a map p:MO such that L=h;p#.

The key condition on the monad 𝕋 that makes effectful recognition work is isolated in Theorem 4.22. This requires some technical preparation:

Remark 4.20.

The natural transformations πX,Y:TX×TYT(X×Y) of (4.1) and ηX:XTX make T a monoidal functor, that is, π and η satisfy coherence laws w.r.t. the natural isomorphisms (X×Y)×ZX×(Y×Z) and 1×XXX×1 [34, Thm. 2.1]. Consequently, T preserves monoid structures: for every monoid (M,,e), the set TM forms a monoid with neutral element and multiplication, respectively, defined by

1𝑒MηMTMandTM×TMπM,MT(M×M)TTM.

It is folklore that if 𝕋 is commutative, then, for every 𝕋-monoid N, the extension h#:TMN of every monoid morphism h:MN is also a monoid morphism. For non-commutative monads 𝕋, this is not true in general.

Definition 4.21.

A (𝕋,)-bialgebra N is monoidally finitely generated if there exists a finite monoid M and a monoid morphism ξ0:MN whose free extension ξ=ξ0#:TMN is a surjective monoid morphism.

We are now ready to state the desired general result on effectful recognition. Note that its condition implies that of Theorem 4.11; for a separating example see the full version [38].

Theorem 4.22.

Suppose that for every finite set X the (𝕋,)-bialgebra X  X from Example 4.82 is monoidally finitely generated. Then for every language L:ΣO, there exists a 𝕋-FA computing L iff there exists a finite monoid 𝕋-effectfully recognizing L.

Proof sketch.

() Suppose that 𝒜=(Q,δ,i,f) is a 𝕋-FA computing L. The proof of Theorem 4.11 shows that L is recognized by the (𝕋,)-bialgebra Q  Q; say L=g;p for some monoid morphism g:Σ(Q  Q) and some 𝕋-morphism p:(Q  Q)O. (The specific choices of g and p in that proof are not relevant here.) Since Q  Q is monoidally finitely generated, there exists a monoid morphism ξ0:M(Q  Q) such that M is finite and ξ=ξ0#:TM(Q  Q) is a surjective monoid morphism. By the universal property of the free monoid Σ and the surjectivity of ξ, we obtain a (not necessarily unique) monoid morphism h:ΣTM with g=h;ξ to fill the commutative diagram (4.2). Moreover, one can show that h:Σ  M is an effectful monoid morphism. Then the finite monoid M effectfully recognizes L via h and ξ0;p because L=g;p=h;(ξ0;p)#. (4.2) (4.3) () Suppose that the finite monoid (M,,e) effectfully recognizes L via h:Σ  M and p:MO. Then the 𝕋-FA 𝒜=(M,δ,e,p) whose transition map δ:M×Σ  M is given by the composite (4.3) (where h0(a)=h(a) for aΣ) computes L

Example 4.23.

The condition of Theorem 4.22 holds for 𝕋{𝒟,𝒞,𝒮}, as the maps ξ0:M(X  X) from Figure 4 witnessing that X  X is finitely generated also witness that X  X is monoidally finitely generated. We note that the verification that ξ:TM(X  X) is a monoid morphism is non-trivial unless 𝕋 is commutative (see ˜4.20). Proposition 4.24 below simplifies the computations for non-commutative 𝕋, and also explains conceptually why the non-commutative monads 𝒞 and 𝒮 satisfy the condition: for a finite set X, the (𝕋,)-bialgebra X  X is generated by central morphisms.

Recall that a Kleisli morphism f:X  Y is central [22, 49] if for all Kleisli morphisms f:X  Y, the following diagram commutes in 𝒦(𝕋).

(4.4)
Proposition 4.24.

If ξ0:M(X  X) is a monoid morphism whose uncurried form X×M  X is central, then its free extension ξ:TM(X  X) is a monoid morphism.

Remark 4.25.

The witness ξ0:(XX)(X  X) defined by ff;ηX for the monad 𝒟 from Definition 3.4 works more generally for all affine monads [35] (i.e. monads 𝕋 satisfying T11). Hence, our Theorems 4.11 and 4.22 apply to all affine monads 𝕋.

4.5 Applications

We proceed to instantiate Theorems 4.11 and 4.22 to derive algebraic characterizations for several effectful automata models. For 𝕋=𝒟, we recover Theorems 3.10 and 3.18 for PFAs:

Theorem 4.26.

A probabilistic language is PFA-computable iff it is 𝒟-effectfully recognized by a finite monoid iff it is recognized by an fg-carried convex monoid.

For 𝕋=𝒞, we obtain a new algebraic characterization of NPFAs. Note that a (𝒞,)-bialgebra is a convex semilattice with an additional monoid structure.

Theorem 4.27.

A probabilistic language is NPFA-computable iff it is 𝒞-effectfully recognized by a finite monoid iff it is recognized by an fg-carried (𝒞,)-bialgebra.

Finally, 𝕋=𝒮 gives a new algebraic characterization of WFAs. Note that an (𝒮,)-bialgebra is an S-semimodule with an additional monoid structure. For a commutative semiring S, the notion of an 𝒮-monoid is that of an (associative) S-algebra.

Theorem 4.28.

An S-weighted language is WFA-computable iff it is 𝒮-effectfully recognized by a finite monoid iff it is recognized by an fg-carried (𝒮,)-bialgebra, and, for a commutative semiring S, iff it is recognized by an fg-carried S-algebra.

For the case where S is a commutative ring (i.e. has additive inverses), we recover the correspondence between WFAs and fg-carried S-algebras due to Reutenauer [52]. However, Theorem 4.28 applies to every semiring, so it also yields novel algebraic characterizations of min- and max-plus automata and transducers (Example 4.53) as special instances.

5 Conclusions and Future Work

We have developed the foundations of an algebraic theory of automata with generic computational effects. Under suitable conditions on the effect monad 𝕋, we characterized 𝕋-FA-computable languages by algebraic modes of effectful recognition. As special cases, this entails the first algebraic characterizations of probabilistic automata and weighted automata over unrestricted semirings. We proceed to give some prospects for future work.

Proving finite presentability of syntactic convex monoids is challenging even for simple probabilistic languages, as Example 3.24 illustrates. Identifying conditions on the language ensuring this property is thus a natural question.

We aim to extend our theory beyond the category of sets. Of particular interest are nominal sets, which would allow us to capture effectful (e.g. probabilistic or weighted) register automata [15]. The main technical hurdle lies in the construction Q(QQ) of function spaces, which does not preserve orbit-finite (i.e. finitely presentable) nominal sets. This might be overcome via the recently proposed restriction to single-use functions [17, 16].

An orthogonal generalization of our theory concerns effectful languages beyond finite words. This requires switching from monoids to algebras for a monad 𝕊 [14]. For example, taking the monad 𝕊 corresponding to ω-semigroups [45] could lead to algebraic recognition of effectful languages over infinite words. For a generalization of the recognition by 𝕋-monoids, we expect that the interaction between the monads 𝕊 and 𝕋 be given by some form of distributive law similar to the interaction of  and 𝕋.

Lastly, our effectful automata/monoid correspondence could pave the way to a topological account of effectful languages based on effectful profinite monoids. A first glimpse in this direction is given by Fijalkow [27] who analyzes the value-1 problem for PFAs in terms of a notion of free prostochastic monoid. We aim to study effectful versions of the duality theory of profinite monoids [28, 11] and its applications, notably variety theorems [29, 63].

References

  • [1] Jiří Adámek, Stefan Milius, Nathan J. Bowler, and Paul Blain Levy. Coproducts of monads on set. In 27th Annual IEEE Symposium on Logic in Computer Science, (LICS 2012), pages 45–54. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.16.
  • [2] Jiří Adámek, Stefan Milius, and Henning Urbat. A categorical approach to syntactic monoids. Log. Methods Comput. Sci., 14(2):9:1–9:34, 2018. doi:10.23638/LMCS-14(2:9)2018.
  • [3] Jiří Adámek and Jiří Rosický. Locally Presentable and Accessible Categories. London Mathematical Society Lecture Note Series. Cambridge University Press, 1994. doi:10.1017/CBO9780511600579.
  • [4] Jiří Adámek, Jiří Rosický, and E. M. Vitale. Algebraic Theories: A Categorical Introduction to General Algebra. Cambridge Tracts in Mathematics. Cambridge University Press, 2010. doi:10.1017/CBO9780511760754.
  • [5] Jiří Adámek and Vera Trnková. Automata and Algebras in Categories. Springer, 1990.
  • [6] Jorge Almeida. On pseudovarieties, varieties of languages, filters of congruences, pseudoidentities and related topics. Algebra Universalis, 27(3):333–350, 1990. doi:10.1007/BF01190713.
  • [7] Michael A. Arbib and Ernest G. Manes. Machines in a category: An expository introduction. SIAM Review, 16(2):163–192, 1974. doi:10.1137/1016026.
  • [8] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
  • [9] Bernhard Banaschewski and Evelyn Nelson. Tensor products and bimorphisms. Canadian Mathematical Bulletin, 19(4):385–402, 1976. doi:10.4153/CMB-1976-060-2.
  • [10] Nicolas Bedon and Chloé Rispal. Schützenberger and Eilenberg theorems for words on linear orderings. J. Comput. Syst. Sci., 78(2):517–536, 2012. doi:10.1016/J.JCSS.2011.06.003.
  • [11] Fabian Birkmann, Henning Urbat, and Stefan Milius. Monoidal extended Stone duality. In 27th International Conference on Foundations of Software Science and Computation Structures (FoSSaCS 2024), volume 14574 of Lect. Notes Comput. Sci., pages 144–165. Springer, 2024. doi:10.1007/978-3-031-57228-9_8.
  • [12] Achim Blumensath. Regular tree algebras. Log. Methods Comput. Sci., 16(1), 2020. doi:10.23638/LMCS-16(1:16)2020.
  • [13] Achim Blumensath. Algebraic language theory for Eilenberg–Moore algebras. Log. Methods Comput. Sci., 17(2), 2021. doi:10.23638/LMCS-17(2:6)2021.
  • [14] Mikołaj Bojańczyk. Recognisable languages over monads. In Igor Potapov, editor, Proc. 19th International Conference on Developments in Language Theory (DLT), volume 9168 of Lect. Notes Comput. Sci., pages 1–13. Springer, 2015. doi:10.1007/978-3-319-21500-6_1.
  • [15] Mikołaj Bojańczyk, Bartek Klin, and Sławomir Lasota. Automata theory in nominal sets. Log. Methods Comput. Sci., 10(3), 2014. doi:10.2168/LMCS-10(3:4)2014.
  • [16] Mikołaj Bojańczyk, Lê Thành Dung Nguyên, and Rafal Stefański. Function spaces for orbit-finite sets. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024), volume 297 of LIPIcs, pages 130:1–130:20. Schloss Dagstuhl, 2024. doi:10.4230/LIPICS.ICALP.2024.130.
  • [17] Mikołaj Bojańczyk and Rafal Stefański. Single-use automata and transducers for infinite alphabets. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020), volume 168 of LIPIcs, pages 113:1–113:14. Schloss Dagstuhl, 2020. doi:10.4230/LIPIcs.ICALP.2020.113.
  • [18] Mikołaj Bojanczyk and Igor Walukiewicz. Forest algebras. In Jörg Flum, Erich Grädel, and Thomas Wilke, editors, Logic and Automata: History and Perspectives, volume 2 of Texts in Logic and Games, pages 107–132. Amsterdam University Press, 2008.
  • [19] Benedikt Bollig, Paul Gastin, Benjamin Monmege, and Marc Zeitoun. A probabilistic Kleene theorem. In 10th International Symposium on Automated Technology for Verification and Analysis (ATVA 2012), volume 7561 of Lect. Notes Comput. Sci., pages 400–415. Springer, 2012. doi:10.1007/978-3-642-33386-6_31.
  • [20] Filippo Bonchi, Ana Sokolova, and Valeria Vignudelli. The theory of traces for systems with nondeterminism, probability, and termination. Log. Methods Comput. Sci., 18(2), 2022. doi:10.46298/LMCS-18(2:21)2022.
  • [21] Rais G. Bukharaev. Theorie der stochastischen Automaten. Vieweg+Teubner, 1995.
  • [22] Titouan Carette, Louis Lemonnier, and Vladimir Zamdzhiev. Central submonads and notions of computation: Soundness, completeness and internal languages. In 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2023), pages 1–13. IEEE, 2023. doi:10.48550/arXiv.2207.09190.
  • [23] Thomas Colcombet. Regular cost functions, part i: Logic and algebra over words. Log. Methods Comput. Sci., 9(3), 2013. doi:10.2168/LMCS-9(3:3)2013.
  • [24] Thomas Colcombet and Daniela Petrişan. Automata minimization: a functorial approach. Log. Methods Comput. Sci., 16(1), 2020. doi:10.23638/LMCS-16(1:32)2020.
  • [25] Volker Diekert, Paul Gastin, and Manfred Kufleitner. A survey on small fragments of first-order logic over finite words. Int. J. Found. Comput. Sci., 19(3):513–548, 2008. doi:10.1142/S0129054108005802.
  • [26] Manfred Droste, Werner Kuich, and Heiko Vogler. Handbook of Weighted Automata. Springer, 2009. doi:10.1007/978-3-642-01492-5.
  • [27] Nathanaël Fijalkow. Profinite techniques for probabilistic automata and the Markov monoid algorithm. Theor. Comput. Sci., 680:1–14, 2017. doi:10.1016/j.tcs.2017.04.006.
  • [28] Mai Gehrke. Stone duality, topological algebra, and recognition. Journal of Pure and Applied Algebra, 220(7):2711–2747, 2016. doi:10.1016/j.jpaa.2015.12.007.
  • [29] Mai Gehrke, Serge Grigorieff, and Jean-Éric Pin. Duality and equational theory of regular languages. In 35th International Colloquium on Automata, Languages and Programming (ICALP 2008), volume 5126 of Lect. Notes Comput. Sci., pages 246–257. Springer, 2008. doi:10.1007/978-3-540-70583-3_21.
  • [30] Joseph A. Goguen. Minimal realization of machines in closed categories. Bulletin of the American Mathematical Society, 78(5):777–783, 1972. doi:10.1090/S0002-9904-1972-13032-5.
  • [31] Sergey Goncharov, Stefan Milius, and Alexandra Silva. Towards a coalgebraic Chomsky hierarchy. In 8th International Conference on Theoretical Computer Science (TCS 2014), volume 8705 of Lect. Notes Comput. Sci., pages 265–280. Springer, 2014. doi:10.1007/978-3-662-44602-7_21.
  • [32] Branko Grünbaum, Volker Kaibel, Victor Klee, and Günter M. Ziegler. Convex Polytopes. Springer, 2003. doi:10.1007/978-1-4613-0019-9.
  • [33] Christian Hensel, Sebastian Junges, Joost-Pieter Katoen, Tim Quatmann, and Matthias Volk. The probabilistic model checker Storm. Int. J. Softw. Tools Technol. Transf., 24(4):589–610, 2022. doi:10.1007/s10009-021-00633-z.
  • [34] Anders Kock. Monads on symmetric monoidal closed categories. Arch. Math, 21:1–10, 1970. doi:10.1007/BF01220868.
  • [35] Anders Kock. Bilinearity and cartesian closed monads. Mathematica Scandinavica, 29:161–174, 1971. doi:10.7146/math.scand.a-11042.
  • [36] Anders Kock. Closed categories generated by commutative monads. Journal of the Australian Mathematical Society, 12(4):405–424, 1971. doi:10.1017/S1446788700010272.
  • [37] Marta Z. Kwiatkowska, Gethin Norman, and David Parker. PRISM 4.0: Verification of probabilistic real-time systems. In Ganesh Gopalakrishnan and Shaz Qadeer, editors, Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, volume 6806 of Lecture Notes in Computer Science, pages 585–591. Springer, 2011. doi:10.1007/978-3-642-22110-1_47.
  • [38] Fabian Lenke, Stefan Milius, Henning Urbat, and Thorsten Wißmann. Algebraic language theory with effects. CoRR, 2024. doi:10.48550/arXiv.2410.12569.
  • [39] Saunders Mac Lane. Categories for the Working Mathematician. Springer, 1998. doi:10.1007/978-1-4757-4721-8.
  • [40] Ernest G. Manes. Algebraic Theories, volume 26 of Graduate Texts in Mathematics. Springer, 1976. doi:10.1007/978-1-4612-9860-1.
  • [41] Ernest G. Manes and Philip Mulry. Monad compositions I: General constructions and recursive distributive laws. Theory and Applications of Categories, 18:172–208, 2007.
  • [42] Robert McNaughton and Seymour A. Papert. Counter-Free Automata (M.I.T. Research Monograph No. 65). The MIT Press, 1971. doi:10.5555/1097043.
  • [43] Eugenio Moggi. Notions of computation and monads. Information and Computation, 93(1):55–92, 1991. doi:10.1016/0890-5401(91)90052-4.
  • [44] Azaria Paz. Introduction to probabilistic automata. Elsevier, 1971. doi:10.1016/c2013-0-11297-4.
  • [45] Dominique Perrin and Jean-Éric Pin. Infinite words - automata, semigroups, logic and games, volume 141 of Pure and applied mathematics series. Elsevier Morgan Kaufmann, 2004.
  • [46] Jean-Éric Pin. Varieties Of Formal Languages. Plenum Publishing Co., 1986.
  • [47] Jean-Éric Pin, editor. Handbook of Automata Theory. European Mathematical Society Publishing House, Zürich, Switzerland, 2021. doi:10.4171/Automata.
  • [48] Jean-Eric Pin. Mathematical foundations of automata theory. https://www.irif.fr/ jep/PDF/MPRI/MPRI.pdf, February 2022.
  • [49] John Power. Premonoidal categories as categories with algebraic structure. Theor. Comput. Sci., 278(1-2):303–321, 2002. doi:10.1016/S0304-3975(00)00340-6.
  • [50] M. O. Rabin and D. Scott. Finite automata and their decision problems. IBM Journal of Research and Development, 3(2):114–125, 1959. doi:10.1147/rd.32.0114.
  • [51] Michael O. Rabin. Probabilistic automata. Inform. Control, 6(3):230–245, 1963. doi:10.1016/S0019-9958(63)90290-0.
  • [52] C. Reutenauer. Séries formelles et algèbres syntactiques. J. Algebra, 66:448–483, 1980. doi:10.1016/0021-8693(80)90097-6.
  • [53] Wojciech Różowski and Alexandra Silva. A completeness theorem for probabilistic regular expressions. In 39th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2024). ACM, 2024. doi:10.1145/3661814.3662084.
  • [54] Jan J. M. M. Rutten. Universal coalgebra: a theory of systems. Theor. Comput. Sci., 249(1):3–80, 2000. doi:10.1016/S0304-3975(00)00056-6.
  • [55] Saeed Salehi and Magnus Steinby. Tree algebras and varieties of tree languages. Theoret.  Comput. Sci., 377(1-3):1–24, 2007. doi:10.1016/j.tcs.2007.02.006.
  • [56] Marcel Paul Schützenberger. On finite monoids having only trivial subgroups. Inform. and Control, 8:190–194, 1965. doi:10.1016/S0019-9958(65)90108-7.
  • [57] Gavin J. Seal. Tensors, monads and actions. Theory and Applications of Categories, 28:403–433, 2012. doi:10.48550/arXiv.1205.0101.
  • [58] Roberto Segala. Modeling and Verification of Randomized Distributed Real-time Systems. PhD thesis, Massachusetts Institute of Technology, 1995. URL: https://hdl.handle.net/1721.1/36560.
  • [59] Alexandra Silva, Filippo Bonchi, Marcello Bonsangue, and Jan Rutten. Generalizing determinization from automata to coalgebras. Log. Methods Comput. Sci., 9(1), 2013. doi:10.2168/LMCS-9(1:9)2013.
  • [60] Ana Sokolova and Harald Woracek. Congruences of convex algebras. Journal of Pure and Applied Algebra, 219(8):3110–3148, 2015. doi:10.1016/j.jpaa.2014.10.005.
  • [61] Marshall H. Stone. Postulates for the barycentric calculus. Annali di Matematica Pura ed Applicata, 29(1):25–30, 1949. doi:10.1007/BF02413910.
  • [62] Tadeusz Swirszcz. Monadic functors and convexity. Polonaise des Sciences. Sér. des sciences math., astr. et phys., 22, 1974.
  • [63] Henning Urbat, Jiří Adámek, Liang-Ting Chen, and Stefan Milius. Eilenberg theorems for free. In 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017), volume 83 of LIPIcs, pages 43:1–43:14. Schloss Dagstuhl, 2017. doi:10.4230/LIPIcs.MFCS.2017.43.
  • [64] Gerco van Heerdt, Justin Hsu, Joël Ouaknine, and Alexandra Silva. Convex language semantics for nondeterministic probabilistic automata. In 15th International Colloquium on Theoretical Aspects of Computing (ICTAC 2018), volume 11187 of lncs, pages 472–492. Springer, 2018. doi:10.1016/j.tcs.2025.115191.
  • [65] Thomas Weidner. Probabilistic automata and probabilistic logic. In 37th International Symposium on Mathematical Foundations of Computer Science (MFCS 2012), volume 7464 of Lect. Notes Comput. Sci., pages 813–824. Springer, 2012. doi:10.1007/978-3-642-32589-2_70.
  • [66] Thomas Wilke. An Eilenberg theorem for -languages. In Proc. 18th International Colloquium on Automata, Languages and Programming (ICALP), volume 510 of Lect. Notes Comput. Sci., pages 588–599. Springer, 1991. doi:10.5555/646245.756766.