Categories for Automata and Language Theory
Abstract
Categorical methods have a long history in automata and language theory, but a coherent theory has started to emerge only in recent years. The abstract viewpoint of category theory can provide a unifying perspective on various forms of automata; it can make it easier to bootstrap a theory in a new setting; and it provides conceptual clarity regarding which aspects and properties are fundamental and which are only coincidental.
Due to being in its early stages, the field is currently still divided into several different communities with little connections between them. One of the central aims of the Dagstuhl Seminar “Categories for Automata and Language Theory” (25141) was to enhance communication between automata theory and category theory communities. To this end, the seminar brought together researchers from both areas and included introductory tutorials designed to provide common ground and help participants better understand each other’s approach and terminology.
The following key topics were explored during the seminar:
-
Categorical unification of language theory, either via the theory of monads, or via the category of -transductions and their composition;
-
Coalgebraic methods and their applications to automata theory, to infinite trace semantics and connection to bisimulation-invariant fragments of logics;
-
Functorial automata and generic algorithms therein;
-
Fibrational and monoidal perspectives on language theory;
-
Higher-order automata and profinite lambda-calculus.
Keywords and phrases:
categorical automata theory, automata theory, category theory, monadsSeminar:
March 30 – April 4, 2025 – https://www.dagstuhl.de/251412012 ACM Subject Classification:
Theory of computation Algebraic language theory ; Theory of computation Automata extensionsCopyright and License:
1 Executive Summary
Achim Blumensath (Masaryk University – Brno, CZ)
Mikołaj Bojańczyk (University of Warsaw, PL)
Bartek Klin (University of Oxford, GB)
Daniela Petrişan (Université Paris Cité, FR)
License:
Creative Commons BY 4.0 International license © Achim Blumensath, Mikołaj Bojańczyk, Bartek Klin, and Daniela Petrişan
Categorical methods have a long history in automata and language theory (see, e.g., [1, 2, 3, 4, 5] for early examples), but only in recent years a coherent theory has started to emerge. As an indication of the increasing popularity, note that papers on categories and automata have regularly appeared in the last three years, both at ICALP(B) and at LICS. These are the top conferences in the field of Track B of Theoretical Computer Science, but a similar pattern holds across other conferences in this field. There are several reasons why these abstract approaches have become popular in automata theory.
-
They provide a unifying perspective on various forms of automata. For example, minimisation and learning algorithms for deterministic automata, weighted automata and sequential transducers can be seen as instances of a generic algorithm given at an abstract category-theoretic level [6].
-
They provide conceptual clarity regarding which aspects and properties are fundamental and which are only coincidental. For example, the semiring based formalism to formal languages treats addition and multiplication symmetrically, while a more general approach [9] reveals that multiplication is specific to the shapes of the objects in question while addition is universal and related to the power-set operation.
The field is still in its early stages, and as a result it is still divided into several different communities with little connections between them. The purpose of this Dagstuhl Seminar was to connect researchers active in these communities; to make them aware of the work of other groups; to initiate collaborations; and to discuss recent developments and possible ways to go forward.
Organization of the seminar
The organisers designed the schedule to strike a balance between survey talks, focused presentations, and free time for informal discussions. Participants were encouraged to share their work and highlight recent advances in their respective fields. Given the diverse backgrounds of the attendees, the organisers invited several participants to deliver extended tutorial-style talks on specific topics to help bridge disciplinary gaps. Five such talks were presented, as listed below.
-
1.
Achim Blumensath: Monads and Recognisability. This tutorial aims to provide an accessible introduction and a broad overview of a recently developed framework for algebraic language theory, which is based on monads and Eilenberg–Moore algebras.
-
2.
Mikołaj Bojańczyk: The Composition Method. This talk explores the connection between algebra and logic through a categorical lens intended to support generalisations beyond word languages, but from a viewpoint dual to the one adopted in the first tutorial: logic serves as the primary notion, and algebraic structures are derived from it. The central topic of interest is the category of -transductions.
-
3.
Pawel Sobocinski: String Diagrams. The talk introduced string diagrams – a mathematical notation rooted in (monoidal) category theory – and its applications through computer science. In particular, it discussed monoidal automata and their languages of string diagrams.
-
4.
Yde Venema: Coalgebra. The talk offered a gentle introduction to universal coalgebra as a broad categorical framework for modeling state-based evolving systems. It discussed coinduction, behavioral equivalence and bisimilarity.
-
5.
Noam Zeilberger A tutorial on (generalized) fibrations for logic, automata and language theory. The talk provided an introduction to certain fibrational concepts from category theory and their relevance to addressing “lifting problems” in logic, automata, and language theory.
Apart from the tutorials, 30 other participants delivered short presentations on recent work related to the topics listed above. Two additional sessions were set aside to allow time for informal discussions and interactions.
Conclusions
The organizers considered the seminar to be a success. Most participants reported gaining new insights from other areas and many expressed interest in applying these ideas to advance their own research. Among the participants who filled in the survey, more than half evaluated the scientific quality of the seminar as outstanding. We have striven to integrate junior researchers and many of them gave talks. This came at the expense of having less time dedicated to personal interactions.
References
- [1] P. L. F. Rudolf E. Kalman and M. A. Arbib, Topics in Mathematical System Theory, McGraw-Hill, 1969.
- [2] S. Eilenberg, Automata, Languages, and Machines: volume A, Pure and applied mathematics, Academic Press, 1974.
- [3] M. A. Arbib and E. G. Manes, A categorist’s view of automata and systems, in Category Theory Applied to Computation and Control, Proceedings of the First International Symposium, San Francisco, CA, USA, February 25–26, 1974, Proceedings, E. G. Manes, ed., vol. 25 of Lecture Notes in Computer Science, Springer, 1974, pp. 51–64.
- [4] H. Ehrig, K. Kiermeier, H. Kreowski, and W. Kühnel, Universal theory of automata. A categorial approach, Teubner Studienbücher, Teubner, 1974.
- [5] B. Tilson, Categories as algebra: An essential ingredient in the theory of monoids, Journal of Pure and Applied Algebra, 48 (1987), pp. 83–198.
- [6] T. Colcombet, D. Petrişan, and R. Stabile, Learning automata and transducers: A categorical approach, in 29th EACSL Annual Conference on Computer Science Logic, CSL 2021, January 25–28, 2021, Ljubljana, Slovenia (Virtual Conference), C. Baier and J. Goubault-Larrecq, eds., vol. 183 of LIPIcs, Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021, pp. 15:1–15:17.
- [7] M. Bojańczyk, Recognisable languages over monads. Unpublished note, arXiv:1502.04898v1.
- [8] Regular Tree Algebras, Logical Methods in Computer Science, 16 (2020), pp. 16:1–16:25.
- [9] The Power-Set Operation for Tree Algebras, Logical Methods in Computer Science, (to appear).
2 Table of Contents
3 Overview of Talks
3.1 Tutorial: Monads and Recognisability
Achim Blumensath (Masaryk University – Brno, CZ)
License:
Creative Commons BY 4.0 International license © Achim Blumensath
This tutorial intends to give an introduction to and an overview of the recently developed framework for algebraic language theory based on monads and Eilenberg-Moore algebras [3, 4, 5, 1, 2]. This framework was developed to support languages of various types, in particular those of words and trees (both finite and infinite ones). The main results concern the existence of syntactic algebras, an Eilenberg Variety Theorem, and a Reiterman Theorem.
References
- [1] A. Blumensath, Algebraic Language Theory for Eilenberg–Moore Algebras, Logical Methods in Computer Science, 17 (2021), pp. 6:1–6:60.
- [2] A. Blumensath, Abstract Algebraic Language Theory, book in preparation, https://www.fi.muni.cz/~blumens/ALT.pdf.
- [3] M. Bojańczyk, Recognisable languages over monads, unpublished note, arXiv:1502.04898v1.
- [4] M. Bojańczyk, Languages Recognised by Finite Semigroups and their generalisations to objects such as Trees and Graphs with an emphasis on definability in Monadic Second-Order Logic, lecture notes, arXiv:2008.11635, 2020.
- [5] H. Urbat, J. Adámek, L.-T. Chen, S. Milius, Eilenberg Theorems for Free, in 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017, August 21–25, 2017 – Aalborg, Denmark, vol. 83, Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017, pp. 43:1–43:15.
3.2 Tutorial: The Composition Method
Mikołaj Bojańczyk (University of Warsaw, PL)
License:
Creative Commons BY 4.0 International license © Mikołaj Bojańczyk
In this talk I would like to discuss the connection between algebra (e.g. semigroups) and logic (e.g. ) for defining languages, in a categorical perspective, which is meant to be helpful for generalisations beyond words.There are in fact two perspectives: one, using monads, focuses on algebras, and logic becomes a derived notion. This talk is about a dual perspective, where logic is the basic notion, and algebras are derived. The main topic of interest is -transductions.
3.3 Tutorial: String Diagrams
Pawel Sobocinski (Tallinn University of Technology, EE)
License:
Creative Commons BY 4.0 International license © Pawel Sobocinski
This tutorial introduced string diagrams, a mathematical notation that originates in (monoidal) category theory, through a computer science lens. In particular, the focus was on string diagrams *as syntax*, generalising classical syntax. Because string diagrams do not have an implicit assumption on the classicality of the underlying data, they are particularly useful for resource sensitive applications. After some discussion of the notion of monoidal theory, generalising the classical notion of algebraic theory, the tutorial finished with the notion of monoidal automaton (joint work with Matt Earnshaw) that accepts languages of string diagrams.
3.4 Tutorial: Coalgebra
Yde Venema (University of Amsterdam, NL)
License:
Creative Commons BY 4.0 International license © Yde Venema
In this tutorial I gave an introduction to universal coalgebra as a general categorical framework for state-based evolving systems. After presenting some motivating examples I formally introduced coalgebras for a set functor T, as well as their morphisms. I then discussed final coalgebras and the concept of coinduction, as a principle for giving both definitions and proofs. I finished with discussing the notions of behavioral equivalence and bisimilarity, and their relationship.
3.5 Tutorial: A tutorial on (generalized) fibrations for logic, automata and language theory
Noam Zeilberger (Ecole Polytechnique - Palaiseau, FR)
License:
Creative Commons BY 4.0 International license © Noam Zeilberger
Joint work of: Paul-André Melliès, Noam Zeilberger
The talk gave an introduction to some fibrational concepts from category theory and the relevance to studying “lifting problems” in logic, automata, and language theory. After presenting some general ideas on representing deductive systems as “bundles of categories”, the bulk of the talk focused on finite-state automata, starting from the classical idea of representing the transition graph of an NFA by a graph homomorphism into the bouquet graph with set of loops , and considering the corresponding functor between free categories. We discussed how to characterize determinism and codeterminism: represents the transition graph of a complete DFA (respectively coDFA) just in case is a discrete opfibration (respectively discrete fibration). Next, emphasizing the importance of the general case, we showed how to characterize functors representing arbitrary (potentially ambiguous) nondeterministic finite-state automata: a functor represents an NFA just in case satisfies the unique lifting of factorizations (ULF) and finitary fiber properties. Finally, we explained how to use this characterization to define NFA over arbitrary categories, recognizing regular languages of arrows.
The tutorial was based on joint work with Paul-André Melliès.
References
- [1] Paul-André Melliès and Noam Zeilberger. Functors are type refinement systems. POPL 2015
- [2] Paul-André Melliès and Noam Zeilberger. The categorical contours of the Chomsky-Schützenberger representation theorem. LMCS (to appear), 2024, arXiv:2405.14703
3.6 Learning automata weighted over number rings: (concretely and) categorically
Quentin Aristote (IRIF - Paris, FR)
License:
Creative Commons BY 4.0 International license © Quentin Aristote
Joint work of: Quentin Aristote, Sam van Gool, Daniela Petrişan, Mahsa Shirmohammadi
We study automata weighted over number rings, that is, rings of integers in an algebraic number field.
We show that number rings are what we call “almost strong Fatou”: if an n-state automaton weighted in a number field recognizes an integer-valued series, then it admits an equivalent n+1-state automaton weighted in the corresponding ring of integers.
We then explain how this fits in a bigger categorical picture: given a well-behaved functor F, we give a generic procedure for retrieving the minimal C-automaton from any D-automaton. This gives in particular a generic reduction of the problem of actively learning C-automata to the problem of actively learning D-automata, which instantiates in particular to a reduction from actively learning automata weighted in number rings to automata weighted in number fields.
3.7 On a Monadic Semantics for Circuit Description Languages
Ugo Dal Lago (University of Bologna, IT)
License:
Creative Commons BY 4.0 International license © Ugo Dal Lago
Joint work of: Andrea Colledan, Ugo Dal Lago, Ken Sakayori
A monad-based denotational model is introduced, which is shown to be adequate for the Proto-Quipper family of calculi, themselves being idealized versions of the Quipper programming language. The use of a monadic approach allows us to keep the value to which a term reduces distinct from the circuit that the term itself produces as a side-effect. In turn, this enables the denotational interpretation of rich type systems in which even the size of the produced circuit is controlled, at the same time justifying some of the design novelties present in such calculi.
3.8 Trace semantics of effectful Mealy machines
Elena Di Lavore (University of Pisa, IT)
License:
Creative Commons BY 4.0 International license © Elena Di Lavore
Joint work of: Elena Di Lavore, Filippo Bonchi, Mario Román
Main reference: Filippo Bonchi, Elena Di Lavore, Mario Román: “Effectful Mealy Machines: Bisimulation and Trace”, CoRR, Vol. abs/2410.10627, 2024.
The talk introduces effectful Mealy machines and gives them semantics both in terms of bisimilarity and traces. Bisimilarity is characterised syntactically, via free uniform feedback. Traces are a coinductive construction. Effectful Mealy machines, their bisimilarity and trace capture existing flavours of Mealy machines, bisimilarity and trace.
3.9 Context-free languages of string diagrams
Matthew David Earnshaw (Tallinn University of Technology, EE)
License:
Creative Commons BY 4.0 International license © Matthew David Earnshaw
Joint work of: Matthew David Earnshaw, Mario Román
Main reference: Matt Earnshaw, Mario Román: “Context-Free Languages of String Diagrams”, CoRR, Vol. abs/2404.10653, 2024.
We introduce context-free languages of morphisms in monoidal categories, extending recent work on the categorification of context-free languages, and regular languages of string diagrams. Context-free languages of string diagrams include classical context-free languages of words, trees, and hypergraphs, when instantiated over appropriate monoidal categories. We prove a representation theorem for context-free languages of string diagrams: every such language arises as the image under a monoidal functor of a regular language of string diagrams.
3.10 Elements of Higher-Dimensional Automata Theory
Uli Fahrenberg (EPITA - Cesson-Sévigné, FR)
License:
Creative Commons BY 4.0 International license © Uli Fahrenberg
Joint work of: Uli Fahrenberg, Amazigh Amrane, Hugo Bazille, Emily Clement, Krzysztof Ziemiański
We introduce higher-dimensional automata (HDAs) and their languages, which consist of interval pomsets with interfaces (ipomsets). We then show a decomposition property which allows to develop an isomorphism between the category of ipomsets and a category generated by special discrete ipomsets (“starters” and “terminators”) under the relation which allows to compose subsequent starters and subsequent terminators. This in turn allows us to introduce an operational semantics for HDAs as so-called ST-automata: finite automata over the graph alphabet of starters and terminators.
3.11 Systems of fixpoint equations categorically
Zeinab Galal (University of Bologna, IT)
License:
Creative Commons BY 4.0 International license © Zeinab Galal
Fixpoints play an important role in both denotational semantics where they are used to represent recursively defined programs and data types as well as in operational semantics where many behavioral equivalences are obtained as fixpoints of some relation transformers.
In the categorical theory of fixpoint operators, we usually consider one fixpoint operator at a time and little attention is given to the study of mixed fixpoint operators where we take a different fixpoint operator for each variable. Systems combining least and greatest fixpoints over lattices are an important example as they are the basis of many static analysis and model checking methods.
I will present in this talk an axiomatization of mixed fixpoint operators first in the 1-categorical setting and then briefly mention how to extend to 2-categories in order to capture the examples of initial algebras and coalgebras of accessible functors, analytic and polynomial functors.
3.12 Thin Coalgebraic Behaviours Are Inductive
Helle Hvid Hansen (University of Groningen, NL)
License:
Creative Commons BY 4.0 International license © Helle Hvid Hansen
Joint work of: Anton Chernev, Corina Cîrstea, Helle Hvid Hansen, Clemens Kupke
Main reference: Anton Chernev, Corina Cîrstea, Helle Hvid Hansen, Clemens Kupke: “Thin Coalgebraic Behaviours Are Inductive”, CoRR, Vol. abs/2504.07013, 2025.
F-coalgebra automata provide a unifying, categorical setting for studying automata-theoretic verification of a variety of system types. For certain applications in quantitative model checking [1], it is crucial that the property to be checked is given by an unambiguous automaton, i.e., there is at most one accepting run on each input. This leads to the question of when unambiguous F-coalgebra automata exist. This question is also of fundamental interest, beyond verification.
For infinite words, the situation is easy, since parity word automata can be determinised. For trees, it is known that deterministic automata are less expressive than nondeterministic ones, but one can recover unambiguous acceptance when restricting to thin trees, i.e., trees with only countably many infinite branches [2, 3].
Inspired by the results on thin trees, we aim to develop a similar theory for thin F-coalgebras. This talk presents the first part of this program. We show that for analytic functors F, we can define thin F-coalgebras as those coalgebras with only countably many infinite paths from each state. Our main result is an inductive characterisation of thinness via an initial algebra. To this end, we develop a syntax for thin behaviours and give a sound and complete axiomatisation of when two terms represent the same thin behaviour. Finally, for the special case of polynomial functors (the type of ranked ordered trees), we retrieve from our syntax the notion of Cantor-Bendixson rank of a thin tree.
References
- [1] Cîrstea, C. and Kupke, C.,“Measure-theoretic semantics for quantitative parity automata,” in 31st EACSL Annual Conference on Computer Science Logic (CSL 2023), B. Klin and E. Pimentel, Eds., ser. Leibniz International Proceedings in Informatics (LIPIcs), vol. 252, Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023, 14:1–14:20.
- [2] Idziaszek, T., Skrzypczak, M., and Bojańczyk, M., “Regular languages of thin trees”, Theory Comput. Syst., vol. 58, no. 4, pp. 614–663, 2016.
- [3] Skrzypczak, M. . “Recognition by thin algebras,” in Descriptive Set Theoretic Methods in Automata Theory: Decidability and Topological Complexity. Springer Berlin Heidelberg, 2016, pp. 121–135.
3.13 Algebraic Language Theory with Effects
Henning Urbat (Universität Erlangen-Nürnberg, DE)
License:
Creative Commons BY 4.0 International license © Henning Urbat
Joint work of: Fabian Lenke, Henning Urbat, Stefan Milius, Thorsten Wißmann
Regular languages – the languages accepted by deterministic finite automata – are known to be precisely the languages recognized by finite monoids. This characterization is the origin of algebraic language theory. We generalize the correspondence between automata and monoids to automata with generic computational effects given by a monad, providing the foundations of an effectful algebraic language theory. We show that, under suitable conditions on the monad, a language is accepted by an effectful finite automaton precisely when it is recognizable by (1) an effectful monoid morphism into an effect-free finite monoid, and (2) a monoid morphism into a monad-monoid bialgebra whose carrier is a finitely generated algebra for the monad, the former mode of recognition being conceptually completely new. As applications we obtain novel algebraic characterizations of probabilistic finite automata, nondeterministic probabilistic finite automata, and for weighted finite automata over unrestricted semirings, generalizing previous work on weighted algebraic recognition over commutative rings.
3.14 Automata in W-toposes, and general Myhill-Nerode theorem
Victor Iwaniack (University of Côte d’Azur - Nice, FR)
License:
Creative Commons BY 4.0 International license © Victor Iwaniack
We enrich the functorial viewpoint on automata of Colcombet and Petrişan to work with automata in any topos. The whole minimisation framework adapts to enrichment with notably enriched Kan extensions and enriched orthogonal factorisation systems. We use this general minimisation to deduce a general Myhill-Nerode theorem. This theorem depends on a notion of “finiteness”. By instantiating this framework with the topos of nominal sets and the orbit-finiteness, we recover the Myhill-Nerode theorem of Bojańczyk, Klin and Lasota. But the theorem applies to other finiteness conditions, such as Kuratowski-finiteness or “stalkwise-finiteness”. Moreover, the whole enriched framework can work with other monoidal categories than toposes.
3.15 Graph Automata and Automaton Functors
Barbara König (Universität Duisburg-Essen, DE)
License:
Creative Commons BY 4.0 International license © Barbara König
Joint work of: Sander Bruggink, Christoph Blume, Barbara König
We generalize Courcelle’s recognizable graph languages and results on monadic second-order logic to more general structures.
We give a category-theoretical characterization of recognizability. A recognizable subset of arrows in a category is defined via a functor into the category of relations on finite sets. This can be seen as a straightforward generalization of finite automata and we show how to obtain graph automata - accepting recognizable graph languages – by applying the theory to the category of cospans of graphs.
We also introduce a simple logic that allows to quantify over the subobjects of a categorical object and we show that, for the category of graphs, this logic is equally expressive as monadic second-order graph logic (MSOGL). Furthermore, we explain that in the more general setting of hereditary pushout categories, a class of categories closely related to adhesive categories, we can recover Courcelle’s result that every MSOGL-expressible property is recognizable.
The talk concludes by reviewing a practical implementation of graph automata with applications to the verification of graph transformation systems.
3.16 The relative family construction, the Brzozowski derivatives, and the Myhill-Nerode theorem
Paul-Andre Mellies (Université Paris Cité, FR), Noam Zeilberger (Ecole Polytechnique - Palaiseau, FR)
License:
Creative Commons BY 4.0 International license © Paul-Andre Mellies and Noam Zeilberger
In this talk, I explained how to establish a “run-aware” version of the Myhill-Nerode theorem based on a fibrational / categorical account of Brzozowski derivatives. A non-deterministic finite state automaton is defined as a finitary functor satisfying the unique lifting of factorisation (ULF) property, where E is a category of states and runs, and B is a category of sorts and words.
A relative family construction is introduced, which provides a “categorified” powerset construction for any functor . One key observation is that the resulting functor is a fibration if and only if is a ULF functor.
The observation is applied to the finitary and ULF functor in order to obtain a fibration together with a functor over where the fibration is the pullback of the codomain fibration along the hom functor . The functor itself is obtained from the functor which transports every pair of states to the function associated to the functor .
The functor is shown to transport cartesian maps to cartesian maps in which appear as a “run-aware” form of Brzozowski derivatives in . From this follows a Myhill-Nerode theorem for “run-aware” languages, stating that a language is regular if and only if its class of derivatives is finitely generated (using sums or disjoint unions) at each fiber over an object of the category .
3.17 Algebraic Recognition of Probabilistic Languages
Stefan Milius (Universität Erlangen-Nürnberg, DE)
License:
Creative Commons BY 4.0 International license © Stefan Milius
Joint work of: Fabian Lenke, Stefan Milius, Henning Urbat, Thorsten Wißmann
Regular languages – the languages accepted by deterministic finite automata – are known to be precisely the languages recognized by finite monoids. This characterization is the origin of algebraic language theory. We generalize the correspondence between automata and monoids to probabilistic automata, providing the foundations of a probabilistic algebraic language theory. We show that a language is computable by a probabilistic finite automaton precisely when it is recognizable by (1) a probabilistic monoid morphism into an (ordinary) finite monoid, and (2) an (ordinary) monoid morphism into an fg-carried convex monoid: a finitely generated convex set equipped with a monoid operation that distributes over the convex structure. The former mode of recognition is conceptually completely new. Moreover, every probabilistic laguage has a syntactic monoid, that is, a minimal algebraic recognizer. However, the syntactic monoid is not fg-carried in general. As an open problem we leave the question whether the syntactiv monoid is finitely presentable as a convex monoid.
3.18 Higher-order regular languages and profinite lambda-terms
Vincent Moreau (Université Paris Cité, FR)
License:
Creative Commons BY 4.0 International license © Vincent Moreau
Joint work of: Vincent Moreau, Sam van Gool, Paul-André Melliès
A fundamental observation at the heart of the topological approach to language theory is the fact that the topological space of profinite words is the Stone dual of the Boolean algebra of regular languages Reg() over the alphabet . Using ideas coming from the seminal work of Salvati on languages of -terms, who introduced the family of Boolean algebras Reg(A) for any type A, we introduce the space of profinite -terms of type A as the Stone dual of Reg(A), which generalizes to the higher-order setting the notion of profinite word. Types and profinite -terms assemble into a Stone-enriched cartesian closed category ProLam, which is the free such category C that recognizes at most regular languages in the sense of Salvati. This demonstrates the compositional aspects of profinite -terms, which we think of as the terms of an extension of the -calculus with profinite operators.
3.19 Relative Membership of Regular Languages
Rémi Morvan (University of Bordeaux, FR)
License:
Creative Commons BY 4.0 International license © Rémi Morvan
Given a regular language Omega and a class of regular languages V, we want to understand when the class of languages that can be written as the intersection of Omega with a a language from V has decidable membership. We provide a sufficient condition on Omega such that whenever V has decidable membership (and has some mild closure properties, i.e. is a pseudovariety), then the relativization of V with respect to Omega also has decidable membership.
3.20 On the expressivity of linear recursion schemes
Andrzej Murawski (University of Oxford, GB)
License:
Creative Commons BY 4.0 International license © Andrzej Murawski
Joint work of: Pierre Clairambault, Andrzej S. Murawski
In the last decade or so, higher-order recursion schemes (HORS) have emerged as a promising technique for model-checking higher-order programs. I will discuss several results concerning the case when HORS are typed using linear logic (intuitionistic multiplicative additive linear logic, to be precise). It turns out that such schemes have an automata-theoretic counterpart, namely restricted tree-stack automata, which come from linguistics, where they were introduced to study the so-called multiple context-free languages. This leads to a new perspective on linear HORS and new decidability results. This is joint work with Pierre Clairambault (deterministic case, MFCS’19), Guanyan Li and Luke Ong (probabilistic case, LICS’22).
3.21 Algebraic Recognition of Regular Functions
Lê Thành Dung Nguyên (CNRS Aix-Marseille Univ., FR)
License:
Creative Commons BY 4.0 International license © Lê Thành Dung Nguyên
Joint work of: Mikołaj Bojańczyk, Lê Thành Dung Nguyên
A string-to-string function is regular (i.e. is an -transduction) if and only if it factors as
where is a finiteness-preserving endofunctor on semigroups, and is a component of a natural transformation
3.22 How are we secretly using category theory while proving hardness of satisfying constraints
Jakub Opršal (University of Birmingham, GB)
License:
Creative Commons BY 4.0 International license © Jakub Opršal
Joint work of: Maximilian Hadek, Tomas Jakl, Jakub Opršal
In the talk, I have provided a new categorical perspective on the algebraic approach to the constraint satisfaction problem (CSP). The approach has been a prevalent method of the study of the complexity of these problems since the early 2000s, and many breakthrough achievements can be either directly or indirectly attributed to it. A prime result is the Bulatov–Zhuk Dichotomy Theorem, which states that every finite-template CSP is either in P or NP-complete.
I have explained the gadget reductions used by the algebraic approach as a case of a well-known categorical construction (left Kan extension along Yoneda), and I have stated and proved the fundamental theorem of the algebraic approach in the categorical language. The theorem provides a condition for a CSP to be NP-complete that covers the hardness part of the dichotomy.
3.23 Functor automata - minimization and learning
Daniela Petrişan (Université Paris Cité, FR)
License:
Creative Commons BY 4.0 International license © Daniela Petrişan
Joint work of: Thomas Colcombet, Daniela Petrişan
Main reference: Thomas Colcombet, Daniela Petrişan: “Automata Minimization: a Functorial Approach”, Log. Methods Comput. Sci., Vol. 16(1), 2020.
In this talk I present a categorical approach to automata based on the categorical notion of functor. The basic idea is to see an automaton as a machine taking some input specified by an “input” category and producing some effect – such as non-determinism, words over an output alphabet or probability values – encoded by an “output” category. Usually, the output category is a Kleisli category for a monad specifying a given effect. We discuss how adjunctions between categories can be lifted to adjunctions between categories of automata, encompassing examples such as determinization or completion. We then present sufficient conditions on the output category such that minimization and learning algorithms exist.
References
- [1] Thomas Colcombet and Daniela Petrişan.Automata Minimization: a Functorial Approach. Log. Methods Comput. Sci., vol 16 (1), 2020
- [2] Thomas Colcombet, Daniela Petrişan and Riccardo Stabile.Learning Automata and Transducers: A Categorical Approach. CSL 2021
3.24 Equational theories of algebraic operators on Weihrauch problems
Cécilia Pradic (Swansea University, GB)
License:
Creative Commons BY 4.0 International license © Cécilia Pradic
Joint work of: Cécilia Pradic, Eike Neumann, Arno Pauly, Ian Price
Main reference: Cécilia Pradic: “The equational theory of the Weihrauch lattice with (iterated) composition”, CoRR, Vol. abs/2408.14999, 2024.
Weihrauch reducibility is a notion that has gained a lot of traction in computable analysis in the last decade. It may be regarded as a framework to compare the uncomputational strength of problems, much like reverse mathematics.
Weihrauch problems include natural mathematical problems such as WKL and RT, but the corresponding degrees also enjoy a rich algebraic structure induced by algebraic operation on problems. A great number of these operations correspond to the structure of the category problems and reductions, which is equivalent to (a full subcategory of) the category of containers/polynomials over represented spaces and computable maps, a well-known nice category which is bicartesian closed and monoidal closed among other things.
After introducing these notions, I will discuss the equational theory of the Weihrauch lattice equipped with the composition product and finite iterations thereof. Terms in this theory can be translated to alternating automata, and reductions regarded as a somewhat weird kind of simulation. This leads to decidability and a complete axiomatization.
Very little to the development is specific to Weihrauch problems and could potentially be adapted to handle containers over a range of extensive locally cartesian closed categories with dependent W-types.
3.25 Conformance Testing for Automata in a Category
Jurriaan Rot (Radboud University Nijmegen, NL)
License:
Creative Commons BY 4.0 International license © Jurriaan Rot
Conformance testing is often used to implement the equivalence query in active automata learning. In this talk, I will highlight this application, and go on to discuss the basic notions of -completeness and the classical -method. I will then discuss recent joint work with Bálint Kocsis ([1], to appear at FoSSaCS 2025) on generalising part of this theory to the abstract setting of automata in a category.
References
- [1] Bálint Kocsis and Jurriaan Rot, Complete Test Suites for Automata in Monoidal Closed Categories, FoSSaCS 2025
3.26 Measure-Theoretic Closure Operator on the Local Varieties of Regular Languages
Ryoma Sin’ya (Akita University, JP)
License:
Creative Commons BY 4.0 International license © Ryoma Sin’ya
Joint work of: Takao Yuyama, Yoshiki Nakamura, Yutaro Yamaguchi, Kazuhiro Inaba
A language L is said to be regular measurable if there exists an infinite sequence of regular languages that “converges” to L. This notion was introduced by the speaker in 2021 [1] and used for classifying non-regular languages by using regular languages. In this talk, we describe why this notion was introduced, and give a brief overview of decidability results relating to the measurability on subclasses (local subvarieties) of regular languages, eg., star-free, generalised definite, and group languages.
References
- [1] Ryoma Sin’ya. Asymptotic Approximation by Regular Languages, In Proceedings of the 47th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2021), 2021.
3.27 Name Allocation in Nominal Automata
Lutz Schröder (Universität Erlangen-Nürnberg, DE)
License:
Creative Commons BY 4.0 International license © Lutz Schröder
Joint work of: Florian Frank, Daniel Hausmann, Dexter Kozen Stefan Milius, Simon Prucker, Henning Urbat, Florian Frank, Daniel Hausmann, Dexter Kozen Stefan Milius, Simon Prucker, Lutz Schröder, Henning Urbat, Thorsten Wißmann
Formal languages over infinite alphabets serve the modelling and specification of structures and processes with data; here, infinite alphabets represent infinite data types equipped with very restricted interfaces, typically just checks for (in)equality. Such formal languages are thus typically referred to as data languages. There is a large variety of automata models for data languages; one of the most established models is the classical register automaton model, in which letters encountered in the input can be stored in a fixed number of registers for later comparison with other letters. Register automata are generally equivalent to automata models over nominal sets; for instance, nondeterministic register automata with nondeterministic update are equivalent to nondeterministic orbit-finite automata. In such equivalences, states of a nominal automaton correspond to configurations of a register automaton, consisting of a control state and the current register content.
In register-based models, expressiveness typically varies with the power of control; e.g., nondeterministic register automata are strictly more expressive than deterministic ones. While membership is typically decidable, inclusion of register automata tends to be either undecidable or of prohibitively high complexity unless stringent restrictions are imposed on either the power of control, e.g. requiring unambiguity, or on the number of registers. We introduce nominal automata models with explicit name allocation, which strike a compromise between expressive power and complexity. In such models, freshness of letters is modelled via alpha-equivalence on words with explicit allocation. Roughly speaking, this means that distinctness of a newly read letter with respect to a stored letter can only be enforced if the stored letter is expected to be seen again. This still allows for typical modes of expression requiring, for instance, explicit closure of named sessions; on the other hand, it allows for inclusion checking in elementary time under unrestricted nondeterminism and without bounding the number of registers. Indeed, the complexity is typically no worse than the complexity of the corresponding finite-alphabet model when the number of registers is fixed as a parameter. This has originally been established for non-deterministic finite-word automata with name allocation; similar results have subsequently been obtained for infinite-word automata, tree automata, and, very recently, for alternating finite-word automata.
3.28 Towards a Theory of Homomorphism Indistinguishability
Tim Seppelt (IT University of Copenhagen, DK)
License:
Creative Commons BY 4.0 International license © Tim Seppelt
Main reference: Tim Seppelt: “Homomorphism Indistinguishability”, Dissertation, RWTH Aachen University, Aachen, 2024.
In 1967, Lovász showed that two graphs and are isomorphic if, and only if, they admit the same number of homomorphisms from every graph . Subsequently, it emerged that many natural graph isomorphism relaxations from fields as diverse as finite model theory, category theory, optimization, quantum information, and algebraic graph theory can be characterized as homomorphism indistinguishability relations over natural restricted graph classes.
In this talk, we set out to abstract from these examples and develop a theory of homomorphism indistinguishability. We propose to answer, given a graph class , the following questions:
-
1.
What is the distinguishing power of the homomorphism indistinguishability relation of ?
-
2.
What is the complexity of deciding homomorphism indistinguishability over ?
We discuss progress on both questions covering Roberson’s conjecture on the homomorphism distinguishing closure, a recent result of myself on the complexity of homomorphism indistinguishability over recognizable of bounded treewidth, and the role of minor-closed graph classes in the emerging theory of homomorphism indistinguishability.
3.29 Arboreal Covers over Relational Structures
Nihil Shah (University of Cambridge, GB)
License:
Creative Commons BY 4.0 International license © Nihil Shah
Joint work of: Nihil Shah, Samson Abramsky, Luca Reggio, Dan Marsden, Yoav Montacute, Anuj Dawar, Tomáš Jakl, Adam O’Conghaile, Amin Karamlou
Main reference: Nihil A Shah: “Arboreal covers over relational structures”, University of Oxford, 2024.
Model-comparison games are an important tool in both finite and unrestricted model theory to prove when two structures satisfy the same sentences in a logic. Game comonads encode particular model comparison games as comonads on the categories of structures. This started with the work of Abramsky, Dawar, and Wang in 2017 on encoding pebble games for finite variable logics. Since this initial work, several examples of have comonads have been engineered to capture a wide-range of logics. Each game comonad provides a categorical characterisation of equivalence in a logic and its variants. The categorical constructions common to these comonads have proved to be a nice tool for organizing tacit connections between syntactic resources in logic, hierarchical approximations to constraint satisfaction and isomorphism, and well-known combinatorial parameters such as treewidth and tree-depth.
Determining the commonalities shared amongst game comonads that enable them to have these features lead to the axiomatic formulation of arboreal category and arboreal cover by Abramsky and Reggio 2021. Arboreal categories axiomatise the notion of a category with “tree-shaped” objects and provide a native setting for dynamic notions like simulation, bisimulation, and resource-indexing. Arboreal covers are comonadic adjunctions to any category that allow application of these dynamic notions to the static objects of the target category. Game comonads all arise as arboreal covers over categories of relational structures.
3.30 Model Completeness, MSOL and Temporal Logics
Silvio Ghilardi (University of Milan, IT)
License:
Creative Commons BY 4.0 International license © Silvio Ghilardi
Joint work of: Silvio Ghilardi, Luca Carai, Sam van Gool
We shall connect classical algebraic model theory with monadic second order logic (MSOL) on infinite structures (natural numbers, infinite binary trees, infinite trees of arbitrary degrees, etc.)
The idea comes from the well-known connection between MSOL and automata: this connection allows a conversion between MSOL formulae and suitable automata (and back). Applying such conversion, every MSOL formula turns out to be equivalent to a formula which is “almost existential”. The “almost” proviso (coming from acceptation conditions) can be removed if the language is enriched with some temporal operators in LTL/CTL style.
First-order theories whose formulae are equivalent to existential formulae are precisely model complete theories. MSOL interpreted on a structure can be viewed as first-order logic (FOL) interpreted on the corresponding power set Boolean first-order structures. From these facts we may formulate the slogan “MSOL is the model companion of temporal logic”. The aim of the talk is to supply results giving a precise formal meaning to such slogan.
3.31 Sound and Complete Axiomatizations of (Infinite) Traces for Probabilistic Transition Systems
Ana Sokolova (Paris Lodron Universität Salzburg, AT)
License:
Creative Commons BY 4.0 International license © Ana Sokolova
Joint work of: Corina Cirstea, Larry Moss, Todd Schmidt, Tori Noquez, Alexandra Silva, Ana Sokolova
This talk is about recent results on axiomatizing infinite trace semantics, also called stream semantics, for generative probabilistic transition systems, also called labelled Markov chains.
The talk is about trace semantics and its equational (sound and complete) axiomatization. Probabilistic transition systems can be represented by suitable probabilistic expressions, with the property that the expression and the transition system (state) have the same (in this case trace) semantics. We then provide equations on these expressions that fully characterize (infinite) trace semantics: two expressions are trace equivalent iff they are provably equivalent with the presented axioms. The axioms for finite traces have been given by Silva and myself in 2011. The soundness and completeness proof is coalgebraic.
In recent work with the group of coathors mentioned above, we present the first sound and complete axiomatization of infinite trace semantics for generative probabilistic transition systems. Our approach is categorical, and we build on recent results on proper functors over convex sets - in particular on a novel and simpler proof of properness of the involved functor. At the core of our proof is a characterization of infinite traces as the final coalgebra of a functor over convex algebras. Somewhat surprisingly, our axiomatization of infinite trace semantics coincides with that of finite trace semantics, even though the techniques used in the completeness proof are significantly different.
3.32 Tree algebras and bisimulation-invariant MSO on finite graphs
Thomas Colcombet (IRIF – CNRS – Université Paris Cité, FR)
License:
Creative Commons BY 4.0 International license © Thomas Colcombet
Joint work of: Amina Doumane, Denis Kuperberg, Thomas Colcombet
In this work with Amina Doumane and Denis Kuperberg, we establish that the bisimulation invariant fragment of MSO over finite transition systems is expressively equivalent over finite transition systems to modal -calculus, a question that had remained open for several decades. The proof goes by translating the question to an algebraic framework, and showing that the languages of regular trees that are recognised by finitary tree algebras whose sorts zero and one are finite are the regular ones. This corresponds for trees to a weak form of the key translation of Wilke algebras to -semigroup over infinite words, and was also a missing piece in the algebraic theory of regular languages of infinite trees for twenty years.
3.33 Profinite words and Stone duality for regular languages
Sam van Gool (Université Paris Cité, FR)
License:
Creative Commons BY 4.0 International license © Sam van Gool
Joint work of: Mai Gehrke, Sam van Gool, Paul-Anrdre Melliès, Vincent Moreau, Benjamin Steinberg
In this talk, we show how profinite words are types, in the model-theoretic sense, for monadic second-order logic. Starting from the Stone duality approach to regular languages developed in [1], we show how the free profinite monoid naturally arises as the dual of a colimit chain of finite Boolean algebras with operators. In the first-order setting, this leads to the model-theoretic view on proaperiodic monoids developed in [2]. We also mention links with the duality-theoretic notion of preserving joins at primes [3, Ch. 8]. These considerations were generalized to give profinite lambda-terms in [4], also see V. Moreau’s talk in the same seminar.
References
- [1] Gehrke, M., S. Grigorieff and J.-E. Pin, Duality and equational theory of regular languages, in: International Colloquium on Automata, Languages and Programming (ICALP), 246–257, 2008.
- [2] van Gool, S. and Steinberg, B. Pro-aperiodic monoids via saturated models, Israel Journal of Mathematics 234: 451-498, 2019.
- [3] Mai Gehrke and Sam van Gool, Topological duality for distributive lattices: Theory and Applications. Cambridge Tracts in Theoretical Computer Science 61. Cambridge University Press, 2024.
- [4] Sam van Gool, Paul-André Melliès and Vincent Moreau, Profinite lambda-terms and parametricity, Mathematical Foundations of Programming Semantics (MFPS), 2023.
3.34 Monadic second-order logic modulo bisimilarity, coalgebraically
Yde Venema (University of Amsterdam, NL)
License:
Creative Commons BY 4.0 International license © Yde Venema
Joint work of: Sebastian Enqvist, Fatemeh Seifan, Yde Venema
The Janin-Walukiewicz Theorem states that on Kripke models, the modal -calculus is the bisimulation-invariant fragment of monadic second-order logic. In the talk I showed how to generalise this result to T-coalgebras for an arbitrary set functor T (satisfying some conditions). In the proof I introduced the notion of coalgebra automata.
3.35 Presheaf automata
Krzysztof Ziemianski (University of Warsaw, PL)
License:
Creative Commons BY 4.0 International license © Krzysztof Ziemianski
Joint work of: Krzysztof Ziemianski, Georg Struth
Main reference: Georg Struth, Krzysztof Ziemianski: “Presheaf automata”, CoRR, Vol. abs/2409.04612, 2024.
I introduce presheaf automata as presheaves over categories with two distinguished families of morphisms. Presheaf automata are a generalisation of different variants of higher-dimensional automata and other automata-like formalisms, including Petri nets and pushdown automata. I develop the foundations of a language theory for them and define runs, languages, notions of regularity and rationality of languages, determinism and bisimulations.
4 Participants
-
Samson Abramsky – University College London, GB
-
Bahareh Afshari – University of Gothenburg, SE
-
Quentin Aristote – IRIF – Paris, FR
-
Achim Blumensath – Masaryk University – Brno, CZ
-
Mikołaj Bojańczyk – University of Warsaw, PL
-
Célia Borlido – University of Coimbra, PT
-
Thomas Colcombet – IRIF – CNRS – Université Paris Cité, FR
-
Ugo Dal Lago – University of Bologna, IT
-
Elena Di Lavore – University of Pisa, IT
-
Matthew David Earnshaw – Tallinn University of Technology, EE
-
Uli Fahrenberg – EPITA – Cesson-Sévigné, FR
-
Zeinab Galal – University of Bologna, IT
-
Silvio Ghilardi – University of Milan, IT
-
Helle Hvid Hansen – University of Groningen, NL
-
Victor Iwaniack – University of Côte d’Azur – Nice, FR
-
Tomas Jakl – Czech Technical University – Prague, CZ
-
Bartek Klin – University of Oxford, GB
-
Barbara König – Universität Duisburg-Essen, DE
-
Aliaume Lopez – University of Warsaw, PL
-
Fosco Loregian – Tallinn University of Technology, EE
-
Jérémie Marquès – University of Milan, IT
-
Paul-Andre Mellies – Université Paris Cité, FR
-
Karla Messing – Universität Duisburg-Essen, DE
-
Stefan Milius – Universität Erlangen- Nürnberg, DE
-
Vincent Moreau – Université Paris Cité, FR
-
Rémi Morvan – University of Bordeaux, FR
-
Andrzej Murawski – University of Oxford, GB
-
Lê Thành Dung Nguyen – CNRS & Aix-Marseille Univ., FR
-
Jakub Opršal – University of Birmingham, GB
-
Daniela Petrişan – Université Paris Cité, FR
-
Cécilia Pradic – Swansea University, GB
-
Jurriaan Rot – Radboud University Nijmegen, NL
-
Lutz Schröder – Universität Erlangen- Nürnberg, DE
-
Tim Seppelt – IT University of Copenhagen, DK
-
Nihil Shah – University of Cambridge, GB
-
Ryoma Sin’ya – Akita University, JP
-
Pawel Sobocinski – Tallinn University of Technology, EE
-
Ana Sokolova – Paris Lodron Universität Salzburg, AT
-
Rafal Stefanski – University of Warsaw, PL
-
Howard Straubing – Boston College, US
-
Henning Urbat – Universität Erlangen- Nürnberg, DE
-
Sam van Gool – Université Paris Cité, FR
-
Yde Venema – University of Amsterdam, NL
-
Jana Wagemaker – Radboud University Nijmegen, NL
-
Noam Zeilberger – Ecole Polytechnique – Palaiseau, FR
-
Krzysztof Ziemianski – University of Warsaw, PL