Abstract Introduction 1 Symbolic dynamics 2 Density of ideals 3 J-class of a shift space 4 Existence of the density 5 On algebraic properties of the density 6 Markov and sofic measures References

Density of Rational Languages Under Shift Invariant Measures

Valérie Berthé ORCID Université Paris Cité, IRIF, CNRS, France Herman Goulet-Ouellet ORCID Faculty of Information Technology, Czech Technical University in Prague, Czech Republic Dominique Perrin ORCID LIGM, Université Gustave Eiffel, Marne-la-Vallée, France
Abstract

We study density of rational languages under shift invariant probability measures on spaces of two-sided infinite words, which generalizes the classical notion of density studied in formal languages and automata theory. The density for a language is defined as the limit in average (if it exists) of the probability that a word of a given length belongs to the language. We establish the existence of densities for all rational languages under all shift invariant measures. We also give explicit formulas under certain conditions, in particular when the language is aperiodic. Our approach combines tools and ideas from semigroup theory and ergodic theory.

Keywords and phrases:
Automata theory, Symbolic dynamics, Semigroup theory, Ergodic theory
Category:
Track B: Automata, Logic, Semantics, and Theory of Programming
Funding:
Valérie Berthé: The first author was supported by the Agence Nationale de la Recherche through the projects “SymDynAr” (ANR-23-CE40-0024) and “IZES” (ANR-22-CE40-0011), as well as by the 2024 ERC Synergy Project DynAMiCs (101167561).
Herman Goulet-Ouellet: The second author was supported by the CTU Global Postdoc Fellowship program.
Dominique Perrin: The third author was supported by the Agence Nationale de la Recherche through the project “IZES” (ANR-22-CE40-0011).
Copyright and License:
[Uncaptioned image] © Valérie Berthé, Herman Goulet-Ouellet, and Dominique Perrin; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Regular languages
Related Version:
Full Version: https://arxiv.org/abs/2504.16708
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

Introduction

The natural density for a language L on a finite alphabet A is defined as the limit in average (if it exists) of the probability that a word of length n belongs to L, where letters are drawn independently with equal probabilities. This notion can be traced back to the work of Berstel [5], who took inspiration from Schützenberger [37]. It has been widely studied in the context of automata theory, logic and the theory of codes [7, 36, 26, 19, 20, 10, 39, 22, 23], and also appears in ergodic theory, for instance in the work of Veech [40, 41]. Densities have also been studied in the more general case where words are drawn with a Bernoulli measure, i.e. letters are drawn independently with possibly different probabilities [7].

This paper is related with recent work on the density of group languages [9] using a more general notion of density motivated by symbolic dynamics. Let μ be a probability measure on the space A of two-sided infinite words on the alphabet A. Then the density of a language L with respect to μ is the limit

δμ(L)=limn1ni=0n1μ({xAx0xi1L}), (1)

if it exists. The classical definition discussed above corresponds to the case where μ is a Bernoulli measure. Our main result states that the density of a rational language L under a shift invariant probability measure μ always exists (Theorem 4.1). A shift invariant measure is a probability measure that behaves well with respect to left extensions of words (see Equation (5)). The proof closely combines dynamical and algebraic methods.

On the dynamical side, the proof relies mainly on the construction of a skew product between the shift space (made of two-sided infinite words) that supports the measure μ and an -class of the finite monoid M defining the rational language L (the transition monoid). The skew product we use is closely related to the notion of wreath product used in semigroup and automata theory [16, Chapter 1, Section 10]. We construct a natural measure on the skew product, called the weighted counting measure, which leads to a closed formula for the density under the condition that this measure is ergodic (Theorem 4.8). This generalizes known results for the case of Bernoulli measures. We also consider in Theorem 3.6 aperiodic languages (also called star-free) for which densities are proved to exist in a strong sense.

On the algebraic side, this construction relies crucially on the theory of Green’s relations and on the key notion of the 𝒥-class associated with a shift. Let the monoid M be the image of A by a morphism φ. The 𝒥-class of M associated with X is the set of generators of the least ideal of M which meets the image by φ of the language of the shift X. Therefore, it can be called the minimal 𝒥-class of the monoid M with respect to the shift X. The use of such a 𝒥-class is a useful tool in automata theory, and appears in several different contexts. See [13] for a survey on the use of this idea, originating in the proof by Schützenberger of the characterization of aperiodic languages and [29] for an application close to the present paper.

Let us give a brief outline of the paper. In Section 1 we present preliminaries on symbolic dynamics. In Section 2 we show how to calculate the densities of ideals of A. In Section 3, we study the 𝒥-class associated with a shift space in a finite monoid and derive an explicit formula for the density of aperiodic languages (Theorem 3.6). We also discuss connections with the notion of degree of an automaton and decidability results concerning the 𝒥-class. In Section 4, we introduce the skew products and weighted counting measures, give the proof of our main result (Theorem 4.1), along with an explicit expression for the density when the weighted counting measure is ergodic (Theorem 4.8). In Section 5, we consider algebraic properties of the density (see Corollary 5.1). Lastly, Section 6 contains a discussion on the case of sofic measures, motivated by Corollary 5.1.

1 Symbolic dynamics

This section covers some necessary material from symbolic dynamics. We refer to [14] for a complete exposition on the topic, including proofs, and also to [17, 34] as alternative sources. For more specialized texts on topological dynamics and ergodic theory, we refer to [31, 42].

1.1 Topological dynamical systems

We first recall some terminology from topological dynamics. A topological dynamical system is a pair (X,T) of a compact metric space X and a continuous transformation T:XX. A subset YX is called invariant if T1(Y)Y. A nonempty topological dynamical system (X,T) is minimal if the only closed invariant subsets of X are X and . Equivalently, all xX have dense forward orbits {Tn(x)n0}. As a weaker notion, (X,T) is transitive if there exists xX with dense forward orbit.

Let μ be a Borel probability measure on X. The support of μ is the smallest closed set of measure 1. We say that μ is invariant if μ(T1U)=μ(U) for every Borel set UX. Note that the support of an invariant measure is a closed invariant subset.

The measure μ is called ergodic if it is invariant and every Borel set such that T1(U)=U has measure 0 or 1. By Birkhoff’s ergodic theorem, ergodicity can be interpreted as a form of asymptotic independence. More precisely, an invariant measure μ is ergodic if and only if

limn1ni=0n1μ(UTiV)=μ(U)μ(V) (2)

for every pair U,V of Borel sets (see [31]). In particular, note that the support of an ergodic measure is a transitive system. As a stronger property, the measure μ is mixing if

limnμ(UTnV)=μ(U)μ(V) (3)

for every pair U,V of Borel sets. Every mixing measure is ergodic but the converse is false.

Let =(X,T) be the set of invariant probability measures on (X,T) and =(X,T) be the subset of those measures which are ergodic. We can view as a subset of the dual 𝒞(X,) of the space of continuous maps X. If we equip 𝒞(X,) with the weak- topology, then is a compact convex subset and is its set of extreme points. In particular if the system (X,T) has only one invariant measure, it must be ergodic; and we call (X,T) uniquely ergodic. Standard results from functional analysis imply that any invariant measure can be decomposed in terms of ergodic measures. More precisely, for every invariant measure μ, there exists a Borel probability measure τ on such that

μ=ν𝑑τ(ν). (4)

More details can be found in e.g. [32, Chapter 12].

1.2 Shift spaces

We now turn to symbolic dynamics. Let A be a finite alphabet. We denote by A the set of two-sided infinite sequences over A equipped with the product topology, where A has the discrete topology. We denote by S the shift transformation on A, defined by y=S(x) if yn=xn+1 for all n. A shift space X on the alphabet A is, by definition, a closed subset of A which is invariant under the shift transformation. Note that (X,S) is a topological dynamical system.

Let A be the free monoid on A, ε be the empty word, and A+=A{ε}. We denote by (X) the set of finite words w which appear in the elements xX and by n(X) the set of words of length n in (X). For a shift space, transitivity and minimality can be characterized in terms of the language (X). A shift space X is transitive if and only if for every u,v(X), there is some w(X) such that uwv(X). Moreover X is minimal if and only if for every n0, there is an N0 such that every word of n(X) appears in every word of N(X). Shift spaces which are transitive are also called irreducible.

For words u,v of length m,n respectively, we define the right and two-sided cylinders by

[v]X={xXx0xn1=v},[uv]X={xXxmxn1=uv}.

The two-sided cylinders form a clopen basis for the topology of X. For L,KA, we define

[L]X=wL[w]X,[LK]X=uL,vK[uv]X.

Let μ be a Borel probability measure on a shift space X. By a slight abuse of notation, we also denote by μ the map which assigns to a word w(X) the number μ([w]X). With this notation, we have μ(ε)=1 and

aAμ(wa)=μ(w).

We extend this to subsets L(X) by letting μ(L)=wLμ(w). When the cylinders [u]X for uL are disjoint, we have μ(L)=μ([L]X). Moreover, if μ is an invariant measure, then

μ(w)=μ([wε]X)=aAμ(aw). (5)

When the cylinders [uε]X for uL are disjoint, we have μ(L)=μ([Lε]X).

Observe that the support of an invariant measure on A is a shift space X of measure 1, and when the measure is ergodic, X is irreducible.

A Bernoulli measure is the simplest case of an ergodic measure on A. The values μ(w) for wA are given by a morphism μ:A[0,1] such that aAμ(a)=1. It corresponds, in classical terms of probability theory, to a sequence (ζn)n of independent identically distributed random variables, where ζn:xxn. A Bernoulli measure is in fact mixing. Further mixing examples can be found among sofic measures, also called rational measures or hidden Markov measures [20, 11]. See Section 6 for a discussion on this topic.

Another important source of examples is the following. A substitution is a monoid morphism σ:AA. The substitution σ is primitive if there is an n0 such that every bA occurs in every σn(a). The shift X(σ), called a substitution shift, is the set of all xA such that all words in (x) appear in some σn(a), n0, aA. If σ is primitive then X(σ) is minimal, and uniquely ergodic [27].

Example 1.1.

The Fibonacci substitution is the morphism σ:{a,b}{a,b} defined by σ:aab,ba. It is primitive and the substitution shift X=X(σ) is called the Fibonacci shift. A description of its unique ergodic measure can be found e.g. in [14, Example 3.8.19].

The Fibonacci shift also belongs to the family of Sturmian shifts, whose definition may be recalled in [25]. More precisely, the Fibonacci shift is Sturmian of slope (35)/2. The following is an example of an automatic sequence (see [1]).

Example 1.2.

The primitive substitution σ:aab,bba is called the Thue–Morse substitution. The shift X(σ) is called the Thue–Morse shift. Its unique ergodic measure is described in [14, Example 3.8.20].

1.3 Density of a language

Let L be a language on A and let μ be an ergodic measure on A with support X. We define the density of L relative to μ as

δμ(L)=limn1ni=0n1μ(LAi),

whenever the limit exists. In other words δμ(L) is the Cesàro limit of (μ(LAn))n=0. When the classical limit of (μ(LAn))n=0 exists, we say that L has a density in the strong sense. It is known that when μ is a Bernoulli measure, the density of any rational language exists, and if μ has rational values on letters, all densities are rational numbers [5, Theorem 2.1].

Let us note some basic properties of the density. If L has a density, then 0δμ(L)1, as 0μ(LAi)1 for every i0. Moreover, since μ(w)=0 when w(X), we have δμ(L)=δμ(L(X)). Thus if L,K satisfy L(X)=K(X) then δμ(L)=δμ(K). The density is also finitely additive: if L,K have densities and LK=, then LK has density δμ(LK)=δμ(L)+δμ(K). Additionally, we have δμ(AL)=1δμ(L), and if LL=A then LL has density δμ(LL)=δμ(L)+δμ(L)1. However, the density of an intersection LK might not exist, even when the density of both L and K exist.

Example 1.3.

On a fixed finite alphabet A, consider the two languages:

L={wA:|w|1mod2},K={wA+:|w|log2|w|mod2}.

It is clear that δμ(L)=δμ(K)=1/2 (no matter the measure μ). However, it can be shown that the sequence of sums sn=1ni=1nμ(LKAi) has subsequences (s22n)n and (s22n+1)n converging to 1/3 and 1/6 respectively, thus δμ(LK) does not exist.

In the above example, the language K is not rational. When both languages are rational our main result (Theorem 4.1) implies that the density of the intersection also exists.

2 Density of ideals

In this section, we show how to calculate densities for languages that are ideals of A (and thus not necessarily rational). We start with right ideals, that is, languages L such that LA=L. The set D=LLA+ is the minimal generating set of L as a right ideal, which means that L=DA and D is contained in every other set with this property. Moreover the set D is a prefix code, meaning that no element of D is a proper prefix of another element of D. For example, for aA, the language L=aA is the set of words that have the letter a as a prefix. Then one has D={a}.

Proposition 2.1.

Let μ be a probability measure on A and L be a right ideal of A. Then L has a density in the strong sense and δμ(L)=μ(D) where D=LLA+.

Proof.

Let X be the support of μ. Since D is a prefix code,

μ(DAAn)=uDμ(uAAn)=uD,|u|nμ(uAn|u|)=uD,|u|nμ(u)

where the last equality uses the fact that μ(uAi)=μ(u) for all i. Clearly this tends to μ(D) when n.

Next we establish a similar result for left ideals of A, that is, languages L such that AL=L. Similar to the right-sided case, the set G=LA+L is the minimal generating set of L as a left ideal and also a suffix code, i.e., no element of G is proper suffix of another.

Proposition 2.2.

Let μ be an invariant probability measure of A and L be a left ideal of A. Then L has a density in the strong sense and δμ(L)=μ(G) where G=LA+L.

Proof.

Let X be the support of μ. Since G is a suffix code,

μ(AGAn)=uGμ(AuAn)=uG,|u|nμ(An|u|u)=uG,|u|nμ(u),

where the last equality uses the fact that since μ is invariant, μ(Aiu)=μ(u) for all i. By invariance of μ again, this tends to μ(G) when n.

The density of a left ideal may not exist if the measure is not invariant, as shown next.

Example 2.3.

Let xA and let μ be the Dirac measure of x, that is, the probability measure μ on A such that μ(U)=1 if xU and 0 otherwise. Then, for aA, the density of Aa is the frequency of a in the sequence x+=x0x1. If the frequency of a does not exist in x+, the language Aa does not have a density. For example, if x+=aba2b2a2nb2n, then the frequency of a does not exist. Indeed, the frequency of a is 1/2 in aba2b2a2nb2n, while it is close to 2/3 in aba2b2a2nb2na2n+1.

Next we consider the case of a quasi-ideal, that is, the intersection of a left ideal and a right ideal. In this case we need the stronger assumption that μ is ergodic.

Proposition 2.4.

Let μ be an ergodic measure on A, L be a right ideal of A, and K be a left ideal of A. Then LK has a density and δμ(LK)=μ(D)μ(G) where D=LLA+ and G=KA+K. Moreover, if μ is mixing, then LK has a density in the strong sense.

Proof.

Let X be the support of μ. Fix uD and vG and let m=max(|u|,|v|). Then μ(uAAvAn)=0 if n<m and otherwise

μ(uAAvAn)=μ([u]XS|v|n[v]X)

and thus using Equation (2),

δμ(uAAv)= limn1ni=0n1μ(uAAvAi)=limn1ni=mn1μ([u]XS|v|i[v]X)
=limn1ni=0n1μ([u]XSi[v]X)=μ([u]X)μ([v]X)=μ(u)μ(v).

This shows that

δμ(DAAG)=u,vδμ(uAAv)=u,vμ(u)μ(v)=μ(D)μ(G).

If μ is mixing, then with similar arguments

limnμ(uAAvAn)=limnμ([u]XSn[v]X)=μ(u)μ(v),

which shows that the density exists in the strong sense.

Example 2.5.

If X={(ab),(ba)}, then the language L=aAAb has density 1/2 with respect to the unique invariant measure μ on X. However μ is not mixing and the density does not exist in the strong sense as μ(aAAbAn) alternates between 0 and 1.

Next we turn to two-sided ideals, that is, languages L such that ALA=L.

Proposition 2.6.

Let μ be an ergodic measure on A with support X. Every two-sided ideal L that intersects (X) has density 1 in the strong sense.

Proof.

Let D=LLA+ and G=LA+L. Then L=DA=AG and by the formulas for right and left ideals, the density exists in the strong sense and δμ(L)=μ(D)=μ(G). But, using the formula for quasi-ideals, we also have δμ(L)=μ(D)μ(G)=δμ(L)2, so δμ(L)=0 or 1. The fact that L(X) implies δμ(L)>0.

Let X be a shift space. We derive from Proposition 2.6 the following property for X-thin codes, where the languages L such that there exists w(X) satisfying AwAL= are called X-thin by the terminology of [6]. A suffix code C(X) is X-maximal if it is not properly included in a suffix code D(X). A dual notion holds for prefix codes. For example, every n(X), for n1, is an X-maximal suffix code.

Proposition 2.7.

Let μ be an ergodic measure with support X and let C(X) be an X-maximal prefix or suffix code. If C is X-thin, then μ(C)=1.

Proof.

We assume that C is an X-maximal suffix code. Let w(X) be such that AwAC=. We claim that AwA(X)AC. Indeed, let u,v be such that uwv(X). Since C is X-maximal, either uwv has a suffix in C or is a suffix of some cC. The second case being impossible, we conclude that uwvAC, which proves the claim. Therefore, we have δμ(AwA)δμ(AC). Then by Propositions 2.6 and 2.2, we have μ(C)=δμ(AC)δμ(AwA)=1. The proof for the dual statement for prefix codes works analogously. In particular, a finite prefix or prefix code is X-maximal if and only if μ(C)=1 for an ergodic measure with support X.

3 J-class of a shift space

This section is devoted to 𝒥-classes of finite monoids associated with shift spaces. The reader who is not familiar with semigroup theory and Green’s relations might want to consult textbooks such as [7, 24, 18, 21, 15, 16]. The notion of a 𝒥-class discussed here also appeared in [28, 29], and is tangentially related with the work of Almeida on free profinite semigroups [2]. See also the survey [13] for more on the connections between Green’s relations and automata theory.

3.1 Definition and first properties

Let X be a shift space on A and let φ:AM be a morphism onto a finite monoid M. We first introduce two subsets of M naturally associated with X.

Definition 3.1.

Let KX(M) be the intersection of all two-sided ideals I of M such that Iφ((X)), called the X-minimal ideal of M. Let JX(M) be the set of elements of M which generate KX(M) as a two-sided ideal, i.e.,

JX(M)={mMMmM=KX(M)}.

It follows from the definition that JX(M) is a 𝒥-class. Consider the quasi-order defined on M by m𝒥nMmMMnM. Note that m𝒥n if and only if m𝒥n and n𝒥m. The next proposition establishes some basic properties of JX(M) when X is irreducible.

Proposition 3.2.

Let X be an irreducible shift space on A and let φ:AM be a morphism onto a finite monoid M. Then

  1. 1.

    KX(M) is an ideal of M which meets φ((X));

  2. 2.

    JX(M)={mKX(M)MmMφ((X))};

  3. 3.

    JX(M)φ((X)) is the non-empty set of 𝒥-minimal elements of φ((X));

  4. 4.

    Either JX(M) is the minimal ideal K(M) of M, or JX(M){0} is the unique 0-minimal ideal in the quotient of M by the largest ideal of M which does not meet φ((X)).

Proof.

1. Let I,J be two ideals which meet φ((X)). Let u,v(X) be such that φ(u)I and φ(v)J. Since X is irreducible, there is a word w such that uwv(X). Then φ(uwv)IJ. This proves that KX(M) is an ideal which meets φ((X)).

2. Let mJX(M) and let nKX(M)φ((X)). Then n is in MmM. This proves the inclusion from left to right. Conversely, let mKX(M) be such that MmMφ((X)). Then, the ideal MmM generated by m is contained in KX(M) and it meets φ((X)). This implies MmM=KX(M) and therefore m is in JX(M).

3. Let φ(u) be 𝒥-minimal in φ((X)). Fix v(X). By irreducibility there is w(X) such that uwv(X). It follows that φ(uwv)𝒥φ(u), which by minimality implies φ(u)𝒥φ(uwv)𝒥φ(v). This shows that φ(u) generates an ideal contained in every ideal which meets (X), thus φ(u)JX(M).

4. First assume that K(M)φ((X)). Since the elements of K(M) are 𝒥-minimal in all of M, it follows from part 3 that JX(M)=K(M).

Assume next that K(M)φ((X))=. Let I={sMMsMφ((X))=}. It is clear that I is the largest ideal of M which does not meet φ((X)). It is non-empty since it contains the minimal ideal K(M) by assumption. Let us show that M/I is a prime monoid (see [7, Section 1.12]). Take s,tM/I which are 0. Then there exist x1,x2,y1,y2 such that x1sx2=φ(u) and y1ty2=φ(v), where u,v(X). By irreducibility there is w such that uwv(X), hence x1sx2φ(w)y1ty2=φ(uvw), and in particular sx2φ(w)y1t0. This shows that M/I is prime, and thus it admits a unique 0-minimal ideal, by [7, Proposition 1.12.9]. Finally notice that any non-zero element s of the quotient M/I must satisfy xsy=φ(u) for some x,yM, thus it is 𝒥-above some element of JX(M). This implies that JX(M){0} is the 0-minimal ideal of M/I.

Example 3.3.

Let 𝒜 be the automaton depicted, alongside its transition monoid, in Figure 1 and X be the Fibonacci shift (Example 1.1). Let φ:AM be the transition morphism of 𝒜. Then φ1(0)=AbbA, and since (X) does not contain bb, it follows that JX(M) is the 𝒥-class of α=φ(a). In this example, the automaton 𝒜 is the minimal automaton of the X-maximal prefix code C={a,ba}.

Figure 1: An automaton 𝒜 and the eggbox picture of its transition monoid M, where α represents the transformation of the states induced by a, and β the one induced by b.

The following result links the 𝒥-class JX(M) with the densities of the languages recognized by M. Roughly speaking, it shows that the 𝒥-class JX(M) concentrates the density.

Proposition 3.4.

Let μ be an ergodic measure on A with support X. Let φ:AM be a morphism onto a finite monoid M. Then the density of L=φ1(JX(M)) exists in the strong sense and δμ(L)=1.

Proof.

It follows from Proposition 3.2 that KX(M)φ((X))=JX(M)φ((X)), hence δμ(φ1(JX(M)))=δμ(φ1(KX(M))). But the density of KX(M) exists in the strong sense and is 1 by Proposition 2.6.

Let us highlight a simple consequence of this.

Corollary 3.5.

Let μ be an ergodic measure with support a shift space X. Let φ:AM be a morphism onto a finite monoid M. For every mJX(M), the density of φ1(m) exists in the strong sense and is 0.

Proof.

Indeed, let L=φ1(m), L=φ1(JX(M)). Then as LAL and the density of L has density 1 in the strong sense,

0=1limnμ(LAn)=limnμ(AnL)limnμ(LAn)=δμ(L).

3.2 On aperiodic languages

The next theorem concerns the density of aperiodic languages. Recall that a language is aperiodic if it can be recognized by an aperiodic monoid. By Schützenberger’s theorem [38], aperiodic languages are precisely the star-free languages, as well as the languages defined by first-order formulas (see [30, Theorem 4.1]). The density of aperiodic languages was also studied, from a logic perspective, by Lynch [26]. Lynch’s definition of density involves sequences of Bernoulli measures, thus it is both more general than our definition (since we use only one measure) and much more particular, since we use more general measures than Bernoulli measures.

Theorem 3.6.

An aperiodic language L has a density with respect to an ergodic measure μ. More precisely, there exist a finite number of pairs (Di,Gi)1ik of a prefix code and a suffix code such that L(X)=i=1k(DiAAGi)(X) and

δμ(L)=i=1kμ(Di)μ(Gi). (6)

Moreover, if the measure is mixing, then the density exists in the strong sense.

Proof.

Let μ be an ergodic measure and φ:AM be a morphism onto an aperiodic monoid M. Let J=JX(M), let mM and let L=φ1(m). It is enough to prove that Equation (6) holds for such L. If mJ, then δμ(L)=0 in the strong sense by Corollary 3.5, so we may assume that mJ. We claim that

L(X)=LAAL(X).

Indeed, assume that u(LAAL)(X). Then φ(u)mMKX(M)M=KX(M), and thus φ(u)J by the second part of Proposition 3.2. But then φ(u)mMMmJ and since M is aperiodic, we have mMMmJ={m}. It follows that φ(u)=m and uL. This concludes the proof of the claim, as the other inclusion is trivial. Finally, we can use Proposition 2.4 to deduce that δμ(L)=μ(D)μ(G), where D=LLA+ and G=LA+L, and that the density exists in the strong sense if μ is mixing.

Let us give two examples to illustrate Theorem 3.6, starting with an aperiodic language.

Example 3.7.

Let L={ab,ba}, whose minimal automaton 𝒜 may be found in [33, Chapter 4, Example 2.1]. Let M stand for its transition monoid. Note that L is aperiodic and thus star-free, which is not obvious. A star-free expression of L may be found in [33, Chapter 4, Example 2.1]. Let X be Thue–Morse shift and let μ be the unique invariant measure on X (Example 1.2). The 𝒥-class JX(M) is the 𝒥-class of α2 in [33, Chapter 4, Example 2.1]. Combining Equation (6) with the expression of L in [33, Chapter 4, Example 2.1], we get

δμ(L)=μ((ab)+b(ba)+a)μ(a(ab)+b(ba)+).

With the values of μ from [14, Example 3.8.20], we find

μ((ab)+b(ba)+a)=μ({abb,ababb,baa,babaa})=1/2

and similarly μ(a(ab)+b(ab)+)=1/2. Thus δμ(L)=1/4.

The next example is a group language whose intersection with the shift space behaves like an aperiodic language.

Example 3.8.

Let X be the orbit of the periodic sequence x=(abc). Thus X consists of three elements and has for unique ergodic measure the uniform probability measure μ. Let φ:A/2 be defined by φ(a)=0 and φ(b)=φ(c)=1. Consider the rational language L=φ1(0). Then one sees

L(X)=(abc){ε,a}(bca){ε,bc}(cab){ε},

and thus δμ(L)=13(1+13+13)=59.

The same result can be obtained using Theorem 3.6. First, observe that L has the same intersection with (X) as the language B where B={a,bc,cab}. The language B is recognized by the automaton 𝒜 depicted in Figure 2. The transition monoid of 𝒜 is an aperiodic monoid M with 27 elements, hence B turns out to be star-free. The 𝒥-class JX(M), depicted in Figure 2, consists of 9 elements, 5 of which belong to the image of B (indicated in yellow). Taking for instance m=βγ=φ(bc), we find that φ1(m)A=bcA and Aφ1(m)=Abc, and thus δμ(φ1(m))=μ(bc)2=1/9 by Equation (6); likewise δμ(φ1(m))=1/9 for all other elements mJX(M) using Equation (6). Thus we recover the fact that δμ(B)=δμ(B(X))=δμ(L(X))=5/9.

Figure 2: The minimal automaton of B and the 𝒥-class JX(M) from Example 3.8, where α, β and γ are the transformations induced respectively by a, b and c. The -classes in yellow represent elements in the image of B.

3.3 Decidability and degree

Next we consider the question of whether membership in JX(M) is decidable. We say that a shift space X is rationally decidable if the emptiness of (X)L is decidable for every rational language L. Sofic shifts are obviously rationally decidable. The class of rationally decidable shift spaces also includes all substitution shifts by a result of [35, Lemma 3] (see also [4, Lemma 3.15] and [12]).

Proposition 3.9.

Let X be a rationally decidable shift space. Then for every morphism φ:AM, there is an algorithm which decides which elements mM are in JX(M).

Proof.

We have mKX(M) if and only if mMnM for every nM such that φ1(n)(X). Since X is rationally decidable, this is decidable for every nM. Therefore, we can decide whether mKX(M), and then whether mJX(M).

The 𝒥-class is also linked with the following notion of degree. Let 𝒜 be an automaton on A and φ:AM be its transition morphism. The minimal rank of the elements of φ((X)) viewed as partial mappings is called the X-degree of the automaton, denoted dX(𝒜). Note that [28, Proposition 3.2] states that JX(M) contains all elements of rank dX(𝒜). For instance the automaton 𝒜 from Example 3.8 has X-degree 2, while the one in Example 3.3 has X-degree 1. This result also implies that dX(𝒜) is computable if and only if membership in JX(M) is decidable (cf. Proposition 3.9 for cases where the latter is decidable).

The notion of X-degree has also been studied in relation with codes [6, 3]. Let X be an irreducible shift, let C(X) be a rational prefix code and let 𝒜 be the minimal automaton of C. If C is X-maximal, then its X-degree (that is, dX(𝒜)) is larger than or equal to 1. When C is a finite X-maximal bifix code (i.e. both prefix and suffix), then the X-degree of C can also be defined in terms of the number of parses of elements of (X) [6].

4 Existence of the density

The goal of this section is to establish the existence of density of regular languages under all invariant measures. More precisely, we will prove the following, which is our main theorem.

Theorem 4.1.

Let μ be an invariant measure on A. Then every rational language L has a density with respect to μ.

The main ingredient for the proof is the dynamical system defined as follows, called a skew product. Let X be an irreducible shift space and φ:AM a morphism onto a finite monoid. Fix R an -class of JX(M) such that Rφ((X)). Let M act on the right of R{0} by rm=rm if rmR and rm=0 otherwise. The specific choice of an -class class R plays no role here, as long as R satisfies Rφ((X)).

Definition 4.2.

The skew product R{0}X is the system ((R{0})×X,T) where

T(r,x)=(rφ(x0),Sx).

Note that the projection πX:(R{0})×XX satisfies πXT=SπX. It follows that if ν is an invariant (respectively ergodic) measure on (R{0})X, then νπX1 is an invariant (respectively ergodic) measure on X. When M=G is a group, then R=G and GX forms a subsystem of (G{0})X, which in effect means we can get rid of 0. In particular we recover the type of skew products studied in the preprint [9]. Next, we introduce a natural probability measure on (R{0})X induced by a measure μ on X. In the group case, we recover the product of μ with the normalized counting measure on the group considered in [9].

Definition 4.3.

Fix an invariant probability measure μ on X. The weighted counting measure on the skew product (R{0})X is the Borel measure ν defined by ν({0}×X)=0, and for every u,v(X) and rR,

ν({r}×[uv]X)=1ds,sφ(u)=rμ(Gsuv),

where d is the cardinality of the -classes of JX(M) and Gs, for sR, is the suffix code such that φ1(Ms)=AGs.

That ν is a well-defined Borel measure on (R{0})X follows from Carathéodory’s extension theorem, since the above formula defines a pre-measure on the Boolean algebra of clopen sets of (R{0})×X. Note that if rs, then Gr=Gs. For an -class HR, let GH be the common value of Gr for rH. The following technical lemma will be useful. In its proof, we use the following notion: we call a monoid M stable if for every s,tM, the implications s𝒥stsst and s𝒥tssts hold. Every finite monoid is stable [18, Lemma 1.1, Chapter V].

Lemma 4.4.

The union G=HRGH over all -classes of R is a suffix code such that μ(G)=1.

Proof.

The fact that G is a suffix code is a consequence of the fact that the monoid M is stable. Moreover, G is X-maximal since, for all w(X), there exists u such that φ(uw)JX(M), by irreducibility and the fact that Rφ((X))ε. Note that a word w(X) such that φ(w)JX(M) cannot be a proper factor of any element of G, since G is a suffix code. Thus there must exist w(X) such that AwAG= (simply fix uG and then take w(X)A+u). It follows that μ(G)=1 by Proposition 2.7.

Next we establish some properties of the weighted counting measure.

Proposition 4.5.

The weighted counting measure is an invariant probability measure on the skew product (R{0})X which satisfies νπX1=μ.

Proof.

First we show that ν is invariant under the map T. Note that it suffices to check invariance for sets of the form B={r}×[uv]X. First, assume that uε. If we let u=ua, aA, then T1({r}×[uv]X) can be expressed as the disjoint union

T1({r}×[uv]X)=s,sφ(a)=r{s}×[uav]X,

and from the definition of ν we find

ν(T1({r}×[uv]X)) =s,sφ(a)=rν({s}×[uav]X)=s,sφ(a)=r1ds,sφ(u)=sμ(Gsuav)
=1ds,sφ(u)=rμ(Gsuv)=ν({r}×[uv]X).

Next let us treat the case where u=ε, or in other words where B={r}×[v]X. Observe that

T1({r}×[v]X)=sφ(a)=r{s}×[av]X,[Grvε]X=sφ(a)=r[Gsavε]X,

where the unions are taken over pairs (s,a)R×A such that sφ(a)=r. Since G=sRGs is a suffix code by Lemma 4.4, the second union in the above equation is disjoint (as is the first, for obvious reasons). Thus we have

ν(T1({r}×[v]X)) =sφ(a)=rν({s}×[av]X)=sφ(a)=r1dμ(Gsav)
=1dμ(Grv)=ν({r}×[v]X).

Finally, let us show that ν(R×U)=μ(U) for every Borel set U. Using Lemma 4.4 together with the invariance of μ, we have

ν(R×U)=rRμ([Grε]XU)/d=HRμ([GHε]XU)=μ([Gε]XU)=μ(U),

where the last equality follows since μ([Gε]X)=μ(G)=1. Taking U=X, we find that ν(R×X)=μ(X)=1, which shows that ν is indeed a probability measure.

The following lemma will also be needed for the proof of Theorem 4.1. Its proof uses standard arguments from ergodic theory. We include it for the sake of completeness.

Lemma 4.6.

For every ergodic measure μ on X, there exists an ergodic measure μ¯ on (R{0})X such that μ¯({0}×X)=0 and μ¯πX1=μ.

Proof.

Let be the set of all invariant probability measures on (R{0})X and μ the subset of those measures ζ such that ζ({0}×X)=0 and ζπX1=μ. Note that μ contains the weighted counting measure by Proposition 4.5 and thus μ. Moreover, μ is a closed convex subspace of , so it must contain an extreme point μ¯ by the Krein–Milman theorem. Let us show that μ¯ is also an extreme point of . Suppose that μ¯=sζ+(1s)ζ′′ for some ζ,ζ′′ and 0<s<1. It is clear that both ζ and ζ′′ must give to {0}×X zero measure, and the fact that μ is an extreme point in the convex set of invariant measures on X implies that ζπX1=ζ′′πX1=μ. Thus ζ,ζ′′μ, contradicting the fact that μ¯ is an extreme point of μ. Since the extreme points of are precisely the ergodic measure on (R{0})X, we are done.

The next proposition is a more precise form of Theorem 4.1 for the case where μ is ergodic. Note that the statement uses Lemma 4.6 implicitly for the existence of μ¯.

Proposition 4.7.

Let μ be an ergodic measure on A with support a shift space X. Let φ:AM be a morphism onto a finite monoid and let L=φ1(m) for some mM. Fix an -class R of the 𝒥-class JX(M). We set the notation Ur,V={r}×V. Then δμ(L)=0 if mJX(M) and otherwise

δμ(L)=r,rmRμ¯(Ur,[L]X)μ¯(Urm,X) (7)

where μ¯ is any ergodic measure on (R{0})X such that μ¯({0}×X)=0 and μ¯πX1=μ.

Proof.

If mJX(M), then δμ(L)=0 by Corollary 3.5. We may now assume that mJX(M). Let C be the prefix code such that LA=CA. For i0, let

Ci={uC|u|i},C>i={uC|u|>i}.

We claim that for every rR such that rmR, one has

Ur,[LAi]X=Ur,[Ci]XTi(Urm,X).

The left-to-right inclusion is obvious. For the converse, we take x such that (r,x)Ur,[Ci]XTi(Urm,X). This means that there exists some j, with 0ji, such that φ(x0xj1)=m and furthermore rm=rm where m=φ(x0xi1). Clearly mm, hence mm by Proposition 3.2 and stability of M. Moreover our choice of r guarantees that rm𝒥m, thus rmm by stability. By Green’s lemma, xrx is a bijection between the -classes of m and rm. Since rm=rm it follows that m=m, i.e. φ(x0xi1)=m. Thus x[LAi]X which proves the claim.

As a result, we have

μ(LAi)=μ¯(R×[LAi]X)=r,rmRμ¯(Ur,[Ci]XTi(Urm,X)).

Next we claim that for ϵ>0, there is i00 such that μ¯(Ur,[C>i0]X)<ϵ for every rR. First observe that

rRμ¯(Ur,[C>i0]X)=μ¯(R×[C>i0]X)=μ([C>i0]X),

since μ¯ projects to μ. Moreover since C is a prefix code, the cylinders [c]X for cC are all disjoint, and thus we may write

μ([C>i0]X)=μ(C>i0)=i>i0cCAiμ(c).

But note that this is a tail of the series

i0cCAiμ(c)=μ(C)1,

and thus the claim immediately follows.

Using the above claim together with the ergodicity of μ¯, this gives

δμ(L) =limn1ni=i0n1μ(LAi)=r,rmRlimn1ni=i0n1μ¯(Ur,[Ci]XTi(Urm,X))
r,rmRlimn1ni=i0n1μ¯(Ur,[C]XTi(Urm,X))ϵ
r,rmRμ¯(Ur,[L]X)μ¯(Urm,X)ϵ.

On the other hand, we have

δμ(L) =r,rmRlimn1ni=i0n1μ¯(Ur,[Ci]XTi(Urm,X))
r,rmRlimn1ni=i0n1μ¯(Ur,[C]XTi(Urm,X))r,rmRμ¯(Ur,[L]X)μ¯(Urm,X),

concluding the proof.

In order to conclude the proof of Theorem 4.1, we need to deduce the existence of densities for a general invariant measure. This is done using the ergodic decomposition discussed at the end of Section 1.1.

Proof of Theorem 4.1.

Let L be a rational language and μ be an invariant measure. Let be the set of ergodic measure on the support of μ and consider the ergodic decomposition of μ as in Equation (4). By Proposition 4.7, δν(L) exists for every ν. Finally by the dominated convergence theorem

δμ(L) =limn1ni=0n1μ(LAi)=limn1ni=0n1ν(LAi)𝑑τ(ν)
=limn1ni=0n1ν(LAi)dτ(ν)=δν(L)𝑑τ(ν).

When the weighted counting measure is ergodic, we can apply Proposition 4.7 to obtain formulas based on the density of ideals, which are easily computable (Section 2). This generalizes a result of [7] where the same formula is proved for a Bernoulli measure.

Theorem 4.8.

Let μ be an ergodic measure on A with support a shift space X. Let φ:AM be a morphism onto a finite monoid. Fix R an -class of JX(M). If the weighted counting measure is an ergodic measure on the skew product (R{0})X, then, for every mJ, the density of L=φ1(m) is

δμ(L)=δμ(AL)δμ(LA)/d, (8)

where d is the cardinality of the -classes of JX(M).

Proof.

Let ν be the weighted counting measure on (R{0})X. Let C be the prefix code such that LA=CA. Then Equation (7) reduces to

δμ(L) =r,rmRν(Ur,[L]X)ν(Urm,X)=1d2r,rmRμ([GrC]X)μ(Grm)
=1d2μ(Gm)r,rmRμ([GrC]X)=1dμ(Gm)HRμ([GHC]X)

where H runs over the -classes of R and GH is the common value of Gr for the d elements rH. By Lemma 4.4 and since C is a prefix code,

HRμ([GHC]X)=μ([C]X)=μ(C)

and therefore by Propositions 2.1 and 2.2

δμ(L)=1dμ(Gm)μ(C)=1dδμ(AL)δμ(LA).

There are known examples where the skew product has more than one ergodic measure. In Example 3.8, the skew product of X with /2 has two orbits, each one being the support of an ergodic measure, and the weighted counting measure is non-ergodic.

5 On algebraic properties of the density

Let us highlight a corollary which generalizes the fact that, when μ is a Bernoulli measure with rational values, the density of a rational language is rational [5].

Corollary 5.1.

Let 𝕂 be an extension of such that μ(L)𝕂 for every rational language L. Then, for every rational language L on the alphabet A which satisfies the hypotheses of Theorem 4.8, the density of L belongs to 𝕂.

The hypothesis that μ(L)𝕂 is satisfied when μ is a Markov measure (or, more generally, a sofic measure) defined by a transition matrix and an initial vector with coefficents in 𝕂. More details can be found in Section 6. It is also satisfied when all the values of μ on A are in 𝕂 and the support X of μ is minimal. Indeed, in this case, consider a rational language L. If every word of (X) is a factor of some word of L, then δμ(L)>0 by Equation (8), which implies μ(L)=. Otherwise, the intersection L(X) is finite and thus the conclusion follows. Let us finish with an example.

Example 5.2.

Let X be the Fibonacci shift (Example 1.1) and 𝒜 be the automaton depicted in Figure 3. Let μ be the unique ergodic measure on X.

Figure 3: The automaton 𝒜 in Example 5.2 next to the 𝒥-class of JX(M), where M is the transition monoid of 𝒜. The image of a in M is denoted by α while the image of b is denoted by β.

Let φ:AM be the transition morphism of 𝒜 and α=φ(a), β=φ(b). Let R be the -class of α. Note that α2 is an idempotent which belongs to the 𝒥-class JX(M).

Let ν be the weighted counting measure on (R{0})X. We claim that ν is an ergodic measure. First observe that (R{0})X has two closed invariant subsets

Y0=({αβ,α2β}×[b]X)({0}×X),Y1=({αβ,α2β}×[a]X)({α,α2}×X).

Note that ν(Y0)=0 and ν(Y1)=1 (in fact Y1 is the support of ν). Consider the morphism ψ:A/2 defined by ψ(a)=1 and ψ(b)=0 and the corresponding skew product (/2×X,T~). The weighted counting measure ν~ on this skew product is ergodic by [9, Corollary 8.12]. Furthermore, one can show that

π(g,x)={(α2,x)if g=0 and x1=a(α,x)if g=1 and x1=a((αβ)2,x)if g=0 and x1=b(αβ,x)if g=1 and x1=b

is a continuous map /2×XY1 such that πT~=Tπ and ν~π1=ν. Therefore the weighted counting measure on (R{0})X is ergodic, and the hypotheses of Theorem 4.8 are satisfied.

The values μ(w) for wA are all in the quadratic extension 𝕂=(λ) where λ is the golden ratio [8, Theorem 2]. By Corollary 5.1 it follows that δμ(φ1(m))𝕂 for every mJX(M). For instance the language L=φ1(α) satisfies μ(LA)=μ(AL)=μ(a). Since μ(a)=λ (cf. [14, Example 3.8.19]), we get that δμ(L)=12λ2 from Theorem 4.8 (noting that the -classes of JX(M) have size 2). Notice also that the language ψ1(0) has the same intersection with (X) as the submonoid C generated by C={aa,aba,b}. The language C is recognized by 𝒜 with 1 as initial and terminal state. Since /2X is uniquely ergodic, we have δμ(ψ1(0))=1/2.

6 Markov and sofic measures

This section discusses how our results may be applied to the case of Markov, and more generally sofic measures. The main result, Proposition 6.4, shows that sofic measures satisfy the condition on field extensions from Corollary 5.1.

First, recall that a Markov measure μ (also called a Markov chain) is given by an A×A-stochastic matrix M (its transition matrix) and a stochastic A-vector v (its initial vector) which is a left eigenvector of M for the eigenvalue 1. Then, for w=a0a1an1, with aiA, we define

μ(w)=va0Ma0a1Man2,an1

The measure is invariant because vM=v. It is ergodic if the matrix M is irreducible and it is mixing if M is primitive (see [31]). A shift space X has finite type if (X)=AAFA for some finite set FA, called the set of forbidden blocks. The support of a Markov measure μ is the shift of finite type defined by the set of forbidden blocks ab where a,bA are such that Ma,b=0. In terms of probability theory, a Markov measure is defined by a sequence of random variables ζn, as for a Bernoulli measure, but this time ζn depends on ζn1. When ζn depends on ζnk,,ζn1 we say that μ is a Markov measure of order k.

A sofic measure ν (also called a hidden Markov chain) is given by a shift of finite type X on the alphabet B, a Markov measure μ on X, and a map ϕ:BA from B onto an alphabet A. We extend ϕ to a morphism from B to A. Its extension to a map from A to B is called a 1-block map. Then, for wA, we define

ν(w)=μ(ϕ1(w)).

The support of ν is contained in the sofic shift Y=ϕ(X) (a shift X is called sofic if (X) is rational). A sofic measure is invariant. It is ergodic if μ is ergodic (see [11]).

Example 6.1.

Consider the Markov measure on B={1,2,3} defined by the pair

v=[1/31/31/3],M=[02/31/32/31/301/302/3]

The support of μ is the shift of finite type X represented in Figure 4 on the left. Using the 1-block map ϕ(1)=a and ϕ(2)=ϕ(3)=b, we obtain an invariant sofic measure defined by the diagram of Figure 4 on the right.

Figure 4: A sofic measure.

It can be shown that this sofic measure is not a Markov measure of order k for any k (see [11]).

A map μ:A+ is +-rational if there is a morphism φ:AMn(+) from A into the monoid of n×n-matrices with coefficients in +, a row vector λ and a column vector γ such that

μ(w)=λμ(w)γ,

for every wA. The triple (λ,φ,γ) is called a linear representation of μ. The following is from [20] (see also the survey [11]).

Proposition 6.2.

A probability measure on A is sofic if and only if the associated probability distribution is +-rational.

The probability measure will be invariant if and only if the linear representation (λ,φ,γ) can be chosen such that the matrix P=aAφ(a) is stochastic, with λP=λ and γ=[111]t.

Example 6.3.

A linear representation for the sofic measure of Example 6.1 is given by

φ(a)=[0002/3001/300],φ(b)=[02/31/301/30002/3],

with

λ=[1/31/31/3],γ=[111].

Let LA be a language and μ be a probability measure on A. The generating series of L is the formal series

fL(z)=n0μ(LAn)zn.

Therefore, the density of L (if it exists) is the limit in average of the coefficients of fL(z).

A formal series f(z)=n0fnzn, with fn0, is +-rational if the map anfn is +-rational. In this case, we have f(z)=p(z)/q(z) for two polynomials p,q with real coefficients. The following statement, originally due to Berstel, is well known (see [15] for example).

Proposition 6.4.

Let L be a rational language. If μ is a sofic measure, then fL(z) is +-rational. Let 𝕂 be a subfield of containing the values of μ on A. Then μ(L)𝕂{} and δμ(L)𝕂.

Proof.

Let g:A+ be defined by

g(w)={μ(w)if wL,0otherwise.

Thus, g(w) is the product of the values of μ and of the characteristic function χL of L. The map wμ(w) is +-rational by Proposition 6.2 and χL is +-rational since L is rational. Therefore, by [15, Theorem 5.2], the map g is +-rational. Set hn=wAng(w). Then h=n0hnzn is +-rational. Since hn=μ(LAn), this proves that fL(z) is +-rational.

By [15, Theorem 7.2], there is an integer p0 such that ri=limnhnp+i exists for 0i<p. Moreover ri𝕂 whenever hn𝕂. This implies that δμ(L)=1n0i<pri exists and is in 𝕂. Finally, we have fL(z)=p(z)/q(z) for two polynomials p,q[z]. Since hn𝕂, we may assume p,q𝕂[z] by [15, Proposition 3.2]. Thus, if μ(L)=n0hn is finite, the radius of convergence of fL(z) is >1 and the sum is equal to p(1)/q(1), which is in 𝕂.

Example 6.5.

Let μ be the Bernoulli measure on {a,b} defined by μ(a)=p, μ(b)=q. Let L={a,b}ab. Then

fL(z)=n0pqzn+2=pqz21z

and consequently δμ(L)=pq.

References

  • [1] Jean-Paul Allouche and Jeffrey Shallit. Automatic Sequences. Cambridge University Press, 2003. doi:10.1017/cbo9780511546563.
  • [2] Jorge Almeida. Profinite semigroups and applications. In V. Kudryavtsev and I. G. Rosenberg, editors, Structural Theory of Automata, Semigroups, and Universal Algebra, volume 207 of NATO Science Series II: Mathematics, Physics and Chemistry, pages 1–45. Springer, 2005. doi:10.1007/1-4020-3817-8_1.
  • [3] Jorge Almeida, Alfredo Costa, Revekka Kyriakoglou, and Dominique Perrin. On the group of a rational maximal bifix code. Forum Math., 32(3):553–576, 2020. doi:10.1515/forum-2018-0270.
  • [4] Marie-Pierre Béal, Dominique Perrin, and Antonio Restivo. Decidable problems in substitution shifts. J. Comput. System Sci., 143, 2024. doi:10.1016/j.jcss.2024.103529.
  • [5] Jean Berstel. Sur la densité asymptotique de langages formels. In M. Nivat, editor, Automata Languages and Programming, pages 345–358. North-Holland, 1972.
  • [6] Jean Berstel, Clelia De Felice, Dominique Perrin, Christophe Reutenauer, and Giuseppina Rindone. Bifix codes and Sturmian words. J. Algebra, 369:146–202, 2012.
  • [7] Jean Berstel, Dominique Perrin, and Christophe Reutenauer. Codes and Automata. Cambridge University Press, 2009.
  • [8] Valérie Berthé. Fréquences des facteurs des suites sturmiennes. Theor. Comput. Sci., 165(2):295–309, 1996. doi:10.1016/0304-3975(95)00224-3.
  • [9] Valérie Berthé, Herman Goulet-Ouellet, Carl-Fredrik Nyberg-Brodda, Dominique Perrin, and Karl Petersen. Density of group languages in shift spaces, 2024. doi:10.48550/arXiv.2403.17892.
  • [10] Manuel Bodirsky, Tobias Gärtner, Timo von Oertzen, and Jan Schwinghammer. Efficiently computing the density of regular languages. In LATIN 2004: Theoretical Informatics, pages 262–270. Springer Berlin Heidelberg, 2004. doi:10.1007/978-3-540-24698-5_30.
  • [11] Mike Boyle and Karl Petersen. Hidden Markov processes in the context of symbolic dynamics, pages 5–71. Cambridge University Press, 2011. doi:10.1017/cbo9780511819407.002.
  • [12] Olivier Carton and Wolfgang Thomas. The monadic theory of morphic infinite words and generalizations. Inf. Comput., 176(1):51–65, 2002. doi:10.1006/inco.2001.3139.
  • [13] Thomas Colcombet. Green’s relations and their use in automata theory. In Adrian-Horia Dediu, Shunsuke Inenaga, and Carlos Martín-Vide, editors, Language and Automata Theory and Applications, volume 6638 of Lecture Notes in Computer Science, pages 1–21. Springer, 2011. doi:10.1007/978-3-642-21254-3_1.
  • [14] Fabien Durand and Dominique Perrin. Dimension Groups and Dynamical Systems. Cambridge University Press, 2022. doi:10.1017/9781108976039.
  • [15] Samuel Eilenberg. Automata, Languages and Machines, vol. A. Pure and applied mathematics. Academic Press, 1974.
  • [16] Samuel Eilenberg. Automata, Languages and Machines, vol. B. Pure and applied mathematics. Academic Press, 1976.
  • [17] N. Pytheas Fogg. Substitutions in dynamics, arithmetics and combinatorics, volume 1794 of Lecture Notes in Mathematics. Springer-Verlag, 2002.
  • [18] Pierre A. Grillet. Semigroups: An Introduction to the Structure Theory. Marcel Dekker, 1995. doi:10.4324/9780203739938.
  • [19] G. Hansel and D. Perrin. Codes and Bernoulli partitions. Math. Syst. Theory, 16(1):133–157, 1983. doi:10.1007/BF01744574.
  • [20] G. Hansel and D. Perrin. Rational probability measures. Theor. Comput. Sci., 65(2):171–188, 1989. doi:10.1016/0304-3975(89)90042-x.
  • [21] John M. Howie. Fundamentals of semigroup theory. Oxford science publications. Oxford University Press, 1995. doi:10.1093/oso/9780198511946.001.0001.
  • [22] Toshihiro Koga. On the density of regular languages. Fundam. Inform., 168(1):45–49, 2019. doi:10.3233/fi-2019-1823.
  • [23] Jakub Kozik. Conditional densities of regular languages. Electron. Notes Theor. Comput. Sci., 140:67–79, 2005. doi:10.1016/j.entcs.2005.06.023.
  • [24] G. Lallement. Semigroups and Combinatorial Applications. Wiley, 1979.
  • [25] M. Lothaire. Combinatorics on Words. Cambridge University Press, 1997. doi:10.1017/cbo9780511566097.
  • [26] James F. Lynch. Convergence laws for random words. Australas. J. Combin., 7:145–156, 1993. URL: http://ajc.maths.uq.edu.au/pdf/7/ocr-ajc-v7-p145.pdf.
  • [27] Pierre Michel. Stricte ergodicité d’ensembles minimaux de substitution. C. R. Acad. Sci., 278:811–813, 1974.
  • [28] Dominique Perrin. Codes and automata in minimal sets. In Florin Manea and Dirk Nowotka, editors, Combinatorics on Words, volume 9304 of Lecture Notes in Computer Science, pages 35–46. Springer, 2015. doi:10.1007/978-3-319-23660-5_4.
  • [29] Dominique Perrin and Paul E. Schupp. Automata on the integers, recurrence distinguishability, and the equivalence and decidability of monadic theories. In Proceedings of the First Annual IEEE Symposium on Logic in Computer Science (LICS 1986), pages 301–304. IEEE Computer Society Press, 1986.
  • [30] Dominique Perrin and Jean Éric Pin. Infinite Words: Automata, Semigroups, Logic and Games. Elsevier, 2004.
  • [31] Karl E. Petersen. Ergodic Theory. Cambridge University Press, 1983. doi:10.1017/cbo9780511608728.
  • [32] Robert R. Phelps. Lectures on Choquet’s theorem. Lecture Notes in Mathematics. Springer Berlin Heidelberg, 2001. doi:10.1007/b76887.
  • [33] Jean-Éric Pin. Varieties of formal languages. Springer New York, 1986.
  • [34] M. Queffélec. Substitution Dynamical Systems: Spectral Analysis, volume 1294 of Lecture Notes in Mathematics. Springer-Verlag, 2010. doi:10.1007/978-3-642-11212-6.
  • [35] Ville Salo. Decidability and universality of quasiminimal subshifts. J. Comput. Syst. Sci., 89:288–314, 2017. doi:10.1016/j.jcss.2017.05.017.
  • [36] Arto Salomaa and Matti Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer New York, 1978. doi:10.1007/978-1-4612-6264-0.
  • [37] M. P. Schützenberger. Sur certains sous-monoïdes libres. Bull. Soc. Math. Fr., 93:209–223, 1965. doi:10.24033/bsmf.1623.
  • [38] Marcel Paul Schützenberger. On finite monoids having only trivial subgroups. Inform. Control, 8(2):190–194, 1965. doi:10.1016/s0019-9958(65)90108-7.
  • [39] Ryoma Sin’ya. An automata theoretic approach to the zero-one law for regular languages: Algorithmic and logical aspects. Electron. Proc. Theor. Comput. Sci., 193:172–185, 2015. doi:10.4204/eptcs.193.13.
  • [40] William A. Veech. Strict ergodicity in zero dimensional dynamical systems and the Kronecker–Weyl theorem mod 2. Trans. Amer. Math. Soc., 140:1, 1969. doi:10.2307/1995120.
  • [41] William A. Veech. Finite group extensions of irrational rotations. Israel J. Math., 21(2):240–259, 1975. doi:10.1007/BF02760801.
  • [42] Peter Walters. An Introduction to Ergodic Theory, volume 79 of Graduate Texts in Mathematics. Springer, 1982.