Abstract

Simple Types for Probabilistic Termination

Willem Heijltjes ORCID Department of Computer Science, University of Bath, UK Georgina Majury ORCID Department of Computer Science, University of Bath, UK
Abstract

We present a new typing discipline to guarantee the probability of termination in probabilistic lambda-calculi. The main contribution is a particular naturality and simplicity: our probabilistic types are as simple types, but generated from probabilities as base types, representing a least probability of termination. Simple types are recovered by restricting probabilities to one.

Our vehicle is the Probabilistic Event Lambda-Calculus by Dal Lago, Guerrieri, and Heijltjes, which presents a solution to the issue of confluence in probabilistic lambda-calculi. Our probabilistic type system provides an alternative solution to that using counting quantifiers by Antonelli, Dal Lago, and Pistone, for the same calculus.

The problem that both type systems address is to give a lower bound on the probability that terms head-normalize. Following the recent Functional Machine Calculus by Heijltjes, our development takes the (simplified) Krivine machine as primary, and proceeds via an extension of the calculus with sequential composition and identity on the machine. Our type system then gives a natural account of termination probability on the Krivine machine, reflected back onto head-normalization for the original calculus. In this way we are able to avoid the use of counting quantifiers, while improving on the termination bounds given by Antonelli, Dal Lago, and Pistone.

Keywords and phrases:
lambda-calculus, probabilistic termination, simple types
Copyright and License:
[Uncaptioned image] © Willem Heijltjes and Georgina Majury; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Lambda calculus
; Theory of computation Type theory ; Theory of computation Probabilistic computation
Acknowledgements:
We would like to thank Melissa Antonelli, Ugo Dal Lago, and Paolo Pistone for the constructive conversation about both our approaches, and the anonymous referees for their helpful commentary.
DOI:
10.4230/LIPIcs.CSL.2025.31
Editors:
Jörg Endrullis and Sylvain Schmitz

1 Introduction

While the study of probabilistic computation can be traced to the 1950s [11], the first study of the probabilistic λ-calculus in particular is considered to be by Saheb-Djahromi in the late 70s [30]. In the near half century since, many variations on higher-order probabilistic computation have been considered [10, 12, 19, 20, 26, 29]. In recent years, perhaps due to the potential for applications in machine learning and modelling of probabilistic systems, the area has seen a return to popularity [5, 6, 16, 17, 24, 31]. An important computational phenomenon in its own right, the study of probabilistic choice can also provide a “foot in the door” for understanding how more general effects might manifest, leading for instance to the recent Functional Machine Calculus (FMC) as a confluent λ-calculus with effects [3, 18] that will play a central role in our development.

In this paper we consider the problem of probabilistic termination, the probability that a given reduction mechanism reaches the normal form of a term, a key consideration for probabilistic computation and the subject of significant recent attention [4, 7, 8, 15, 22]. 111Note that this objective is distinct from almost-sure termination: it considers exact probabilities, not those converging in the limit. Almost-sure termination is more commonly studied for iterative constructs [21, 27]; extending the lambda-calculus with an almost-surely terminating iterator is the subject of future work. Termination being one of the prime considerations of type systems in general, the question of type systems for probabilistic termination is pertinent [1, 7]. Our contribution in this paper is a new typing discipline for probabilistic termination, following the recent line of work on the probabilistic event λ-calculus [9] and the FMC. Their prime features are confluence for probabilistic computation and other effects, and the encoding of multiple reduction strategies within a single calculus.

One of the greatest challenges facing the study of probabilistic computation is confluence. In most variants the outcome of duplicating a probabilistic term is strategy dependent. Consider the instruction “write down the result of flipping a coin twice”: it is ambiguous whether “twice” refers to “write” or “flipping”, and the choice of interpretation changes the possible outcomes. In most of the literature results are therefore either restricted to a single strategy, such as call–by–value or call–by–name, or rely on a modified definition of reduction. This raises the question whether it is possible to have confluence for probabilistic computation, while expressing different strategies, and rewriting without restriction on contexts, properties which lend elegance to the λ-calculus. The probabilistic event λ-calculus [9] proposes a solution to this problem by decomposing a probabilistic sum MN into a choice operation MaN, understood as a conditional “if a then M else N”, and a probabilistic generator [Uncaptioned image].P, representing a coin toss whose outcome is bound to the boolean variable a in the term P, which then determines the choice for MaN. When in argument position [Uncaptioned image].M acts as a wrapper, similar to thunking in call–by–push value [25] and the bang-calculus [14, 15], protecting the operator from evaluation. These factors combine to return confluence to the calculus, allowing for arbitrary evaluation strategies with an unrestricted β-reduction.

The probabilistic event λ-calculus (𝖯𝖤Λ) was introduced in [9] with a simple type system, corresponding to that on the lambda-calculus. This system ignored the probabilistic elements of the calculus, and thus gave little insight into the properties of this extension. In a deterministic calculus a type system provides various safety guarantees. These may be qualitative: termination, outputs, composability; or quantitative: run time, term size. In the probabilistic setting, however, it is natural to wonder what guarantees can be made, when the behaviour of a single term can vary between iterations. If, as in the type system mentioned, the probabilities are ignored, strong results can be obtained, albeit on a fairly uninteresting fragment of the calculus. Once probability is acknowledged by the type system the guarantees made become similarly qualified: probability of termination, almost-sure termination, expected run time. Here, we consider the first of these.

One proposed type system for the 𝖯𝖤Λ [1], labelled Cλ, introduces counting quantifiers as a Curry–Howard correspondent to an intuitionistic counting propositional logic. These provide a mechanism to express “proportion of truth” by quantifying the number of satisfying assignments to a formula within a given model, in the way that existential and universal quantification may be understood via the existence of a satisfying assignment, respectively the satisfaction of all assignments. By taking the assignments to a formula to describe the branches of a probabilistic computation, counting quantifiers may be used to describe probability bounds. By this mechanism, Cλ provides a lower bound on the probability of head normalisation. However, as illustrated in [1], the bounds provided by Cλ are not tight, even in situations where this would be expected (see Example 8).

In this paper we present an alternative approach to the same problem, for the same calculus, using a different inspiration, to provide improved termination bounds. Deriving from the 𝖯𝖤Λ, the FMC provides an additional feature that, to us, seemed crucial to a natural account of probabilistic termination via the type system. First, the “machine” in question is the (simplified) Krivine machine [23], whose evaluation is closely related to head normalisation. The FMC here adds an intriguing aspect: it introduces sequential composition and identity on the machine, where the latter, the imperative skip, provides a notion of successful termination, notably absent from the machine for standard λ-calculus. Moreover, this becomes the primary interpretation of types: a type derivation in the FMC is a proof that the machine successfully terminates. Our main question for this work was how to adapt this to capture probabilistic termination. The answer, described in Section 6, is an extended sequential probabilistic event λ-calculus 𝖲𝖯𝖤Λ, with a type system 𝖲𝖯𝖤Λ that naturally describes probabilistic termination of the machine, as well as probabilistic termination of head reduction.

The type system 𝖲𝖯𝖤Λ for the extended sequential calculus reflects back onto a natural type system for the original probabilistic event λ-calculus, 𝖯𝖤Λ, which gives probabilistic head normalization in the following way: types generalize simple types by replacing the base type with a probability. Formally,

A,BpAB

where p[0,1] is a rational number between zero and one inclusive. The intuition is that the base type for simple types, o, may be understood as signifying successful evaluation, even if it is uninhabited. After all, a constant base type such as for booleans or integers signifies the successful return of a corresponding value. This further matches the interpretation in the FMC, where the type o is inhabited by skip, and corresponds to termination of the machine without producing a result. Our approach, then, is to replace certain termination with a probability of termination, replacing the base type o with a probability p. We consider the striking simplicity and natural intuition of this approach one of our main contributions.

2 The probabilistic event lambda-calculus

We recall the probabilistic event lambda-calculus 𝖯𝖤Λ from [9]. Assume countable sets of variables, ranged over by x,y,z, and of events, ranged over by a,b,c. The former are term variables, to be instantiated by terms of the calculus, and the latter are boolean variables, to be instantiated by (true) or (false). The 𝖯𝖤Λ extends the λ-calculus with a generator [Uncaptioned image].M, which flips a coin and binds the result ( or ) to a, and a choice or conditional MaN, which evaluates to M if a is true, and to N otherwise. A traditional probabilistic sum of terms, MN, may be encoded as [Uncaptioned image].MaN. Generators are normally fair, with equal probability for or , though on occasion we may need a biased generator [Uncaptioned image]p which chooses with probability p, and with 1p.

Definition 1.

Terms are given by the following grammar,

M,Nxλx.MMN[Uncaptioned image].MMaN

with, from left to right: a variable; an abstraction, which binds x in M; an application; a generator, which binds a in M; and a choice.

The free variables and free events of a term M are written 𝖿𝗏(M) and 𝖿𝖾(M) respectively. Substitution is written prefix: {N/x}M is the capture-avoiding substitution of N for x in M. For an event a we define two projection functions πa and πa, which apply the effect of instantiating a with true and false respectively, to a term M.

Definition 2.

The projection functions πia for an event a and i{,} are given as follows, where ab.

πiax=xπia(λx.M)=λx.πiaMπia(MN)=(πiaM)(πiaN)πa(MaN)=πaMπia(MbN)=(πiaM)b(πiaN)πa(MaN)=πaNπia([Uncaptioned image].M)=[Uncaptioned image].πiaM

We use the standard notions of context, head context, and applicative context to define the reduction relations. Note that a head context is of the form λx1λxn.{}M1Mm where m and n are potentially zero.

Definition 3.

Contexts C, head contexts H, and applicative contexts A are defined as follows. A context C with the hole replaced by M, capturing variables, is written C{M}.

C{}λx.CCMMCHλx.HA[Uncaptioned image].CCaMMaCAAM{}

A probability p,q is a rational number between 0 and 1 inclusive. Probabilistic reduction will return a multi-(sub-)distribution [2], a finite multiset of weighted terms written as [p1M1,,pnMn] whose weight ||=inpi is (at most) one.222We use multi-distributions rather than distributions to accommodate the reduction measure for the proof of head normalization in Appendix A.2. For simplicity, we will refer to these as distributions, and we convert implicitly between terms M and the singleton distribution [1M]. Multiset union is written 𝒮+𝒯, and the empty multiset as . A distribution is scaled to p by multiplying each weight in by p. The underlying probability (sub-)distribution of , the finite function from terms to probabilities obtained by collecting like terms pM and qM as (p+q)M, is written .

Definition 4.

Beta-reduction [Uncaptioned image]β and head β-reduction [Uncaptioned image]β𝗁 are given by closing the beta-rule below under all contexts C respectively under head contexts H, and implicitly return a singleton distribution.

(λx.M)N[Uncaptioned image]β{N/x}M

Projective reduction [Uncaptioned image]π is the following reduction relation from terms to distributions.

H{[Uncaptioned image].M}[Uncaptioned image]π[12H{πaM},12H{πaM}]

Head reduction ([Uncaptioned image]𝗁)=([Uncaptioned image]β𝗁)([Uncaptioned image]π) is the union of head β-reduction and projective reduction. Reduction is lifted to distributions of terms in the expected way: if M[Uncaptioned image]𝒩 then [pM]+[Uncaptioned image](p𝒩)+. We write [Uncaptioned image] for the reflexive-transitive closure of a reduction relation [Uncaptioned image].

The 𝖯𝖤Λ features a second notion of reduction, permutative reduction [Uncaptioned image]𝗉 [9], which gives a more fine-grained evaluation of probabilistic sums. It is this reduction for which confluence is particularly significant. The effect of permutative reduction is to bridge the gap between the decomposed operators [Uncaptioned image] and MaN and the standard probabilistic sum MN, encoded as [Uncaptioned image].MaN, by internalizing the reduction of [Uncaptioned image].M to the sum of its two projections [9, Proposition 29]:

[Uncaptioned image].M[Uncaptioned image]𝗉[Uncaptioned image].(πaM)a(πaM)(if a𝖿𝖾(M)).

In this paper we will work with projective reduction, since we are interested in termination of head reduction. We will, on occasion, consider terms where probabilistic sums are of the traditional form MN=[Uncaptioned image].MaN with a𝖿𝖾(M),𝖿𝖾(N). As per the above, these are effectively the normal forms of permutative reduction.

3 Probabilistic types

Before introducing our probabilistic type system, we recall simple types for the 𝖯𝖤Λ [9].

Definition 5.

Simple types are given by the following grammar.

A,BoAB

A typing context Γ is a finite function from term variables to types, written as a sequence x1:A1,,xn:An. A typing judgment ΓM:A assigns the type A to the term M in the context Γ. The simply-typed probabilistic event λ-calculus 𝖯𝖤Λ is given by the typing rules in Figure 1.

Figure 1: The simple type system 𝖯𝖤Λ.

Simple types ignore the probabilistic constructs of the calculus, generator and choice, requiring only that branches of a choice have equal types. This gives the expected result: typed terms are strongly normalizing, since every possible branch of the computation is typed. For probabilistic termination, we wish to capture that a given fraction of all branches of a computation, terminates – where our notion of termination is given by head normalization.

Our approach derives from the following idea: if the base type o of simple types, even if not inhabited, denotes certainty of termination, then we may generalise this to probability of termination by replacing base types with arbitrary probabilities. This yields the following notion of types.

Definition 6.

Probabilistic types are given by the following grammar, where p[0,1].

A,BpAB

The intuitive meaning of, for example, assigning a term M a type ABp is: given inputs of type A and B, M terminates with probability at least p. The identity term I=λx.x may be assigned any type pp: given an input N that terminates with probability p, the term IN does so as well. The type system will include an axiom that any term, in particular a non-terminating one such as Ω=(λx.xx)(λx.xx), may be assigned a termination probability of zero. This is expressed by a type A1An0: for any inputs A1 through An, the term will terminate with probability at least zero.

Note that these types generalise simple types with a single, uninhabited base type o. A probabilistic system with multiple base types, say integers and booleans, would pair the return type with a probability, for instance an integer with 12 probability, or a boolean with 14 probability. The extension of the calculus with sequencing, in Sections 5 and 6, will give a more concrete account of how probabilities interact with return values and return types.

The term [Uncaptioned image].MaΩ gives a fair probabilistic choice between M and Ω. For the computation to be well-typed regardless of the choice, M and Ω should have the same input types, so that they can be applied to the same arguments. Hence, if M has type ABp, we may choose AB0 for Ω. Then the type for [Uncaptioned image].MaΩ should be AB12p: given arguments of type A and B, the computation chooses M with probability 12 and so terminates with probability at least 12p.

These considerations motivate the shape of our probabilistic type system, with one further aspect to explain. An arbitrary generator term [Uncaptioned image].M reduces with a fair probability to either πaM or πaM, which may then terminate with different probabilities. However, using projections would mean the type system loses the property of being inductive on terms. We will instead record the assignment of a truth value to a in an additional context E, that we call an event valuation. The type derivation then projects a term MaN to M or to N according to the value of a in the event valuation E.

Definition 7.

The probabilistically typed probabilistic event λ-calculus 𝖯𝖤Λ is given by the typing rules in Figure 2, using the following definitions. An event valuation E is a finite function from events to {,}, written as a sequence a1i1,,anin. A typing context Γ is a finite function from variables to types, written x1:A1,,xn:An. A probabilistic typing judgement E|ΓM:A assigns a type A to the term M in the context of E and Γ. A sequence of antecedents A1Anp is abbreviated with vector notation as A¯p.

The typing rules are syntax-driven, except for the 𝗅𝗈𝗐 rule to lower a given bound. The rule is not essential to the type system, since it is already implied in the meaning of types as giving a lower bound; but for the same reason, since it is implied, including it brings more clarity than omitting it. Note further that the typing rule 𝗀𝖾𝗇 assumes a fair coin toss for the generator [Uncaptioned image], which gives the resulting probability of 12p+12q. An unfair toss [Uncaptioned image]r with ratio r would give a probability rp+(1r)q.

Figure 2: The probabilistic type system 𝖯𝖤Λ.
Example 8.

To demonstrate our type system we reprise the example t[a,b] from [1, Section 6.2], written here as the term [Uncaptioned image].[Uncaptioned image].[Uncaptioned image].T below. In Figure 3 we derive the following type.

[Uncaptioned image].[Uncaptioned image].[Uncaptioned image].T:138whereT=((IcΩ)bΩ)a(ΩbI)
Figure 3: Typing derivations for Example 8, where T=((IcΩ)bΩ)a(ΩbI).

We will discuss a number of aspects of our type system. First, as a particularly natural feature, observe that the rules 𝗏𝖺𝗋, 𝖺𝖻𝗌 and 𝖺𝗉𝗉 make it conservative over standard simply-typed λ-calculus – which, interestingly, is agnostic to the choice of base types.

Second, a key difference with the simple type system of Figure 1 is that probabilistic branching occurs in the generator rule, whereas for simple types it is the choice rule. This is due to aiming for head normalization instead of strong normalization. Consider the example term [Uncaptioned image].Ma(ΩaN). Head reduction projects it to M and N, removing Ω, which will not be reduced. This is reflected in the probabilistic type system, where the branches of the generator typing rule project to M and N via the event valuation. The simple type system, reflecting strong normalization, would require also Ω to be typed – which of course it cannot.

For terms of the form MN=[Uncaptioned image].MaN this difference is moot: both type systems branch similarly for this construct, to M and N. This leaves as only distinction the generalisation of types themselves, from a single base type o to probabilities p. In accordance with the meaning of a base type as the probability of termination, simple types are then recovered by restricting to base type p=1. This rules out the rules 𝗓𝖾𝗋𝗈 and 𝗅𝗈𝗐, assigning a zero-weighted type A¯0 and reducing the weight of a type. It is easy to observe that the remaining rules preserve the restriction p=1, to give the following proposition.

Proposition 9.

For the fragment below left, the simply-typed 𝖯𝖤Λ coincides with the probabilistically-typed 𝖯𝖤Λ restricted to the types below right.

M,Nxλx.MMN[Uncaptioned image].MaNA,B1AB

Our main result for the probabilistically-typed 𝖯𝖤Λ is that a type A¯p guarantees head normalization with probability at least p. Formally, this is stated by a head reduction to a distribution of which a proportion of at least p is in head-normal form.

Theorem 10.

For closed M, if M:A¯p then M[Uncaptioned image]𝗁𝒩0+𝒩1 where all terms in 𝒩0 are head normal and |𝒩0|p.

The result will follow directly from the corresponding Theorem 22 for the expanded probabilistic calculus with sequential composition, introduced in Section 5.

4 Comparison with counting quantifiers

In this section we will give a close comparison with the type system Cλ of Antonelli, Dal Lago, and Pistone [1], and demonstrate that our approach gives tighter bounds on the probability of termination.

The first distinction between Cλ and 𝖯𝖤Λ is that the former uses indexed event variables xai for i instead of events a. However, there is no formal need for this; since the indices i are static, we may replace xai and xaj for distinct i and j simply with distinct events a and b. This simplification extends to the semantics. Probabilities in Cλ are given by boolean formulas 𝒷 over the variables xai, indicating a subset of the space (2)X where X is a finite set of events. However, since the indices i are fixed and bounded, it is sufficient to consider finite spaces 2X. The elements of this set are the event valuations E with domain X, and a boolean formula 𝒷 over X indicates the set of event valuations 𝒷X={E2XE𝒷}, where E𝒷 is characterized syntactically as expected:

For a set 2X the measure μX() is given by

μX()=||2|X|

where |S| denotes the size of a set S. Then μ(𝒷) is μX(𝒷X) where X is the domain of 𝒷. This coincides with the measure μ(𝒷) over (2)X in [1], as the latter makes no essential use of the infinity offered by .

The second difference is the syntax of types: Cλ introduces probabilities through counting quantifiers Cp, where 𝖯𝖤Λ has probabilities as base types. Types are nevertheless isomorphic: Cλ types 𝔰 are of the form Cp(𝔰1𝔰no), with a fixed outer counting quantifier, and map 1-to-1 onto 𝖯𝖤Λ types A of the form A1Anp by associating the probability p instead with the consequent o. Formally, we encode types 𝔰 and typing contexts Γ of Cλ into our setting as follows.

Cq(o)=qCq(𝔰τ)=𝔰Cq(τ)x1:𝔰1,,xn:𝔰n=x1:𝔰1,,xn:𝔰n

Having connected boolean formulas 𝒷 to event valuations E, and Cλ types to 𝖯𝖤Λ types, we may state the following conservativity result of 𝖯𝖤Λ over Cλ.

Proposition 11.

If E𝒷X then

ΓXM:𝒷𝔰impliesE|ΓM:𝔰.

Proof.

By induction on the typing derivation for ΓXM:𝒷𝔰.

Corollary 12.

For a given closed term M the type system 𝖯𝖤Λ gives the same or higher termination bounds than Cλ.

In the reverse direction, Example 8 and its counterpart in [1, Section 6.2] show that the two type systems do not give the exact same bounds, and in some cases 𝖯𝖤Λ gives a strictly higher bound. The reason is that 𝖯𝖤Λ locates branching between alternatives at the 𝗀𝖾𝗇-rule for [Uncaptioned image].M, where Cλ branches for choice terms MaN or in a contraction rule. Crucially, the 𝗀𝖾𝗇-rule allows branches with different termination bounds. We illustrate this further by attempting to simulate the rule for [Uncaptioned image].M in Cλ.

The branching at this rule in Cλ is captured with a contraction on the two premisses, which requires the probabilities to be equal, i.e. p=q. The derivation is as follows, where 𝒹= and μ(𝒹)=1 for the counting rule.

This is the issue illustrated by Example 8 and its counterpart in [1, Section 6.2], for which 𝖯𝖤Λ gives the actual termination probability of 38, while Cλ gives a best approximation of 14. Reprising the example in Cλ, the two sub-derivations for [Uncaptioned image].[Uncaptioned image].T, given in condensed form below, assign probabilities of 14 and 12. These may only be combined in a contraction by lowering the first probability to match the 14 of the first.

The above analysis suggests that this issue with Cλ may be fixed by adopting the generator typing rule of 𝖯𝖤Λ, adjusted appropriately as follows. The key here is that different branches feature the dual atoms a and ¬a, not seen in either the simple type system 𝖯𝖤Λ nor the intersection type system in [1].

5 Sequencing

Two observations about probabilistic λ-calculi motivate the developments in the remainder of this paper. The first is the primary role of head reduction. It is well known that head normalization corresponds closely to evaluation on the Krivine Machine [23], and sometimes the machine gives a more natural model of what is being studied. The second is that probabilistic evaluation in λ-calculi needs to account for the difference between call–by–value (cbv) and call–by–name (cbn), to which end additional constructions are introduced, sometimes ad-hoc. The 𝖯𝖤Λ is an example, as is the separate consideration of a cbv- and a cbn-probabilistic sum by Faggian and Ronchi Della Rocca [16], while Antonelli, Dal Lago, and Pistone [1] add to the standard cbn-application a second cbv-application.

These observations prompted us to consider probabilistic termination from the perspective of the Functional Machine Calculus (FMC) [18], a λ-calculus with computational effects. Firstly, the Krivine Machine plays a central role in the FMC (indeed it is the “M” in “FMC”), while the FMC provides the machine with a new notion of successful termination, absent from the standard λ-calculus. It is then a natural question if and how this may be used to capture the probability of successful termination. Secondly, the FMC may express both cbn and cbv behaviour, with the cbn λ-calculus a fragment and the cbv λ-calculus encoded in the syntax. The need for ad-hoc constructs to control reduction behaviour is thus avoided.

The Krivine Machine, simplified by replacing environments with substitution, evaluates a λ-term in the presence of a stack of input terms. An abstraction λx.M pops the top off the stack, say N, and continues as {N/x}M, while an application MN pushes its argument N and continues as M. A λ-term may thus be viewed as a language of instruction sequences for this machine: application–push, abstraction–pop, variable–execute.

The FMC then extends the λ-calculus with sequential composition M;N and its unit, the imperative skip , with the expected semantics: concatenation of machine instructions and the empty instruction. As in models of imperative languages, skip indicates successful termination of the machine. This gives a fragment called the sequential λ-calculus.

We adopt these modifications in the 𝖯𝖤Λ to give the sequential probabilistic event λ-calculus (𝖲𝖯𝖤Λ), defined below. Following [18] we render abstraction as x.M=λx.M and application as [N].M=MN to emphasise the machine behaviour of pop and push, retaining the standard syntax as a shorthand. In particular, the new notation clarifies the interaction between push and sequencing: the following three terms are equivalent, rendered first in standard notation and second with prefix application.

(N);M(;M)NMN([N].);M[N].(;M)[N].M

The full FMC further generalises to a machine with multiple independent stacks, addressed by a set of locations, in which pop and push are then parameterised to operate on the corresponding stack. This allows us to encode the effects of mutable higher-order store, input/output, and indeed probabilistic computation: the generator [Uncaptioned image].M of the 𝖯𝖤Λ is, in the FMC, an abstraction 𝗋𝗇𝖽a.M parameterised to draw from a stream of random values labelled 𝗋𝗇𝖽. The 𝖲𝖯𝖤Λ is thus a fragment of the FMC with two locations.

Definition 13.

The sequential probabilistic event λ-calculus 𝖲𝖯𝖤Λ is given as follows.

M,N,Pxx.M[N].M[Uncaptioned image].MMaNN;M

Prefixing binds tighter than sequencing, [N].M;P=([N].M);P, and sequencing associates right, M;N;P=M;(N;P). Projections and contexts extend to the 𝖲𝖯𝖤Λ as below; head contexts and applicative contexts are as for the 𝖯𝖤Λ.

πia=πia(N;M)=πiaN;πiaMCC;MM;C

The interaction between sequentiality and the λ-calculus is governed by the following sequencing reduction rules. These make sequential composition right-associative, and let the prefixing of push, pop, and the generator propagate past it (as in a standard list concatenation algorithm). The result is to make the first such action on the abstract machine the leading construct.

;P[Uncaptioned image]σP(N;M);P[Uncaptioned image]σN;(M;P)x.M;P[Uncaptioned image]σx.(M;P)(x𝖿𝗏(P))[N].M;P[Uncaptioned image]σ[N].(M;P)[Uncaptioned image].M;P[Uncaptioned image]σ[Uncaptioned image].(M;P)(a𝖿𝖾(P))

The sequencing relation [Uncaptioned image]σ is given by closing these rules under all contexts C, and head sequencing [Uncaptioned image]σ𝗁 by closing under head contexts H only. The β-reduction rule in 𝖲𝖯𝖤Λ notation is as below left, with [Uncaptioned image]β given by closing under all contexts and [Uncaptioned image]β𝗁 by closing under head contexts. Projective reduction, below right, is as previously.

[N].x.M[Uncaptioned image]β{N/x}MH{[Uncaptioned image].M}[Uncaptioned image]π[12H{πaM},12H{πaM}]

Head reduction [Uncaptioned image]𝗁 is the union of all three head relations:

[Uncaptioned image]𝗁=[Uncaptioned image]β𝗁[Uncaptioned image]σ𝗁[Uncaptioned image]π.

We round off by observing the shape of head-normal forms, assuming no free event variables.

Proposition 14.

The head-normal forms of event-closed 𝖲𝖯𝖤Λ-terms are of one of the three forms H{}, H{x}, and H{x;M}.

5.1 Encoding call–by–value

Sequential composition provides an essential element that the λ-calculus lacks, and which is at the heart of the cbv/cbn dichotomy: control over execution. The cbv behaviour of an application MN is encoded almost as N;M: first evaluate the argument N, then evaluate the function M (the full encoding, below, includes an extra part to also execute M).

We demonstrate the encoding of the cbv-probabilistic λ-calculus Λcbv of Faggian and Ronchi Della Rocca [16]. The encoding of cbv λ-terms is standard: see [13, 18, 28]. Values V, W and terms M, N encode by the translations 𝗏 and 𝗍 respectively, below.

Values V,W:x𝗏=x(λx.M)𝗏=x.M𝗍Terms M,N:V𝗍=[V𝗏].(MN)𝗍=N𝗍;M𝗍;x.x(MN)𝗍=[Uncaptioned image].M𝗍aN𝗍

The operational intuition is that a push represents a return value: a term M𝗍 evaluates until it is of the form V𝗍=[V𝗏]., at which point V𝗏 is pushed to the stack and the machine terminates. Then β-reduction is simulated as follows.

((λx.M)V)𝗍=[V𝗏].;[x.M𝗍].;y.y[Uncaptioned image]σ[V𝗏].[x.M𝗍].y.y[Uncaptioned image]β[V𝗏].x.M𝗍[Uncaptioned image]β{V𝗏/x}M𝗍=({V/x}M)𝗍

The probabilistic reduction rule of Λcbv is that of projective reduction, under the given encoding, but it applies in surface contexts:

S{}MSSM

Then the translation of S{MN} indeed does not place the probabilistic redex inside a push, the requirement for correct behaviour.

5.2 The abstract machine

The small-step operational semantics of the 𝖲𝖯𝖤Λ is given by the following abstract machine. A state is a triple (S,M,K), where M is a term and S and K are stacks of terms. S is the operand stack, with the head to the right as SN, and K is the continuation stack, with the head to the left as NK. In both cases the empty stack is written ε, and concatenation by juxtaposition, ST. Transitions or steps are probabilistic: a transition rule is written as below left, expressing that the machine transitions from a state (S,M,K) to (T,N,L) with probability p. We may omit p when p=1. A run is a sequence of steps, written as below centre, where probabilities are multiplied, i.e. p below is the product of the probabilities of all steps. A run is successful if it terminates with skip and an empty continuation stack, as below right; the stack T then holds the return values of the computation.

step:p(S,M,K)(T,N,L)run:p(S,M,K)(T,N,L)successful run:p(S,M,K)(T,,ε)
Definition 15.

The sequential probabilistic machine (SPM) is given by the following probabilistic transitions.

(S,[N].M,K)(SN,M,K)(S,N;M,K)(S,N,MK)12(S,[Uncaptioned image].M,K)(S,πaM,K)(SN,x.M,K)(S,{N/x}M,K)(S,,MK)(S,M,K)12(S,[Uncaptioned image].M,K)(S,πaM,K)

5.3 Big-step semantics

Running the machine for a given term and input stack gives a distribution of return stacks. We use the following notation, extending from that for distributions over terms.

𝒯=[p1T1,,pnTn]=[piTi]in
Definition 16.

The evaluation relation S,M𝒯 is defined inductively by the following rules.

We demonstrate that small-step and big-step semantics agree.

Proposition 17.

S,M𝒯 if and only if there is a finite collection of n distinct runs

pi(S,M,ε)(Ti,,ε)(in)such that𝒯=[piTi]in.

Proof.

() By induction on S,M𝒯. () By induction each run in the collection of n runs.

6 Sequential probabilistic types

Types for the sequential λ-calculus are of the form below, with the meaning: given an input stack of terms typed by A1 through An, the machine will terminate successfully and return a stack with types B1 through Bm.

AnA1B1Bm

For the 𝖲𝖯𝖤Λ, we parameterize this with the probability of successful termination.

Definition 18.

Sequential probabilistic types are given by the following grammars, where p is a probability.

A,B,CA¯pC¯(types)A¯A1An(type vectors)

A typing judgement E|ΓM:A assigns a term M the type A in the context of E and Γ, and E|ΓS:A¯ assigns a stack of terms S a type vector A¯. The sequential probabilistic type system 𝖲𝖯𝖤Λ is given by the typing rules in Figure 4. We may omit p when p=1.

There are no base types: their rôle is subsumed by types with empty vectors (p). Observe that because stacks are last-in-first-out, the identity term on two elements is x.y.[y].[x]., i.e. with the order of x and y reversed between popping and pushing. We match this reversal in types, and assign this term the type ABBA. Since we want identity types to be of the form A¯A¯, in a type A¯pC¯ we consider the antecedent type vector A¯ to be reversed, i.e.

A¯A¯=AnA1A1An.

Probabilistic 𝖯𝖤Λ-types embed into sequential types by A¯p=A¯pε. With this identification, for 𝖯𝖤Λ-terms and -types the two type systems coincide.

Proposition 19.

For M a 𝖯𝖤Λ-term, E|ΓM:A¯p if and only if E|ΓM:A¯pε.

Figure 4: The sequential probabilistic type system 𝖲𝖯𝖤Λ.

Every type is inhabited by a closed term. For a type A, define the zero term 0A as follows: for A=B1BmpC1Cn,

0A=x1xm.[0C1][0Cn]..
Proposition 20 (Type inhabitation).

Every type A is inhabited by its zero term, 0A:A.

Proof.

By induction on the type A, using the 𝗅𝗈𝗐 rule to lower the termination probability from 1 to an arbitrary p.

Type inhabitation is an important justification for the interpretation of types as a guarantee for successful machine termination: it means that for a typed term M:ApC¯, a suitable input stack S:A¯ always exists. Our main theorem, below, formalizes this interpretation: for a term M:ApC¯ and stack S:A¯, the machine successfully returns a stack T:C¯ with probability at least p.

Theorem 21.

For a typed, closed term M:A¯pC¯ and stack S:A¯ there is a finite set of distinct successful runs

pi(S,M,ε)(Ti,,ε)(in)

with sum probability inpip.

Proof.

See Appendix A.1.

The lower bound given by the theorem is an exact bound when two conditions are met: the typing derivation does not use the 𝗅𝗈𝗐 rule to lower the termination bound, and every use of the 𝗓𝖾𝗋𝗈 rule to assign a zero probability applies to a term that is in fact non-terminating. In such a case, such as Example 8, types can be seen to give an exact bound.

Since head reduction as a strategy closely follows machine evaluation, it is no surprise that the termination bound for the machine is also a lower bound for probabilistic head reduction. This is formalised in the following theorem.

Theorem 22.

For a closed M, if M:A¯pC¯ then M[Uncaptioned image]𝗁𝒩0+𝒩1 where all terms in 𝒩0 are head normal and |𝒩0|p.

Proof.

See Appendix A.2.

Observe that our main theorem on probabilistic termination for the 𝖯𝖤Λ, Theorem 10, follows immediately from the corresponding Theorem 22 above by conservativity of the 𝖲𝖯𝖤Λ over the 𝖯𝖤Λ. This is despite the fact that the proof of Theorem 22 makes essential use of the notion of successful termination made available by sequencing, absent from the 𝖯𝖤Λ.

7 Conclusions

To us, what stands out about our approach is the simplicity and the natural intuition of our type system. This manifests in the transparent reasoning in our proofs, which despite using deep techniques such as abstract reducibility, give a direct and clear connection between types, machine behaviour, and reduction.

Compared with the approach in [1], the simplicity of our approach manifests in several advantages. First is to avoid the need for counting quantifiers, associating probabilities instead with base types, and the use of simple event valuations over boolean formulas. Second, it is clear that the expression of both call–by–name and call–by–value behaviour is an essential ingredient for a probabilistic calculus. We eschew the introduction of ad-hoc constructs, relying instead on a principled interpretation of cbv via sequential composition. Finally, where the main example [1, Example 6.2] requires intersection types to produce the exact bound, our approach does so directly in Example 8, while morally remaining within the realm of simple types, and strictly improving on the bounds provided by Cλ.

References

  • [1] Melissa Antonelli, Ugo Dal Lago, and Paolo Pistone. Curry and howard meet borel. In Proceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS‘22), pages 45:1–45:13. ACM, 2022. doi:10.1145/3531130.3533361.
  • [2] Martin Avanzini, Ugo Dal Lago, and Akihisa Yamada. On probabilistic term rewriting. In John P. Gallagher and Martin Sulzmann, editors, Proc. Functional and Logic Programming - 14th International Symposium, FLOPS 2018, volume 10818 of Lecture Notes in Computer Science, pages 132–148. Springer, 2018. doi:10.1007/978-3-319-90686-7_9.
  • [3] Chris Barrett, Willem Heijltjes, and Guy McCusker. The Functional Machine Calculus II: Semantics. In 31st EACSL Annual Conference on Computer Science Logic (CSL 2023), volume 252 of Leibniz International Proceedings in Informatics (LIPIcs), pages 10:1–10:18, 2023. doi:10.4230/LIPIcs.CSL.2023.10.
  • [4] Raven Beutner and Luke Ong. On probabilistic termination of functional programs with continuous distributions. In Stephen N. Freund and Eran Yahav, editors, Proceedings 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI), pages 1312–1326. ACM, 2021. doi:10.1145/3453483.3454111.
  • [5] Flavien Breuvart and Ugo Dal Lago. On intersection types and probabilistic lambda calculi. In roceedings of the 20th International Symposium on Principles and Practice of Declarative Programming, PPDP 2018, pages 8:1–8:13. ACM, 2018. doi:10.1145/3236950.3236968.
  • [6] Fredrik Dahlqvist and Dexter Kozen. Semantics of higher-order probabilistic programs with conditioning. Proceedings of the ACM on Programming Languages, 4(POPL):57:1–57:29, 2020. doi:10.1145/3371125.
  • [7] Ugo Dal Lago, Claudia Faggian, and Simona Ronchi Della Rocca. Intersection types and (positive) almost-sure termination. Proceedings of the ACM on Programming Languages, 5(POPL):1–32, 2021. doi:10.1145/3434313.
  • [8] Ugo Dal Lago and Charles Grellois. Probabilistic termination by monadic affine sized typing. ACM Transactions on Programming Languages and Systems, 41(2):10:1–10:65, 2019. doi:10.1145/3293605.
  • [9] Ugo Dal Lago, Giulio Guerrieri, and Willem Heijltjes. Decomposing probabilistic lambda-calculi. In Proceedings of Foundations of Software Science and Computation Structures (FoSSaCS), volume 12077 of LNCS, pages 136–156, 2020. doi:10.1007/978-3-030-45231-5_8.
  • [10] Ugo Dal Lago and Margherita Zorzi. Probabilistic operational semantics for the lambda calculus. RAIRO - Theoretical Informatics and Applications, 46(3):413–450, 2012. doi:10.1051/ita/2012012.
  • [11] Karel de Leeuw, Edward F. Moore, Claude E. Shannon, and Norman Shapiro. Computability by probabilistic machines. Automata studies, 34:183–198, 1956.
  • [12] Ugo De’Liguoro and Adolfo Piperno. Non deterministic extensions of untyped lambda-calculus. Information and Computation, 122(2):149–177, 1995. doi:10.1006/inco.1995.1145.
  • [13] Rémi Douence and Pascal Fradet. A systematic study of functional language implementations. ACM Transactions on Programming Languages and Systems, 20(2):344–387, 1998. doi:10.1145/276393.276397.
  • [14] Thomas Ehrhard and Giulio Guerrieri. The bang calculus: An untyped lambda-calculus generalizing call-by-name and call-by-value. In Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming (PPDP’16), pages 174–187, 2016. doi:10.1145/2967973.2968608.
  • [15] Thomas Ehrhard and Christine Tasson. Probabilistic call by push value. Logical Methods in Computer Science, 15(1), 2019. doi:10.23638/LMCS-15(1:3)2019.
  • [16] Claudia Faggian and Simona Ronchi Della Rocca. Lambda calculus and probabilistic computation. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–13, 2019. doi:10.1109/LICS.2019.8785699.
  • [17] Jean Goubault-Larrecq. A probabilistic and non-deterministic call-by-push-value language. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–13. IEEE Computer Society, 2019. doi:10.1109/LICS.2019.8785809.
  • [18] Willem Heijltjes. The functional machine calculus. In Proceedings of the 38th Conference on the Mathematical Foundations of Programming Semantics, MFPS XXXVIII, volume 1 of ENTICS, 2022. doi:10.46298/ENTICS.10513.
  • [19] C. Jones and G. D. Plotkin. A probabilistic powerdomain of evaluations. In Proceedings of the Fourth Annual Symposium on Logic in Computer Science (LICS ’89), pages 186–195. IEEE Computer Society, 1989. doi:10.1109/LICS.1989.39173.
  • [20] Achim Jung and Regina Tix. The troublesome probabilistic powerdomain. Electronic Notes in Theoretical Computer Science, 13:70–91, 1998. doi:10.1016/S1571-0661(05)80216-6.
  • [21] Benjamin Kaminski. Advanced Weakest Precondition Calculi for Probabilistic Programs. PhD thesis, Fakultät für Mathematik, Informatik und Naturwissenschaften, RWTH Aachen University, 2019. doi:10.18154/RWTH-2019-01829.
  • [22] Naoki Kobayashi, Ugo Dal Lago, and Charles Grellois. On the termination problem for probabilistic higher-order recursive programs. In Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–14. IEEE, 2019. doi:10.1109/LICS.2019.8785679.
  • [23] Jean-Louis Krivine. A call-by-name lambda-calculus machine. Higher-Order and Symbolic Computation, 20(3):199–207, 2007. doi:10.1007/s10990-007-9018-9.
  • [24] Thomas Leventis. A deterministic rewrite system for the probabilistic λ-calculus. Mathematical Structures in Computer Science, 29(10):1479–1512, 2019. doi:10.1017/S0960129519000045.
  • [25] Paul Blain Levy. Call-by-push-value: A functional/imperative synthesis, volume 2 of Semantic Structures in Computation. Springer Netherlands, 2003. doi:10.1007/978-94-007-0954-6.
  • [26] Udi Manber and Martin Tompa. Probabilistic, nondeterministic, and alternating decision trees. In 14th Annual ACM Symposium on Theory of Computing, pages 234–244, 1982. doi:10.1145/800070.802197.
  • [27] Annabelle McIver and Carroll Morgan. Abstraction and refinement in probabilistic systems. SIGMETRICS Perform. Eval. Rev., 32(4):41–47, March 2005. doi:10.1145/1059816.1059824.
  • [28] A.J. Power and Hayo Thielecke. Closed Freyd- and κ-categories. In International Colloquium on Automata, Languages, and Programming (ICALP), volume 1644 of Lecture Notes in Computer Science, pages 625–634. Springer, 1999. doi:10.1007/3-540-48523-6_59.
  • [29] Norman Ramsey and Avi Pfeffer. Stochastic lambda calculus and monads of probability distributions. In Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pages 154–165, 2002. doi:10.1145/503272.503288.
  • [30] Nasser Saheb-Djahromi. Probabilistic LCF. In Mathematical Foundations of Computer Science 1978, Proceedings, 7th Symposium, volume 64 of Lecture Notes in Computer Science, pages 442–451. Springer, 1978. doi:10.1007/3-540-08921-7_92.
  • [31] Davide Sangiorgi and Valeria Vignudelli. Environmental bisimulations for probabilistic higher-order languages. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, pages 595–607, 2016. doi:10.1145/2837614.2837651.

Appendix A Proofs for Section 6

A.1 Probabilistic machine termination

Our main theorem for the typed 𝖲𝖯𝖤Λ will be that a type A¯pB¯ guarantees a probability of termination of the machine of at least p, given an input stack of type A¯, and returning a stack of type B¯. The proof is a direct application of abstract reducibility. For each type we define a set run(A) that holds the terms with the above property, and we proceed to prove that every term of type A belongs to this set.

We write 𝒟p(X) for the set of finite distributions 𝒳 of weight |𝒳|p over a set X.

Definition 23.

The set run() is defined by mutual induction on types A and type vectors A¯ as a set of closed terms respectively of closed stacks, as follows.

run(A¯pB¯)={MSrun(A¯).𝒯𝒟p(run(B¯)).S,M𝒯}run(A1An)={εM1MnMirun(Ai)}

For the proof of Lemma 25, the main reducibility lemma, we need the following notation, as well as an additional lemma. We write σ for a substitution map {Mi/xi}in, where σM is the application of σ to M. The set run() then extends to contexts Γ as follows. Note that the definition implies that σ is closed (i.e. each substituting term is closed).

run(x1:A1,,xn:An)={σσxirun(Ai), 1in}

For an event valuation E=(akik)kn we write πEM for the projection on each ai in E.

πEM=πi1a1((πinanM))

We expand the stacks in a distribution 𝒯 by prepending a stack S as S𝒯, where S[piTi]in=[piSTi]in.

Lemma 24.

If S,M𝒯 then RS,MR𝒯 for any stack R.

Proof.

By induction on the definition of .

The main reducibility lemma is then the following.

Lemma 25.

If E|ΓM:A and σrun(Γ) then πE(σM)run(A).

Proof.

By induction on the typing derivation for M. Note that since σ is closed, πE(σM)=σ(πE(M)). We cover three key cases (sequencing, abstraction, and generator); the remaning are similar.

Sequencing case:

Let N=πE(σN) and M=πE(σM). Given Rrun(A¯), by induction we have R,N[piSi]in with each Sirun(B¯) and inpip. Again by induction, for each in we have Si,M𝒯i with 𝒯i𝒟q(run(C¯)). The definition of gives R,(N;M)𝒯 for 𝒯=inpi𝒯i. Finally, observe that 𝒯𝒟pq(run(C¯)) since |𝒯|=inpi|𝒯i|pq and since each 𝒯i is a distribution over run(C¯).

Abstraction case:

Let σrun(Γ) and SNrun(B¯A). We need to demonstrate a distribution 𝒯𝒟p(run(C¯)) such that SN,πE(σ(x.M))𝒯. Let M=πE(σM) and note that since σ is not defined on x and σ and N are closed, πE(σ{N/x}M)={N/x}M and πE(σ(x.M)=x.M. The inductive hypothesis for the premise E|Γ,x:AM:B¯pC¯, with σ{N/x}run(Γ,x:A), gives the desired 𝒯𝒟p(run(C¯)) with evaluation S,{N/x}M𝒯. Then by the definition of evaluation, SN,M𝒯, and hence x.Mrun(AB¯pC¯).

Generator case:

Let σrun(Γ) and Srun(A¯). Let Mi=πE,ai(σM) for i,. By the inductive hypothesis, Mrun(A¯pC¯) and Mrun(A¯qC¯), which gives S,M𝒯 and S,M𝒯 for some 𝒯𝒟p(run(C¯)) and 𝒯𝒟q(run(C¯)). Let r=12(p+q); then 𝒯=12(𝒯+𝒯)𝒟r(run(C¯)) and by definition of evaluation, S,πE(σ([Uncaptioned image].M))𝒯. We conclude that πE(σ([Uncaptioned image].M))run(A¯rC¯).

Our main theorem, the probability of machine termination, is a direct consequence.

Theorem 21 (restatement).

For a typed, closed term M:A¯pC¯ and stack S:A¯ there is a finite set of distinct successful runs

pi(S,M,ε)(Ti,,ε)(in)

with sum probability inpip.

Proof.

By Lemma 25 we have Mrun(A¯pC¯) and Srun(A¯). By the definition of run() we then have S,M𝒯 with |𝒯|p. Proposition 17 then gives the desired set of machine runs.

A.2 Probabilistic head normalization

Finally, we will relate machine termination to head reduction. For a redex in a head context λx1λxn.{}M1Mm the machine runs as follows: after the abstractions consume the top n elements off the stack, and the applications push the terms Mi onto it, then the redex itself is the first part of the term to be evaluated on the machine. Machine evaluation thus corresponds tightly to head reduction, with the same order of evaluation of redexes.

However, in this correspondence, the machine halts with a variable, while successful termination in our setting requires a skip with an empty continuation stack. We will thus use a different approach.

The main idea is that reduction shortens the runs of the machine, by removing consecutive push and pop operations, or in the case of projective reduction, removing a generator. By annotating the evaluation relation to count abstractions, applications, and generators we may then observe that this measure reduces under head reduction.

Definition 26.

The weighted evaluation relation S,Mw𝒯 is given in Figure 5, where the weight w is a natural number. We extend it to a distribution of terms by the following rule, carrying a multiset of weights 𝒲.

Figure 5: The weighted probabilistic evaluation relation.

The core lemma establishes that the weight in the evaluation relation decreases for [Uncaptioned image]β𝗁 and [Uncaptioned image]π reduction steps, and is stable under [Uncaptioned image]σ𝗁. However, this does not apply to terms introduced by the zero-rule, as S,M0, which are the potentially non-terminating terms. Crucially for our purpose, when the evaluation returns a non-empty distribution, head-reduction on the term progresses towards a head-normal form.

Lemma 27.

Let S,Mw𝒯 where 𝒯.

  • If M[Uncaptioned image]σ𝗁N then S,Nw𝒯.

  • If M[Uncaptioned image]β𝗁N then S,Nv𝒯 with v<w.

  • If M[Uncaptioned image]π𝒩 then S,𝒩𝒲𝒯 with v<w for every v in 𝒲.

Proof.

We first consider the reduction steps themselves, and then their closure under head contexts. The proof is by induction on . We consider three cases; the remaining are similar.

Beta

[N].x.M[Uncaptioned image]β{N/x}M  Since 𝒯 is non-empty, the derivation must be as below left, as none of the inferences may be replaced by a zero-rule. The case is then as follows.

Projection

[Uncaptioned image].M[Uncaptioned image]π[12πaM,12πaM]  Since 𝒯 is non-empty, the derivation for the redex is the first below. The derivation for the reduct, second below, uses the evaluation rule for distributions.

Sequence–generator

[Uncaptioned image].M;P[Uncaptioned image]σ[Uncaptioned image].(M;P)(a𝖿𝖾(P))  The derivations are as follows, with premises stacked for space. In the second derivation w=w+inui and v=v+n<imui.

This concludes the reduction steps. The remaining cases consider the closure under head contexts. For sequencing reduction, which leaves the weight unchanged, this is immediate, and the cases for beta-steps are straightforward and omitted. For projective reduction, let M[Uncaptioned image]π[12N12N], and consider the remaining cases.

Application

[P].M[Uncaptioned image]π[12[P].N12[P].N]  The derivation for [P].M is as follows.

By induction, for the premise of this derivation we get one for the distribution 𝒩=[12N,12N], below, where it follows that 𝒯=12𝒯+12𝒯 and u,v<w.

Then for 𝒫=[12[P].N12[P].N] we get the following derivation.

Abstraction

x.M[Uncaptioned image]π[12x.N12x.N]  The derivation for x.M is as follows.

Given the reduction for M, we also have the following, where we abbreviate the reduct as 𝒫.

{P/x}M[Uncaptioned image]π[12{P/x}N12{P/x}N]=𝒫

By induction this gives us the following derivation for 𝒫, where again 𝒯=12𝒯+12𝒯 and u,v<w.

Then for 𝒩=[12x.N12x.N] we get the required derivation.

We apply the above lemma to relate machine evaluation to head reduction in the following way: if evaluation returns a distribution of weight p, then head-reduction terminates with probability at least p.

Lemma 28.

If S,𝒲𝒯 then [Uncaptioned image]𝗁𝒩0+𝒩1 where all terms in 𝒩0 are head normal and |𝒩0||𝒯|.

Proof.

Let =[piMi]in evaluate by the following derivation.

First, we head-reduce any Mi where 𝒯i. This process terminates by induction on 𝒲, using Lemma 27: a β-step or projective step reduces 𝒲, while sequencing reduction alone is terminating and does not increase it.

Then the distribution 𝒩0=[piMiin,𝒯i] is head-normal. The weights of 𝒩0 and 𝒯 are as follows.

|𝒩0|=[piin,𝒯i]|𝒯|=inpi|𝒯i|

It follows that |𝒩0||𝒯|.

Our final theorem then ties everything together: typing gives successful evaluation on the machine, which in turn gives termination of head reduction.

Theorem 22 (restatement).

For a closed M, if M:A¯pC¯ then M[Uncaptioned image]𝗁𝒩0+𝒩1 where all terms in 𝒩0 are head normal and |𝒩0|p.

Proof.

We provide M with an input stack consisting only of zero terms 0A. Let S be the stack of zero terms for A¯. By Proposition 20 zero-terms inhabit their types, so that S:A¯.

By Lemma 25 we have Srun(A¯) and Mrun(A¯pC¯). By the definition of run() this gives an evaluation S,M𝒯 where |𝒯|p.

For the corresponding weighted derivation S,Mw𝒯, Lemma 28 gives a head reduction M[Uncaptioned image]𝗁𝒩0+𝒩1 where all terms in 𝒩0 are head normal and |𝒩0||𝒯|p.