Simple Types for Probabilistic Termination
Abstract
We present a new typing discipline to guarantee the probability of termination in probabilistic lambda-calculi. The main contribution is a particular naturality and simplicity: our probabilistic types are as simple types, but generated from probabilities as base types, representing a least probability of termination. Simple types are recovered by restricting probabilities to one.
Our vehicle is the Probabilistic Event Lambda-Calculus by Dal Lago, Guerrieri, and Heijltjes, which presents a solution to the issue of confluence in probabilistic lambda-calculi. Our probabilistic type system provides an alternative solution to that using counting quantifiers by Antonelli, Dal Lago, and Pistone, for the same calculus.
The problem that both type systems address is to give a lower bound on the probability that terms head-normalize. Following the recent Functional Machine Calculus by Heijltjes, our development takes the (simplified) Krivine machine as primary, and proceeds via an extension of the calculus with sequential composition and identity on the machine. Our type system then gives a natural account of termination probability on the Krivine machine, reflected back onto head-normalization for the original calculus. In this way we are able to avoid the use of counting quantifiers, while improving on the termination bounds given by Antonelli, Dal Lago, and Pistone.
Keywords and phrases:
lambda-calculus, probabilistic termination, simple typesCopyright and License:
2012 ACM Subject Classification:
Theory of computation Lambda calculus ; Theory of computation Type theory ; Theory of computation Probabilistic computationAcknowledgements:
We would like to thank Melissa Antonelli, Ugo Dal Lago, and Paolo Pistone for the constructive conversation about both our approaches, and the anonymous referees for their helpful commentary.DOI:
10.4230/LIPIcs.CSL.2025.31Editors:
Jörg Endrullis and Sylvain SchmitzSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
While the study of probabilistic computation can be traced to the 1950s [11], the first study of the probabilistic -calculus in particular is considered to be by Saheb-Djahromi in the late 70s [30]. In the near half century since, many variations on higher-order probabilistic computation have been considered [10, 12, 19, 20, 26, 29]. In recent years, perhaps due to the potential for applications in machine learning and modelling of probabilistic systems, the area has seen a return to popularity [5, 6, 16, 17, 24, 31]. An important computational phenomenon in its own right, the study of probabilistic choice can also provide a “foot in the door” for understanding how more general effects might manifest, leading for instance to the recent Functional Machine Calculus (FMC) as a confluent -calculus with effects [3, 18] that will play a central role in our development.
In this paper we consider the problem of probabilistic termination, the probability that a given reduction mechanism reaches the normal form of a term, a key consideration for probabilistic computation and the subject of significant recent attention [4, 7, 8, 15, 22]. 111Note that this objective is distinct from almost-sure termination: it considers exact probabilities, not those converging in the limit. Almost-sure termination is more commonly studied for iterative constructs [21, 27]; extending the lambda-calculus with an almost-surely terminating iterator is the subject of future work. Termination being one of the prime considerations of type systems in general, the question of type systems for probabilistic termination is pertinent [1, 7]. Our contribution in this paper is a new typing discipline for probabilistic termination, following the recent line of work on the probabilistic event -calculus [9] and the FMC. Their prime features are confluence for probabilistic computation and other effects, and the encoding of multiple reduction strategies within a single calculus.
One of the greatest challenges facing the study of probabilistic computation is confluence. In most variants the outcome of duplicating a probabilistic term is strategy dependent. Consider the instruction “write down the result of flipping a coin twice”: it is ambiguous whether “twice” refers to “write” or “flipping”, and the choice of interpretation changes the possible outcomes. In most of the literature results are therefore either restricted to a single strategy, such as call–by–value or call–by–name, or rely on a modified definition of reduction. This raises the question whether it is possible to have confluence for probabilistic computation, while expressing different strategies, and rewriting without restriction on contexts, properties which lend elegance to the -calculus. The probabilistic event -calculus [9] proposes a solution to this problem by decomposing a probabilistic sum into a choice operation , understood as a conditional “if then else ”, and a probabilistic generator , representing a coin toss whose outcome is bound to the boolean variable in the term , which then determines the choice for . When in argument position acts as a wrapper, similar to thunking in call–by–push value [25] and the bang-calculus [14, 15], protecting the operator from evaluation. These factors combine to return confluence to the calculus, allowing for arbitrary evaluation strategies with an unrestricted -reduction.
The probabilistic event -calculus () was introduced in [9] with a simple type system, corresponding to that on the lambda-calculus. This system ignored the probabilistic elements of the calculus, and thus gave little insight into the properties of this extension. In a deterministic calculus a type system provides various safety guarantees. These may be qualitative: termination, outputs, composability; or quantitative: run time, term size. In the probabilistic setting, however, it is natural to wonder what guarantees can be made, when the behaviour of a single term can vary between iterations. If, as in the type system mentioned, the probabilities are ignored, strong results can be obtained, albeit on a fairly uninteresting fragment of the calculus. Once probability is acknowledged by the type system the guarantees made become similarly qualified: probability of termination, almost-sure termination, expected run time. Here, we consider the first of these.
One proposed type system for the [1], labelled , introduces counting quantifiers as a Curry–Howard correspondent to an intuitionistic counting propositional logic. These provide a mechanism to express “proportion of truth” by quantifying the number of satisfying assignments to a formula within a given model, in the way that existential and universal quantification may be understood via the existence of a satisfying assignment, respectively the satisfaction of all assignments. By taking the assignments to a formula to describe the branches of a probabilistic computation, counting quantifiers may be used to describe probability bounds. By this mechanism, provides a lower bound on the probability of head normalisation. However, as illustrated in [1], the bounds provided by are not tight, even in situations where this would be expected (see Example 8).
In this paper we present an alternative approach to the same problem, for the same calculus, using a different inspiration, to provide improved termination bounds. Deriving from the , the FMC provides an additional feature that, to us, seemed crucial to a natural account of probabilistic termination via the type system. First, the “machine” in question is the (simplified) Krivine machine [23], whose evaluation is closely related to head normalisation. The FMC here adds an intriguing aspect: it introduces sequential composition and identity on the machine, where the latter, the imperative skip, provides a notion of successful termination, notably absent from the machine for standard -calculus. Moreover, this becomes the primary interpretation of types: a type derivation in the FMC is a proof that the machine successfully terminates. Our main question for this work was how to adapt this to capture probabilistic termination. The answer, described in Section 6, is an extended sequential probabilistic event -calculus , with a type system that naturally describes probabilistic termination of the machine, as well as probabilistic termination of head reduction.
The type system for the extended sequential calculus reflects back onto a natural type system for the original probabilistic event -calculus, , which gives probabilistic head normalization in the following way: types generalize simple types by replacing the base type with a probability. Formally,
where is a rational number between zero and one inclusive. The intuition is that the base type for simple types, , may be understood as signifying successful evaluation, even if it is uninhabited. After all, a constant base type such as for booleans or integers signifies the successful return of a corresponding value. This further matches the interpretation in the FMC, where the type is inhabited by skip, and corresponds to termination of the machine without producing a result. Our approach, then, is to replace certain termination with a probability of termination, replacing the base type with a probability . We consider the striking simplicity and natural intuition of this approach one of our main contributions.
2 The probabilistic event lambda-calculus
We recall the probabilistic event lambda-calculus from [9]. Assume countable sets of variables, ranged over by , and of events, ranged over by . The former are term variables, to be instantiated by terms of the calculus, and the latter are boolean variables, to be instantiated by (true) or (false). The extends the -calculus with a generator , which flips a coin and binds the result ( or ) to , and a choice or conditional , which evaluates to if is true, and to otherwise. A traditional probabilistic sum of terms, , may be encoded as . Generators are normally fair, with equal probability for or , though on occasion we may need a biased generator which chooses with probability , and with .
Definition 1.
Terms are given by the following grammar,
with, from left to right: a variable; an abstraction, which binds in ; an application; a generator, which binds in ; and a choice.
The free variables and free events of a term are written and respectively. Substitution is written prefix: is the capture-avoiding substitution of for in . For an event we define two projection functions and , which apply the effect of instantiating with true and false respectively, to a term .
Definition 2.
The projection functions for an event and are given as follows, where .
We use the standard notions of context, head context, and applicative context to define the reduction relations. Note that a head context is of the form where and are potentially zero.
Definition 3.
Contexts , head contexts , and applicative contexts are defined as follows. A context with the hole replaced by , capturing variables, is written .
A probability is a rational number between and inclusive. Probabilistic reduction will return a multi-(sub-)distribution [2], a finite multiset of weighted terms written as whose weight is (at most) one.222We use multi-distributions rather than distributions to accommodate the reduction measure for the proof of head normalization in Appendix A.2. For simplicity, we will refer to these as distributions, and we convert implicitly between terms and the singleton distribution . Multiset union is written , and the empty multiset as . A distribution is scaled to by multiplying each weight in by . The underlying probability (sub-)distribution of , the finite function from terms to probabilities obtained by collecting like terms and as , is written .
Definition 4.
Beta-reduction and head -reduction are given by closing the beta-rule below under all contexts respectively under head contexts , and implicitly return a singleton distribution.
Projective reduction is the following reduction relation from terms to distributions.
Head reduction is the union of head -reduction and projective reduction. Reduction is lifted to distributions of terms in the expected way: if then . We write for the reflexive-transitive closure of a reduction relation .
The features a second notion of reduction, permutative reduction [9], which gives a more fine-grained evaluation of probabilistic sums. It is this reduction for which confluence is particularly significant. The effect of permutative reduction is to bridge the gap between the decomposed operators ![]()
In this paper we will work with projective reduction, since we are interested in termination of head reduction. We will, on occasion, consider terms where probabilistic sums are of the traditional form with . As per the above, these are effectively the normal forms of permutative reduction.
3 Probabilistic types
Before introducing our probabilistic type system, we recall simple types for the [9].
Definition 5.
Simple types are given by the following grammar.
A typing context is a finite function from term variables to types, written as a sequence . A typing judgment assigns the type to the term in the context . The simply-typed probabilistic event -calculus is given by the typing rules in Figure 1.
Simple types ignore the probabilistic constructs of the calculus, generator and choice, requiring only that branches of a choice have equal types. This gives the expected result: typed terms are strongly normalizing, since every possible branch of the computation is typed. For probabilistic termination, we wish to capture that a given fraction of all branches of a computation, terminates – where our notion of termination is given by head normalization.
Our approach derives from the following idea: if the base type of simple types, even if not inhabited, denotes certainty of termination, then we may generalise this to probability of termination by replacing base types with arbitrary probabilities. This yields the following notion of types.
Definition 6.
Probabilistic types are given by the following grammar, where .
The intuitive meaning of, for example, assigning a term a type is: given inputs of type and , terminates with probability at least . The identity term may be assigned any type : given an input that terminates with probability , the term does so as well. The type system will include an axiom that any term, in particular a non-terminating one such as , may be assigned a termination probability of zero. This is expressed by a type : for any inputs through , the term will terminate with probability at least zero.
Note that these types generalise simple types with a single, uninhabited base type . A probabilistic system with multiple base types, say integers and booleans, would pair the return type with a probability, for instance an integer with probability, or a boolean with probability. The extension of the calculus with sequencing, in Sections 5 and 6, will give a more concrete account of how probabilities interact with return values and return types.
The term gives a fair probabilistic choice between and . For the computation to be well-typed regardless of the choice, and should have the same input types, so that they can be applied to the same arguments. Hence, if has type , we may choose for . Then the type for should be : given arguments of type and , the computation chooses with probability and so terminates with probability at least .
These considerations motivate the shape of our probabilistic type system, with one further aspect to explain. An arbitrary generator term reduces with a fair probability to either or , which may then terminate with different probabilities. However, using projections would mean the type system loses the property of being inductive on terms. We will instead record the assignment of a truth value to in an additional context , that we call an event valuation. The type derivation then projects a term to or to according to the value of in the event valuation .
Definition 7.
The probabilistically typed probabilistic event -calculus is given by the typing rules in Figure 2, using the following definitions. An event valuation is a finite function from events to , written as a sequence . A typing context is a finite function from variables to types, written . A probabilistic typing judgement assigns a type to the term in the context of and . A sequence of antecedents is abbreviated with vector notation as .
The typing rules are syntax-driven, except for the rule to lower a given bound. The rule is not essential to the type system, since it is already implied in the meaning of types as giving a lower bound; but for the same reason, since it is implied, including it brings more clarity than omitting it. Note further that the typing rule assumes a fair coin toss for the generator ![]()
Example 8.
We will discuss a number of aspects of our type system. First, as a particularly natural feature, observe that the rules , and make it conservative over standard simply-typed -calculus – which, interestingly, is agnostic to the choice of base types.
Second, a key difference with the simple type system of Figure 1 is that probabilistic branching occurs in the generator rule, whereas for simple types it is the choice rule. This is due to aiming for head normalization instead of strong normalization. Consider the example term . Head reduction projects it to and , removing , which will not be reduced. This is reflected in the probabilistic type system, where the branches of the generator typing rule project to and via the event valuation. The simple type system, reflecting strong normalization, would require also to be typed – which of course it cannot.
For terms of the form this difference is moot: both type systems branch similarly for this construct, to and . This leaves as only distinction the generalisation of types themselves, from a single base type to probabilities . In accordance with the meaning of a base type as the probability of termination, simple types are then recovered by restricting to base type . This rules out the rules and , assigning a zero-weighted type and reducing the weight of a type. It is easy to observe that the remaining rules preserve the restriction , to give the following proposition.
Proposition 9.
For the fragment below left, the simply-typed coincides with the probabilistically-typed restricted to the types below right.
Our main result for the probabilistically-typed is that a type guarantees head normalization with probability at least . Formally, this is stated by a head reduction to a distribution of which a proportion of at least is in head-normal form.
Theorem 10.
For closed , if then where all terms in are head normal and .
4 Comparison with counting quantifiers
In this section we will give a close comparison with the type system of Antonelli, Dal Lago, and Pistone [1], and demonstrate that our approach gives tighter bounds on the probability of termination.
The first distinction between and is that the former uses indexed event variables for instead of events . However, there is no formal need for this; since the indices are static, we may replace and for distinct and simply with distinct events and . This simplification extends to the semantics. Probabilities in are given by boolean formulas over the variables , indicating a subset of the space where is a finite set of events. However, since the indices are fixed and bounded, it is sufficient to consider finite spaces . The elements of this set are the event valuations with domain , and a boolean formula over indicates the set of event valuations , where is characterized syntactically as expected:
For a set the measure is given by
where denotes the size of a set . Then is where is the domain of . This coincides with the measure over in [1], as the latter makes no essential use of the infinity offered by .
The second difference is the syntax of types: introduces probabilities through counting quantifiers , where has probabilities as base types. Types are nevertheless isomorphic: types are of the form , with a fixed outer counting quantifier, and map 1-to-1 onto types of the form by associating the probability instead with the consequent . Formally, we encode types and typing contexts of into our setting as follows.
Having connected boolean formulas to event valuations , and types to types, we may state the following conservativity result of over .
Proposition 11.
If then
Proof.
By induction on the typing derivation for .
Corollary 12.
For a given closed term the type system gives the same or higher termination bounds than .
In the reverse direction, Example 8 and its counterpart in [1, Section 6.2] show that the two type systems do not give the exact same bounds, and in some cases gives a strictly higher bound. The reason is that locates branching between alternatives at the -rule for , where branches for choice terms or in a contraction rule. Crucially, the -rule allows branches with different termination bounds. We illustrate this further by attempting to simulate the rule for in .
The branching at this rule in is captured with a contraction on the two premisses, which requires the probabilities to be equal, i.e. . The derivation is as follows, where and for the counting rule.
This is the issue illustrated by Example 8 and its counterpart in [1, Section 6.2], for which gives the actual termination probability of , while gives a best approximation of . Reprising the example in , the two sub-derivations for , given in condensed form below, assign probabilities of and . These may only be combined in a contraction by lowering the first probability to match the of the first.
The above analysis suggests that this issue with may be fixed by adopting the generator typing rule of , adjusted appropriately as follows. The key here is that different branches feature the dual atoms and , not seen in either the simple type system nor the intersection type system in [1].
5 Sequencing
Two observations about probabilistic -calculi motivate the developments in the remainder of this paper. The first is the primary role of head reduction. It is well known that head normalization corresponds closely to evaluation on the Krivine Machine [23], and sometimes the machine gives a more natural model of what is being studied. The second is that probabilistic evaluation in -calculi needs to account for the difference between call–by–value (cbv) and call–by–name (cbn), to which end additional constructions are introduced, sometimes ad-hoc. The is an example, as is the separate consideration of a cbv- and a cbn-probabilistic sum by Faggian and Ronchi Della Rocca [16], while Antonelli, Dal Lago, and Pistone [1] add to the standard cbn-application a second cbv-application.
These observations prompted us to consider probabilistic termination from the perspective of the Functional Machine Calculus (FMC) [18], a -calculus with computational effects. Firstly, the Krivine Machine plays a central role in the FMC (indeed it is the “M” in “FMC”), while the FMC provides the machine with a new notion of successful termination, absent from the standard -calculus. It is then a natural question if and how this may be used to capture the probability of successful termination. Secondly, the FMC may express both cbn and cbv behaviour, with the cbn -calculus a fragment and the cbv -calculus encoded in the syntax. The need for ad-hoc constructs to control reduction behaviour is thus avoided.
The Krivine Machine, simplified by replacing environments with substitution, evaluates a -term in the presence of a stack of input terms. An abstraction pops the top off the stack, say , and continues as , while an application pushes its argument and continues as . A -term may thus be viewed as a language of instruction sequences for this machine: application–push, abstraction–pop, variable–execute.
The FMC then extends the -calculus with sequential composition and its unit, the imperative skip , with the expected semantics: concatenation of machine instructions and the empty instruction. As in models of imperative languages, skip indicates successful termination of the machine. This gives a fragment called the sequential -calculus.
We adopt these modifications in the to give the sequential probabilistic event -calculus (), defined below. Following [18] we render abstraction as and application as to emphasise the machine behaviour of pop and push, retaining the standard syntax as a shorthand. In particular, the new notation clarifies the interaction between push and sequencing: the following three terms are equivalent, rendered first in standard notation and second with prefix application.
The full FMC further generalises to a machine with multiple independent stacks, addressed by a set of locations, in which pop and push are then parameterised to operate on the corresponding stack. This allows us to encode the effects of mutable higher-order store, input/output, and indeed probabilistic computation: the generator of the is, in the FMC, an abstraction parameterised to draw from a stream of random values labelled . The is thus a fragment of the FMC with two locations.
Definition 13.
The sequential probabilistic event -calculus is given as follows.
Prefixing binds tighter than sequencing, , and sequencing associates right, . Projections and contexts extend to the as below; head contexts and applicative contexts are as for the .
The interaction between sequentiality and the -calculus is governed by the following sequencing reduction rules. These make sequential composition right-associative, and let the prefixing of push, pop, and the generator propagate past it (as in a standard list concatenation algorithm). The result is to make the first such action on the abstract machine the leading construct.
The sequencing relation is given by closing these rules under all contexts , and head sequencing by closing under head contexts only. The -reduction rule in notation is as below left, with given by closing under all contexts and by closing under head contexts. Projective reduction, below right, is as previously.
Head reduction is the union of all three head relations:
We round off by observing the shape of head-normal forms, assuming no free event variables.
Proposition 14.
The head-normal forms of event-closed -terms are of one of the three forms , , and .
5.1 Encoding call–by–value
Sequential composition provides an essential element that the -calculus lacks, and which is at the heart of the cbv/cbn dichotomy: control over execution. The cbv behaviour of an application is encoded almost as : first evaluate the argument , then evaluate the function (the full encoding, below, includes an extra part to also execute ).
We demonstrate the encoding of the cbv-probabilistic -calculus of Faggian and Ronchi Della Rocca [16]. The encoding of cbv -terms is standard: see [13, 18, 28]. Values , and terms , encode by the translations and respectively, below.
The operational intuition is that a push represents a return value: a term evaluates until it is of the form , at which point is pushed to the stack and the machine terminates. Then -reduction is simulated as follows.
The probabilistic reduction rule of is that of projective reduction, under the given encoding, but it applies in surface contexts:
Then the translation of indeed does not place the probabilistic redex inside a push, the requirement for correct behaviour.
5.2 The abstract machine
The small-step operational semantics of the is given by the following abstract machine. A state is a triple , where is a term and and are stacks of terms. is the operand stack, with the head to the right as , and is the continuation stack, with the head to the left as . In both cases the empty stack is written , and concatenation by juxtaposition, . Transitions or steps are probabilistic: a transition rule is written as below left, expressing that the machine transitions from a state to with probability . We may omit when . A run is a sequence of steps, written as below centre, where probabilities are multiplied, i.e. below is the product of the probabilities of all steps. A run is successful if it terminates with skip and an empty continuation stack, as below right; the stack then holds the return values of the computation.
Definition 15.
The sequential probabilistic machine (SPM) is given by the following probabilistic transitions.
5.3 Big-step semantics
Running the machine for a given term and input stack gives a distribution of return stacks. We use the following notation, extending from that for distributions over terms.
Definition 16.
The evaluation relation is defined inductively by the following rules.
We demonstrate that small-step and big-step semantics agree.
Proposition 17.
if and only if there is a finite collection of distinct runs
Proof.
By induction on . By induction each run in the collection of runs.
6 Sequential probabilistic types
Types for the sequential -calculus are of the form below, with the meaning: given an input stack of terms typed by through , the machine will terminate successfully and return a stack with types through .
For the , we parameterize this with the probability of successful termination.
Definition 18.
Sequential probabilistic types are given by the following grammars, where is a probability.
A typing judgement assigns a term the type in the context of and , and assigns a stack of terms a type vector . The sequential probabilistic type system is given by the typing rules in Figure 4. We may omit when .
There are no base types: their rôle is subsumed by types with empty vectors . Observe that because stacks are last-in-first-out, the identity term on two elements is , i.e. with the order of and reversed between popping and pushing. We match this reversal in types, and assign this term the type . Since we want identity types to be of the form , in a type we consider the antecedent type vector to be reversed, i.e.
Probabilistic -types embed into sequential types by . With this identification, for -terms and -types the two type systems coincide.
Proposition 19.
For a -term, if and only if .
Every type is inhabited by a closed term. For a type , define the zero term as follows: for ,
Proposition 20 (Type inhabitation).
Every type is inhabited by its zero term, .
Proof.
By induction on the type , using the rule to lower the termination probability from to an arbitrary .
Type inhabitation is an important justification for the interpretation of types as a guarantee for successful machine termination: it means that for a typed term , a suitable input stack always exists. Our main theorem, below, formalizes this interpretation: for a term and stack , the machine successfully returns a stack with probability at least .
Theorem 21.
For a typed, closed term and stack there is a finite set of distinct successful runs
with sum probability .
Proof.
See Appendix A.1.
The lower bound given by the theorem is an exact bound when two conditions are met: the typing derivation does not use the rule to lower the termination bound, and every use of the rule to assign a zero probability applies to a term that is in fact non-terminating. In such a case, such as Example 8, types can be seen to give an exact bound.
Since head reduction as a strategy closely follows machine evaluation, it is no surprise that the termination bound for the machine is also a lower bound for probabilistic head reduction. This is formalised in the following theorem.
Theorem 22.
For a closed , if then where all terms in are head normal and .
Proof.
See Appendix A.2.
Observe that our main theorem on probabilistic termination for the , Theorem 10, follows immediately from the corresponding Theorem 22 above by conservativity of the over the . This is despite the fact that the proof of Theorem 22 makes essential use of the notion of successful termination made available by sequencing, absent from the .
7 Conclusions
To us, what stands out about our approach is the simplicity and the natural intuition of our type system. This manifests in the transparent reasoning in our proofs, which despite using deep techniques such as abstract reducibility, give a direct and clear connection between types, machine behaviour, and reduction.
Compared with the approach in [1], the simplicity of our approach manifests in several advantages. First is to avoid the need for counting quantifiers, associating probabilities instead with base types, and the use of simple event valuations over boolean formulas. Second, it is clear that the expression of both call–by–name and call–by–value behaviour is an essential ingredient for a probabilistic calculus. We eschew the introduction of ad-hoc constructs, relying instead on a principled interpretation of cbv via sequential composition. Finally, where the main example [1, Example 6.2] requires intersection types to produce the exact bound, our approach does so directly in Example 8, while morally remaining within the realm of simple types, and strictly improving on the bounds provided by .
References
- [1] Melissa Antonelli, Ugo Dal Lago, and Paolo Pistone. Curry and howard meet borel. In Proceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS‘22), pages 45:1–45:13. ACM, 2022. doi:10.1145/3531130.3533361.
- [2] Martin Avanzini, Ugo Dal Lago, and Akihisa Yamada. On probabilistic term rewriting. In John P. Gallagher and Martin Sulzmann, editors, Proc. Functional and Logic Programming - 14th International Symposium, FLOPS 2018, volume 10818 of Lecture Notes in Computer Science, pages 132–148. Springer, 2018. doi:10.1007/978-3-319-90686-7_9.
- [3] Chris Barrett, Willem Heijltjes, and Guy McCusker. The Functional Machine Calculus II: Semantics. In 31st EACSL Annual Conference on Computer Science Logic (CSL 2023), volume 252 of Leibniz International Proceedings in Informatics (LIPIcs), pages 10:1–10:18, 2023. doi:10.4230/LIPIcs.CSL.2023.10.
- [4] Raven Beutner and Luke Ong. On probabilistic termination of functional programs with continuous distributions. In Stephen N. Freund and Eran Yahav, editors, Proceedings 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI), pages 1312–1326. ACM, 2021. doi:10.1145/3453483.3454111.
- [5] Flavien Breuvart and Ugo Dal Lago. On intersection types and probabilistic lambda calculi. In roceedings of the 20th International Symposium on Principles and Practice of Declarative Programming, PPDP 2018, pages 8:1–8:13. ACM, 2018. doi:10.1145/3236950.3236968.
- [6] Fredrik Dahlqvist and Dexter Kozen. Semantics of higher-order probabilistic programs with conditioning. Proceedings of the ACM on Programming Languages, 4(POPL):57:1–57:29, 2020. doi:10.1145/3371125.
- [7] Ugo Dal Lago, Claudia Faggian, and Simona Ronchi Della Rocca. Intersection types and (positive) almost-sure termination. Proceedings of the ACM on Programming Languages, 5(POPL):1–32, 2021. doi:10.1145/3434313.
- [8] Ugo Dal Lago and Charles Grellois. Probabilistic termination by monadic affine sized typing. ACM Transactions on Programming Languages and Systems, 41(2):10:1–10:65, 2019. doi:10.1145/3293605.
- [9] Ugo Dal Lago, Giulio Guerrieri, and Willem Heijltjes. Decomposing probabilistic lambda-calculi. In Proceedings of Foundations of Software Science and Computation Structures (FoSSaCS), volume 12077 of LNCS, pages 136–156, 2020. doi:10.1007/978-3-030-45231-5_8.
- [10] Ugo Dal Lago and Margherita Zorzi. Probabilistic operational semantics for the lambda calculus. RAIRO - Theoretical Informatics and Applications, 46(3):413–450, 2012. doi:10.1051/ita/2012012.
- [11] Karel de Leeuw, Edward F. Moore, Claude E. Shannon, and Norman Shapiro. Computability by probabilistic machines. Automata studies, 34:183–198, 1956.
- [12] Ugo De’Liguoro and Adolfo Piperno. Non deterministic extensions of untyped lambda-calculus. Information and Computation, 122(2):149–177, 1995. doi:10.1006/inco.1995.1145.
- [13] Rémi Douence and Pascal Fradet. A systematic study of functional language implementations. ACM Transactions on Programming Languages and Systems, 20(2):344–387, 1998. doi:10.1145/276393.276397.
- [14] Thomas Ehrhard and Giulio Guerrieri. The bang calculus: An untyped lambda-calculus generalizing call-by-name and call-by-value. In Proceedings of the 18th International Symposium on Principles and Practice of Declarative Programming (PPDP’16), pages 174–187, 2016. doi:10.1145/2967973.2968608.
- [15] Thomas Ehrhard and Christine Tasson. Probabilistic call by push value. Logical Methods in Computer Science, 15(1), 2019. doi:10.23638/LMCS-15(1:3)2019.
- [16] Claudia Faggian and Simona Ronchi Della Rocca. Lambda calculus and probabilistic computation. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–13, 2019. doi:10.1109/LICS.2019.8785699.
- [17] Jean Goubault-Larrecq. A probabilistic and non-deterministic call-by-push-value language. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–13. IEEE Computer Society, 2019. doi:10.1109/LICS.2019.8785809.
- [18] Willem Heijltjes. The functional machine calculus. In Proceedings of the 38th Conference on the Mathematical Foundations of Programming Semantics, MFPS XXXVIII, volume 1 of ENTICS, 2022. doi:10.46298/ENTICS.10513.
- [19] C. Jones and G. D. Plotkin. A probabilistic powerdomain of evaluations. In Proceedings of the Fourth Annual Symposium on Logic in Computer Science (LICS ’89), pages 186–195. IEEE Computer Society, 1989. doi:10.1109/LICS.1989.39173.
- [20] Achim Jung and Regina Tix. The troublesome probabilistic powerdomain. Electronic Notes in Theoretical Computer Science, 13:70–91, 1998. doi:10.1016/S1571-0661(05)80216-6.
- [21] Benjamin Kaminski. Advanced Weakest Precondition Calculi for Probabilistic Programs. PhD thesis, Fakultät für Mathematik, Informatik und Naturwissenschaften, RWTH Aachen University, 2019. doi:10.18154/RWTH-2019-01829.
- [22] Naoki Kobayashi, Ugo Dal Lago, and Charles Grellois. On the termination problem for probabilistic higher-order recursive programs. In Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, pages 1–14. IEEE, 2019. doi:10.1109/LICS.2019.8785679.
- [23] Jean-Louis Krivine. A call-by-name lambda-calculus machine. Higher-Order and Symbolic Computation, 20(3):199–207, 2007. doi:10.1007/s10990-007-9018-9.
- [24] Thomas Leventis. A deterministic rewrite system for the probabilistic -calculus. Mathematical Structures in Computer Science, 29(10):1479–1512, 2019. doi:10.1017/S0960129519000045.
- [25] Paul Blain Levy. Call-by-push-value: A functional/imperative synthesis, volume 2 of Semantic Structures in Computation. Springer Netherlands, 2003. doi:10.1007/978-94-007-0954-6.
- [26] Udi Manber and Martin Tompa. Probabilistic, nondeterministic, and alternating decision trees. In 14th Annual ACM Symposium on Theory of Computing, pages 234–244, 1982. doi:10.1145/800070.802197.
- [27] Annabelle McIver and Carroll Morgan. Abstraction and refinement in probabilistic systems. SIGMETRICS Perform. Eval. Rev., 32(4):41–47, March 2005. doi:10.1145/1059816.1059824.
- [28] A.J. Power and Hayo Thielecke. Closed Freyd- and -categories. In International Colloquium on Automata, Languages, and Programming (ICALP), volume 1644 of Lecture Notes in Computer Science, pages 625–634. Springer, 1999. doi:10.1007/3-540-48523-6_59.
- [29] Norman Ramsey and Avi Pfeffer. Stochastic lambda calculus and monads of probability distributions. In Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pages 154–165, 2002. doi:10.1145/503272.503288.
- [30] Nasser Saheb-Djahromi. Probabilistic LCF. In Mathematical Foundations of Computer Science 1978, Proceedings, 7th Symposium, volume 64 of Lecture Notes in Computer Science, pages 442–451. Springer, 1978. doi:10.1007/3-540-08921-7_92.
- [31] Davide Sangiorgi and Valeria Vignudelli. Environmental bisimulations for probabilistic higher-order languages. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, pages 595–607, 2016. doi:10.1145/2837614.2837651.
Appendix A Proofs for Section 6
A.1 Probabilistic machine termination
Our main theorem for the typed will be that a type guarantees a probability of termination of the machine of at least , given an input stack of type , and returning a stack of type . The proof is a direct application of abstract reducibility. For each type we define a set that holds the terms with the above property, and we proceed to prove that every term of type belongs to this set.
We write for the set of finite distributions of weight over a set .
Definition 23.
The set is defined by mutual induction on types and type vectors as a set of closed terms respectively of closed stacks, as follows.
For the proof of Lemma 25, the main reducibility lemma, we need the following notation, as well as an additional lemma. We write for a substitution map , where is the application of to . The set then extends to contexts as follows. Note that the definition implies that is closed (i.e. each substituting term is closed).
For an event valuation we write for the projection on each in .
We expand the stacks in a distribution by prepending a stack as , where .
Lemma 24.
If then for any stack .
Proof.
By induction on the definition of .
The main reducibility lemma is then the following.
Lemma 25.
If and then .
Proof.
By induction on the typing derivation for . Note that since is closed, . We cover three key cases (sequencing, abstraction, and generator); the remaning are similar.
- Sequencing case:
-
Let and . Given , by induction we have with each and . Again by induction, for each we have with . The definition of gives for . Finally, observe that since and since each is a distribution over .
- Abstraction case:
-
Let and . We need to demonstrate a distribution such that . Let and note that since is not defined on and and are closed, and . The inductive hypothesis for the premise , with , gives the desired with evaluation . Then by the definition of evaluation, , and hence .
- Generator case:
-
Let and . Let for . By the inductive hypothesis, and , which gives and for some and . Let ; then and by definition of evaluation, . We conclude that .
Our main theorem, the probability of machine termination, is a direct consequence.
Theorem 21 (restatement).
For a typed, closed term and stack there is a finite set of distinct successful runs
with sum probability .
Proof.
A.2 Probabilistic head normalization
Finally, we will relate machine termination to head reduction. For a redex in a head context the machine runs as follows: after the abstractions consume the top elements off the stack, and the applications push the terms onto it, then the redex itself is the first part of the term to be evaluated on the machine. Machine evaluation thus corresponds tightly to head reduction, with the same order of evaluation of redexes.
However, in this correspondence, the machine halts with a variable, while successful termination in our setting requires a skip with an empty continuation stack. We will thus use a different approach.
The main idea is that reduction shortens the runs of the machine, by removing consecutive push and pop operations, or in the case of projective reduction, removing a generator. By annotating the evaluation relation to count abstractions, applications, and generators we may then observe that this measure reduces under head reduction.
Definition 26.
The weighted evaluation relation is given in Figure 5, where the weight is a natural number. We extend it to a distribution of terms by the following rule, carrying a multiset of weights .
The core lemma establishes that the weight in the evaluation relation decreases for and reduction steps, and is stable under . However, this does not apply to terms introduced by the zero-rule, as , which are the potentially non-terminating terms. Crucially for our purpose, when the evaluation returns a non-empty distribution, head-reduction on the term progresses towards a head-normal form.
Lemma 27.
Let where .
-
If then .
-
If then with .
-
If then with for every in .
Proof.
We first consider the reduction steps themselves, and then their closure under head contexts. The proof is by induction on . We consider three cases; the remaining are similar.
- Beta
-
Since is non-empty, the derivation must be as below left, as none of the inferences may be replaced by a zero-rule. The case is then as follows.
- Projection
-
Since is non-empty, the derivation for the redex is the first below. The derivation for the reduct, second below, uses the evaluation rule for distributions.
- Sequence–generator
-
The derivations are as follows, with premises stacked for space. In the second derivation and .
This concludes the reduction steps. The remaining cases consider the closure under head contexts. For sequencing reduction, which leaves the weight unchanged, this is immediate, and the cases for beta-steps are straightforward and omitted. For projective reduction, let , and consider the remaining cases.
- Application
-
The derivation for is as follows.
By induction, for the premise of this derivation we get one for the distribution , below, where it follows that and .
Then for we get the following derivation.
- Abstraction
-
The derivation for is as follows.
Given the reduction for , we also have the following, where we abbreviate the reduct as .
By induction this gives us the following derivation for , where again and .
Then for we get the required derivation.
We apply the above lemma to relate machine evaluation to head reduction in the following way: if evaluation returns a distribution of weight , then head-reduction terminates with probability at least .
Lemma 28.
If then where all terms in are head normal and .
Proof.
Let evaluate by the following derivation.
First, we head-reduce any where . This process terminates by induction on , using Lemma 27: a -step or projective step reduces , while sequencing reduction alone is terminating and does not increase it.
Then the distribution is head-normal. The weights of and are as follows.
It follows that .
Our final theorem then ties everything together: typing gives successful evaluation on the machine, which in turn gives termination of head reduction.
Theorem 22 (restatement).
For a closed , if then where all terms in are head normal and .
Proof.
We provide with an input stack consisting only of zero terms . Let be the stack of zero terms for . By Proposition 20 zero-terms inhabit their types, so that .
By Lemma 25 we have and . By the definition of this gives an evaluation where .
For the corresponding weighted derivation , Lemma 28 gives a head reduction where all terms in are head normal and .
