Abstract 1 Introduction 2 Related Work 3 MrBNF in Action 4 MrBNF’s Internals: Construction of Datatypes with Bindings 5 Application I: Mazza’s Isomorphism 6 Application II: POPLmark Challenge 7 Conclusion References

Animating MRBNFs: Truly Modular Binding-Aware Datatypes in Isabelle/HOL

Jan van Brügge ORCID Heriot-Watt University, Edinburgh, UK Andrei Popescu ORCID University of Sheffield, UK Dmitriy Traytel ORCID University of Copenhagen, Denmark
Abstract

Nominal Isabelle provides powerful tools for meta-theoretic reasoning about syntax of logics or programming languages, in which variables are bound. It has been instrumental to major verification successes, such as Gödel’s incompleteness theorems. However, the existing tooling is not compositional. In particular, it does not support nested recursion, linear binding patterns, or infinitely branching syntax. These limitations are fundamental in the way nominal datatypes and functions on them are constructed within Nominal Isabelle. Taking advantage of recent theoretical advancements that overcome these limitations through a modular approach using the concept of map-restricted bounded natural functor (MRBNF), we develop and implement a new definitional package for binding-aware datatypes in Isabelle/HOL, called MrBNF. We describe the journey from the user specification to the end-product types, constants and theorems the tool generates. We validate MrBNF in two formalization case studies that so far were out of reach of nominal approaches: (1) Mazza’s isomorphism between the finitary and the infinitary affine λ-calculus, and (2) the POPLmark 2B challenge, which involves non-free binders for linear pattern matching.

Keywords and phrases:
syntax with bindings, datatypes, inductive predicates, Isabelle/HOL
Funding:
Andrei Popescu: Acknowledges support from the EPSRC grants EP/V039156/1 (Security of Digital Twins in Manufacturing) and and EP/X015114/1 Safe and secure COncurrent programming for adVancEd aRchiTectures (COVERT).
Dmitriy Traytel: Acknowledges support from the Novo Nordisk Foundation (start package grant NNF20OC0063462) and the Independent Research Fund Denmark (Sapere Aude grant 3120-00057B, titled DISCOVER: Distributed Streaming Computations Verified).
Copyright and License:
[Uncaptioned image] © Jan van Brügge, Andrei Popescu, and Dmitriy Traytel; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Security and privacy Logic and verification
Supplementary Material:
Software: https://doi.org/10.5281/zenodo.15756655
Acknowledgements:
We thank the ITP reviewers for their insightful comments. Martin Elsman challenged MrBNF with advanced syntaxes, which overall resulted in our tool becoming more robust. The authors are listed alphabetically regardless of individual contributions or seniority.
Editors:
Yannick Forster and Chantal Keller

1 Introduction

Most programming languages involve variable-binding constructs, or simply binders, such as the lambda abstraction or the recursive and non-recursive let operators. For the study of these languages’ metatheory, the specific choice of bound variable names in a language expression is immaterial. For example, it is customary to treat the lambda calculus expressions λx.x and λy.y as being syntactically equal and to choose bound variables in a way that avoids name clashes with surrounding free variables – this is known as Barendregt’s variable convention [6].

The mechanization of programming language metatheory in proof assistants struggles to keep up with this informal convention. The POPLmark challenge [4] initiated a flurry of approaches to mechanized binders, each with its own strengths and weaknesses (§2). The used approaches can be categorized into three main paradigms: (1) the nameless representation that replaces bound variables in terms with pointers to the binding position [17, 23]; (2) the nameful or nominal representation that includes bound variable names but identifies terms modulo alpha-equivalence, i.e., up to bound variable renaming [18]; and (3) the reductive representation that embeds the programming language’s binders into the metalogic’s binders [19, 30, 28].

The nominal representation faithfully encodes Barendregt’s variable convention in that the accompanying reasoning principles (e.g., nominal induction) allow their users to assume that bound variables do not clash with surrounding free variables whenever bound variables are introduced. Nominal Isabelle [21] implements the nominal representation in the Isabelle/HOL proof assistant with successful applications ranging from Gödel’s incompleteness theorems [29] to verifying the correctness of Haskell’s compiler optimizations [13] and the security of authenticated data structures [14]. All these developments use the expected binder constructs: lambda abstractions, existential quantifiers, and (parallel) recursive let constructs.

Nominal Isabelle is fundamentally restricted to syntaxes with finite support, i.e., expressions may only contain finitely many free and bound variables. This rules out applications like Mazza’s infinitary affine lambda calculus [22]. Moreover, nested recursion, which e.g., is needed to model function applications to multiple arguments, is not directly supported. Sometimes this limitation can be overcome by using mutual recursion instead, but this workaround is limited to cases where the nesting type is a datatype itself. Nesting through coinductive datatypes such as streams or lazy lists or non-free structures such as finite or countable sets remains problematic. Also nominal datatypes, even in their flexible variant provided by Nominal 2 [42], cannot directly incorporate linear patterns, which are required in most of the complex binder structures including those of POPLmark 2B.

Blanchette et al. [10] have proposed map-restricted bounded natural functors (MRBNFs) as a new modular foundation for binding-aware datatypes that overcomes the above limitations. MRBNFs generalize bounded natural functors (BNFs) [41], which underly Isabelle’s datatypes and codatatypes [11]. In this paper, we present the journey from the theoretical MRBNF framework to a practical package in Isabelle/HOL, called MrBNF (pronounced “Mister BNF”).

MrBNF’s heart is the binder_datatype command for declaring binding-aware datatypes. Behind the scenes, the command composes and takes least fixed points of MRBNFs, defines the new type as the quotient of a raw nameful datatype by alpha-equivalence, lifts the raw constructors to the quotient, and proves nominal induction principles as well as a wealth of constructor properties (§3). The command also provides a nominal recursor infrastructure, which is crucial for defining recursive functions. All constructions are carried out foundationally in Isabelle/HOL: no axioms have been introduced (§4). Our main contributions are twofold:

  • We extend Isabelle/HOL with a foundational package for defining binding-aware datatypes that supports nested recursion, complex inductive binders, and types that may have infinitely many free or bound variables. This involves proving that MRBNFs, the key notion underlying our approach, are closed under composition and least fixed points. We design and automate these mechanized proofs as Isabelle/ML tactics. Our implementation includes a user-friendly proof method for applying the nominal induction principles and a nominal recursor for defining binding-aware primitive recursive functions on datatypes with binders.

  • Two case studies illustrate our tool’s usefulness. First, we prove Mazza’s result [22] that the λ-calculus is isomorphic to an infinitary affine λ-calculus (§5). Second, we formalize the POPLmark challenge [4], i.e., type soundness of System F<:, including parts 1B and 2B, which extend the language with records and pattern matching (§6). To the best of our knowledge, this is the first formalization of these extensions using a nominal approach.

Our implementation and case studies are publicly available [44].

2 Related Work

We refer to Blanchette et al. [10, Section 9] for a broad overview of syntax with bindings approaches in programming languages and proof assistants. Here, we focus our attention on how these approaches manifest themselves in proof assistants and discuss strength and weaknesses.

The representation of variables as de Bruijn indices [17] is widely popular in proof assistants [8, 45, 37, 20, 16, 35, 40, 25] because it is readily available via standard datatypes. Thereby bound variables point to the respective binders using a simple indexing scheme: a number indicates how many binders to skip when traversing the syntax tree towards its root. Binders such as λ-abstractions do not need to mention the bound variable. Free variables are numbers, too, namely those larger than the number of binders above them. For example, 𝖫𝗆(𝖫𝗆(𝖠𝗉(𝖠𝗉 1 0) 2)) is the de Bruijn version of the λ-calculus term λx.λy.((yx)z). Working with indices frequently requires shifting when a term is moved under a binder, e.g., during substitution. Good automation as provided by Autosubst in Coq [36, 39] can eliminate much of the tedium of index shifting. Nonetheless the internal representation occasionally leaks: index shifts may pop up in induction proofs and sometimes even lemma statements.

The related locally nameless representation [15, 33] combines de Bruijn indices for bound variables with the named representation for free variables, which allows to use readable names for the free variables. Locally nameless replaces shifting by opening terms such that bound variables are turned into free ones. One downside of the locally nameless approach is that terms with loose bounds are malformed and need to be ruled out using a predicate (or a subtype).

Nominal Logic [18] provides a nameful alternative to the two above approaches: binders carry explicit bound variable names, but the syntax is quotiented by a notion of alpha-equivalence which makes the name choice immaterial. Still the explicit mention of the bound meta-variable in the binders allows us to refer to it explicitly and choose it to avoid other surrounding variables, which enforces Barendregt’s variable convention. Nominal Isabelle is the Isabelle/HOL implementation of nominal logic [21, 42], which has been used in several substantial formalizations efforts [7, 29, 14, 13]. Our contribution follows the nominal approach, while generalizing the support for nested recursion and allowing infinitely many variables in terms.

Higher-order abstract syntax (HOAS) [30] uses binding primitives available in the meta-logic to represent binders of the object of study. For example, abstraction in the λ-calculus becomes 𝖫𝗆:(varterm)term under weak HOAS and 𝖫𝗆:(termterm)term under HOAS, reusing the λ-abstraction available in the language when writing specific lambda terms, e.g., 𝖫𝗆(λx.𝖫𝗆(λy.𝖠𝗉(𝖠𝗉yx)(𝖵𝗋z))). A challenge with (weak) HOAS are so-called exotic terms, i.e., terms that do not constitute valid λ-calculus terms because they observe aspects of the meta-language that the object language should not see. HOAS is popular in logical frameworks pioneered by Twelf [31] and refined and extended in Beluga [32] and Abella [5].

Berghofer and Urban [9] provide a detailed comparison between the de Bruijn and nominal approaches; Momigliano et al. [24] perform a similar exercise for de Bruijn and (weak) HOAS. Ambal et al. [2] compare all above approaches in the context of a higher-order π-calculus. Norrish and Vestergaard [27] establish a formal connection between de Bruijn and nominal terms.

Solutions using different above techniques [1] target the POPLmark challenge [4]. However, only four cover all proof-related parts, in particular including complex binders for linear pattern matching: three using de Bruijn indices in Isabelle [8] and Coq [45, 38] and one using HOAS in Twelf [3]. We provide the first complete solution following the nominal approach.

3 MrBNF in Action

As users, what do we want to be the effect of specifying a datatype with bindings, such as those of λ- or π-calculus syntax? We want the following: (1) a type capturing the syntax fully abstractly, i.e., not distinguishing between alpha-equivalent terms and not including “junk”, i.e., invalid terms; (2) constants corresponding to the syntactic constructors and other syntactic operators such as renaming and free-variables; (3) propositions describing the basic properties of non-binding constructors), and quasi-injectivity for the binder constructors; (4) propositions describing the basic properties concerning the interaction of constructors and the renaming and free-variable operators; (5) a proposition stating a binding-aware structural induction principle; and (6) a proposition stating the characteristic equations of a binding-aware structural recursion principle.

Importantly, we would not care how such a type and constants have been defined internally, because (a subset of) the above properties characterize the type uniquely up to an isomorphism. This ensures that these internal definitions, however they proceed, give us the correct result.

In the remainder of this section, using a sequence of increasingly sophisticated syntaxes with bindings we will illustrate how our MrBNF definitional package achieves these goals.

3.1 Preliminaries on cardinals and permutations

Isabelle has a well-developed theory of ordinals and cardinals [12]. In a nutshell: an ordinal is just a well-order, while a cardinal is an ordinal that is minimal under the preorder relation o on ordinals defined as follows: ror iff there exists a well-order embedding between r and r; we also write <o for the strict counterpart of this preorder, and also =o for its induced equivalence relation. Given any set A:aset, we define |A| to be its cardinality; technically, this is a (necessarily unique up to an order isomorphism) choice of a cardinal on a whose domain is A and that forms a well-order on A. Instead of |𝖴𝖭𝖨𝖵:aset|, the cardinality of the set of all elements of type a, we will simply write |a|, and refer to it as the cardinality of the type a.

For a function σ:aa, we write 𝗌𝗎𝗉𝗉σ for its support, defined as the set of elements that σ modifies, {xσxx}. We call permutation any such function that (1) is bijective and (2) has the cardinality of its support strictly smaller than that of its underlying type. Formally, the (polymorphic) predicate 𝗉𝖾𝗋𝗆:(aa)bool reflects this as 𝗉𝖾𝗋𝗆σ𝖻𝗂𝗃σ|𝗌𝗎𝗉𝗉σ|<o|a|. When a is countably infinite, being a permutation amounts to being a bijection of finite support, so this generalizes the standard nominal logic assumption. We let σ range over permutations and write σ1 for the inverse of σ. We let ab denote the swapping permutation, which takes a to b, takes b to a, and leaves everything else unchanged.

3.2 𝝀-calculus terms

Let us start with the paradigmatic example of syntax with bindings, that of untyped λ-calculus. Using our package, this can be declared as the following datatype lterm  of λ-terms, which is polymorphic in the type of variables, i.e., depends on the Isabelle type-variable 𝑣𝑎𝑟:

binder_datatype ’var lterm = Vr ’var | Ap "’var lterm" "’var lterm"
| Lm x::’var t::"’var lterm" binds x in t

When using the type 𝑣𝑎𝑟lterm , we will always implicitly assume that 𝑣𝑎𝑟 has at least countably infinite cardinality. (This is achieved in practice via a type class 𝗅𝖺𝗋𝗀𝖾lterm , i.e., being “large enough”, which means having cardinality at least as large as 𝖻𝗈𝗎𝗇𝖽lterm , and 𝖻𝗈𝗎𝗇𝖽lterm  is a cardinal bound specific to each datatype – here, for lterm , it is a countable cardinal, i.e., 𝖻𝗈𝗎𝗇𝖽lterm =0, so smallness means “at least countably infinite” – see §4 for more details.)

The command produces the following constants, all polymorphic in 𝑣𝑎𝑟:

  • the constructors 𝖵𝗋:𝑣𝑎𝑟𝑣𝑎𝑟lterm , 𝖠𝗉:𝑣𝑎𝑟lterm 𝑣𝑎𝑟lterm 𝑣𝑎𝑟lterm  and 𝖫𝗆:𝑣𝑎𝑟𝑣𝑎𝑟lterm 𝑣𝑎𝑟lterm ;

  • the free-variable operator 𝖥𝖵lterm :𝑣𝑎𝑟lterm 𝑣𝑎𝑟set;

  • the permutation operator 𝖯𝖤𝖱𝖬lterm :(𝑣𝑎𝑟𝑣𝑎𝑟)𝑣𝑎𝑟lterm 𝑣𝑎𝑟lterm , where we write t[σ]lterm  instead of 𝖯𝖤𝖱𝖬lterm σt;

  • a cardinal bound, 𝖻𝗈𝗎𝗇𝖽lterm  (which, as explained above, in this case it is 0);

  • a binding-aware recursion combinator

    𝗋𝖾𝖼lterm :((𝑣𝑎𝑟𝑣𝑎𝑟)(pp))(p𝑣𝑎𝑟set)((𝑣𝑎𝑟𝑣𝑎𝑟)(aa))(a𝑣𝑎𝑟set)(𝑣𝑎𝑟(pa))((pa)(pa)(pa))(𝑣𝑎𝑟(pa)(pa))𝑣𝑎𝑟lterm (pa).

We write 𝖥𝖵 and _[_] instead of 𝖥𝖵lterm  and _[_]lterm  (and similarly for other examples).

The following properties are generated (stated and proved) by our command:

Prop 1.
  1. (I)

    Distinctness and (quasi-)injectivity of the constructors:
    (1) 𝖵𝗋x𝖠𝗉t1t2; (2) 𝖵𝗋x𝖫𝗆xt; (3) 𝖠𝗉t1t2𝖫𝗆xt;
    (4) 𝖵𝗋x=𝖵𝗋xx=x; (5) 𝖠𝗉t1t2=𝖠𝗉t1t2t1=t1t2=t2;
    (6) 𝖫𝗆xt=𝖫𝗆xt(x𝖥𝖵tx=x)t=t[xx];

  2. (II)

    Equivariance of the constructors:
    (1) 𝗉𝖾𝗋𝗆σ(𝖵𝗋x)[σ]=𝖵𝗋(σx); (2) 𝗉𝖾𝗋𝗆σ(𝖠𝗉t1t2)[σ]=𝖠𝗉(t1[σ])(t2[σ]);
    (3) 𝗉𝖾𝗋𝗆σ(𝖫𝗆xt)[σ]=𝖫𝗆(σx)(t[σ]);

  3. (III)

    Smallness (here, equivalently, finiteness) of the set of free variables:
    (1) |𝖥𝖵t|<o𝖻𝗈𝗎𝗇𝖽lterm ;

  4. (IV)

    Interaction between free variables and constructors:
    (1) 𝖥𝖵(𝖵𝗋x)={x}; (2) 𝖥𝖵(𝖠𝗉t1t2)=𝖥𝖵t1𝖥𝖵t2; (3) 𝖥𝖵(𝖫𝗆xt)=𝖥𝖵t{x}.

  5. (V)

    Permutation identity and compositionality:
    (1) t[𝗂𝖽]=t; (2) 𝗉𝖾𝗋𝗆σ𝗉𝖾𝗋𝗆σt[σ][σ]=t[σσ];

  6. (VI)

    Interaction between free variables and permutation (infix ` denotes image):
    (1) 𝗉𝖾𝗋𝗆σ𝖥𝖵(t[σ])=σ`𝖥𝖵t; (2) 𝗉𝖾𝗋𝗆σ(x𝖥𝖵t.σx=x)t[σ]=t.

Note that the constructors 𝖵𝗋 and 𝖠𝗉 are free, hence injective (points (4) and (5) in the above proposition). On the other hand, the λ-constructor 𝖫𝗆 is not free, since it introduces bindings – for example, 𝖫𝗆x(𝖵𝗋x)=𝖫𝗆y(𝖵𝗋y) for any variables x,y. Therefore, only a quasi-injectivity, i.e., injectivity up to a renaming, property holds for it (point (6)).

Points (I.6), (IV.3) and (VI.2) all reflect the fact that we work not with entirely free terms but with terms quotiented to alpha-equivalence. And so does the following proposition, expressing a strong version of structural induction, which is also generated by the binder_datatype command:

Prop 2 (Binding-aware structural induction).

Assume 𝖯𝗏𝖺𝗋𝗌:p𝑣𝑎𝑟set and φ:p𝑣𝑎𝑟lterm bool are such that (1) p.|𝖯𝗏𝖺𝗋𝗌p|<o𝖻𝗈𝗎𝗇𝖽lterm , i.e., p.𝖿𝗂𝗇𝗂𝗍𝖾(𝖯𝗏𝖺𝗋𝗌p); (2) p,x.φp(𝖵𝗋x); (3) p,t1,t2.(q.φqt1)(q.φqt2)φp(𝖠𝗉t1t2); and (4) p.x,t.x𝖯𝗏𝖺𝗋𝗌p(q.φqt)φp(𝖫𝗆xt). Then p,t.φpt.

The above resembles standard structural induction (as available for the standard datatypes), except for the highlighted part, which allows one to assume during the induction process that the bound variables are disjoint from the variables coming from a designated type p of parameters – this enables the rigorous application of Barendregt’s variable convention [6]. Taking p to be the unit type and 𝖯𝗏𝖺𝗋𝗌p=, we obtain standard structural induction.

Given a type a together with operators resembling the free-variable and permutation operators, namely 𝑎𝑓𝑣:a𝑣𝑎𝑟set and 𝑎𝑝𝑟𝑚:(𝑣𝑎𝑟𝑣𝑎𝑟)aa, we say that they form a loosely-supported pre-nominal structure, written 𝗅𝗌𝗉𝗇𝗈𝗆𝑎𝑓𝑣𝑎𝑝𝑟𝑚, when the following holds:

  • Compositionality (Prop. 1, V.2): 𝗉𝖾𝗋𝗆σ𝗉𝖾𝗋𝗆σ𝑎𝑝𝑟𝑚σ(𝑎𝑝𝑟𝑚σa)=𝑎𝑝𝑟𝑚(σσ)a.

  • Congruence (Prop. 1, VI.2): 𝗉𝖾𝗋𝗆σ(x𝑎𝑓𝑣a.σx=x)𝑎𝑝𝑟𝑚σa=a.

Moreover, for any cardinal κ, we say that they form a κ-loosely-supported nominal structure, written 𝗅𝗌𝗇𝗈𝗆κ𝑎𝑓𝑣𝑎𝑝𝑟𝑚, when 𝗅𝗌𝗉𝗇𝗈𝗆𝑎𝑓𝑣𝑎𝑝𝑟𝑚 holds and additionally the following holds:

  • Smallness (Prop. 1, III.1 in case κ=𝖻𝗈𝗎𝗇𝖽lterm ): |𝖥𝖵a|<oκ.

Finally, given two types p and a and operators on them

  • 𝑝𝑓𝑣:p𝑣𝑎𝑟set, 𝑝𝑝𝑟𝑚:(𝑣𝑎𝑟𝑣𝑎𝑟)pp,

  • 𝑎𝑓𝑣:a𝑣𝑎𝑟set, 𝑎𝑝𝑟𝑚:(𝑣𝑎𝑟𝑣𝑎𝑟)aa,

  • 𝑣𝑟:𝑣𝑎𝑟(pa), 𝑎𝑝:𝑣𝑎𝑟lterm (pa)𝑣𝑎𝑟lterm (pa)(pa), 𝑙𝑚:𝑣𝑎𝑟𝑣𝑎𝑟lterm (pa)(pa)

(where 𝑣𝑟, 𝑎𝑝, 𝑙𝑚 have types resembling those of lterm ’s constructors 𝖵𝗋, 𝖠𝗉 and 𝖫𝗆), we say that they form an lterm -model, written 𝗆𝗈𝖽𝖾𝗅lterm 𝑝𝑓𝑣𝑝𝑝𝑟𝑚𝑎𝑓𝑣𝑎𝑝𝑟𝑚𝑣𝑟𝑎𝑝𝑙𝑚, provided that: (1) 𝗅𝗌𝗇𝗈𝗆𝖻𝗈𝗎𝗇𝖽lterm 𝑝𝑓𝑣𝑝𝑝𝑟𝑚 holds; (2) 𝗅𝗌𝗉𝗇𝗈𝗆𝑎𝑓𝑣𝑎𝑝𝑟𝑚 holds; and (3) the following properties, corresponding to properties of lterm , hold, where 𝑝𝑎𝑝𝑟𝑚σf=𝑎𝑝𝑟𝑚σf𝑝𝑝𝑟𝑚(σ1):

  1. 1.

    equivariance of the constructors (Prop 1, II.1, II.2, II.3):

    • 𝗉𝖾𝗋𝗆σ𝑝𝑎𝑝𝑟𝑚σ(𝑣𝑟x)=𝑣𝑟(σx);

    • 𝗉𝖾𝗋𝗆σ𝑝𝑎𝑝𝑟𝑚σ(𝑎𝑝t1f1t2f2)=𝑎𝑝(t1[σ])(𝑝𝑎𝑝𝑟𝑚σf1)(t2[σ])(𝑝𝑎𝑝𝑟𝑚σf2);

    • 𝗉𝖾𝗋𝗆σ𝑝𝑎𝑝𝑟𝑚σ(𝑙𝑚xtf)=𝑙𝑚(σx)(t[σ])(𝑝𝑎𝑝𝑟𝑚σf);

  2. 2.

    free-variables sub-distributing under constructors (weaker versions of Prop 1, IV.1, IV.2, IV.3, with inclusions instead of equalities):

    • 𝑎𝑓𝑣(𝑣𝑟xp){x}𝑝𝑓𝑣p;

    • 𝑎𝑓𝑣(f1p)𝑎𝑓𝑣t1𝑝𝑓𝑣p𝑎𝑓𝑣(f2p)𝑎𝑓𝑣t2𝑝𝑓𝑣p𝑎𝑓𝑣(𝑎𝑝t1f1t2f2p)𝑎𝑓𝑣t1𝑎𝑓𝑣t2𝑝𝑓𝑣p;

    • x𝑝𝑓𝑣p𝑎𝑓𝑣(fp)𝑎𝑓𝑣t{x}𝑝𝑓𝑣p𝑎𝑓𝑣(𝑙𝑚xtfp)𝑎𝑓𝑣t{x}𝑝𝑓𝑣p.

We refer to the types p and a above as the parameter type and the carrier type of the lterm -model, respectively. Our binding-aware recursor operates on lterm -models, in that, given any lterm -model it returns a function from terms and parameters to carrier elements that (1) commutes with the constructors and permutation operators; and (2) preserves the free-variable operators. Moreover, commutation with the binding constructor happens in a binding-aware fashion, that is, avoiding clashes between the bound variables and the parameter variables – i.e., again obeying Barendregt’s variable convention. This is expressed in the following proposition:

Prop 3 (Binding-aware recursion).

Assume 𝗆𝗈𝖽𝖾𝗅lterm 𝑝𝑓𝑣𝑝𝑝𝑟𝑚𝑎𝑓𝑣𝑎𝑝𝑟𝑚𝑣𝑟𝑎𝑝𝑙𝑚 holds and let g:𝑣𝑎𝑟lterm pa denote 𝗋𝖾𝖼lterm 𝑝𝑓𝑣𝑝𝑝𝑟𝑚𝑎𝑓𝑣𝑎𝑝𝑟𝑚𝑣𝑟𝑎𝑝𝑙𝑚. The following properties hold: (1) g(𝖵𝗋x)p=𝑣𝑟xp; (2) g(𝖠𝗉t1t2)p=𝑎𝑝t1(gt1)t2(gt2)p; (3) x𝑝𝑓𝑣pg(𝖫𝗆xt)p=𝑙𝑚xt(gt)p; (4) 𝗉𝖾𝗋𝗆σg(a[σ])p=𝑝𝑎𝑝𝑟𝑚σ(ga)p; and (5) 𝑎𝑓𝑣(gtp)𝖥𝖵t𝑝𝑓𝑣p.

In the current implementation, we do not get a single recursor constant and the above recursion theorem, but rather given a model we define g and derive its properties on the fly. Our recursor definition follows Blanchette et al.’s design [10], which generalizes Norrish’s nominal recursor [26] and removes one of the unnecessary assumptions [34].

Here is an example of applying the recursor. For any ρ:varvarlterm , we let its support 𝖲𝗎𝗉𝗉ρ be {x:varρx𝖵𝗋x}, and its image-support 𝖨𝗆𝖲𝗎𝗉𝗉ρ be 𝖲𝗎𝗉𝗉ρt𝖲𝗎𝗉𝗉ρ𝖥𝖵t. We let the type of substitution-functions var𝗌𝗎𝖻𝗌𝗍𝖥𝗎𝗇 be the type of all functions ρ such that |𝖲𝗎𝗉𝗉ρ|<o𝖻𝗈𝗎𝗇𝖽lterm  (obtained as a subtype of ρ:varvarlterm ); function application and composition are inherited to var𝗌𝗎𝖻𝗌𝗍𝖥𝗎𝗇 from the function type and are denoted the same. To define term-for-variable substitution operator 𝗌𝗎𝖻𝗌𝗍:𝑣𝑎𝑟lterm var𝗌𝗎𝖻𝗌𝗍𝖥𝗎𝗇varlterm , we take p=var𝗌𝗎𝖻𝗌𝗍𝖥𝗎𝗇 and a=varlterm , and determine the model from the desired recursive clauses for the constructors and the desired behavior of substitution w.r.t. free variables and permutation:

  1. (1)

    𝗌𝗎𝖻𝗌𝗍(𝖵𝗋x)ρ=ρx;

  2. (2)

    𝗌𝗎𝖻𝗌𝗍(𝖠𝗉t1t2)ρ=𝖠𝗉(𝗌𝗎𝖻𝗌𝗍t1ρ)(𝗌𝗎𝖻𝗌𝗍t2ρ);

  3. (3)

    x𝖨𝗆𝖲𝗎𝗉𝗉ρ𝗌𝗎𝖻𝗌𝗍(𝖫𝗆xt)ρ=𝖫𝗆x(𝗌𝗎𝖻𝗌𝗍tρ);

  4. (4)

    𝗌𝗎𝖻𝗌𝗍(t[σ])ρ=𝗌𝗎𝖻𝗌𝗍t((_[σ])ρ);

  5. (5)

    𝖥𝖵(𝗌𝗎𝖻𝗌𝗍tρ)𝖥𝖵t𝖨𝗆𝖲𝗎𝗉𝗉ρ.

Namely, here is the lterm -model structure (𝑝𝑓𝑣,𝑝𝑝𝑟𝑚,𝑎𝑓𝑣,𝑎𝑝𝑟𝑚,𝑣𝑟,𝑎𝑝,𝑙𝑚) corresponding to (and unambiguously determined from) the above:

(M1)

𝑣𝑟xρ=ρx;

(M2)

𝑎𝑝t1f1t2f2ρ=𝖠𝗉(f1ρ)(f2ρ);

(M3)

𝑙𝑚xtfρ=𝖫𝗆x(fρ);

(M4)

𝑎𝑝𝑟𝑚tσ=t[σ] and 𝑝𝑝𝑟𝑚ρσ=(_[σ])ρ;

(M5)

𝑎𝑓𝑣t=𝖥𝖵t and 𝑝𝑓𝑣ρ=𝖨𝗆𝖲𝗎𝗉𝗉ρ.

Indeed, the (Mi) definitions are obtained by “fishing” the codomain operator behind the (i) recursive clause – e.g., (M2) turns (2) into 𝗌𝗎𝖻𝗌𝗍(𝖠𝗉t1t2)ρ=𝑎𝑝t1(𝗌𝗎𝖻𝗌𝗍ρt1)t2(𝗌𝗎𝖻𝗌𝗍ρt2)ρ. Currently this fishing process is not implemented in our package, so the user has to explicitly indicate these operators and then infer (1)–(5) from the recursion theorem.

3.3 Infinitary 𝝀-calculus terms

Let astream  and adstream  be the polymorphic types of streams (i.e., countable sequences) and distinct (i.e., non-repetitive) streams, respectively. While streams exist in Isabelle’s standard library, we introduce distinct streams as a subtype of streams that ensures that stream elements do not repeat. To simplify the exposition, we pretend that making a type non-repetitive (or linear) is performed automatically using the following command, while for now we are executing manually a uniform construction sketched by Blanchette et al. [10, §4].

linear_type adstream  = astream  on a

The type of infinitary λ-terms [22], where λ-abstraction binds a distinct stream of variables and application applies a term to a stream of terms, is introduced by the following command:

binder_datatype ’var iterm = iVr ’var | iAp "’var iterm" "’var iterm stream"
| iLm "(xs::’var) dstream" t::"’var iterm" binds xs in t

This time (employing the same type-class mechanism explained in §3.2) when using the type 𝑣𝑎𝑟iterm we will implicitly assume that 𝑣𝑎𝑟 has cardinality at least 1, i.e., is more than countable. Indeed, to accommodate the countable branching syntax while ensuring that no term can exhaust the entire supply of variables, we now have 𝖻𝗈𝗎𝗇𝖽iterm=1.

Our command produces again the familiar constants: the constructors 𝗂𝖵𝗋, 𝗂𝖠𝗉 and 𝗂𝖫𝗆, free-variable operator 𝗂𝖥𝖵, permutation operator 𝗂𝖯𝖤𝖱𝖬 (written _[_]), a cardinal bound 𝖻𝗈𝗎𝗇𝖽iterm (here, 1), and a binding-aware recursion combinator 𝗋𝖾𝖼iterm. Moreover, it generates similar properties as for lterm . We only show properties that differ in a major way from the lterm  case (while keeping the numbering). We use an auxiliary predicate for a function that behaves as identity on a given set: 𝗂𝖽_𝗈𝗇Af=xA.fx=x.

Prop 4.
  1. (I)

    Distinctness and (quasi-)injectivity of the constructors: (6) 𝗂𝖫𝗆𝑥𝑠t=𝗂𝖫𝗆𝑥𝑠t(σ.𝗉𝖾𝗋𝗆σ𝗂𝖽_𝗈𝗇(𝗂𝖥𝖵t𝖽𝗌𝗌𝖾𝗍𝑥𝑠)σ𝖽𝗌𝗆𝖺𝗉σ𝑥𝑠=𝑥𝑠t[σ]=t);

  2. (II)

    Equivariance of the constructors:
    (2) 𝗉𝖾𝗋𝗆σ(𝗂𝖠𝗉t𝑡𝑠)[σ]=𝗂𝖠𝗉(t[σ])(𝗌𝗆𝖺𝗉(λt.t[σ])𝑡𝑠);
    (3) 𝗉𝖾𝗋𝗆σ(𝗂𝖫𝗆𝑥𝑠t)[σ]=𝗂𝖫𝗆(𝖽𝗌𝗆𝖺𝗉σ𝑥𝑠)(t[σ]);

  3. (III)

    Smallness (here, equivalently, at most countability) of the set of free variables:
    (1) |𝗂𝖥𝖵t|<o𝖻𝗈𝗎𝗇𝖽iterm;

  4. (IV)

    Interaction between free variables and constructors:
    (2) 𝗂𝖥𝖵(𝗂𝖠𝗉t𝑡𝑠)=𝗂𝖥𝖵tt𝗌𝗌𝖾𝗍𝑡𝑠𝗂𝖥𝖵t; (3) 𝗂𝖥𝖵(𝗂𝖫𝗆𝑥𝑠t)=𝗂𝖥𝖵t𝖽𝗌𝗌𝖾𝗍𝑥𝑠;

  5. (V)

    Permutation identity and compositionality;

  6. (VI)

    Interaction between free variables and permutation.

Again, 𝗂𝖵𝗋 and 𝗂𝖠𝗉 are free constructors, hence injective, whereas the binding constructor 𝗂𝖫𝗆 only satisfies quasi-injectivity, i.e., injectivity up to a permutation of the bound variables which leaves the term’s free variables untouched (I.6) – note that the latter property uses the dstream -specific free variables (𝖽𝗌𝗌𝖾𝗍) and permutation operators (𝖽𝗌𝗆𝖺𝗉). Similarly the recursive occurrences of iterm  nested under stream  in the 𝗂𝖠𝗉 constructor are accessed via the stream’s 𝗌𝗆𝖺𝗉 and 𝗌𝗌𝖾𝗍 functions in II.2 and IV.2, respectively. We obtain binding-aware structural induction and recursion principles, too, and highlight the main differences to Props. 2 and 3 (for a corresponding notion of iterm-model):

Prop 5 (Binding-aware structural induction).

Assume 𝖯𝗏𝖺𝗋𝗌:p𝑣𝑎𝑟set and φ:p𝑣𝑎𝑟itermbool are such that (1) p.|𝖯𝗏𝖺𝗋𝗌p|<o𝖻𝗈𝗎𝗇𝖽iterm, i.e., p.𝖼𝗈𝗎𝗇𝗍𝖺𝖻𝗅𝖾(𝖯𝗏𝖺𝗋𝗌p); (2) p,x.φp(𝗂𝖵𝗋x); (3) p,t,𝑡𝑠.(q.φqt)(t𝗌𝗌𝖾𝗍𝑡𝑠.q.φqt)φp(𝗂𝖠𝗉t𝑡𝑠); and (4) p,𝑥𝑠,t.𝖽𝗌𝗌𝖾𝗍𝑥𝑠𝖯𝗏𝖺𝗋𝗌p=(q.φqt)φp(𝗂𝖫𝗆𝑥𝑠t). Then p,t.φpt.

Prop 6 (Binding-aware recursion).

Assume 𝗆𝗈𝖽𝖾𝗅iterm𝑝𝑓𝑣𝑝𝑝𝑟𝑚𝑎𝑓𝑣𝑎𝑝𝑟𝑚𝑖𝑣𝑟𝑖𝑎𝑝𝑖𝑙𝑚 holds, and let g:𝑣𝑎𝑟itermpa denote 𝗋𝖾𝖼iterm𝑝𝑓𝑣𝑝𝑝𝑟𝑚𝑎𝑓𝑣𝑎𝑝𝑟𝑚𝑖𝑣𝑟𝑖𝑎𝑝𝑖𝑙𝑚. Further let 𝑝𝑎𝑝𝑟𝑚σf=𝑎𝑝𝑟𝑚σf𝑝𝑝𝑟𝑚(σ1). The following hold: (1) g(𝗂𝖵𝗋x)p=𝑖𝑣𝑟xp; (2) g(𝗂𝖠𝗉t𝑡𝑠)p=𝑖𝑎𝑝t(gt)𝑡𝑠(𝗌𝗆𝖺𝗉g𝑡𝑠)p; (3) 𝖽𝗌𝗌𝖾𝗍xs𝑝𝑓𝑣p=g(𝗂𝖫𝗆𝑥𝑠t)p=𝑖𝑙𝑚𝑥𝑠t(gt)p; (4) 𝗉𝖾𝗋𝗆σg(a[σ])p=𝑝𝑎𝑝𝑟𝑚σ(ga)p; and (5) 𝑎𝑓𝑣(gtp)𝗂𝖥𝖵t𝑝𝑓𝑣p.

3.4 Types and terms for System F<:

We define the types and terms of System F<:, which we will use in our solution to the POPLmark challenge (§6). Because we aim for the challenge 2B, we directly introduce the syntax that incorporates nested types and pattern matching. Compared to the previous subsections we will be much briefer regarding the output of our binder_datatype commands: the previous examples already cover many of the arising ingredients and phenomena.

We start by introducing a non-repetitive (in the keys) type of finite sets of key-value pairs that will be used to represent records (where we use strings as keys).

type_synonym label = string
linear_type (a,b)lfset = (a×b)fset on a

The challenge description and all existing solutions favor ordered collections for records, and it would be easy for us to adjust our entire formalization to use lists instead of finite sets (fset). We chose to use fset as the basis for our records because in practical languages like Standard ML or JSON records are considered to be unordered collections. We also chose it because it displays the flexibility of our approach to work with nested recursion through non-datatypes:

binder_datatype ’tvar type = TVr ’tvar | Top | Arr "’tvar type" "’tvar type"
| All X::’tvar "’tvar type" T::"’tvar type" binds X in T
| TRec "(label, ’tvar type) lfset"

The above command defines POPLmark types. The only binding constructor is 𝖠𝗅𝗅 and we obtain the following quasi-injectivity property for it (where 𝖳𝖥𝖵:𝑡𝑣𝑎𝑟type𝑡𝑣𝑎𝑟set): 𝖠𝗅𝗅XT1T2=𝖠𝗅𝗅XT1T2(T1=T1(X𝖳𝖥𝖵T2X=X)T2=T2[XX]). Naturally, we also obtain binding-aware induction and recursion principles.

We continue with defining terms. For that purpose we introduce patterns as the non-repetitive subtype of the (non-binding) “pre-pattern” datatype that recurses through lfset.

datatype (𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)ppat = 𝖯𝖯𝖵𝗋𝑣𝑎𝑟(𝑡𝑣𝑎𝑟type) 𝖯𝖯𝖱𝖾𝖼(label,(𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)ppat)lfset
linear_type (𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)pat = (𝑣𝑎𝑟,𝑡𝑣𝑎𝑟)ppat on 𝑣𝑎𝑟

We lift the pre-pattern constructors 𝖯𝖯𝖵𝗋 and 𝖯𝖯𝖱𝖾𝖼 to the pattern type as 𝖯𝖵𝗋:𝑣𝑎𝑟𝑡𝑣𝑎𝑟type(𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)pat and 𝖯𝖱𝖾𝖼:(label,(𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)pat)lfset(𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)pat. The latter operator is not a free constructor: its argument must satisfy a non-repetitiveness predicate (nonrep𝖯𝖱𝖾𝖼:(label,(𝑡𝑣𝑎𝑟,𝑣𝑎𝑟)pat)lfsetbool). We are ready to define terms:

binder_datatype (’tvar, ’var) term = Vr ’var
| Ap "(’tvar, ’var) term" "(’tvar, ’var) term"
| Lm x::’var "’tvar typ" t::"(’tvar, ’var) term" binds x in t
| ApT "(’tvar, ’var) term" "’tvar typ"
| LmT X::’tvar "’tvar typ" t::"(’tvar, ’var) term" binds X in t
| Rec "(label, (’tvar, ’var) term) lfset" | Proj "(’tvar, ’var) term" label
| Let "(’tvar, P::’var) pat" "(’tvar, ’var) term" t::"(’tvar, ’var) term" binds P in t

Of the eight constructors, three are binding. We show their quasi-injectivity properties:

𝖫𝗆xTt=𝖫𝗆xTt(T=T(x𝖥𝖵tx=x)t=t[xx])𝖫𝗆𝖳XTt=𝖫𝗆𝖳XTt(T=T(X𝖥𝖳𝖵tx=x)t=t[XX])𝖫𝖾𝗍Pt1t2=𝖫𝖾𝗍Pt1t2(t1=t1(σ.𝗉𝖾𝗋𝗆σ𝗂𝖽_𝗈𝗇(𝖥𝖵t𝖯𝖵P)σP[σ]=Pt2[σ]=t2))

These hinge on our ability to refer to a term’s free variables (𝖥𝖵) and its free type variables (𝖥𝖳𝖵), as well as a pattern’s free type variables (𝖯𝖵). Similarly, we obtain and make use of infrastructure to permute a pattern’s variables, a term’s variables, and a term’s type variables. Again, we also obtain binding-aware induction and recursion principles, where e.g., the parameter p avoids a pattern P’s free variables in the 𝖫𝖾𝗍 case of the induction by providing the assumption 𝖯𝖵𝖺𝗋𝗌p𝖯𝖵P= to the user.

4 MrBNF’s Internals: Construction of Datatypes with Bindings

Isabelle’s definitional package for standard (co)datatypes [12], sometimes referred to as the BNF package, is based on bounded natural functors (BNFs) [41] – which are comprised of meta-information associated with well-behaved type constructors and are closed under composition and fixpoints (datatype and codatatype construction). The meta-information consists of a few constants and relations between them. Specifically, a BNF is an n-ary type constructor α¯T (here we use the overlined notation as a shorthand for (α1,,αn)T and similarly in the following for other types and terms) along with a mapper 𝗆𝖺𝗉T:(αβ)¯α¯Tβ¯T, the relator 𝗋𝖾𝗅T:(αβbool)¯α¯Tβ¯Tbool, several setters 𝗌𝖾𝗍Ti:α¯Tαiset, and the cardinal bound 𝖻𝗈𝗎𝗇𝖽T satisfying a number of properties, e.g., 𝗆𝖺𝗉T𝗂𝖽¯=𝗂𝖽 or 𝗌𝖾𝗍Ti(𝗆𝖺𝗉f¯x)=fi`𝗌𝖾𝗍Tix. For example, standard lists form a BNF with the standard 𝗆𝖺𝗉 function, the relator 𝗅𝗂𝗌𝗍_𝖺𝗅𝗅𝟤, which relates two lists of the same length provided that lists’ elements satisfy the given relation when zipped pair-wise, the 𝗌𝖾𝗍 function that returns all the list’s elements and the cardinality bound 0. Standard datatypes are least fixpoints of BNFs.

The BNF properties require 𝗆𝖺𝗉T to behave well when applied to arbitrary functions. Blanchette et al. [10] observed that this is too restrictive when dealing with syntax with bindings and generalized BNFs to map-restricted bounded natural functions (MRBNFs). MRBNFs thus resemble BNFs but distinguish between three modes of type arguments: bound variables which can be mapped by permutations, free variables which can be mapped by small-support (endo)functions, and live variables which correspond to BNF’s variables and can be mapped by arbitrary functions. (Both BNFs and MRBNFs also support variables that are ignored by 𝗆𝖺𝗉T, which are called dead.) We write (α¯,β¯)T for the MRBNF with m=|α¯| free and bound variables and |β¯| live variables (dead variables are left implicit) with the mapper 𝗆𝖺𝗉T:(αα)¯(βγ)¯(α¯,β¯)T(α¯,γ¯)T, the relator 𝗋𝖾𝗅T:(αα)¯(βγbool)¯(α¯,β¯)T(α¯,γ¯)Tbool, several setters 𝗌𝖾𝗍Ti:(α¯,β¯)Tγiset where γi=αi if i<m and γi=βim otherwise, and the cardinal bound 𝖻𝗈𝗎𝗇𝖽T. Note that 𝗋𝖾𝗅T only acts on the bound and free variables using permutations or small-support functions, respectively. The MRBNF properties clarify which of the α¯ are bound or free. For example, distinct streams adstream  are an MRBNF with bound variable a. Our MrBNF package implements Blanchette et al.’s [10] construction of binding-aware datatypes as least fixpoints of MRBNF type equations, and also customizes it to the high-level operators and theorems required by the users. In this section, we describe the construction’s main milestones.

4.1 From User-Specifications to Fixpoint Equations

The binder_datatype command’s syntax is inspired by Isabelle’s standard datatype command. Bound and free variables in binder datatypes are always left polymorphic. A custom name can be provided for the free variable operators. A type class on the type argument ensures that the type chosen is large enough for the size of the binder datatype, e.g., that it is at least countably infinite for finite syntax like lterm  and at least uncountably infinite for iterm. The “at least” makes nesting binder datatypes in other potentially larger (e.g. uncountable) types easier as the variable type can be increased to match the size of the surrounding type.

The other major addition to the command’s syntax compared with standard datatypes are the binding annotations (inspired by Nominal Isabelle) and the subterm selectors. Normal datatypes allow to automatically define accessor functions using the fun_name::type syntax. For binder datatypes this syntax is repurposed and generalized to define the binding structure. A selector can not only appear on the top level (i.e. on a field of a constructor as in the 𝖫𝗆 constructor in term) but also nested within other types (as in the 𝖫𝖾𝗍 constructor in term). Valid targets for the selectors are variable positions and (potentially mutually) recursive positions.

MrBNF translates the user specification into the pre-datatype, a non-recursive sum of products. Thereby, variable and recursive positions are separated based on whether they appear in a binding clause. Next to the free variables visible in the syntax (’var), the pre-datatype also has a bound variable position, and two recursive positions for recursive occurrences under a binder and not under a binder respectively. The iterm type’s pre-datatype is:

type_synonym (’var, ’bvar, ’rec, ’brec) pre_iterm = ’var (* free occurrence *)
+ (’rec * ’rec stream) (* recursive non-binding occurrences *)
+ (’bbar dstream * ’brec) (* bound and recursive bound occurrence *)

Next, MrBNF defines a “raw” standard datatype with a single constructor (which is completely free, i.e., not yet quotiented to α-equivalence):

datatype ’var raw_iterm =
ctorraw_iterm "(’var, ’var, ’var raw_iterm, ’var raw_iterm) pre_iterm"

For the above step as well as to prepare for the next steps in the construction of the binder datatype, the pre-datatype must form an MRBNF with free 𝑣𝑎𝑟 (setter 𝗌𝖾𝗍pre_iterm1), bound 𝑏𝑣𝑎𝑟 (setter 𝗌𝖾𝗍pre_iterm2), and live 𝑟𝑒𝑐 (setter 𝗌𝖾𝗍pre_iterm3) and 𝑏𝑟𝑒𝑐 (setter 𝗌𝖾𝗍pre_iterm4) positions – this is ensured by tracking the registered MRBNFs and automating their composition.

4.2 Composition of MRBNFs

The proof that a given type forms an MRBNF proceeds recursively via composition. For the individual components there are three cases to consider. If the type is a type-variable then return the identity MRBNF (live a, TID:=a, 𝗆𝖺𝗉ID:=𝗂𝖽:(aa)aa, 𝗋𝖾𝗅ID:=𝗂𝖽:(aabool)aabool, and 𝗌𝖾𝗍ID:=λx.{x}:aaset). If the topmost type constructor is not known to be a (MR)BNF then return the constant MRBNF (dead d, TCST:=d, 𝗆𝖺𝗉CST:=𝗂𝖽:dd, 𝗋𝖾𝗅CST:=(=):ddbool). Otherwise (i.e., the topmost type constructor is a MRBNF) recursively prove that its arguments are MRBNFs. Then do a composition step between the outer MRBNF and the inner MRBNFs.

Inspired by BNF composition [12], MRBNF composition is split into several phases. First step is demoting. All type-variables shared between the involved MRBNFs are demoted to the same mode. Given that modes can only become more specific (live > free > bound > dead), this will result in the lowest mode a variable is used at in any of the MRBNFs. If a type-variable appears under a type constructor that is not a (MR)BNF, it must be demoted to dead (using the constant MRBNF). The second step is lifting: new dummy type-variables are added to all MRBNFs to ensure that all involved MRBNFs have the same (modulo reordering) bound and free type-variables and all the inner MRBNFs have the same live type-variables (again modulo reordering). The third step is permuting: bring shared type variables into the same order in all involved MRBNFs. Finally, composition proceeds along the following definition:

Definition 1.

Given an outer MRBNF (α¯,β¯)G where |α¯|=m and |β¯|=n and inner MRBNFs (α¯,γ¯)F1(α¯,γ¯)Fn where |γ¯|=k, the composed MRBNF (α¯,γ¯)H is given by:

(α¯,γ¯)TH=(α¯,(α¯,γ¯)F1,,(α¯,γ¯)Fn)G𝗆𝖺𝗉H=λf¯g¯.𝗆𝖺𝗉Gf¯(𝗆𝖺𝗉F1f¯g¯)(𝗆𝖺𝗉Fnf¯g¯):(αα)¯(γγ)¯(α¯,γ¯)TH(α¯,γ¯)TH𝗌𝖾𝗍Him=λx.𝗌𝖾𝗍Gix(y𝗌𝖾𝗍Gm+1x.𝗌𝖾𝗍F1iy)(y𝗌𝖾𝗍Gm+n.𝗌𝖾𝗍Fniy):(α¯,γ¯)THαi𝗌𝖾𝗍𝗌𝖾𝗍Hi>m=λx.(y𝗌𝖾𝗍Gm+1x.𝗌𝖾𝗍F1iy)(y𝗌𝖾𝗍Gm+n.𝗌𝖾𝗍Fniy):(α¯,γ¯)THγim𝗌𝖾𝗍𝗋𝖾𝗅H=λf¯R¯.𝗋𝖾𝗅Gf¯(𝗋𝖾𝗅F1f¯R¯)(𝗋𝖾𝗅Fnf¯R¯):(αα)¯(γγbool)¯(α¯,γ¯)TH(α¯,γ¯)THbool

4.3 Fixpoint and Quotienting Constructions

Next, MrBNF automates Blanchette et al. [10] definition of free variables, permutation and α-equivalence on the raw datatype. Free variables are defined via an inductive predicate 𝖿𝗋𝖾𝖾 that specifies if a variable x is free in a term t, here shown on our running example iterm.

The 𝖥𝖵raw_iterm function is then defined as λt.{a.𝖿𝗋𝖾𝖾ax}. One also defines a primitive recursive function 𝗉𝖾𝗋𝗆𝗎𝗍𝖾 that takes an permutation on the variable position and applies it to all variables (bound and free). This function uses the map function of the pre-datatype: 𝗉𝖾𝗋𝗆𝗎𝗍𝖾σ(𝖼𝗍𝗈𝗋raw_itermx)=𝖼𝗍𝗈𝗋raw_iterm(𝗆𝖺𝗉pre_itermσσ(𝗉𝖾𝗋𝗆𝗎𝗍𝖾σ)(𝗉𝖾𝗋𝗆𝗎𝗍𝖾σ)x). Equipped with these two functions, alpha-equivalence is defined inductively as follows:

MrBNF proves 𝖺𝗅𝗉𝗁𝖺iterm to be an equivalence relation and uses it to define a quotient of the raw datatype. The constructor, free variable and permutation operators are lifted from the raw datatype to the quotient. The lifted constructor 𝖼𝗍𝗈𝗋iterm is used to define the high-level constructors 𝗂𝖵𝗋, 𝗂𝖠𝗉, and 𝗂𝖫𝗆, and to prove all the high-level theorems illustrated in §3.

MrBNF actually implements a mild generalization of Blanchette et al. [10]’s fixpoint construction, which only allows to bind variables in recursive subterms. However, sometimes it is necessary to bind variables that appear free in another (non-recursive) type occurrence. To solve this issue in the pre-datatype, instead of just having positions for free and bound type-variables, we also introduce a hybrid called bfree type-variables. Then, alpha-equivalence ensures that the portion of bfree variables that appear bound is appropriately renamed.

4.4 Recursion

MrBNF also implements Blanchette et al.’s binding-aware recursion principle [10]. To define a recursive function from a binding-aware datatype α¯T of free type-variables α¯ to some other α¯-type α¯U, one needs: (1) a parameter structure consisting of a type α¯P, a permutation Pmap:αα¯α¯Pα¯P and support operators PVarsi:α¯Pαi𝗌𝖾𝗍, and (2) a model consisting of a type α¯U, permutation, and support operators similar to the parameter structure as well as an algebra structure encoding the recursive behavior of the all the constructors, Uctor:(α¯,α¯T×(α¯Pα¯U)¯)pre_Tα¯Pα¯U, where pre_T is the pre-datatype of T.

We introduce a precursor of our recursor on raw terms, which applies Uctor recursively while suitably permuting bound variables “out of the way” with regard to the parameter structure. Suitably means that the “out of the way” function f returns a permutation that does not change the frees and makes bounds disjoint from the frees. For iterm, suitable is defined as:

𝗌𝗎𝗂𝗍𝖺𝖻𝗅𝖾itermf=xp.𝗉𝖾𝗋𝗆(fxp)𝗂𝗆𝗌𝗎𝗉𝗉(fxp)((𝖥𝖵iterm(𝖼𝗍𝗈𝗋itermx)𝖯𝖵𝖺𝗋𝗌p)𝗌𝖾𝗍pre_iterm2x)=fxp`𝗌𝖾𝗍pre_iterm2x(𝖥𝖵iterm(𝖼𝗍𝗈𝗋itermx)𝖯𝖵𝖺𝗋𝗌p)=

Here, 𝗂𝗆𝗌𝗎𝗉𝗉f=𝗌𝗎𝗉𝗉ff`𝗌𝗎𝗉𝗉f. As the precursor must permute the bound variables, it is not possible to define it using primitive recursion. Instead, we use well-founded recursion via an auxiliary 𝗌𝗎𝖻𝗌𝗁𝖺𝗉𝖾 relation, which provides the necessary wiggle room:

We obtain the definition of the precursor 𝗋𝖾𝖼U for a suitable “out of the way” function f:

𝗋𝖾𝖼Uf(𝖼𝗍𝗈𝗋raw_itermx)p=if¬𝗌𝗎𝗂𝗍𝖺𝖻𝗅𝖾itermfthen𝗎𝗇𝖽𝖾𝖿𝗂𝗇𝖾𝖽else𝑈𝑐𝑡𝑜𝑟(𝗆𝖺𝗉pre_itermid(fxp)((λt.(t,𝗋𝖾𝖼Uft))𝗉𝖾𝗋𝗆𝗎𝗍𝖾raw_iterm(fxp))(λt.(t,𝗋𝖾𝖼Uft))x)p

The main lemma that the package proves is that the precursor commutes with permutation, it returns the same result for alpha-equivalent terms, and that the specific choice of the “out of the way” function is irrelevant. These properties must be proved simultaneously by induction on the binder datatype using the induction scheme associated with the subshape relation:

𝗌𝗎𝗂𝗍𝖺𝖻𝗅𝖾itermf𝗌𝗎𝗂𝗍𝖺𝖻𝗅𝖾itermf𝗉𝖾𝗋𝗆σ𝖺𝗅𝗉𝗁𝖺itermtt𝗋𝖾𝖼Uf(𝗉𝖾𝗋𝗆𝗎𝗍𝖾raw_itermσt)p=𝑈𝑚𝑎𝑝σ(𝗋𝖾𝖼Uft(𝖯𝗆𝖺𝗉σ1p))𝗋𝖾𝖼Uftp=𝗋𝖾𝖼Uftp

To move towards the binding-aware recursor, we use Hilbert Choice to hide the “out of the way” function by choosing an arbitrary suitable one 𝗋𝗋𝖾𝖼U=𝗋𝖾𝖼U(εf.𝗌𝗎𝗂𝗍𝖺𝖻𝗅𝖾itermf).

The invariance of the precursor under alpha is needed to lift it from the raw type to the quotient type. The definition of the precursor is used to derive a better simplification rule that hides the permuting function. This rule requires that the top-level bound variables are disjoint from the parameter structure and from the free variables. We then use identity as the “out of the way” function on the top level and an arbitrary suitable function in the recursion.

𝗌𝖾𝗍pre_iterm2x(𝗌𝖾𝗍pre_iterm1x(z𝗌𝖾𝗍pre_iterm3x𝖥𝖵itermz)𝖯𝖵𝖺𝗋𝗌p)=𝗋𝗋𝖾𝖼U(𝖼𝗍𝗈𝗋raw_itermx)p=𝖴𝖼𝗍𝗈𝗋(𝗆𝖺𝗉pre_iterm𝗂𝖽𝗂𝖽(λt.(t,𝗋𝗋𝖾𝖼Ut))(λt.(t,𝗋𝗋𝖾𝖼Ut))x)p

Relativized Recursion.

The above is a slightly simplified version of our recursion facilities. To accommodate situations where the domain of parameters or the target domain for the intended recursion function do not make up the entire types but only certain subsets, MrBNF allows the user to optionally provide predicates 𝗏𝖺𝗅𝗂𝖽P:α¯Pbool and 𝗏𝖺𝗅𝗂𝖽U:α¯Ubool that restrict these domains – while producing proof obligations that the user-provided parameter-structure and recursor-model operators preserve these predicates, and producing recursor clauses relativized to these predicates. For normal datatypes such a feature would be useless, as there are no proof obligations incurred for recursion. But for binding datatypes this is useful, since organizing the entire type as a parameter structure or a recursor model (so that the proof obligations can be discharged) is often difficult or awkward. An example of leveraging the 𝗏𝖺𝗅𝗂𝖽P flexibility is if the user prefers to define substitution using actual functions subject to the small-support requirement, as opposed to defining a subtype corresponding to this requirement (𝗌𝗎𝖻𝗌𝗍𝖥𝗎𝗇 at the end of §3.2). In §5, we show an example that leverages the 𝗏𝖺𝗅𝗂𝖽U flexibility.

5 Application I: Mazza’s Isomorphism

In his work on connecting the meta-theory of λ-calculus with the notion of metric completion, Mazza [22] establishes an isomorphic translation between standard λ-calculus (using the lterm  syntax in §3.2) and a (uniform affine) infinitary λ-calculus (the iterm syntax from §3.3). We show our formalization using MrBNF of some key constructions in his development.

Recall that the type-variable for the lterm  type constructor must be infinite (since 𝖻𝗈𝗎𝗇𝖽lterm =0), and the one for iterm  must be uncountably infinite (since 𝖻𝗈𝗎𝗇𝖽iterm=1). In what follows, we fix these type-variables, namely fix a countable type var (a copy of nat) and an uncountable type ivar; we will refer to the elements of ivar as ivariables. We will simply write lterm  instead of varlterm  and iterm instead of ivariterm.

Following Mazza, we choose a countable set 𝖲𝗉𝗋:(vardstream )set of distinct streams of variables called supervariables, having the property that any two are mutually disjoint: 𝑥𝑠,𝑦𝑠𝖲𝗉𝗋.𝗌𝗌𝖾𝗍𝑥𝑠𝗌𝗌𝖾𝗍𝑦𝑠=. The intention is restricting the λ-iterms to only use these as bindings. Moreover, we choose a function 𝗌𝗉𝗋:var(vardstream )set for which 𝖻𝗂𝗃_𝖻𝖾𝗍𝗐𝗌𝗉𝗋(𝖴𝖭𝖨𝖵:varset)𝖲𝗉𝗋 holds, i.e., 𝗌𝗉𝗋 is a bijection between variables and supervariables; we write 𝗌𝗉𝗋1:(vardstream )setvar for its inverse. We refer to the elements of natlist as positions, and choose a bijection 𝗇𝖺𝗍𝖮𝖿:natlistnat. For p:natlist and n:nat, pn denotes the concatenation of p and [n]. According to Mazza’s definition, the finitary-to-infinitary translation should be a function __:lterm natlistiterm given by: (1) 𝖵𝗋xp=𝗂𝖵𝗋((𝗌𝗉𝗋x)𝗇𝖺𝗍𝖮𝖿p); (2) 𝖫𝗆xtp=𝗂𝖫𝗆(𝗌𝗉𝗋x)tp; and (3) 𝖠𝗉t1t2p=𝗂𝖠𝗉t1p0(t2p1,t2p2,t2p3,).

The intuition is that every variable x in the original term is duplicated in the translation into countably many ivariable “copies” of it sourced from its corresponding supervariable, 𝗌𝗉𝗋x. The positions make sure that the copies located in different parts of the resulting iterm are distinct, thus ensuring that the iterm is affine. Indeed, in the recursive case for application, we see that the position p grows with different numbers appended to the arguments of infinitary application, which ensures disjointness in conjunction with choosing the particular “copy” based on this position counter (𝗇𝖺𝗍𝖮𝖿p) when reaching the 𝖵𝗋-leaves. Correspondingly, abstraction over a variable is translated to abstraction over its supervariable, i.e., over all its “copies”.

Moreover, to describe the image of this translation, Mazza defines the notion of renaming equivalence expressed as the relation :itermitermbool defined inductively in Fig. 1. It relates two λ-iterms t and t just in case they (i) have the same (𝗂𝖵𝗋,𝗂𝖫𝗆,𝗂𝖠𝗉)-structure (as trees), (ii) only use supervariables in binders, (iii) at the leaves have variables appearing in the same supervariable, and (iv) for both t and t all the subterms that form the righthand side of an application are mutually renaming equivalent. Then uniformity of an iterm, which will characterize the translation’s image, is self-renaming-equivalence: 𝗎𝗇𝗂𝖿𝗈𝗋𝗆t=(tt).

𝑥𝑠𝖲𝗉𝗋{x,x}𝗌𝗌𝖾𝗍𝑥𝑠𝗂𝖵𝗋x𝗂𝖵𝗋xiVr 𝑥𝑠𝖲𝗉𝗋tt𝗂𝖫𝗆xst𝗂𝖫𝗆xstiLm

ttt1,t2.{t1,t2}𝗌𝗌𝖾𝗍𝑡𝑠𝗌𝗌𝖾𝗍𝑡𝑠t1t2𝗂𝖠𝗉t𝑡𝑠𝗂𝖠𝗉t𝑡𝑠iAp

Figure 1: Renaming equivalence relation.

We aim to define a function satisfying clauses (1)–(3) above, with (2) formally written as 𝗂𝖠𝗉t1p0(𝗌𝗆𝖺𝗉t2p_(𝗇𝖺𝗍𝗌𝖥𝗋𝗈𝗆 1)) where, for any n, 𝗇𝖺𝗍𝗌𝖥𝗋𝗈𝗆n denotes the stream of naturals starting from n. To turn these clauses into a formal definition, we deploy our recursor for lterm , which requires also indicating the desired interaction between the to-be-defined function with permutation and free-variables. Upon analysis, we converge to (4) t[σ]p=tp[𝗏𝟤𝗂𝗏σ] and (5) 𝗌𝗉𝗋1`𝗍𝗈𝗎𝖼𝗁𝖾𝖽tp𝖥𝖵t, where 𝗏𝟤𝗂𝗏σ (read “variable to ivariable”) converts σ:𝑣𝑎𝑟𝑣𝑎𝑟, via 𝗌𝗉𝗋, into a supervariable-preserving function; namely, for any y𝑖𝑣𝑎𝑟 such that y appears in some (necessarily unique) supervariable 𝑥𝑠, we define 𝗏𝟤𝗂𝗏σy:𝑖𝑣𝑎𝑟 as (𝗌𝗉𝗋(σ(𝗌𝗉𝗋1𝑥𝑠)))i for the unique i such that 𝑥𝑠i=y; and 𝗍𝗈𝗎𝖼𝗁𝖾𝖽t is the set of all supervariables that are touched by (the free variables of) t, namely {𝑥𝑠𝖲𝗉𝗋𝗌𝗌𝖾𝗍𝑥𝑠𝖥𝖵t}.

Equation (4) above is seen to be intuitive if we remember that the translation sends variables to supervariables, which means that bijections σ between variables naturally correspond to bijections between supervariables, hence (thanks to the supervariables being mutually disjoint) to supervariable-structure preserving bijections between ivariables; therefore indeed (A) applying a bijection on variables and then translating should be the same as (B) first translating and then applying this corresponding bijection of its ivariable “copies” in the translation. As for the above inclusion (5), we obtained it by adjunction from 𝗍𝗈𝗎𝖼𝗁𝖾𝖽tp𝗌𝗉𝗋`𝖥𝖵t, which is again intuitive if we think in terms of the variable-supervariable correspondence.

Clauses (1)–(5) give us a structure on the intended codomain of __, natlistiterm, using the recipe sketched at the end of §3.2. However, for these to give us an lterm -model, we must restrict the codomain to include only those functions f:natlistiterm whose image consists of renaming-equivalent items only – otherwise the model properties do not hold; this is not suprising, since Mazza’s translation’s goal is to produce uniform iterms. We therefore employ the codomain-relativized recursion discussed at the end of §4.4, obtaining:

Prop 7.

There exists a unique function __:lterm natlistiterm such that clauses (1)–(5) hold, and in addition p,q.tptq; in particular, p.𝗎𝗇𝗂𝖿𝗈𝗋𝗆tp.

For the opposite translation _ (from infinitary back to finitary terms), Mazza writes equations that in our notation look as follows, restricting the domain to uniform iterms:
(1) 𝗂𝖵𝗋𝑥𝑠i=𝖵𝗋(𝗌𝗉𝗋1𝑥𝑠); (2) 𝗂𝖫𝗆𝑥𝑠t=𝖫𝗆(𝗌𝗉𝗋1𝑥𝑠)t; (3) 𝗂𝖠𝗉t𝑡𝑠=𝖠𝗉t𝑡𝑠0.

With the help of a custom recursor for a suitable superset of the uniform iterms, again adding clauses for permuation and free variables, we are able to prove:

Prop 8.

There exists a function _:itermlterm  satisfying the above clauses (1)–(3) when resticted to uniform iterms (i.e., assuming 𝗂𝖵𝗋𝑥𝑠i, 𝗂𝖫𝗆𝑥𝑠t and 𝗂𝖠𝗉t𝑡𝑠 are uniform), and such that _’s restriction to uniform iterms is uniquely determined by these properties.

Mazza’s main result consists of a sequence of five statements, three of which refer to the syntactic component of the finitary-infinitary isomorphism.

Prop 9.

The following hold: (1) (Lemma 16 from [22]) ttt=t.
(2) (Thm. 19(1) from [22]) sp=s. (3) (Thm. 19(2) from [22]) 𝗎𝗇𝗂𝖿𝗈𝗋𝗆ttpt.

The theorem states that, for any position p, _ and _p give mutually inverse bijections between terms and equivalence classes of uniform iterms w.r.t. renaming equivalence. An additional lemma (omitted here) shows that the -representative produced by _p is affine (i.e., has no repeated variables). Thus the result establishes a syntactic isomorphism, up to renaming equivalence, between terms and uniform affine iterms. Mazza’s isomorphism also has an operational-semantics component, given by a theorem stating that _ and _p preserve β-reduction in both calculi in a manner that matches the number of reduction steps [22, Thm. 19(3,4)]. We omit this result here, but details can be found in our formalization [44].

6 Application II: POPLmark Challenge

We report on our solution to the Part 2 of the POPLmark challenge [4], which is concerned with the type soundness of System F<:; our formalization also solves Part 1, which is concerned with subtyping. We work with the System F<: types and terms we have introduced in §3.4. With our setup that enforces the variable convention in all induction proofs, the formalization becomes a routine exercise: we can follow the formalization document and transcribe auxiliary lemmas and their proofs. We show our core definitions of Part 2. Naturally, they rely on some definitions of Part 1 (notably the subtyping relation) and other basic infrastructure (notably contexts modeled as lists); we refer to our formalization for full details [44].

We start with the typing judgments for patterns and terms:

inductive pat_typing (" _ : _ _" [30,29,30] 30) where
PTPVr: " PVr x T : T , Inr x <: T"
| PTPRec: "nonrep_PRec PP labels PP = labels TT
(l P T. (l, P) PP (l, T) TT P : T Δ l)
PRec PP : TRec TT concat (map Δ (labelist TT))"
inductive typing ("_ _ : _" [30,29,30] 30) where
TVr: " Γ OK (Inr x, T) set Γ Γ Vr x : T"
| TLm: "Γ , Inr x <: T1 t : T2 Γ Lm x T1 t : Arr T1 T2"
| TAp: "Γ t1 : Arr T11 T12 Γ t2 : T11 Γ App t1 t2 : T12"
| TLmT: "Γ , Inl X <: T1 t : T2 Γ LmT X T1 t : All X T1 T2"
| TApT: "Γ t1 : All X T11 T12 proj_ctxt Γ T2 <: T11
Γ ApT t1 T2 : substT (TVr(X := T2)) T12"
| TSub: "Γ t : S proj_ctxt Γ S <: T Γ t : T"
| TRec: " Γ OK rel_lfset id (λt T. Γ t : T) XX TT Γ Rec XX : TRec TT"
| TProj: "Γ t : TRec TT (l, T) TT Γ Proj t l : T"
| TLet: "Γ t : T P : T Δ Γ , Δ u : U Γ Let P t u : U"

All rules follow closely the challenge description [4]. A few rules deserve some explanation. Rules TVr and TRec assume that the context is well-scoped (Γ𝙾𝙺); other rules preserve this invariant inductively. Rule TRec uses the relator rel_lfset to relate the values in two lfsets pairwise grouped by label. Rule TApT uses the parallel substitution function on System F<: types (substT), which we define using our recursor. Rule PTPRec assumes that the destructed record pattern is nonrepetitive (nonrep_PRec); it also writes for membership in lfset and constructs the resulting context by sorting the finite set of labels lexicographically (labelist).

We next define matching and the evaluation function for the terms.

inductive match for σ where
MPVr: "σ X = v match σ (PVr X T) v"
| MPRec: "nonrep_PRec PP labels PP labels VV
(l P v. (l, P) PP (l, v) VV match σ P v)
match σ (PRec PP) (Rec VV)"
definition "restrict σ A x = (if x A then σ x else Vr x)"
inductive step where
ApLm: "value v step (Ap (Lm x T t) v) (subst (Vr(x := v)) TVr t)"
| ApTLmT: "step (ApT (LmT X T t) T2) (subst Vr (TVr(X := T2)) t)"
| LetV: "value v match σ P v step (Let P v u) (subst (restrict σ (PV p)) TVr u)"
| ProjRec: "v values VV. value v (l, v) VV step (Proj (Rec VV) l) v"
| ApCong1: "step t t’ step (Ap t u) (Ap t’ u)"
| ApCong2: "value v step t t’ step (Ap v t) (Ap v t’)"
| ApTCong: "step t t’ step (ApT t T) (ApT t’ T)"
| ProjCong: "step t t’ step (Proj t l) (Proj t’ l)"
| RecCong: "step t t’ (l, t) XX step (Rec XX) (Rec (XXl := t’))"
| LetCong: "step t t’ step (Let P t u) (Let P t’ u)"

Similar to Berghofer’s solution [8], we use a matching predicate rather than a (partial) function that computes the matching substitution. The rules ApLm, ApTLmT, LetV, and ProjRec implement actual transitions; the remaining rules of step are congruence rules navigating to allowed redexes. We prefer this formulation over an equivalent context-based one, because the congruence steps are in all cases the easy cases of the involved induction proofs. The rules ApLm, ApTLmT, LetV use the parallel substitution subst, which we again define using our recursor. This substitution function acts both on term variables (first argument) and type variables (second argument). We then prove the main results: progress and preservation.

lemma progress: " t : T value t (t’. step t t’)"
lemma preservation: : "Γ t : T step t t’ Γ t’ : T"

The proofs are canonical following the challenge description [4]. We pervasively use binding-aware induction on our datatypes but also on the shown inductive predicates, which has recently been developed in Isabelle by van Brügge et al. [43]. Occasionally we use induction even in places where a case distinction would have sufficed: this is because our tool lacks case distinction theorems following the variable convention. One omission in the challenge’s pen-and-paper proof, which has been also noted by Berghofer [8], is the following lemma, crucial for progress, about the existence of matching substitutions for well-typed patterns.

lemma pat_typing_ex_match: P : T Δ v : T value v σ. match σ P v

7 Conclusion

MrBNF is a new definitional package in Isabelle/HOL for defining binding-aware datatypes. It follows a modular approach to datatypes relying on the notion of MRBNF as infrastructure to refer to free, bound, and recursive occurrences in a syntax declaration. It comprises 20 000 lines of Standard ML. While some of its usability edges are still rough, our case studies suggest that MrBNF can be a cornerstone in mechanized developments, pushing the boundaries of nominal techniques. We are currently proceeding to polish MrBNF’s rough edges, which involves providing a high-level interface to the recursor, automating the non-repetitiveness construction (linear_type), and providing variable-avoiding case distinction rules and a proof method to apply them effectively. At the same time, we are extending MrBNF’s scope to binding-aware codatatypes, which will have applications such as Böhm trees [6].

References

  • [1] Submitted solutions to the POPLmark challenge. https://www.seas.upenn.edu/˜plclub/poplmark/, accessed March 22, 2025, 2005.
  • [2] Guillaume Ambal, Sergueï Lenglet, and Alan Schmitt. HOπ in Coq. J. Autom. Reason., 65(1):75–124, 2021. doi:10.1007/S10817-020-09553-0.
  • [3] Michael Ashley-Rollman, Karl Crary, and Robert Harper. Group from CMU’s solution to the POPLmark challenge. https://www.seas.upenn.edu/˜plclub/poplmark/cmu.html, accessed March 22, 2025, 2005.
  • [4] Brian E. Aydemir, Aaron Bohannon, Matthew Fairbairn, J. Nathan Foster, Benjamin C. Pierce, Peter Sewell, Dimitrios Vytiniotis, Geoffrey Washburn, Stephanie Weirich, and Steve Zdancewic. Mechanized metatheory for the masses: The POPLMark challenge. In Joe Hurd and Thomas F. Melham, editors, TPHOLs 2005, volume 3603 of LNCS, pages 50–65. Springer, 2005. doi:10.1007/11541868_4.
  • [5] David Baelde, Kaustuv Chaudhuri, Andrew Gacek, Dale Miller, Gopalan Nadathur, Alwen Tiu, and Yuting Wang. Abella: A system for reasoning about relational specifications. J. Formaliz. Reason., 7(2):1–89, 2014. doi:10.6092/ISSN.1972-5787/4650.
  • [6] Hendrik Pieter Barendregt. The lambda calculus – its syntax and semantics, volume 103 of Studies in logic and the foundations of mathematics. North-Holland, 1985.
  • [7] Jesper Bengtson and Joachim Parrow. Formalising the pi-calculus using nominal logic. Log. Methods Comput. Sci., 5(2), 2009. URL: http://arxiv.org/abs/0809.3960.
  • [8] Stefan Berghofer. A solution to the POPLMark challenge using de Bruijn indices in Isabelle/HOL. J. Autom. Reason., 49(3):303–326, 2012. doi:10.1007/S10817-011-9231-4.
  • [9] Stefan Berghofer and Christian Urban. A head-to-head comparison of de Bruijn indices and names. In Alberto Momigliano and Brigitte Pientka, editors, LFMTP 2006, volume 174 of ENTCS, pages 53–67. Elsevier, 2006. doi:10.1016/J.ENTCS.2007.01.018.
  • [10] Jasmin Christian Blanchette, Lorenzo Gheri, Andrei Popescu, and Dmitriy Traytel. Bindings as bounded natural functors. Proc. ACM Program. Lang., 3(POPL):22:1–22:34, 2019. doi:10.1145/3290335.
  • [11] Jasmin Christian Blanchette, Johannes Hölzl, Andreas Lochbihler, Lorenz Panny, Andrei Popescu, and Dmitriy Traytel. Truly modular (co)datatypes for Isabelle/HOL. In Gerwin Klein and Ruben Gamboa, editors, ITP 2014, volume 8558 of LNCS, pages 93–110. Springer, 2014. doi:10.1007/978-3-319-08970-6_7.
  • [12] Jasmin Christian Blanchette, Andrei Popescu, and Dmitriy Traytel. Cardinals in Isabelle/HOL. In Gerwin Klein and Ruben Gamboa, editors, ITP 2014, volume 8558 of LNCS, pages 111–127. Springer, 2014. doi:10.1007/978-3-319-08970-6_8.
  • [13] Joachim Breitner. Formally proving a compiler transformation safe. In Ben Lippmeier, editor, Haskell Symposium 2015, pages 35–46. ACM, 2015. doi:10.1145/2804302.2804312.
  • [14] Matthias Brun and Dmitriy Traytel. Generic authenticated data structures, formally. In John Harrison, John O’Leary, and Andrew Tolmach, editors, ITP 2019, volume 141 of LIPIcs, pages 10:1–10:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.ITP.2019.10.
  • [15] Arthur Charguéraud. The locally nameless representation. J. Autom. Reason., 49(3):363–408, 2012. doi:10.1007/s10817-011-9225-2.
  • [16] Zaynah Dargaye and Xavier Leroy. Mechanized verification of CPS transformations. In Nachum Dershowitz and Andrei Voronkov, editors, LPAR 2007, volume 4790 of LNCS, pages 211–225. Springer, 2007. doi:10.1007/978-3-540-75560-9_17.
  • [17] N.G de Bruijn. Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem. Indagationes Mathematicae (Proceedings), 75(5):381–392, 1972. doi:10.1016/1385-7258(72)90034-0.
  • [18] Murdoch Gabbay and Andrew M. Pitts. A new approach to abstract syntax with variable binding. Formal Aspects Comput., 13(3-5):341–363, 2002. doi:10.1007/s001650200016.
  • [19] Robert Harper, Furio Honsell, and Gordon D. Plotkin. A framework for defining logics. In LICS 1987, pages 194–204. IEEE Computer Society, 1987.
  • [20] Gérard P. Huet. Residual theory in lambda-calculus: A formal development. J. Funct. Program., 4(3):371–394, 1994. doi:10.1017/S0956796800001106.
  • [21] Brian Huffman and Christian Urban. A new foundation for Nominal Isabelle. In Matt Kaufmann and Lawrence C. Paulson, editors, ITP 2010, volume 6172 of LNCS, pages 35–50. Springer, 2010. doi:10.1007/978-3-642-14052-5_5.
  • [22] Damiano Mazza. An infinitary affine lambda-calculus isomorphic to the full lambda-calculus. In LICS 2012, pages 471–480. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.57.
  • [23] Conor McBride and James McKinna. Functional pearl: I am not a number—I am a free variable. In Henrik Nilsson, editor, Haskell Workshop 2004, pages 1–9. ACM, 2004. doi:10.1145/1017472.1017477.
  • [24] Alberto Momigliano, S.J. Ambler, and R.L. Crole. A comparison of formalizations of the meta-theory of a language with variable bindings in Isabelle. In TPHOLs 2001, Supplemental Proceedings, pages 267–282, 2001. URL: https://www.inf.ed.ac.uk/publications/online/0046/b267.pdf.
  • [25] Tobias Nipkow. More Church-Rosser proofs. J. Autom. Reason., 26(1):51–66, 2001. doi:10.1023/A:1006496715975.
  • [26] Michael Norrish. Recursive function definition for types with binders. In Konrad Slind, Annette Bunker, and Ganesh Gopalakrishnan, editors, TPHOLs 2004, volume 3223 of LNCS, pages 241–256. Springer, 2004. doi:10.1007/978-3-540-30142-4_18.
  • [27] Michael Norrish and René Vestergaard. Proof pearl: De Bruijn terms really do work. In Klaus Schneider and Jens Brandt, editors, TPHOLs 2007, volume 4732 of LNCS, pages 207–222. Springer, 2007. doi:10.1007/978-3-540-74591-4_16.
  • [28] Lawrence C. Paulson. The foundation of a generic theorem prover. J. Autom. Reason., 5(3):363–397, 1989. doi:10.1007/BF00248324.
  • [29] Lawrence C. Paulson. A mechanised proof of Gödel’s incompleteness theorems using Nominal Isabelle. J. Autom. Reason., 55(1):1–37, 2015. doi:10.1007/s10817-015-9322-8.
  • [30] Frank Pfenning and Conal Elliott. Higher-order abstract syntax. In Richard L. Wexelblat, editor, PLDI 1988, pages 199–208. ACM, 1988. doi:10.1145/53990.54010.
  • [31] Frank Pfenning and Carsten Schürmann. System description: Twelf – A meta-logical framework for deductive systems. In Harald Ganzinger, editor, CADE 1999, volume 1632 of LNCS, pages 202–206. Springer, 1999. doi:10.1007/3-540-48660-7_14.
  • [32] Brigitte Pientka and Jana Dunfield. Beluga: A framework for programming and reasoning with deductive systems (system description). In Jürgen Giesl and Reiner Hähnle, editors, IJCAR 2010, volume 6173 of LNCS, pages 15–21. Springer, 2010. doi:10.1007/978-3-642-14203-1_2.
  • [33] Andrew M. Pitts. Locally nameless sets. Proc. ACM Program. Lang., 7(POPL):488–514, 2023. doi:10.1145/3571210.
  • [34] Andrei Popescu. Nominal recursors as epi-recursors. Proc. ACM Program. Lang., 8(POPL):425–456, 2024. doi:10.1145/3632857.
  • [35] Tom Ridge and James Margetson. A mechanically verified, sound and complete theorem prover for first order logic. In Joe Hurd and Thomas F. Melham, editors, TPHOLs 2005, volume 3603 of LNCS, pages 294–309. Springer, 2005. doi:10.1007/11541868_19.
  • [36] Steven Schäfer, Tobias Tebbi, and Gert Smolka. Autosubst: Reasoning with de Bruijn terms and parallel substitutions. In Christian Urban and Xingyuan Zhang, editors, ITP 2015, volume 9236 of LNCS, pages 359–374. Springer, 2015. doi:10.1007/978-3-319-22102-1_24.
  • [37] Natarajan Shankar. A mechanical proof of the Church-Rosser theorem. J. ACM, 35(3):475–522, 1988. doi:10.1145/44483.44484.
  • [38] Kathrin Stark. Mechanising syntax with binders in Coq. PhD thesis, Saarland University, Saarbrücken, Germany, 2020. URL: https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/28822.
  • [39] Kathrin Stark, Steven Schäfer, and Jonas Kaiser. Autosubst 2: reasoning with multi-sorted de Bruijn terms and vector substitutions. In Assia Mahboubi and Magnus O. Myreen, editors, CPP 2019, pages 166–180. ACM, 2019. doi:10.1145/3293880.3294101.
  • [40] Dawit Legesse Tirore, Jesper Bengtson, and Marco Carbone. A sound and complete projection for global types. In Adam Naumowicz and René Thiemann, editors, ITP 2023, volume 268 of LIPIcs, pages 28:1–28:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.ITP.2023.28.
  • [41] Dmitriy Traytel, Andrei Popescu, and Jasmin Christian Blanchette. Foundational, compositional (co)datatypes for higher-order logic: Category theory applied to theorem proving. In LICS 2012, pages 596–605. IEEE Computer Society, 2012. doi:10.1109/LICS.2012.75.
  • [42] Christian Urban and Cezary Kaliszyk. General bindings and alpha-equivalence in Nominal Isabelle. Log. Methods Comput. Sci., 8(2), 2012. doi:10.2168/LMCS-8(2:14)2012.
  • [43] Jan van Brügge, James McKinna, Andrei Popescu, and Dmitriy Traytel. Barendregt convenes with Knaster and Tarski: Strong rule induction for syntax with bindings. Proc. ACM Program. Lang., 9(POPL):57:1–57:32, 2025. doi:10.1145/3704893.
  • [44] Jan van Brügge, Andrei Popescu, and Dmitriy Traytel. Implementation of MrBNF and case studies presented in this paper, 2025. doi:10.5281/zenodo.15756655.
  • [45] Jérôme Vouillon. A solution to the POPLMark challenge based on de Bruijn indices. J. Autom. Reason., 49(3):327–362, 2012. doi:10.1007/S10817-011-9230-5.