A Natural Language Formalization of Perfectoid Rings in ℕaproche

Koepke, Peter

doi:10.4230/LIPIcs.ITP.2025.6

A Natural Language Formalization of Perfectoid Rings in $\mathbb{N}$ aproche

Peter Koepke

University of Bonn, Germany

Abstract

This paper describes an experiment to formalize sophisticated mathematics in the $\mathbb{N}$ aproche proof assistant which uses natural language input and a first-order internal logic. We view this as a contribution to the ongoing discussion whether formal systems for research mathematics require complex, computer-orientated type systems or whether approaches closer to traditional mathematical practices are possible. The formalization also explores the limits of the current $\mathbb{N}$ aproche system and avenues for further development.

Keywords and phrases:

formal mathematics, formalization, perfectoid rings, controlled natural language, Naproche

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Logic and verification

Supplementary Material:

Software: https://github.com/naproche/FLib/blob/master/PerfectoidRings/perfectoidrings.ftl.tex
archived at

swh:1:dir:5b4c9f03b21265535b810b2b5b46fa05945bc44e

Text: https://github.com/naproche/FLib/blob/master/PerfectoidRings/perfectoidrings.ftl.pdf
archived at

swh:1:dir:5b4c9f03b21265535b810b2b5b46fa05945bc44e

Funding:

The formalization was completed during the Trimester Prospects of Formal Mathematics at the Hausdorff Research Institut for Mathematics, funded by the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy – EXC-2047/1 – 390685813.

DOI:

10.4230/LIPIcs.ITP.2025.6

Event:

16th International Conference on Interactive Theorem Proving (ITP 2025)

Editors:

Yannick Forster and Chantal Keller

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The $\mathbb{N}$ aproche proof assistant (Natural Proof Checking, [11]) accepts natural language input texts and checks them for logical correctness. The system goes back to the System for Automated Deduction (SAD) [12, 6], which was discontinued in 2008. $\mathbb{N}$ aproche inherited the controlled natural language ForTheL (Formula Theory Language, [13]) as its input language which is designed to closely approximate natural mathematical language. $\mathbb{N}$ aproche translates input documents into first-order statements that are assembled into prover tasks and exported to provers like E [17] or Vampire [7]. Strong automated theorem proving allows wider proof steps, ideally comparable to natural reasoning steps in informal proofs. We use a LaTeX format so that formalizations can be typeset as usual mathematical documents.

Motivated by the celebrated Lean formalization of perfectoid spaces by K. Buzzard, J. Commelin, and P. Massot [9] we took on the challenge of formalizing perfectoid rings, which are main components of perfectoid spaces. The formalization in $\mathbb{N}$ aproche, or rather its LaTeX printout, is a readable 45 page document resembling a published mathematical paper, including explanatory text on mathematical and formalization details and literature references. The document is self-contained and provides all necessary preliminaries about natural numbers, prime numbers, groups, rings, and topological structures. It then follows parts of the Lean formalization up to perfectoid rings, which amounts to about 20% of the formalization of perfectoid spaces.

The natural language definition of perfectoid rings by P. Scholze in [16] reads:

Definition 3.1 A Tate ring $R$ is perfectoid if $R$ is complete, uniform, i.e. $R^{o}\subset R$ is bounded, and there exists a pseudo-uniformizer $\varpi\in R$ such that $\varpi^{p}|p$ in $R^{o}$ and the Frobenius map

$\Phi:R^{o}/\varpi\rightarrow R^{o}/\varpi^{p}:x\mapsto x^{p}$

is an isomorphism.

In the Lean formalization [8] this becomes

structure perfectoid_ring (R : Type) [Huber_ring R]
extends Tate_ring R : Prop :=
(complete : is_complete_hausdorff R)
(uniform : is_uniform R)
(ramified : $\exists\varpi$ : pseudo_uniformizer R, $\varpi$ ˆp | p in R°)
(Frobenius : surjective (Frob R°/p))

In contrast, the (LaTeX-rendering of the) final definition in the $\mathbb{N}$ aproche formalization comes close to the original.

Definition 293.

$R$ is perfectoid iff $R$ is complete and uniform and there exists a pseudouniformizer $\varpi$ of $R$ such that $\varpi^{p,R}|p^{[R]}$ in $R^{o}$ within $R$ and

$\Phi^{R}:R^{o}/\varpi\cong R^{o}/\varpi^{p,R}.$

Our formalization indicates the feasibility of natural formal mathematics in advanced modern mathematics, although a lot of further work will be required for wider applications. We list some conclusions from our experiment:

$\blacksquare$

The language ForTheL is sufficient to formalize rich mathematical structures like several kinds of rings with strong interactions between their algebraic and topological structures. In contrast, previous formalizations in $\mathbb{N}$ aproche were restricted to more basic mathematical objects like numbers, sets, and functions.
$\blacksquare$

$\mathbb{N}$ aproche suggests that the typing common in ordinary mathematical language can be modeled by types defined in first-order logic. Types are relations with a distinguished variable, all terms have to belong to some type, typed quantifiers are modeled by type guards. This approach allows the immediate deployment of strong first-order automated theorem provers.
$\blacksquare$

In mathematics, generous simplifications are used to avoid tedious details and to emphasize the deeper mathematical argument. There is, e.g., a tendency to identify structures with their underlying sets; this approach, though problematic in principle, is used within our self-contained formalization. If one wants to export our results into a more comprehensive mathematical universe, one would have to apply appropriate “theory morphisms” analogous to those in [20].
$\blacksquare$

The existing Lean formalization [8] was an essential guide for our text. We followed the build-up in the Lean formalization, and rewrote definitions and theorems directly into ForTheL. For the proofs, however, we wrote declarative proofs in ForTheL instead of Lean’s tactic proofs.
$\blacksquare$

The formalization has been a major stress test for the $\mathbb{N}$ aproche system. A continuation of the work towards perfectoid spaces appears possible in principle, but it would require to first strengthen the $\mathbb{N}$ aproche system to alleviate problems described in the final section.

The recent activities on autoformalizations with LLMs strongly suggest that the use of natural mathematical language in formal mathematics will expand. Direct LLM translations from free mathematical text into a prover input language are possible but hard to trace and “explain”. A system like $\mathbb{N}$ aproche could provide an interesting middle layer. One could translate free mathematical text into ForTheL by LLMs, and proof-check the translation in a system like $\mathbb{N}$ aproche. This would allow to check the informal / formal translation within natural English instead of inspecting formal logic statements. We are planning an LLM driven formalization and informalization project between free mathematical texts and $\mathbb{N}$ aproche, based on some preliminary ChatGPT experiments. The present formalization shows that ForTheL is suffiently expressive for certain kinds of advanced mathematics.

2 Running the formalization in Isabelle 2024

$\mathbb{N}$ aproche is a Haskell program which is being developed on Github [4]. It can be downloaded, compiled and run from the command line, using stack. In 2018, $\mathbb{N}$ aproche was integrated into the Isabelle Prover IDE [2, 19]. Formalizations can now be developed in the comfortable jEdit editor with continuous proof checking by E, Vampire and other ATPs delivered with Isabelle.

Our formalization has been developed in Isabelle 2024¹¹1Unfortunately, $\mathbb{N}$ aproche and the present formalization could not be included in Isabelle 2025 since fatal errors concerning naming of folders in some operating system were discovered too late. $\mathbb{N}$ aproche will be back in Isabelle 2026. which can be downloaded and installed from [1]. The formalization perfectoidrings.ftl.tex and also a readable pdfTeX output perfectoidrings.ftl.pdf can be downloaded from the $\mathbb{N}$ aproche homepage [5]. After opening perfectoidrings.ftl.tex in Isabelle/jEdit, the $\mathbb{N}$ aproche component of Isabelle 2024 will automatically start to check the file and give feedback via an output buffer and by underlining of checked text. Continuous $\mathbb{N}$ aproche proof checking will be active whilst editing the file.

Writing in a controlled natural language can be challenging. A controlled natural language will offer a certain number of alternative phrasing options, but it may be difficult to find out which of the many natural language alternatives are indeed accepted. Some speakers may interpret phrases differently from the “official” reading which can be inspected by mouse hovering in the Isabelle editor. Yet more user support would be welcome.

Note that $\mathbb{N}$ aproche proof checking is a major computational effort since each statement may generate several proof tasks that are sent to the ATP. Moreover, the proof search of ATPs under $\mathbb{N}$ aproche is quite unpredictable and may depend on accidental properties of the input text. The specific choice of intermediate steps in declarative proofs, sometimes found only after time-consuming experimentation, is crucial for prover success and also for the naturality of a proof text.

Checking the present formalization requires nearly half an hour, depending on available hardware. Sometimes it is necessary to increase the space and time bounds for the external ATP: In a formalization text the command

[memorylimit 6000]

increases the memory available to the ATP prover process to 6 megabytes for the further checking. Similarly, time bounds for ATP proving are increased to 10 seconds by

[timelimit 10]

The default timeout is

[timelimit 3]

One can switch between provers by the commands

[prover eprover]

and

[prover vampire]

with E being the default.

Note that formal parts of a $\mathbb{N}$ aproche formalization need to be enclosed in \begin{forthel} … \end{forthel} environments in the LaTeX source and are typeset against a grey background.

3 The mathematical structure of the $\mathbb{N}$ aproche formalization

The theory of perfectoid spaces [15] has transformed arithmetic algebraic geometry and was celebrated by a Fields Medal awarded to its creator P. Scholze in 2018 [14]. The formalization of perfectoid spaces in Lean was a milestone in bringing formal mathematics closer towards leading mathematical research. That formalization, however, is not readable as a normal mathematical text since Lean resembles and indeed is a computer language [3]. In contrast, the $\mathbb{N}$ aproche formalization of perfectoid rings, when compiled by LaTeX, looks and reads like a mathematical paper and can be understood by mathematicians without further computer support.

This section gives a mathematical overview of the $\mathbb{N}$ aproche formalization. Perfectoid rings are topological rings satisfying strong interactions between their algebraic and topological structure. To make the formalization logically self-contained we start from basic notions and properties as far as needed for our context. In principle, the formalization is a tower of definitions together with auxiliary lemmas for their justifications.

The first part of the formalization, called “Preliminaries”, introduces sets and functions. These are axiomatized in standard ways with axioms from the set theories of Zermelo-Fraenkel and Kelley-Morse. The set $\mathbb{N}$ of natural numbers is introduced by Peano-like axioms, including a general induction principle. We prove facts about divisibility, prime numbers and binomial coefficients which will later be relevant for Frobenius maps. Finiteness is handled by bijections with proper initial sections of $\mathbb{N}$ .

The general algebraic development proceeds from groups to topological rings where we make several simplifying assumptions: all groups considered will be commutative and written additively; all rings will be commutative containing a one.

In rings and subrings we define congruences modulo some element. We can canonically map the natural numbers into rings and also form powers of ring elements as iterated multiplications. Powers of sums of two ring elements satisfy standard binomial properties. If $p$ is a prime number the powers $(x+y)^{p}$ simplify to $x^{p}+y^{p}$ modulo $p$ . We conclude the preliminaries by defining topological spaces, topological groups and topological rings.

In the second part of the formalization, called “Perfectoid Rings”, various notions of boundedness are key.

Definition 240 (L 42).

Assume that $B$ is a subset of the topological ring $R$ . $B$ is bounded in $R$ iff for all neighborhoods $U$ of $0^{R}$ in $R$ there exists a neighborhood $V$ of $0^{R}$ in $R$ such that $v\cdot^{R}b\in U$ where $v\in V$ and $b\in B$ .

The development partially follows notes by Wedhorn [18] and the Lean formalization [8]. The above definition is taken directly from the pdf printout of our formalization; the identifier L 42 refers to line 42 in the file lean-perfectoid-spaces/src/power_bounded.lean in [9]. In the section on “Boundedness” we will discuss some differences between the formalizations in Lean and $\mathbb{N}$ aproche and what it means to follow the Lean formalization.

We continue with the notions of Huber rings and Tate rings. In a Huber ring, the ring topology is defined via the algebraic operations. A Tate ring is a Huber ring that has a topologically nilpotent element which is multiplicatively invertible.

Before the final definitions we consider the Frobenius map $x\mapsto x^{p}$ on $R$ and show that under some conditions it maps the subring $R^{o}$ of powerbounded elements into itself. If $\varpi$ is a pseudouniformizer of $R$ whose $p$ -th power $\varpi^{p}$ divides the prime number $p$ then the Frobenius map induces a homomorphism from the quotient $R^{o}/\varpi$ to the quotient $R^{o}/\varpi^{p}$ .

Finally, $R$ is defined to be perfectoid if the induced map is actually an isomorphism:

\Phi:R^{o}/\varpi\cong R^{o}/\varpi^{p}

For simplicity we express this crucial property via congruences mod $\varpi$ and mod $\varpi^{p}$ .

In classical algebra a field $F$ is perfect, if $F$ has characteristic $0$ , or $F$ has prime characteristic $p$ and the Frobenius map $x\mapsto x^{p}$ is an isomorphism. Such fields have very good “Galois properties”. The isomorphism property of perfectoid rings can also be used as a valuable symmetry in the theory. But our formalization ends here and does not extend beyond the very definition of perfectoid rings.

4 The language of mathematics and ForTheL

The natural language used in mathematical publications can be analysed by standard linguistic methods. A natural language sentence is built from phrases like noun phrases or verb phrases according to grammatical rules of the form:

sentence $\rightarrow$ nounPhrase verbPhrase

This phrase structure rule means that the concatenation of a noun phrase and a verb phrase yields a sentence. A complex phrase structure grammar may contain further rules for generating sentences. Phrase structure rules can be seen as rules for generating, but also for analysing and parsing phrases. In some systems the above rule can be directly used in constructing a sentence parser from a noun phrase parser and a verb phrase parse.

Symbolic mathematical material can also be seen as built from phrases. The compound term $f(x^{2})$ is obtained by inserting the “term phrase” $x^{2}$ into the …-slot of the “function phrase” $f($ … $)$ .

Let us indicate how a linguistic analysis could be carried out on the natural language example cited above:

Definition 3.1 A Tate ring $R$ is perfectoid if $R$ is complete, uniform, i.e. $R^{o}\subset R$ is bounded, and there exists a pseudo-uniformizer $\varpi\in R$ such that $\varpi^{p}|p$ in $R^{o}$ and the Frobenius map

$\Phi:R^{o}/\varpi\rightarrow R^{o}/\varpi^{p}:x\mapsto x^{p}$

is an isomorphism.

The sentence “A Tate ring …” is a definition of a predicate and consists of a “head” “A Tate ring $R$ is perfectoid”, the keyword “if”, and a mathematical statement “ $R$ is …”. With phrase categories predicateDef, predicateHead, and statement the sentence is analysed by the rule

predicateDef $\rightarrow$ predicateHead if statement

One can refine the analysis and break down the statement “ $R$ is …”: there is an “and”-chain of substatements, like the “simple statement” that “ $R$ is complete”, or a quantifier statement with a “head”: “there exists a pseudo-uniformizer”. This suggests the rule

statement $\rightarrow$ headStatement $|$ chainStatement

The symbol $|$ indicates a grammatical alternative. A statement can be (either) a headStatement or a chainStatement.
Continuing like this, we are led to a small grammar of the kind:

predicateDef $\rightarrow$ predicateHead if statement

statement $\rightarrow$ headStatement $|$ chainStatement

chainStatement $\rightarrow$ andChain $|$ …

andChain $\rightarrow$ primaryStatement $\{$ and primaryStatement $\}$

primaryStatement $\rightarrow$ simpleStatement $|$ thereIsStatement $|$ …

thereIsStatement $\rightarrow$ there (exists $|$ exist) notions

etc.
Dots … stand for further alternatives, brackets $\{\dots\}$ mean zero or more repetitions, and words in teletype font are terminal tokens that one is searching for in the input text. The alternative (exists $|$ exist) allows to choose the correct english form: “there exists $x$ ” versus “there exist $x, y, z$ ”.

This kind of linguistic analysis has been pursued for several decades in the Evidence Algorithm (EA) project [12]. It has led to a large phrase structure grammar that covers a significant part of mathematical language, including symbolic terms and formulas. ForTheL is the controlled language defined or accepted by this grammar [13]. The grammar of ForTheL extends beyond single sentences and also covers theorem and proof environments as well as whole mathematical texts.

ForTheL follows principles familiar from natural language:

$\blacksquare$

the language is built around nouns and noun phrases: “number”, “natural number”, “positive natural number”;
$\blacksquare$

noun phrases can be modified and connected by adjectives, verb phrases, “such that” sentences, relative sentences, …: “nonzero natural number”, “ $x$ divides $y$ ”;
$\blacksquare$

definite noun phrases correspond to constants and functions: “the number $3$ ”, “the square root of $9$ ”, “the square root of $x$ ”;
$\blacksquare$

indefinite noun phrases correspond to (dependent) types: “a natural number”, “a divisor of $x$ ”.

In ForTheL those “soft types” are called notions. Notions can be freely defined by ForTheL formulas. Each quantifier ranges over some notion since in ForTheL every variable needs to be explicitly or implicitely typed.

In $\mathbb{N}$ aproche, ForTheL has been extended to a simple LaTeX format so that formalizations can be typeset immediately. The formal content of a ForTheL formalization is contained in forthel environments. Outside those environments, arbitrary LaTeX content is accepted as “literary comment”.

ForTheL does not insist on correct English, although one can and should write grammatical English by providing and choosing appropriate forms. The command
[synonym successor/-s] identifies the words “successor” and “successors” already in the tokenization of the input. This allows to write grammatically correct texts even if singulars and plurals are not relevant for the logical checking. We import more singular/plural pairs from a library file.

[readtex arx/lang/vocabulary.ftl.tex]

Similarly a macros file provides linguistic alternatives for some phrases:

[readtex arx/lang/macros.ftl.tex]

This file contains linguistic identifications of the form:

  Let a member of $X$ stand for an element of $X$.
  Let $x$ belongs to $X$ stand for $x$ is an element of $X$.
  Let $X$ contains $x$ stand for $x$ is an element of $X$.
  etc.

The command “Let XXX stand for YYY” identifies the phrases “YYY” with “XXX”, also respecting variables. Here Let and stand for are ForTheL keywords.

5 Interpreting basic notions in first-order logic

The notion (or type) of natural numbers is introduced by Signature commands. Together with an induction principle to be stated later in the formalization, the natural numbers can be understood as the inductive type generated by $0$ and the $+1$ operation. We describe methods by which such higher-order concepts can be dealt with in first-order logic.

Signature 28.

A natural number is a mathematical object.

Let $n, m, k, l, i, j$ denote natural numbers.

Definition 29.

$\mathbb{N}$ is the collection of natural numbers.

Axiom 30 (Axiom of Infinity).

$\mathbb{N}$ is a set.

Signature 31.

$0$ is a natural number.

Let $x$ is nonzero stand for $x\neq 0$ .

Signature 32.

$1$ is a nonzero natural number.

Signature 33.

$m+n$ is a natural number.

Axiom 34.

If $n$ is a nonzero natural number then $n=m+1$ for some natural number $m$ .

In $\mathbb{N}$ aproche, ForTheL is interpreted in first-order logic where notions become type-guards. The first-order approach has several advantages: there is a widespread view and experience in mathematics that complex notions can be reduced to set theory and axiomatized in first-order logic. Moreover, there are strong automated theorem provers for first-order logic that can directly work on the first-order translations.

The translation of the signature statement

Signature 28.

A natural number is a mathematical object.

can be viewed by mouse-hovering over the statement:

[Translation] ...
forall v0 ((HeadTerm :: aNaturalNumber(v0)) implies aObject(v0))

Internally, notions like “natural number” and “object” are rendered as unary relation symbols aNaturalNumber and aObject. The translation says that the extension of the first notion is contained in the second.

This suggests a “first-order picture” of standard foundations. “Class”, “object”, and “set” are inbuilt notions equipped with the axiom that something is a set iff it is a class and an object at the same time. The signature command introducing natural numbers states that the extension of the natural numbers is contained within the objects, but we do not require that natural numbers are sets. Intuitively we rather want them to be objects of their own kind, and so we have drawn them disjoint from the sets. Inside the natural numbers we find constants $0$ and $1$ , and the operation $+$ operates on the natural numbers. The definition of $\mathbb{N}$ is translated into

forall v0 ((HeadTerm :: v0 = \mathbb{N}) iff (aClass(v0) and
forall v1 (aElementOf(v1,v0) iff (aNaturalNumber(v1) and aObject(v1)))))

This registers $\mathbb{N}$ as a class. The set-theoretic Axiom of Infinity strengthens this to $\mathbb{N}$ being a set.

6 Induction in first-order logic

Induction is usually conceived as a higher-order principle. In $\mathbb{N}$ aproche, instances of induction have to be invoked explicitly and are then schematically reduced to first-order logic. For this purpose, ForTheL provides a distinguished binary symbol $\prec$ which is understood as “inductively smaller”. Indeed we can denote the relation by the phrase “inductively smaller” due to a “stand for” command:

Let $m$ is inductively smaller than $n$ stand for $m\prec n$ .

Consider the following example taken directly from the $\mathbb{N}$ aproche formalization.

Let $a, b$ denote natural numbers.

Lemma 29.

For all nonzero natural numbers $n, a, b$ if $n|a*b$ and $n$ and $a$ are coprime then $n$ divides $b$ .

Proof by induction on $a*b$ .

… ∎

Theorem 30 (Euclids Lemma).

Let $p$ be a prime number and $p|m*n$ . Then $p|m$ or $p|n$ .

Lemma 29 is proved by induction which is signaled by giving the proof method as a parameter to the proof environment: \begin{proof}[by induction on $a * b$].

The “by induction” method reduces the task of proving a universal statement “for all $n,a,b\ P(n,a,b)$ ” to proving a related and easier to prove “inheritance property”:

“for all $n, a, b$ : if $P(n^{\prime},a^{\prime},b^{\prime})$ holds for all $n^{\prime},a^{\prime},b^{\prime}$ with $a^{\prime}*b^{\prime}$ inductively smaller than $a*b$ then $P(n,a,b)$ holds.”

Initially, $\prec$ is an unspecified relation. To postulate induction over some concrete relation requires to embed that relation into $\prec$ . Thus induction along the standard order of natural numbers is made available by an axiom which embeds $<$ into $\prec$ :

Axiom 59.

If $m<n$ then $m$ is inductively smaller than $n$ .

Then the inductive proof of Lemma 29 can proceed by reducing the divisibility property at $a, b$ to some $a^{\prime},b^{\prime}$ where $a^{\prime}*b^{\prime}<a*b$ . Lemma 29 immediately implies Euclid’s important lemma on division by prime numbers in $\mathbb{N}$ .

7 A simple approach to structures

It is common to define structures as an underlying set with further structural elements that “belong” to that set: “a group is a set with …”. For simplicity one often “identifies” a structure with its underlying set. This naive approach works alright as long as one does not consider two different structures of the same kind on the same underlying set. In general one would need stronger structure mechanisms as provided in other proof systems.

In our formalization, additive groups, subgroups and rings are introduced as follows:

Signature 112.

A group is a set.

Let $G$ denote a group.

Signature 113.

$0^{G}$ is an element of $G$ .

Signature 114.

Let $x,y\in G$ . $x+^{G}y$ is an element of $G$ .

Signature 115.

Let $x\in G$ . $-^{G}x$ is an element of $G$ .

Axiom 116 (Associativity).

$(x+^{G}y)+^{G}z=x+^{G}(y+^{G}z)$ for all $x,y,z\in G$ .

Axiom 117.

$x+^{G}0^{G}=x$ for all $x\in G$ .

Axiom 118.

$x+^{G}(-^{G}x)=0^{G}$ for all $x\in G$ .

Axiom 119.

$x+^{G}y=y+^{G}x$ for all $x,y\in G$ .

Definition 123.

A subgroup of $G$ is a subset $H$ of $G$ such that $0^{G}\in H$ and for all $x,y\in H$

$x+^{G}y\in H$ and $-^{G}x\in H$ .

Signature 130.

A ring is an additive group.

Let $R$ denote a ring.

Signature 131.

$1^{R}$ is an element of $R$ such that $1^{R}\neq 0^{R}$ .

Signature 132.

Let $x,y\in R$ . $x\cdot^{R}y$ is an element of $R$ .

Axiom 133.

$(x\cdot^{R}y)\cdot^{R}z=x\cdot^{R}(y\cdot^{R}z)$ for all $x,y,z\in R$ .

Axiom 134.

$x\cdot^{R}1^{R}=x$ for all $x\in R$ .

Axiom 135 (Commutativity).

$x\cdot^{R}y=y\cdot^{R}x$ for all $x,y\in R$ .

Axiom 136 (Distributivity).

$(x+^{R}y)\cdot^{R}z=(x\cdot^{R}z)+^{R}(y\cdot^{R}z)$ for all $x,y,z\in R$ .

8 Hiding obvious parameters

Since $\mathbb{N}$ aproche does not provide automatic type elaboration the obvious group parameter $G$ has to be carried along in the LaTeX formalization. In the LaTeX output, however, we can hide it by some TeX hack:

\catcode‘^\active
\def^#1{\ifthenelse{\equal{#1}{G}} {\unskip} {\sp{#1}}}

The first command makes the symbol ˆ active as a possible macro name. The second line defines the macro ˆ{...}: if the argument is $G$ , it is discarded. Otherwise we treat it as a superscript. Since ˆ is used as a macro symbol, we generate the superscript instead by the LaTeX command \sp{ }. The definition of the macro requires to \usepackage{ifthen}. The group axioms then print out more concisely:

Axiom 116 (Associativity).

$(x+y)+z=x+(y+z)$ for all $x,y,z\in G$ .

Axiom 117.

$x+0=x$ for all $x\in G$ .

Axiom 118.

$x+(-x)=0$ for all $x\in G$ .

Axiom 119.

$x+y=y+x$ for all $x,y\in G$ .

9 Topological spaces, groups and rings

Topological spaces are defined using our naive approach to structures. The space $T$ is identified with its underlying set, and there is a predicate (or atomic formula) “ $X$ is open in $T$ ” for subsets of $T$ . We can express usual topological axioms and prove that the empty set $\emptyset$ is open in $T$ .

Signature 207.

A topological space is a set.

Let $T$ denote a topological space.

Signature 208.

Assume that $T$ is a topological space and $X$ is a subset of $T$ . $X$ is open in $T$ is an atom.

Definition 209.

An open subset of $T$ is a subset of $T$ that is open in $T$ .

Axiom 210.

$T$ is open in $T$ .

Axiom 211.

Let $X$ be an open subsets of $T$ and $Y$ be an open subset of $T$ . Then $X\cap Y$ is open in $T$ .

Axiom 212.

Let $Z$ be a set such that every element of $Z$ is an open subset of $T$ . Then $\bigcup Z$ is open in $T$ .

Lemma 213.

$\emptyset$ is open in $T$ .

Proof . $\emptyset=\bigcup\emptyset$ . ∎

A topological group has a continuous addition operation and a topological ring moreover has a continuous multiplication. Note that in formalizing mathematics one often has to choose between different approaches which from a mathematical standpoint are considered to be essentially the same. One can define a general notion of continuous function between topological spaces, introduce addition and multiplication as functions of type $T\times T\rightarrow T$ and then postulate that they are continuous functions. Instead we treat the $+$ and $\cdot$ -operations as first-order function symbols and express continuity specifically for those. In the $\mathbb{N}$ aproche context this is not just an arbitrary choice, but it can improve the efficiency of proof search, since the ring operations are immediately available to the first-order ATP as function symbols of the first-order theory.

Let $R$ denote a ring that is a topological space.

We first define the continuity of ring multiplication locally and then introduce the global notion.

Definition 237.

Let $x,x^{\prime}\in R$ . $R$ is multiplicatively continuous at $x$ and $x^{\prime}$ iff for every neighborhood $U$ of $x\cdot^{R}x^{\prime}$ in $R$ there are subsets $V,V^{\prime}$ of $R$ such that $V$ is a neighborhood of $x$ in $R$ and $V^{\prime}$ is a neighborhood of $x^{\prime}$ in $R$ and for all $v\in V$ for all $v^{\prime}\in V^{\prime}$ $v\cdot^{R}v^{\prime}\in U$ .

Definition 238.

$R$ is multiplicatively continuous iff $R$ is multiplicatively continuous at $x$ and $x^{\prime}$ for all $x,x^{\prime}\in R$ .

Definition 239 (Topological rings).

A topological ring is a ring $R$ such that $R$ is a topological group and $R$ is multiplicatively continuous.

10 Boundedness

The Boundedness chapter of the formalization corresponds to the file power_bounded.lean in the Lean formalization of perfectoid spaces [8]. It contains the theory of topologically nilpotent, bounded, and power-bounded elements and subsets of topological rings $R$ .

Definition 240 (L 42).

Assume that $B$ is a subset of $R$ . $B$ is bounded in $R$ iff for all neighborhoods $U$ of $0$ in $R$ there exists a neighborhood $V$ of $0$ in $R$ such that $v\cdot b\in U$ where $v\in V$ and $b\in B$ .

Definition 251 (L 179).

Let $r$ be an element of $R$ . $r$ is powerbounded in $R$ iff $\{{r}^{n,R}\mid n\in\mathbb{N}\}$ is bounded in $R$ .

The proof of the following lemma uses that a given element $a$ of $R$ is powerbounded in $R$ . For some neighborhood $U$ of $0$ in $R$ one needs to “take” another neighborhood $V$ of $0$ such that $V\cdot\{{a}^{n,R}\mid n\in\mathbb{N}\}\subseteq U$ . The latter property is expressed as $v\cdot b{{}^{n}}\in U$ where $v\in V$ and $n$ is a natural number. The “take” command is justified by proving the existence of such a $V$ , which requires the ATP to dig through several quantifiers and definitions. On the computer used for the formalization this required raising the timelimit to 10 seconds. The next “take” even needed [timelimit 15]. Afterwards the limit is reset to its default of 3 seconds. On other machines, other timeouts may be required.

Lemma 256 (L 248).

Let $a, b$ be elements of $R$ that are powerbounded in $R$ . Then $a\cdot b$ is powerbounded in $R$ .

Proof . Let $U$ be a neighborhood of $0$ in $R$ . [timelimit 10] Take a neighborhood $V$ of $0$ in $R$ such that $v\cdot b{{}^{n}}\in U$ where $v\in V$ and $n$ is a natural number. [timelimit 15] Take a neighborhood $W$ of $0$ in $R$ such that $w\cdot a{{}^{n}}\in V$ where $w\in W$ and $n$ is a natural number. [timelimit 3]

(1) $w\cdot(a\cdot b){{}^{n}}\in U$ where $w\in W$ and $n$ is a natural number.

Proof. Let $w\in W$ and $n$ be a natural number. $w\cdot a{{}^{n}}\in V$ .

$w\cdot(a\cdot b){{}^{n}}=w\cdot(a{{}^{n}}\cdot b{{}^{n}})=(w\cdot a{{}^{n}})% \cdot b{{}^{n}}\in U$ .

qed. ∎

Lemma 256 follows the corresponding lemma in Lean on line 248 of
lean-perfectoid-spaces/src/power_bounded.lean in [9].

lemma mul (a b : R)
  (ha : is_power_bounded a) (hb : is_power_bounded b) :
  is_power_bounded (a * b) :=
begin
  rw singleton at ha hb,
  refine subset _ (monoid.closure (union ha hb)),
  rw [set.singleton_subset_iff, monoid.mem_closure_union_iff],
  refine a,_,b,_,rfl; exact monoid.subset_closure (set.mem_singleton _)
end

Note that the statement of Lemma 256 can be seen as a direct translation of the Lean lemma. The proof, however, follows traditional mathematical lines and is quite different from the cryptic Lean code.

The formalization goes through several variants of boundedness and their properties. As intuitively expected finite sets are bounded. This is proved by induction on the size of the finite set. Under certain conditions, the set $R{{}^{o}}$ of powerbounded elements in a ring $R$ is a subring. $R$ is called uniform, if $R{{}^{o}}$ is bounded itself. A ring element is topologically nilpotent if its set of powers “converges to zero”:

Definition 270 (L 30).

Let $r$ be an element of $R$ . $r$ is topologically nilpotent in $R$ iff for all neighborhoods $U$ of $0$ in $R$ there exists a natural number $N$ such that $r{{}^{n}}\in U$ for all natural numbers $n$ such that $n>N$ .

A topologically nilpotent ring element is also powerbounded.

11 Huber and Tate rings

The formalization uses the original definition of f-adic rings by R. Huber [10]:

Definition 276.

A Huber ring is a topological ring $R$ such that for some subset $U$ of $R$ and some finite subset $T$ of $U$ $\{{U}^{n,R}\mid n\in\mathbb{N}\}$ is a fundamental system of neighborhoods of $R$ and $T\ast U=U\ast U\subseteq U$ .

Lemma 277.

Let $R$ be a Huber ring. Then $R{{}^{o}}$ is a subring of $R$ .

Definition 278.

A pseudouniformizer of $R$ is a unit in $R$ that is topologically nilpotent in $R$ .

Definition 281.

A Tate ring is a Huber ring that has a pseudouniformizer.

12 Frobenius maps and perfectoid rings

Normally we would use (set-theoretic) quotients by principal ideals in $R{{}^{o}}$ for the definition of perfectoid rings, and then Frobenius-style maps on those quotient rings. Instead we define a global map $\Phi$ , $x\mapsto x{{}^{p}}$ and consider whether it induces maps between such quotients. This can be expressed by calculations modulo the generators of the ideals. We introduce the Frobenius Map as a function symbol of the logic, dependent on the ring $R$ and a fixed prime number $p$ .

Let $R$ denote a Huber ring.

Signature 282.

$p$ is a prime number.

Definition 283.

Let $x\in R$ . $\Phi(x)=x{{}^{p}}$ .

Lemma 284.

Let $x\in R{{}^{o}}$ . Then $\Phi(x)\in R{{}^{o}}$ .

The function $\Phi$ is called a Frobenius map. It maps the subring of power-bounded elements into itself.

A perfectoid ring requires the Frobenius map to be an isomorphism between certain quotients of $R{{}^{o}}$ . To simplify matters, we introduce the notation $\Phi:S/a\cong T/b$ with LaTeX source

$\Phiˆ{R} : S / a \cong T / b$

Although the notation suggests that $S$ and $T$ are subrings, that one forms quotients by the elements $a$ and $b$ , and that $\Phi$ (or some induced map) is an isomorphism between the quotients, it is simply defined as a property of congruences in the parameters $S, a, T, b$ .

Definition 291.

Let $S,T\subseteq R$ . Let $a\in S$ and $b\in T$ . $\Phi:S/a\cong T/b$ iff (for every $x,y\in S$ if $\Phi(x)\equiv_{T}\Phi(y)$ mod $b$ within $R$ then $x\equiv_{S}y$ mod $a$ within $R$ ) and (for every $z\in T$ there exists $w\in S$ such that $z\equiv_{T}\Phi(w)$ mod $b$ within $R$ ).

Finally we are getting to perfectoid rings in $\mathbb{N}$ aproche:

Let $R$ denote a Tate ring.

Lemma 292.

Let $R$ be complete and $\varpi$ be a pseudouniformizer of $R$ . Then $\varpi,\varpi{{}^{p}}$ do not divide $1$ in $R{{}^{o}}$ within $R$ .

In this case the quotients $R{{}^{o}}/\varpi$ and $R{{}^{o}}/\varpi{{}^{p}}$ are well-defined rings, and one can define:

Definition 293.

$R$ is perfectoid iff $R$ is complete and uniform and there exists a pseudouniformizer $\varpi$ of $R$ such that $\varpi{{}^{p}}|p{{}^{[R]}}$ in $R{{}^{o}}$ within $R$ and

\Phi:R{{}^{o}}/\varpi\cong R{{}^{o}}/\varpi{{}^{p}}.

13 Strengthening the $\mathbb{N}$ aproche system

The formal definition of perfectoid rings in a readable and proof-checked mathematical text has been a veritable tour de force. Further work on $\mathbb{N}$ aproche should aim at reducing the user’s technical effort and free resources for the mathematical work. It should also reduce the computational demands of proving. Let us mention some problems and realistically attainable important aspects.

$\blacksquare$

In the current $\mathbb{N}$ aproche system there is a marked difference between some elegant axiomatic arguments and the inefficiency of symbolic calculations. In our formalization, the derivation of the standard binomial formula in rings

Theorem 201.

Let $x,y\in R$ . For all natural numbers $n$ we have

$(x+y){{}^{n}}={\sum}_{i=0}{{}^{{n}}}{n\choose i}x{{}^{n-i}}y{{}^{i}}.$

provides an especially awkward example within our formalization. All symbolic transfor¸mations are proved from axioms, instead of being justified by some ring-theoretic calculus, and they require several minutes of prover time. It will be necessary to at least leverage the term-rewriting potential of the equational prover E.
$\blacksquare$

Often the ATP proof search takes unnatural and inefficient routes from the standpoint of a mathematical expert. Therefore one needs better control over the deployed ATPs instead of using them just as black boxes.
$\blacksquare$

Working on a formalization requires a lot of re-checking of already checked text. The rudimentary caching mechanisms of $\mathbb{N}$ aproche should be improved so that the superposition proofs found by E or Vampire would be stored directly in some auxiliary file which should be consulted before sending out new prover tasks to the ATP. This is similar to using sledgehammer in Isabelle/HOL and storing sledgehammer results directly with the formalization.
$\blacksquare$

After several years of experience with exemplary formalizations basic notions and properties should be made available in a checked standard library with efficient import mechanisms. Most of the preliminary material in the present formalization should ideally be put into such a library.
$\blacksquare$

The $\mathbb{N}$ aproche approach should be integrated into other proof systems. This would require clear interfaces for exchanging parse trees or prover tasks and answers. To achieve this, further modularization of the $\mathbb{N}$ aproche software is necessary.
$\blacksquare$

Most processes within the $\mathbb{N}$ aproche system are translations between various languages, from (controlled) natural language to the prover input language TPTP. These days one must obviously examine whether Large Language Models can do such translations just as well or even better.

References

[1] The Isabelle 2024 homepage. https://isabelle.in.tum.de/website-Isabelle2024/.
[2] The Isabelle homepage. https://isabelle.in.tum.de/.
[3] The Lean homepage. https://lean-lang.org/.
[4] The $\mathbb{N}$ aproche Github repository. https://github.com/naproche/naproche.
[5] The $\mathbb{N}$ aproche homepage. https://naproche.github.io.
[6] The SAD homepage. http://nevidal.org/sad.en.html.
[7] The Vampire homepage. https://vprover.github.io/.
[8] Kevin Buzzard, Johan Commelin, and Patrick Massot. The Lean formalization of perfectoid spaces. https://leanprover-community.github.io/lean-perfectoid-spaces/.
[9] Kevin Buzzard, Johan Commelin, and Patrick Massot. Formalising perfectoid spaces. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, POPL ’20, pages 299–312. ACM, January 2020. doi:10.1145/3372885.3373830.
[10] Roland Huber. Continuous Valuations. Mathematische Zeitschrift, (212):455–477, 1993.
[11] Adrian De Lon, Peter Koepke, Anton Lorenzen, Adrian Marti, Marcel Schütz, and Makarius Wenzel. The Isabelle/Naproche Natural Language Proof Assistant. In CADE 28, LNCS 12699, pages 614–624, 2021. doi:10.1007/978-3-030-79876-5_36.
[12] Alexander Lyaletski and Konstantin Verchinine. Evidence Algorithm and System for Automated Deduction: A Retrospective View. In Intelligent Computer Mathematics, LNCS 6167, pages 411–426, 2010. doi:10.1007/978-3-642-14128-7_35.
[13] Andrei Paskevich. The syntax and semantics of the ForTheL language. http://nevidal.org/download/forthel.pdf, 2007.
[14] Michael Rapoport. The work of Peter Scholze - Laudation for Scholze’s Fields medal 2018, 2022. arXiv:1909.07222.
[15] Peter Scholze. Perfectoid Spaces. Publications Mathématiques de l’IHÉS, (116):245–313, 2012.
[16] Peter Scholze. Etale cohomology of diamonds, 2022. arXiv:1709.07343.
[17] Stephan Schulz. The E Theorem Prover. http://wwwlehre.dhbw-stuttgart.de/˜sschulz/E/E.html.
[18] Torsten Wedhorn. Etale cohomology of diamonds, 2019. arXiv:1910.05934.
[19] Makarius Wenzel. Interaction with Formal Mathematical Documents in Isabelle/PIDE. In Intelligent Computer Mathematics, CICM 2019, LNCS 11617, pages 1–15, 2019. doi:10.1007/978-3-030-23250-4_1.
[20] William M. Farmer. Theory Morphisms in Church’s Type Theory with Quotation and Evaluation. In H.Geuvers, M. England, O. Hasan, F. Rabe, and O. Teschke, editors, Intelligent Computer Mathematics. CICM 2017, LNCS 10383, pages 147–162, 2017. doi:10.1007/978-3-319-62075-6_11.

[bib.bib1] [1] The Isabelle 2024 homepage. https://isabelle.in.tum.de/website-Isabelle2024/.

[bib.bib2] [2] The Isabelle homepage. https://isabelle.in.tum.de/.

[bib.bib3] [3] The Lean homepage. https://lean-lang.org/.

[bib.bib4] [4] The $\mathbb{N}$ aproche Github repository. https://github.com/naproche/naproche.

[bib.bib5] [5] The $\mathbb{N}$ aproche homepage. https://naproche.github.io.

[bib.bib6] [6] The SAD homepage. http://nevidal.org/sad.en.html.

[bib.bib7] [7] The Vampire homepage. https://vprover.github.io/.

[bib.bib8] [8] Kevin Buzzard, Johan Commelin, and Patrick Massot. The Lean formalization of perfectoid spaces. https://leanprover-community.github.io/lean-perfectoid-spaces/.

[bib.bib9] [9] Kevin Buzzard, Johan Commelin, and Patrick Massot. Formalising perfectoid spaces. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, POPL ’20, pages 299–312. ACM, January 2020. doi:10.1145/3372885.3373830.

[bib.bib10] [10] Roland Huber. Continuous Valuations. Mathematische Zeitschrift, (212):455–477, 1993.

[bib.bib11] [11] Adrian De Lon, Peter Koepke, Anton Lorenzen, Adrian Marti, Marcel Schütz, and Makarius Wenzel. The Isabelle/Naproche Natural Language Proof Assistant. In CADE 28, LNCS 12699, pages 614–624, 2021. doi:10.1007/978-3-030-79876-5_36.

[bib.bib12] [12] Alexander Lyaletski and Konstantin Verchinine. Evidence Algorithm and System for Automated Deduction: A Retrospective View. In Intelligent Computer Mathematics, LNCS 6167, pages 411–426, 2010. doi:10.1007/978-3-642-14128-7_35.

[bib.bib13] [13] Andrei Paskevich. The syntax and semantics of the ForTheL language. http://nevidal.org/download/forthel.pdf, 2007.

[bib.bib14] [14] Michael Rapoport. The work of Peter Scholze - Laudation for Scholze’s Fields medal 2018, 2022. arXiv:1909.07222.

[bib.bib15] [15] Peter Scholze. Perfectoid Spaces. Publications Mathématiques de l’IHÉS, (116):245–313, 2012.

[bib.bib16] [16] Peter Scholze. Etale cohomology of diamonds, 2022. arXiv:1709.07343.

[bib.bib17] [17] Stephan Schulz. The E Theorem Prover. http://wwwlehre.dhbw-stuttgart.de/˜sschulz/E/E.html.

[bib.bib18] [18] Torsten Wedhorn. Etale cohomology of diamonds, 2019. arXiv:1910.05934.

[bib.bib19] [19] Makarius Wenzel. Interaction with Formal Mathematical Documents in Isabelle/PIDE. In Intelligent Computer Mathematics, CICM 2019, LNCS 11617, pages 1–15, 2019. doi:10.1007/978-3-030-23250-4_1.

[bib.bib20] [20] William M. Farmer. Theory Morphisms in Church’s Type Theory with Quotation and Evaluation. In H.Geuvers, M. England, O. Hasan, F. Rabe, and O. Teschke, editors, Intelligent Computer Mathematics. CICM 2017, LNCS 10383, pages 147–162, 2017. doi:10.1007/978-3-319-62075-6_11.

A Natural Language Formalization of Perfectoid Rings in ℕaproche

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Supplementary Material:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Definition 293.

2 Running the formalization in Isabelle 2024

3 The mathematical structure of the ℕaproche formalization

Definition 240 (L 42).

4 The language of mathematics and ForTheL

5 Interpreting basic notions in first-order logic

Signature 28.

Definition 29.

Axiom 30 (Axiom of Infinity).

Signature 31.

Signature 32.

Signature 33.

Axiom 34.

Signature 28.

6 Induction in first-order logic

Lemma 29.

Theorem 30 (Euclids Lemma).

Axiom 59.

7 A simple approach to structures

Signature 112.

Signature 113.

Signature 114.

Signature 115.

Axiom 116 (Associativity).

Axiom 117.

Axiom 118.

Axiom 119.

Definition 123.

Signature 130.

Signature 131.

Signature 132.

Axiom 133.

Axiom 134.

Axiom 135 (Commutativity).

Axiom 136 (Distributivity).

8 Hiding obvious parameters

Axiom 116 (Associativity).

Axiom 117.

Axiom 118.

Axiom 119.

9 Topological spaces, groups and rings

Signature 207.

Signature 208.

Definition 209.

Axiom 210.

Axiom 211.

Axiom 212.

Lemma 213.

Definition 237.

Definition 238.

Definition 239 (Topological rings).

10 Boundedness

Definition 240 (L 42).

Definition 251 (L 179).

Lemma 256 (L 248).

Definition 270 (L 30).

11 Huber and Tate rings

Definition 276.

Lemma 277.

Definition 278.

Definition 281.

12 Frobenius maps and perfectoid rings

Signature 282.

Definition 283.

Lemma 284.

Definition 291.

Lemma 292.

Definition 293.

A Natural Language Formalization of Perfectoid Rings in $\mathbb{N}$ aproche

3 The mathematical structure of the $\mathbb{N}$ aproche formalization

13 Strengthening the $\mathbb{N}$ aproche system