Combining Generalization Algorithms in Regular Collapse-Free Theories

Ayala-Rincón, Mauricio; Cerna, David M.; Kutsia, Temur; Ringeissen, Christophe

doi:10.4230/LIPIcs.FSCD.2025.7

Combining Generalization Algorithms
in Regular Collapse-Free Theories

Mauricio Ayala-Rincón

Exact Sciences Institute, Universidade de Brasília, Brazil David M. Cerna

Institute of Computer Science, Czech Academy of Sciences, Prague, Czechia Temur Kutsia

Research Institute for Symbolic Computation, Johannes Kepler University, Linz, Austria Christophe Ringeissen

Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France

Abstract

We look at the generalization problem modulo some equational theories. This problem is dual to the unification problem: given two input terms, we want to find a common term whose respective two instances are equivalent to the original terms modulo the theory.

There exist algorithms for finding generalizations over various equational theories. We focus on modular construction of equational generalization algorithms for the union of signature-disjoint theories. Specifically, we consider the class of regular and collapse-free theories, showing how to combine existing generalization algorithms to produce specific solutions in these cases.

Additionally, we identify a class of theories that admit a generalization algorithm based on the application of axioms to resolve the problem. To define this class, we rely on the notion of syntactic theories, a concept originally introduced to develop unification procedures similar to the one known for syntactic unification. We demonstrate that syntactic theories are also helpful in developing generalization procedures similar to those used for syntactic generalization.

Keywords and phrases:

Generalization, Anti-unification, Equational theories, Combination

Funding:

Mauricio Ayala-Rincón: Brazilian National Council for Scientific and Technological Development (CNPq), Grant Universal 409003/21-2 and RG 313290/21-0.

David M. Cerna: Czech Science Foundation Grant 22-06414L; Cost Action CA20111 EuroProofNet.

Temur Kutsia: Austrian Science Fund (FWF), Project P 35530; Cost Action CA20111 EuroProofNet.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Equational logic and rewriting ; Computing methodologies

\rightarrow

Symbolic and algebraic manipulation

DOI:

10.4230/LIPIcs.FSCD.2025.7

Event:

10th International Conference on Formal Structures for Computation and Deduction (FSCD 2025)

Editors:

Maribel Fernández

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The problem of generalization of two terms $s$ and $t$ asks to find a term $r$ such that $s$ and $t$ are substitution instances of $r$ . The generalizations of interest are those that maximally retain similarities between the given terms while abstracting their differences uniformly. In other words, they should be the least general among generalizations of the given terms. The generalization problem is also called the problem of anti-unification since it can be seen as a dual to unification: while unification focuses on making two expressions more specific by finding their common instance via unifiers, anti-unification abstracts them to a more general form.

The first generalization algorithms were developed in the 1970s [38, 41], motivated by the usefulness of this technique for inductive reasoning. Nowadays, application areas of anti-unification are quite diverse, including, e.g., automated program repair [8, 19, 44], software code or specification synthesis [21, 40, 49], code similarity detection and change analysis [48, 9, 33], learning-related tasks [18, 36, 39], natural language processing [3, 24, 45], indexing/compression [12, 37, 25], just to name a few. Generalization computation techniques have been investigated for various structures (see, e.g., the recent survey [16] for an overview), among them, for first-order equational theories, which is also the subject of this paper.

In equational anti-unification, syntactic equality is replaced by equality modulo the given equational theory. As a consequence, the given terms do not necessarily have a single least general generalization. Instead, problems are characterized by their minimal complete set of generalizations, which might be a singleton, a finite set, an infinite set, or might not exist at all due to a clash between the minimality and completeness requirements. Several authors considered (first-order) equational anti-unification, studying the problem in associative, commutative, associative-commutative, unital, idempotent, absorptive theories, semirings, theories that lead to regular congruence classes, etc. [1, 2, 4, 5, 11, 13, 14, 15, 31, 47].

A more general question is related to anti-unification in a combined equational theory: Is it possible to derive a generalization algorithm for a union of equational theories from the existing generalization algorithms for the component theories? This is called the combination problem for generalization algorithms and is very important for applications. To the best of our knowledge, our work is the first attempt to address the modular construction of generalization algorithms.

On the other hand, the combination problem has been studied quite intensively for unification; see, e.g., [7, 20, 22, 26, 28, 30, 42, 43, 46]. These modularity results impose certain restrictions on the component theories to guarantee the existence and good properties of the combined algorithm, the main restriction being that the theories are signature-disjoint. Considering regular collapse-free theories is also a classical restriction. In the class of regular collapse-free theories, the left-hand side and the right-hand side of any equational axiom are non-variable terms with exactly the same variables. A combination method is known for unification algorithms in any union of disjoint regular collapse-free theories [46], and another one exists for matching algorithms in any union of disjoint regular theories [35] where regular theories may have collapse axioms between a non-variable left-hand side and a variable right-hand side. It is important to notice that when we relax the restriction on equational theories, we may need to impose a stronger restriction on the kind of algorithms available in the component theories. Hence, the combination method introduced in [7] for the unification problem works in any union of disjoint theories. Still, as a counterpart, it assumes the existence of general unification procedures capable of dealing with free function symbols.

Our contributions

We consider the combination problem for generalization algorithms in the union of equational theories $E_{1}$ and $E_{2}$ over the signatures $\Sigma_{1}$ and $\Sigma_{2}$ and a set of free constants $C$ , such that:

$\blacksquare$

$\Sigma_{1}\cap\Sigma_{2}=\emptyset$ , i.e., the theories are signature-disjoint (and $E_{i}\cap C=\emptyset$ holds for $i=1,2$ , since $C$ consists of free constants);
$\blacksquare$

both $E_{1}$ and $E_{2}$ are regular collapse-free;
$\blacksquare$
for each $E_{i}$ , $i=1,2$ , there exists a complete generalization algorithm $\mathfrak{G}_{i}$ which
- –
  
  handles ground terms built over $\Sigma_{i}\cup C$ , and
- –
  
  for any given ground terms $t$ and $u$ returns a triple $(r,\phi,\psi)$ , where $r$ is a generalization of $(t,u)$ with respective solved substitutions $(\phi,\psi)$ , meaning that application of $(\phi,\psi)$ to $r$ does not lead to further (more specific) generalizations.

For such equational theories $E_{1}$ and $E_{2}$ , a combination algorithm for generalization problems in $E_{1}\cup E_{2}$ is designed, which relies on $E_{i}$ -generalization algorithms $\mathfrak{G}_{i}$ for $i=1,2$ , and its completeness is shown. This is our main result, which we further refine in two ways, as explained below.

1.

The class of finite theories (i.e., theories where every equivalence class is finite) satisfies the above-mentioned conditions, and we show that combined generalization problems in this class can be solved in a modular way. Furthermore, we provide a rule-based generalization algorithm for the subclass of finite syntactic theories. The notion of syntactic theories was originally introduced to develop unification procedures similar to the one known for syntactic unification (cf. [27, 29, 34]). We demonstrate that syntactic theories are also helpful in developing a rule-based generalization algorithm that extends syntactic generalization. This extension is obtained by introducing a new mutation rule where the resolvent axioms of the theory are applied to simplify the generalization problems.
2.

For theories that admit rule-based generalization algorithms working on configurations of a special form, we design a combination algorithm that uses the rules of the component algorithms (instead of using them as black-boxes), thus obtaining a combined rule-based algorithm with rules of the same form as the component algorithms. We call it the white-box combination method. This leads to a modular result: a combined algorithm obtained in such a way can be used as a component algorithm for further white-box combinations. This white-box combination method applies to any instance of the rule-based generalization algorithm introduced for finite syntactic theories.

Organization

After this introduction, Section 2 presents the required background. Then, Section 3 presents some results connecting equational generalization to free generalization, showing that equational generalization in finite theories is reducible to free generalization, and so in finite theories equational generalization is finitary. In Section 4, we discuss the combination of generalization for regular collapse-free theories. Then, Section 5 explores the class of finite syntactic theories by showing a rule-based generalization algorithm. Section 6 considers the problem of combining rule-based generalization algorithms. Finally, Section 7 concludes and briefly discusses possible future work.

2 Preliminaries

2.1 Terms and Substitutions

We consider a first-order alphabet consisting of a signature $\Sigma$ (set of fixed arity function symbols) and a set of variables $\mathcal{V}$ . The set of terms $\mathcal{T}(\Sigma,\mathcal{V})$ over $\Sigma$ and $\mathcal{V}$ is defined in the standard way: $t\in\mathcal{T}(\Sigma,\mathcal{V})$ iff $t$ is defined by the grammar $t:=x\mid f(t_{1},\ldots,t_{n})$ , where $x\in\mathcal{V}$ and $f\in\Sigma$ is an $n$ -ary symbol with $n\geq 0$ .

We denote arbitrary function symbols by $f, g, h$ , constants by $a, b, c$ , variables by $x, y, z, v$ , and terms by $s, t, r, u$ . The notions of term depth, term size, and a position in a term are defined in the standard way, see, e.g., [6]. By $t|_{p}$ , we denote the subterm of $t$ at position $p$ , and by $t[s]_{p}$ , a term obtained from $t$ by replacing the subterm at position $p$ with the term $s$ . For any position $p$ in a term $t$ (including the root position $\epsilon$ ), $t(p)$ is the symbol at position $p$ . The set of all variables in $t$ is denoted by $\mathit{Var}(t)$ . A term is called linear if no variable occurs in it more than once, and ground if $\mathit{Var}(t)=\emptyset$ .

A sequence of terms $t_{1},\dots,t_{n}$ may be written $\bar{t}$ when $n$ is clear from the context.

Given any signature $\Sigma$ and any finite set of constants $C$ such that $\Sigma\cap C=\emptyset$ , the signature $\Sigma\cup C$ is denoted by ${\Sigma}^{C}$ .

A substitution is a mapping from $\mathcal{V}$ to $\mathcal{T}(\Sigma,\mathcal{V})$ , which is the identity almost everywhere. We use the Greek letters $\sigma,\vartheta,\varphi$ to denote substitutions, except for the identity substitution, which is written as $\mathit{Id}$ . We represent substitutions using the usual set notation, e.g., $\sigma=\{x_{1}\mapsto t_{1},\dots,x_{n}\mapsto t_{n}\}$ where $\{x_{1},\dots,x_{n}\}$ is the domain of $\sigma$ , denoted by $\mathit{Dom}(\sigma)$ . Application of a substitution $\sigma$ to a term $t$ , denoted by $t\sigma$ , is defined as $x\sigma=\sigma(x)$ and $f(t_{1},\ldots,t_{n})\sigma=f(t_{1}\sigma,\ldots,t_{n}\sigma)$ . Substitution composition is defined as a composition of mappings. It is associative but not commutative, with $\mathit{Id}$ playing the role of the unit element. We write $\sigma\vartheta$ for the composition of $\sigma$ with $\vartheta$ .

2.2 Equational Theories

Let $E$ be a set of term pairs, i.e., elements of $\mathcal{T}(\Sigma,\mathcal{V})\times\mathcal{T}(\Sigma,\mathcal{V})$ . An equational $\Sigma$ -theory generated by $E$ , denoted by $=_{E}$ , is the least congruence relation on $\mathcal{T}(\Sigma,\mathcal{V})$ that is closed under substitution application and contains $E$ . We call $E$ a presentation of $=_{E}$ , and the elements of $E$ are called the axioms of $=_{E}$ , written as $l=r$ . If $t=_{E}u$ , we say that $u$ is equal modulo $E$ to $t$ . When $E=\emptyset$ , every term is equal only to itself, and the theory is called the empty or free theory.

It is common to slightly abuse the terminology in the literature by calling both $E$ and $=_{E}$ an equational theory. For a given $E$ , we denote by $\mathsf{sig}(E)$ the set of all function symbols occurring in $E$ .

A sequence of $E$ -equalities $t_{1}=_{E}u_{1},\dots,t_{n}=_{E}u_{n}$ may be written $\bar{t}=_{E}\bar{u}$ when $n$ is clear from the context.

An axiom $l=r$ is regular if $\mathit{Var}(l)=\mathit{Var}(r)$ . An axiom $l=r$ is collapse-free if $l$ and $r$ are non-variable terms. An axiom $l=r$ is shallow if variables can only occur at a position at depth at most $1$ in both $l$ and $r$ . An equational theory is regular (resp., collapse-free/shallow) if all its axioms are regular (resp., collapse-free/shallow). An equational theory $E$ is finite if, for each term $t$ , there are only finitely many terms $u$ such that $t=_{E}u$ . An equational theory $E$ is subterm collapse-free if there are no terms $t, u$ such that $t=_{E}u$ and $u$ is a strict subterm of $t$ . A subterm collapse-free theory is necessarily collapse-free, and a finite theory is necessarily regular and subterm collapse-free.

A theory $E$ is syntactic if it has a finite resolvent presentation $R P$ , defined as a finite set of axioms $R P$ such that each equality $t=_{E}u$ has an equational proof $t\leftrightarrow_{RP}^{*}u$ with at most one equational step $\leftrightarrow_{RP}$ applied at the root position. Note that any axiom in $R P$ can be applied in both directions: from the left to the right and from the right to the left. Any shallow theory is syntactic [17], and any collapse-free theory with finitary unification is syntactic [29].

One can easily check that $A=\{x*(y*z)=(x*y)*z\}$ (Associativity), $C=\{x*y=y*x\}$ (Commutativity), and $AC=A\cup C$ (Associativity-Commutativity) are regular and collapse-free. Moreover, $A$ , $C$ and $A C$ are finite and syntactic [34, 29], but only $C$ is shallow. Since $A$ , $C$ , and $A C$ are finite theories, they are also subterm collapse-free.

2.3 Combination of Equational Theories

The rest of the paper assumes that $E_{i}$ is a regular collapse-free $\Sigma_{i}$ -theory for $i=1,2$ , where $\Sigma_{1}$ and $\Sigma_{2}$ are disjoint signatures. For the sake of brevity, the combined signature $\Sigma_{1}\cup\Sigma_{2}$ is abbreviated to $\Sigma_{1,2}$ , and the combined theory $E_{1}\cup E_{2}$ is also denoted by $E_{1,2}$ .

A $\Sigma$ -rooted term $t$ is a term whose root symbol is in $\Sigma$ . Given any ${\Sigma}^{C}_{1,2}$ -term $t$ where $t$ is $\Sigma_{i}$ -rooted, an alien subterm of $t$ is any subterm $s$ of $t$ which is not $\Sigma_{i}$ -rooted and such that any proper superterm of $s$ in $t$ is $\Sigma_{i}$ -rooted. The set of alien subterms of $t$ is denoted by $Alien(t)$ . Given any term $t$ , its height of theory is formally defined as follows: $ht(t)=1+\max\{ht(t^{\prime})\leavevmode\nobreak\ |\leavevmode\nobreak\ t^{% \prime}\in Alien(t)\}$ . Given a finite set $T$ of ground ${\Sigma}^{C}_{1,2}$ -terms, we consider $Aliens=\bigcup_{t\in T}\mathit{Alien}(t)$ and a mapping $\pi:Aliens\rightarrow FC$ such that $F C$ is a set of fresh free constants ( $FC\cap C=\emptyset$ ) and for any $a,a^{\prime}\in Aliens$ , $\pi(a)=\pi(a^{\prime})$ iff $a=_{E_{1}\cup E_{2}}a^{\prime}$ . Conversely, we define $\pi^{-1}:FC\rightarrow Aliens$ as a mapping such that for any $a\in Aliens$ , $\pi^{-1}(\pi(a))=_{E_{1}\cup E_{2}}a$ . The $i$ -abstraction of any term $t\in T$ is denoted by $t^{\pi_{i}}$ and is inductively defined as follows:

$\blacksquare$

$c^{\pi_{i}}=c$ if $c\in C$ ,
$\blacksquare$

$(f(t_{1},\dots,t_{n}))^{\pi_{i}}=f(t_{1}^{\pi_{i}},\dots,t_{n}^{\pi_{i}})$ if $f\in\Sigma_{i}$ ,
$\blacksquare$

$(f(t_{1},\dots,t_{n}))^{\pi_{i}}=\pi(f(t_{1},\dots,t_{n}))$ if $f\in\Sigma_{j}$ for $j\neq i$ .

Lemma 1 (Correspondence for Equality [46]).

Let $E_{1}$ and $E_{2}$ be two signature-disjoint regular collapse-free theories. For $i=1,2$ and any ${\Sigma}^{C}_{i}$ -rooted terms $t$ and $u$ , $t=_{E_{1}\cup E_{2}}u$ iff $t^{\pi_{i}}=_{E_{i}}u^{\pi_{i}}$ .

Lemma 1 holds when the component theories $E_{1}$ and $E_{2}$ are regular collapse-free. In the general case where the component theories are arbitrary, we need to assume that $t$ and $u$ are in layer-reduced form [23] to have the same correspondence with their $i$ -abstractions.

2.4 Generalization Problems

Given an equational theory $E$ , a term $t$ is more general than $u$ modulo $E$ , or is an $E$ -generalization of $u$ , if there exists a substitution $\sigma$ such that $t\sigma=_{E}u$ . In such a case, we write $t\preceq_{E}u$ and also say the term $u$ is less general than $t$ modulo $E$ or that $u$ is an $E$ -instance of $t$ . The relation $\preceq_{E}$ is a quasi-order and generates the equivalence relation, denoted by $\simeq_{E}$ . The strict part of $\preceq_{E}$ is denoted by $\prec_{E}$ .

An equational generalization problem is formulated as follows:

Given:: an equational theory $E$ and two terms $t$ and $u$ ;
Find:: a term $r$ such that $r\preceq_{E}t$ and $r\preceq_{E}u$ ( $r$ is said to be an $E$ -generalization of $t$ and $u$ ).

Given an equational theory $E$ and two terms $t$ and $u$ , a set of terms $\mathcal{G}$ is called a complete set of $E$ -generalizations of $t$ and $u$ if it satisfies the following properties:

1.

Soundness: every element of $\mathcal{G}$ is an $E$ -generalization of $t$ and $u$ ,
2.

Completeness: for every $E$ -generalization $q$ of $t$ and $u$ there exists $r\in\mathcal{G}$ such that $q\preceq_{E}r$ .

The set $\mathcal{G}$ is called a minimal complete set of $E$ -generalizations of $t$ and $u$ , denoted $\mathit{mcsg}_{E}(t,u)$ if it, in addition, satisfies the following:

3.

Minimality: if there exist $r,q\in\mathcal{G}$ such that $r\preceq_{E}q$ , then $r=q$ .

The existence and cardinality of $\mathit{mcsg}_{E}$ define what is called the generalization type of an equational theory $E$ as follows: $E$ has the generalization type 0 (or is nullary) if there are two terms $t$ and $u$ such that $\mathit{mcsg}_{E}(t,u)$ does not exist. Otherwise, the generalization type of $E$ is

$\blacksquare$

unitary, if $\mathit{mcsg}_{E}(t,u)$ is a singleton for all $t$ and $u$ (in this case, the sole element of $\mathit{mcsg}_{E}(t,u)$ is called a least general $E$ -generalization of $t$ and $u$ and is denoted by $\mathit{lgg}_{E}(t,u)$ ),
$\blacksquare$

finitary, if $\mathit{mcsg}_{E}(t,u)$ is finite for all $t$ and $u$ , and there exists at least one pair of terms for which this set is not a singleton,
$\blacksquare$

infinitary, if there exist $t$ and $u$ for which $\mathit{mcsg}_{E}(t,u)$ is infinite.

In the rest of the paper, we consider generalization algorithms defined as terminating inference systems transforming anti-unification triples (AUTs, for short) written $x:t\triangleq u$ where $x$ is a variable, and $t$ and $u$ are the two ground terms to generalize. More precisely, we are focusing on rule-based systems working on configurations $\mathcal{C}$ of the form $A;S;\sigma$ , where $A$ and $S$ are sets of AUTs, called the active set and store of $\mathcal{C}$ , respectively, and $\sigma$ is a substitution, called the generalization component of $\mathcal{C}$ . From now on, $\mathcal{C}$ is said to be a PSS-configuration; it consists of a problem together with a store and a substitution. Given a set of AUTs $P=\bigcup_{i=1}^{n}\{x_{i}:t_{i}\triangleq u_{i}\}$ , we associate two substitutions: $\{x_{1}\mapsto t_{1},\allowbreak\dots,x_{n}\mapsto t_{n}\}$ is the left substitution of $P$ and $\{x_{1}\mapsto u_{1},\dots,x_{n}\mapsto u_{n}\}$ is the right substitution of $P$ . A set of AUTs may be written using the disjoint union symbol $\uplus$ to isolate a specific AUT from that set.

3 Equational Generalization vs Free Generalization

The following two theorems characterize equational generalizations with the help of free least general generalizations.

Theorem 2 (Free $\mathit{lgg}$ ’s versus $E$ -Generalizations).

Given an equational theory $E$ and two terms $t$ and $u$ , if a term $r$ is an $E$ -generalization of $t$ and $u$ , then there exist terms $t^{\prime}$ , $u^{\prime}$ , and $r^{\prime}$ such that $t^{\prime}=_{E}t$ , $u^{\prime}=_{E}u$ , $r^{\prime}$ is a free least general generalization of $t^{\prime}$ and $u^{\prime}$ , and $r\preceq_{\emptyset}r^{\prime}$ .

Proof.

From $r$ being an $E$ -generalization of $t$ and $u$ we get the existence of two substitutions $\sigma$ and $\vartheta$ such that $r\sigma=_{E}t$ and $r\vartheta=_{E}u$ . Take $t^{\prime}=r\sigma$ and $u^{\prime}=r\vartheta$ . Then we have $t^{\prime}=_{E}t$ and $u^{\prime}=_{E}u$ . Moreover, $r$ is a free generalization of $t^{\prime}$ and $u^{\prime}$ . Then there exists a free least general generalization $r^{\prime}$ of $t^{\prime}$ and $u^{\prime}$ , and we get $r\preceq_{\emptyset}r^{\prime}$ . $\hfill\blacktriangleleft$

Theorem 3 (Completeness and Finiteness through Free $\mathit{lgg}$ ’s).

Let $E$ be an equational theory, $t$ and $u$ be two terms, and $\mathcal{G}$ be the set of terms $\mathcal{G}_{E}(t,u)=\{\mathit{lgg}_{\emptyset}(t^{\prime},u^{\prime})\mid t^{% \prime}\in[t]_{E},\,u^{\prime}\in[u]_{E}\}$ . Then

1.

$\mathcal{G}_{E}(t,u)$ is a complete set of $E$ -generalizations of $t$ and $u$ .
2.

$\mathcal{G}_{E}(t,u)$ is not necessarily minimal.
3.

If $E$ is a finite theory, then $\mathcal{G}_{E}(t,u)$ is finite modulo variable renaming. Consequently, every finite theory $E$ has the $E$ -generalization type at most finitary.
4.

For any $E$ , if $[t]_{E}$ is finite, then $\mathcal{G}_{E}(t,u)$ (as well as $\mathcal{G}_{E}(u,t)$ ) is also finite modulo variable renaming.

Proof.

1.

Every element $r^{\prime}\in\mathcal{G}$ is an $E$ -generalization of $t$ and $u$ , because there exist substitutions $\sigma$ and $\vartheta$ such that $r^{\prime}\sigma=t^{\prime}=_{E}t$ and $r^{\prime}\vartheta=u^{\prime}=_{E}u$ . Thus, together with Theorem 2, we may conclude that for every $E$ -generalization $r$ of $t$ and $u$ there exists an $E$ -generalization $r^{\prime}$ of $t$ and $u$ such that $r^{\prime}\in\mathcal{G}$ and $r\preceq_{\emptyset}r^{\prime}$ , which implies the completeness of $\mathcal{G}$ .
2.

Let $E$ be the equational theory that asserts the commutativity of $f$ , and take $t=f(a,b)$ and $u=f(a,b)$ . Then $\mathcal{G}_{E}(t,u)=\{f(a,b),f(x,y)\}$ , because $f(a,b)=\mathit{lgg}_{\emptyset}(f(a,b),f(a,b))$ and $f(x,y)=\mathit{lgg}_{\emptyset}(f(a,b),f(b,a))$ . Obviously, $f(x,y)$ is more general than $f(a,b)$ and, hence, $\mathcal{G}_{E}(t,u)$ is not minimal.
3.

If $E$ is a finite theory, then $[t]_{E}$ and $[u]_{E}$ are finite sets. Free least general generalizations are unique (modulo variable renaming). Hence, $\mathcal{G}_{E}(t,u)$ is finite modulo renaming. Since $E$ -matching is finitary in any finite theory $E$ , $\mathcal{G}_{E}(t,u)$ can always be minimized, and we get that $E$ -generalization type in finite theories is at most finitary.
4.

First, note that the depth of a free generalization of two terms never exceeds the minimum of their depths, and such a generalization does not contain any function symbol that does not appear in both of them. Therefore, for a fixed $t^{\prime}\in[t]_{E}$ , the depth of $\mathit{lgg}_{\emptyset}(t^{\prime},u^{\prime})$ is bounded by the depth of $t^{\prime}$ , and the set of function symbols that potentially may appear in $\mathit{lgg}_{\emptyset}(t^{\prime},u^{\prime})$ is restricted to the set of function symbols of $t^{\prime}$ . Hence, only finitely many terms (over a fixed-arity alphabet like ours) can satisfy these conditions. Hence, whatever $u^{\prime}$ we take, the candidates for $\mathit{lgg}(t^{\prime},u^{\prime})$ may come only from a fixed finite (modulo variable renaming) set that depends on $t^{\prime}$ only: the elements of this set are terms whose free instance is $t^{\prime}$ . Therefore, for the given $t^{\prime}$ , the set of all terms $\{\mathit{lgg}_{\emptyset}(t^{\prime},u^{\prime})\mid u^{\prime}\in\mathcal{T}% (\mathcal{F},\mathcal{V})\}$ is finite. Since by assumption $[t]_{E}$ is finite, we get that $\mathcal{G}_{E}(t,u)$ is a finite union of finite sets, which implies that it is finite.

$\hfill\blacktriangleleft$

Example 4.

This example illustrates an infinite $\mathcal{G}_{E}$ for $E$ that is non-finite subterm-collapse free (the simplest class of non-finite theories according to [10]). Let $E=\{f(a,g(x))=f(a,x),f(b,g(x))=f(b,x)\}$ , $t=f(a,c)$ , and $u=f(b,d)$ . Then

\displaystyle\mathcal{G}_{E}(t,u)=\{f(x,y),f(x,g(y)),f(x,g(g(y)),\ldots\}.

In fact, this set contains an infinite chain

f(x,y)\prec_{E}f(x,g(y))\prec_{E}f(x,g(g(y))\prec_{E}\cdots.

4 Combined Generalization in Regular Collapse-Free Theories

In this section, we consider regular collapse-free theories $E$ . When $E$ is regular, the $E$ -equivalence class of any ground term only contains ground terms. Thus, any solution of an $E$ -matching problem necessarily maps a variable occurring in the problem to a ground term. As a consequence, the following property holds when $E$ is assumed to be regular: given any generalization $r$ of a pair of ground terms $(t,u)$ , any substitutions $\phi,\psi$ such that $r\phi=_{E}t$ and $r\psi=_{E}u$ satisfy that $x\phi$ and $x\psi$ are ground for any variable $x\in\mathit{Var}(r)$ . This assumption is useful in the following definition, where the considered substitutions are necessarily ground.

Definition 5 (Generalizations with Solved Instantiations).

Let $E$ be a regular $\Sigma$ -theory. Given any pair of ground ${\Sigma}^{C}$ -terms $(t,u)$ , an $E$ -generalization of $(t,u)$ with respective instantiations $(\phi,\psi)$ is a ${\Sigma}^{C}$ -term $r$ together with a pair of substitutions $(\phi,\psi)$ such that $\mathit{Var}(r)=\mathit{Dom}(\phi)=\mathit{Dom}(\psi)$ , $r\phi=_{E}t$ and $r\psi=_{E}u$ . Given any pair of ground ${\Sigma}^{C}$ -terms $(t,u)$ , a triple representing an $E$ -generalization of $(t,u)$ with solved instantiations (an $E$ -GSI triple of $(t,u)$ , for short) is any triple $(r,\phi,\psi)$ such that

$\blacksquare$

$r$ is an $E$ -generalization of $(t,u)$ with respective instantiations $(\phi,\psi)$ ,
$\blacksquare$

for any $x\in\mathit{Var}(r)$ , there is no non-variable $E$ -generalization of $(x\phi,x\psi)$ ,
$\blacksquare$

for any $x,y\in\mathit{Var}(r)$ , $x\phi=_{E}y\phi$ and $x\psi=_{E}y\psi$ implies $x=y$ ¹¹1This third condition may be dropped when considering linear generalizations..

A set $S T$ of $E$ -GSI triples of $(t,u)$ is said to be complete if $\bigcup_{(r,\phi,\psi)\in ST}\{r\}$ is a complete set of $E$ -generalizations of $(t,u)$ . We say that $E$ admits an algorithm computing a complete set of $E$ -generalizations with solved instantiations (GSI algorithm, for short) if there exists a computable function returning, for each input pair of ground ${\Sigma}^{C}$ -terms $(t,u)$ , a complete set of $E$ -GSI triples of $(t,u)$ denoted by $GSI_{E}(t,u)$ .

A GSI algorithm can be constructed via the repeated application of a generalization algorithm together with a matching algorithm. In the case of a finite theory, the generalization problem is finitary, and the matching problem as well. Consequently, this iteration can be implemented, and it is necessarily terminating in the case of a finite theory since the size of any generalization is necessarily bounded. Thus, we have that:

Lemma 6.

Any finite theory admits a GSI algorithm.

$\blacktriangleright$ Remark 7.

In addition to the sketch of the proof given in the paragraph preceding Lemma 6, notice that another proof can be obtained using Theorem 3 and the existence of a GSI algorithm for the empty theory.

Example 8.

In Definition 5, the notion of GSI triple is introduced for any regular theory $E$ . In particular, $E$ can be the non-finite regular theory from Example 4. Continuing with that example where $t=f(a,c)$ , $u=f(b,d)$ , and $\mathcal{G}_{E}(t,u)=\{f(x,y),f(x,g(y)),f(x,g(g(y)),\ldots\}$ , one can check that

GSI_{E}(t,u)=\{(r,\phi,\psi)\mid r\in\mathcal{G}_{E}(t,u),\phi=\{x\mapsto a,y% \mapsto c\},\psi=\{x\mapsto b,y\mapsto d\}\}.

Section 5 introduces another class of theories admitting a GSI algorithm. This class includes non-finite theories.

In general, for the theories we want to combine, we could imagine deriving GSI algorithms from existing rule-based generalization algorithms transforming PSS-configurations (see Section 6). Then, the resulting GSI algorithms can be combined directly as black-boxes, provided the theories are regular collapse-free.

Theorem 9 (Modular Property for GSI Algorithm).

The class of regular collapse-free theories admitting a GSI algorithm and with decidable equality is closed by disjoint union.

Proof.

The rule-based procedure given in Figure 1 can be shown terminating thanks to a complexity measure defined as follows: to each configuration $(A;S;\sigma)$ , we associate a pair $(mht(A),|S|)$ where $mht(A)$ is the multiset of heights of theory $\bigcup_{(x^{\prime}:t^{\prime}\triangleq u^{\prime})\in A}\{ht(t^{\prime}),ht% (u^{\prime})\}$ , and $|S|$ is the cardinality of $S$ . These pairs are ordered lexicographically, the first component being ordered by the multiset extension of the (Noetherian) ordering on positive integers, while the second component is ordered by the (Noetherian) ordering on positive integers. One can easily verify that $E_{i}$ -Gen and $E_{1,2}$ -Sol strictly decrease the first component of the pair, and $E_{1,2}$ -Mer does not increase the first component but strictly decreases the second one.

The normal forms with respect to the rule-based procedure are the configurations of the form $(\emptyset;S;\sigma)$ upon which $E_{1,2}$ -Mer does not apply. Indeed, any configuration $(A;S;\sigma)$ with $A\neq\emptyset$ is necessarily reducible by either $E_{i}$ -Gen or $E_{1,2}$ -Sol.

For any derivation starting with the input configuration $(\{x:t\triangleq u\},\emptyset,\mathit{Id})$ and leading to a final configuration $(\emptyset;S;\sigma)$ , we can associate a tuple $(x\sigma,\phi_{S},\psi_{S})$ where

$\blacksquare$

$\phi_{S}=\{y\mapsto t^{\prime}\leavevmode\nobreak\ |\leavevmode\nobreak\ y:t^{% \prime}\triangleq u^{\prime}\in S,y\in\mathit{Var}(x\sigma)\}$ ,
$\blacksquare$

$\psi_{S}=\{y\mapsto u^{\prime}\leavevmode\nobreak\ |\leavevmode\nobreak\ y:t^{% \prime}\triangleq u^{\prime}\in S,y\in\mathit{Var}(x\sigma)\}$ .

Then, the set of all such tuples defines a $GSI_{E_{1}\cup E_{2}}(t,u)$ as stated in Definition 5. This is a consequence of the following facts:

$\blacksquare$

${E_{i}}$ -Gen is sound and complete by Lemma 10 and Corollary 11 (see below),
$\blacksquare$

$E_{1,2}$ -Sol is sound and complete since $E_{1}$ and $E_{2}$ are collapse-free.

$\hfill\blacktriangleleft$

$E_{i}$ -Gen: Generalization in the $E_{i}$ -theory
$\{x:t\triangleq u\}\uplus A;S;\sigma\Longrightarrow(P\setminus S_{i})\cup A;S_% {i}\cup S;\sigma\{x\mapsto(r\pi^{-1})\},$

where

$\blacksquare$

$t(\epsilon)\in{\Sigma}^{C}_{i},u(\epsilon)\in{\Sigma}^{C}_{i}$ ,
$\blacksquare$

$(r,\phi,\psi)\in GSI_{E_{i}}(t^{\pi_{i}},u^{\pi_{i}})$ ,
$\blacksquare$

$P=\{y:(y\phi)\pi^{-1}\triangleq(y\psi)\pi^{-1}\leavevmode\nobreak\ |% \leavevmode\nobreak\ y\in\mathit{Var}(r)\}$ ,
$\blacksquare$

$S_{i}=\{y:t^{\prime}\triangleq u^{\prime}\leavevmode\nobreak\ |\leavevmode% \nobreak\ (y:t^{\prime}\triangleq u^{\prime})\in P,t^{\prime}(\epsilon)\in{% \Sigma}^{C}_{i},u^{\prime}(\epsilon)\in{\Sigma}^{C}_{i}\}$ .

$E_{1,2}$ -Sol: Solving in the combined theory
$\{x:t\triangleq u\}\uplus A;S;\sigma\Longrightarrow A;S\cup\{x:t\triangleq u\}% ;\sigma,$ if $t(\epsilon)\in\Sigma_{i}$ and $u(\epsilon)\in\Sigma_{3-i}$ .

$E_{1,2}$ -Mer: Merging in the combined theory
$\emptyset;\{x:t\triangleq u,x^{\prime}:t^{\prime}\triangleq u^{\prime}\}\uplus S% ;\sigma\Longrightarrow\emptyset;S\cup\{x:t\triangleq u\};\sigma\{x^{\prime}% \mapsto x\},$ if $t=_{E_{1}\cup E_{2}}t^{\prime}$ and $u=_{E_{1}\cup E_{2}}u^{\prime}$ .

Figure 1: Combined Generalization Procedure for Regular Collapse-Free Theories.

We now show the completeness of the algorithm presented in Figure 1 by considering any generalization of the given terms over the combined theory $E_{1}\cup E_{2}$ and showing that we can construct from it an $E_{i}$ -generalization, where $i$ depends on the root symbol, that uses the same variable instantiations as the combined generalization. In essence, our combination algorithm builds on the completeness of the algorithms for the individual theories and is thus complete as well.

The following technical lemma and corollary are used in the proof of Theorem 9.

Lemma 10 (Correspondence for Generalization).

Let $E_{1}$ and $E_{2}$ be two signature-disjoint regular collapse-free theories over respectively the signatures $\Sigma_{1}$ and $\Sigma_{2}$ , $t$ and $u$ any ${\Sigma}^{C}_{i}$ -rooted terms for some $i=1,2$ , and $r$ any $E_{1}\cup E_{2}$ -generalization $r$ of $(t,u)$ with respective instantiations $(\phi,\psi)$ . Consider a bijective mapping $\Pi_{V}:Alien(r)\rightarrow V$ where $V$ is a set of fresh variables and the following terms and substitutions:

$\blacksquare$

$r_{i}$ is the $i$ -pure term obtained from $r$ by replacing each $r^{\prime}\in Alien(r)$ by $\Pi_{V}(r^{\prime})$ .
$\blacksquare$
$\phi_{i}$ and $\psi_{i}$ are two substitutions such that $\mathit{Dom}(\phi_{i})=\mathit{Dom}(\psi_{i})=\mathit{Var}(r_{i})$ and defined as follows:
- –
  
  for any $x\in\mathit{Var}(r_{i})\setminus\mathit{Var}(r)$ , $x\phi_{i}=((\Pi_{V}^{-1}(x))\phi)^{\pi_{i}}$ and $x\psi_{i}=((\Pi_{V}^{-1}(x))\psi)^{\pi_{i}}$ ,
- –
  
  for any $x\in\mathit{Var}(r)$ , $x\phi_{i}=(x\phi)^{\pi_{i}}$ .

Then, $r_{i}$ is an $E_{i}$ -generalization of $(t^{\pi_{i}},u^{\pi_{i}})$ with respective instantiations $(\phi_{i},\psi_{i})$ , and for any $x\in\mathit{Var}(r_{i})\setminus\mathit{Var}(r)$ , $\Pi_{V}^{-1}(x)$ is an $E_{1}\cup E_{2}$ -generalization of $(x\phi_{i})\pi^{-1},(x\psi_{i})\pi^{-1})$ with respective instantiations $(\phi,\psi)$ .

Proof.

By assumption, $r\phi=_{E_{1}\cup E_{2}}t$ and $r\psi=_{E_{1}\cup E_{2}}u$ . By Lemma 1, $(r\phi)^{\pi_{i}}=_{E_{1}\cup E_{2}}t^{\pi_{i}}$ and $(r\psi)^{\pi_{i}}=_{E_{1}\cup E_{2}}u^{\pi_{i}}$ . Then, by definition of $r_{i}$ , $\phi_{i}$ and $\psi_{i}$ , we have $r_{i}\phi_{i}=(r\phi)^{\pi_{i}}$ and $r_{i}\psi_{i}=(r\psi)^{\pi_{i}}$ . Consequently, $r_{i}\phi_{i}=_{E_{1}\cup E_{2}}t^{\pi_{i}}$ and $r_{i}\psi_{i}=_{E_{1}\cup E_{2}}u^{\pi_{i}}$ .

For each $x\in\mathit{Var}(r_{i})\backslash\mathit{Var}(r)$ , by definition of $\phi_{i}$ and $\psi_{i}$ , we have that $(\Pi_{V}^{-1}(x)\phi)^{\pi_{i}}=x\phi_{i}$ and $(\Pi_{V}^{-1}(x)\psi)^{\pi_{i}}=x\psi_{i}$ . Applying $\pi^{-1}$ on these identities, we get $(\Pi_{V}^{-1}(x)\phi)^{\pi_{i}}\pi^{-1}=(x\phi_{i})\pi^{-1}$ and $(\Pi_{V}^{-1}(x)\psi)^{\pi_{i}}\pi^{-1}=(x\psi_{i})\pi^{-1}$ . By definition of $\pi^{-1}$ , $s=_{E_{1}\cup E_{2}}s^{\pi_{i}}\pi^{-1}$ for any term $s$ . Consequently, $(\Pi_{V}^{-1}(x))\phi=_{E_{1}\cup E_{2}}(x\phi_{i})\pi^{-1}$ and $(\Pi_{V}^{-1}(x))\psi=_{E_{1}\cup E_{2}}(x\psi_{i})\pi^{-1}$ . $\hfill\blacktriangleleft$

Corollary 11.

Let $E_{1}$ and $E_{2}$ be two signature-disjoint regular collapse-free theories over respectively the signatures $\Sigma_{1}$ and $\Sigma_{2}$ , and any ${\Sigma}^{C}_{i}$ -rooted terms $t, u$ where $i=1,2$ . Then, there is no non-variable $E_{1}\cup E_{2}$ -generalization of $(t,u)$ iff there is no non-variable $E_{i}$ -generalization of $(t^{\pi_{i}},u^{\pi_{i}})$ .

5 Rule-Based Generalization in Finite Syntactic Theories

In this section, we explore a class of equational theories admitting a rule-based generalization algorithm similar to the one known for the empty theory. For the class of finite syntactic theories, we construct a rule-based algorithm fully parametrized by the axioms of a resolvent presentation of a finite syntactic theory. A similar rule-based algorithm can be found in [27].

Example 12.

The class of finite syntactic theories includes $A$ , $C$ , $A C$ , finite shallow theories [17] such as $f(x)=g(x)$ , and permutative theories closed by paramodulation [32]. Notice that $\{f(a)=a\}$ and $\{f(f(x))=f(x)\}$ are syntactic but not finite theories.

Lemma 13.

For any finite syntactic theory $E$ with a resolvent presentation $R P$ , the inference system given in Figure 4 provides an $E$ -generalization algorithm, using the axioms of $R P$ to

$\blacksquare$

check $E$ -equality (Figure 2),
$\blacksquare$

solve $E$ -matching problems (Figure 3),
$\blacksquare$

control the construction of $E$ -generalizations.

Proof.

The rule-based procedure in Figure 4 terminates since $E$ is assumed to be finite. Indeed, the size of the generalizations associated with configurations is necessarily bounded, and so there are only finitely many applications of Mut. The rule Sol is sound and complete: when an AUT cannot be transformed by Mut or Dec, we cannot have a generalization rooted by a function symbol; a variable is the only possible way to get a generalization for this AUT. The soundness and completeness of Mut and Dec follow because $R P$ is a resolvent presentation of $E$ .

For any input PSS-configuration $(\{x:t\triangleq u\};\emptyset;\mathit{Id})$ where $t, u$ are ground ${\Sigma}^{C}$ -terms, the procedure terminates by computing finitely many PSS-configurations of the form $(\emptyset;S;\sigma)$ and the union of all the terms $x\sigma$ corresponds to a complete set of $E$ -generalizations of $t\triangleq u$ , since all the rules are sound and complete. $\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 14.

For the inference system given in Figure 2, an input problem $\{s=t\}$ is such that $s=_{E}t$ iff all the normal forms reached by this inference system are the empty set.

For the inference system in Figure 3, the set of solutions of an input $E$ -matching problem $\{s\leq t\}$ consists of all the normal forms reached by this inference system corresponding to solved forms $\bigcup_{i=1}^{m}\{x_{i}\leq t_{i}\}$ where $x_{1},\dots,x_{m}$ are pairwise distinct variables.

Mut₌:

$\{f(\bar{t})=g(\bar{u})\}\uplus\Gamma\Longrightarrow\{\bar{t}=\bar{l}\theta\}\cup\Gamma$ where $f(\bar{l})=g(\bar{r})$ is a fresh renaming of an axiom in $R P$ , and $\theta$ is a substitution such that $\bar{r}\theta=_{RP}\bar{u}$ .

Dec₌:

$\{f(\bar{t})=f(\bar{u})\}\uplus\Gamma\Longrightarrow\{\bar{t}=\bar{u}\}\cup\Gamma$ where $f\in{\Sigma}^{C}$ .

Figure 2: Procedure for Checking Equality in a finite theory with a resolvent presentation

R P

.

Mut_≤: Mutation rule for matching
$\{f(\bar{t})\leq^{?}g(\bar{u})\}\uplus\Gamma\Longrightarrow\{\bar{t}\leq^{?}% \bar{l}\theta\}\cup\Gamma$ where $f(\bar{l})=g(\bar{r})$ is a fresh renaming of an axiom in $R P$ , and $\theta$ is a substitution such that $\bar{r}\theta=_{RP}\bar{u}$ .

Dec_≤: Decomposition rule for matching
$\{f(\bar{t})\leq^{?}f(\bar{u})\}\uplus\Gamma\Longrightarrow\{\bar{t}\leq^{?}% \bar{u}\}\cup\Gamma$ where $f\in{\Sigma}^{C}$ .

Mer_≤: Merging rule for matching
$\{x\leq^{?}t,x^{\prime}\leq^{?}t^{\prime}\}\uplus\Gamma\Longrightarrow\{x\leq^% {?}t\}\cup\Gamma$ if $t=_{RP}t^{\prime}$ .

Figure 3: Matching Procedure in a finite theory with a resolvent presentation

R P

.

Mut_LR: Left-Right Mutation rule for generalization
$\{x:f(\bar{t})\triangleq g(\bar{u})\}\uplus A;S;\sigma\Longrightarrow\{y:y% \theta\triangleq y\theta^{\prime}\leavevmode\nobreak\ |\leavevmode\nobreak\ y% \in\bar{y}\}\cup A;S;\sigma\{x\mapsto h(\bar{y})\}$ where $h(\bar{l})=f(\bar{r})\in RP$ , $h(\bar{l}^{\prime})=g(\bar{r}^{\prime})\in RP$ , $h(\bar{y})\theta=_{RP}f(\bar{t})$ and $h(\bar{y})\theta^{\prime}=_{RP}g(\bar{u})$ .

Mut_R: Right Mutation rule for generalization
$\{x:f(\bar{t})\triangleq g(\bar{u})\}\uplus A;S;\sigma\Longrightarrow\{y:y% \theta\triangleq y\theta^{\prime}\leavevmode\nobreak\ |\leavevmode\nobreak\ y% \in\bar{y}\}\cup A;S;\sigma\{x\mapsto f(\bar{y})\}$ where $f(\bar{l}^{\prime})=g(\bar{r}^{\prime})\in RP$ , $\theta=\{\bar{y}\mapsto\bar{t}\}$ and $f(\bar{y})\theta^{\prime}=_{RP}g(\bar{u})$ .

Mut_L: Left Mutation rule for generalization
$\{x:f(\bar{t})\triangleq g(\bar{u})\}\uplus A;S;\sigma\Longrightarrow\{y:y% \theta\triangleq y\theta^{\prime}\leavevmode\nobreak\ |\leavevmode\nobreak\ y% \in\bar{y}\}\cup A;S;\sigma\{x\mapsto g(\bar{y})\}$ where $g(\bar{r}^{\prime})=f(\bar{l}^{\prime})\in RP$ , $g(\bar{y})\theta=_{RP}f(\bar{t})$ and $\theta^{\prime}=\{\bar{y}\mapsto\bar{u}\}$ .

Dec: Decomposition rule for generalization
$\{x:f(\bar{t})\triangleq f(\bar{u})\}\uplus A;S;\sigma\Longrightarrow\{\bar{y}% :\bar{t}=\bar{u}\}\cup A;S;\sigma\{x\mapsto f(\bar{y})\}$ where $f\in{\Sigma}^{C}$ and $\bar{y}$ are fresh variables.

Sol: Solving rule for generalization
$\{x:t\triangleq u\}\uplus A;S;\sigma\Longrightarrow A;S\cup\{x:t\triangleq u\};\sigma$ if no rule in $\{\mathbf{Mut}_{LR},\mathbf{Mut}_{R},\mathbf{Mut}_{L},\mathbf{Dec}\}$ applies.

Mer: Merging rule for generalization
$\emptyset;\{x:t\triangleq u,x^{\prime}:t^{\prime}\triangleq u^{\prime}\}\uplus S% ;\sigma\Longrightarrow\emptyset;S\cup\{x:t\triangleq u\};\sigma\{x^{\prime}% \mapsto x\}$ if $t=_{RP}t^{\prime}$ and $u=_{RP}u^{\prime}$ .

Figure 4: Generalization Procedure in a finite theory with a resolvent presentation

R P

.

The class of finite syntactic theories is known to be closed by disjoint union [34]. Actually, for a union of two disjoint finite theories with respective resolvent presentations $RP_{1}$ and $RP_{2}$ , we have that $RP_{1}\cup RP_{2}$ is a resolvent presentation corresponding to a finite theory. Consequently, the inference system in Figure 4 can be directly applied to any union of disjoint finite syntactic theories for which the resolvent presentations are known. However, finding a resolvent presentation may be a difficult task, especially if $E$ -unification is not known to be finitary. In the case we already have a GSI algorithm for a finite theory $E$ , we can combine it directly using the approach introduced in Section 4; we don’t have to find a resolvent presentation (if there exists one for $E$ ).

$\blacktriangleright$ Remark 15.

The empty theory $\emptyset$ is a finite syntactic theory, and its resolvent presentation is empty. In the case $RP=\emptyset$ , one can notice that the inference system from Figure 4 corresponds to the classical rule-based generalization algorithm known for the empty theory.

6 White-Box Combination of Rule-Based Generalization Algorithms

This section focuses on the modular construction of generalization algorithms transforming PSS-configurations. Let us now define precisely this kind of rule-based generalization algorithm.

Definition 16.

Let $E$ be a regular theory. An $E$ -generalization algorithm transforming PSS-configurations is an inference system $\mathfrak{G}$ such that for any initial configuration $(\{x:t\triangleq u\},\emptyset,\mathit{Id})$ , it derives a finite set $F C$ of final configurations of the form $(\emptyset;S;\sigma)$ for which $\{x\sigma\leavevmode\nobreak\ |\leavevmode\nobreak\ (\emptyset;S;\sigma)\in FC\}$ is a complete set of $E$ -generalizations of $(t,u)$ . Furthermore, for any configuration $(A;S;\sigma)$ derived from the initial configuration $(\{x:t\triangleq u\};\emptyset;\mathit{Id})$ , the following (invariant) properties must hold:

$\blacksquare$

$x\sigma$ is an $E$ -generalization of $(t,u)$ with respective instantiations $(\phi,\psi)$ where $\phi$ (resp., $\psi$ ) is the left (resp., right) substitution of $A\cup S$ ,
$\blacksquare$

for any $x^{\prime}:t^{\prime}\triangleq u^{\prime}$ in $S$ , there is no non-variable $E$ -generalization of $(t^{\prime},u^{\prime})$ .

$\blacktriangleright$ Remark 17.

An $E$ -generalization algorithm transforming PSS-configurations leads to a GSI-algorithm. Consider an initial configuration $(\{x:t\triangleq u\},\emptyset,\mathit{Id})$ and any triple $(r\sigma,\phi_{S},\psi_{S})$ such that $(\emptyset,S,\sigma)\in FC$ where $F C$ is the set of final configurations given in Definition 16 and $\phi_{S}$ (resp., $\psi_{S}$ ) denotes the left (resp., right) substitution of $S$ . Then, $\{(x\sigma,\phi_{S},\psi_{S})\leavevmode\nobreak\ |\leavevmode\nobreak\ (% \emptyset,S,\sigma)\in FC\}$ is actually a $GSI_{E}(t,u)$ .

The inference procedure in Figure 4 corresponds to a generalization algorithm transforming PSS-configurations, thus leading to a GSI algorithm. Conversely, note that a GSI algorithm can be turned into a generalization algorithm transforming PSS-configurations where all derivations are of length $1$ .

Corollary 18.

A theory admits a generalization algorithm transforming PSS-configurations iff it admits a GSI algorithm.

Generalization algorithms transforming PSS-configurations correspond to GSI algorithms, and so they can be combined using the method given in Figure 1, provided that the respective theories $E_{1}$ and $E_{2}$ are regular collapse-free. Since this method is given as an inference system manipulating PSS-configurations, it can be reworded as an $E_{1}\cup E_{2}$ -generalization algorithm transforming PSS-configurations. In this new presentation, we do not directly use the triples from $GSI_{E_{i}}$ -sets, but we just apply one step of the inference system corresponding to an $E_{i}$ -generalization algorithm transforming PSS-configurations, see Figure 5. This is sufficient to move forward with a solution.

Given two generalization algorithms $\mathfrak{G}_{1}$ and $\mathfrak{G}_{2}$ transforming PSS-configurations, let $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ be the inference system given by the set of rules $\{\text{{\boldmath$E_{i}$-Step}, {\boldmath$E_{1,2}$-Sol}, {\boldmath$E_{1,2}$% -Mer} }\}$ where $E_{i}$ -Step is introduced in Figure 5 while $E_{1,2}$ -Sol and $E_{1,2}$ -Mer are from Figure 1.

$E_{i}$ -Step:

$\{x:t\triangleq u\}\uplus A;S;\sigma\Longrightarrow(A_{i}\pi^{-1})\cup A;(S_{i% }\pi^{-1})\cup S;\sigma(\sigma_{i}\pi^{-1})$ where $t(\epsilon)\in{\Sigma}^{C}_{i},u(\epsilon)\in{\Sigma}^{C}_{i}$ and $(\{x:t^{\pi_{i}}\triangleq u^{\pi_{i}}\};\emptyset;\mathit{Id})\longrightarrow% _{\mathfrak{G}_{i}}(A_{i};S_{i};\sigma_{i})$ .

Figure 5: Rule calling a generalization algorithm

\mathfrak{G}_{i}

transforming PSS-configurations.

Theorem 19 (Modular Construction for Rule-Based Generalization).

Let $E_{1}$ and $E_{2}$ be two signature-disjoint regular collapse-free theories. If $\mathfrak{G}_{i}$ is an $E_{i}$ -generalization algorithm transforming PSS-configurations for $i=1,2$ , then $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ is an $E_{1}\cup E_{2}$ -generalization algorithm transforming PSS-configurations.

Proof.

The inference system $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ is terminating since $\mathfrak{G}_{1}$ and $\mathfrak{G}_{2}$ are terminating, and there are finitely many alternations of calls to $\mathfrak{G}_{1}$ and to $\mathfrak{G}_{2}$ (see Lemma 10). Actually, $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ corresponds to an $E_{1}\cup E_{2}$ -generalization algorithm transforming PSS-configurations since the ${E_{i}}$ -Step rule from Figure 5 is sound and complete by Lemma 10 and Corollary 11. $\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 20.

In both rules $E_{i}$ -Step and $E_{i}$ -Gen, the constant abstraction $(\cdot)^{\pi_{i}}$ and the term concretization $(\cdot)\pi^{-1}$ should be considered as transparent operations. With $(\cdot)^{\pi_{i}}$ , aliens are viewed as free constants, with the restriction that two $E_{1}\cup E_{2}$ -equal aliens must be viewed as the same free constant. Conversely, with $(\cdot)\pi^{-1}$ , these free constants are viewed back as terms in the combined theory $E_{1}\cup E_{2}$ . By slightly abusing notation, if we omit to apply these transformations, then the rules $E_{i}$ -Step and $E_{i}$ -Gen become easy to express.

Note that for free, associative, and commutative theories, all four algorithms for the combined theories $(\emptyset)(A)$ , $(\emptyset)(C)$ , $(A)(C)$ , $(\emptyset)(A)(C)$ described in [2] can be obtained as a combination of generalization algorithms transforming PSS-configurations.

Example 21.

Consider $E_{1}=\{f(x)=g(x)\}$ and $E_{2}=AC(+)$ . For $E_{1}$ , one can show the existence of an $E_{1}$ -generalization algorithm transforming PSS-configurations, say $\mathfrak{G}_{1}$ , given by the set of rules $\{\text{$E_{i}$-{Mut}},\ \textbf{Dec},\ \textbf{Sol},\ \textbf{Mer}\}$ , where $\textbf{Dec},\textbf{Sol},\textbf{Mer}$ can be found in Figure 4, and the mutation rules $E_{1}$ -Mut is defined as follows:

$E_{1}$ -Mut:

$\{x:f(t)\triangleq g(u)\}\uplus A;S;\sigma\Longrightarrow\{y:t\triangleq u\}% \cup A;S;\sigma\{x\mapsto f(y)\}$

For $E_{2}$ , we call $\mathfrak{G}_{2}$ the $E_{2}$ -generalization algorithm transforming PSS-configurations given in Figure 4, with the following resolvent presentation of $AC(+)$ shown in [29]:

RP=\left\{\begin{array}[]{l}x_{1}+(x_{2}+x_{3})=(x_{1}+x_{2})+x_{3}\\ x_{1}+x_{2}=x_{2}+x_{1}\\ x_{1}+(x_{2}+x_{3})=(x_{1}+x_{3})+x_{2}\\ (x_{1}+x_{2})+x_{3}=(x_{1}+x_{3})+x_{2}\\ x_{1}+(x_{2}+x_{3})=x_{3}+(x_{1}+x_{2})\\ (x_{1}+x_{3})+(x_{2}+x_{4})=(x_{1}+x_{4})+(x_{2}+x_{3})\end{array}\right\}

Let us now detail how the combined generalization algorithm $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ is applied to solve

x:f(a+(f(a)+b))\triangleq g((a+c)+f(a)).

We focus on two particular derivations leading to final PSS-configurations. The derived PSS-configurations are listed in the table below, where the rightmost column shows the configuration number, while the first column indicates how the configuration in that row is obtained (which generalization algorithm is applied to which configuration):

	$(\{x:f(a+(f(a)+b))\triangleq g((a+c)+f(a))\};\emptyset;\mathit{Id})$	(0)
$\mathfrak{G}_{1}(0)$	$(\{y_{1}:a+(f(a)+b)\triangleq(a+c)+f(a)\};\emptyset;\sigma_{0}\{x\mapsto f(y_{% 1})\})$	(1)
$\mathfrak{G}_{2}(1)$	$(\{y_{2}:a\triangleq a+c,y_{3}:f(a)+b\triangleq f(a)\};\emptyset;\sigma_{1}\{y% _{1}\mapsto y_{2}+y_{3})\})$	(2)
$\mathfrak{G}_{2}(2)$	$(\{y_{3}:f(a)+b\triangleq f(a)\};\{y_{2}:a\triangleq a+c\};\sigma_{2})$	(3)
$\mathfrak{G}_{2}(3)$	$(\emptyset;\{y_{2}:a\triangleq a+c,y_{3}:f(a)+b\triangleq f(a)\};\sigma_{3})$	(4)
$\mathfrak{G}_{2}(1)$	$(\{y_{4}:f(a)\triangleq f(a),y_{5}:b+a\triangleq a+c\};\emptyset;\sigma_{1}\{y% _{1}\mapsto y_{4}+y_{5})\})$	(5)
$\mathfrak{G}_{1}(5)$	$(\{y_{6}:a\triangleq a,y_{5}:b+a\triangleq a+c\};\emptyset;\sigma_{5}\{y_{4}% \mapsto f(y_{6})\})$	(6)
$\mathfrak{G}_{1}(6)$	$(\{y_{5}:b+a\triangleq a+c\};\emptyset;\sigma_{6}\{y_{6}\mapsto a\})$	(7)
$\mathfrak{G}_{2}(7)$	$(\{y_{7}:a\triangleq a,y_{8}:b\triangleq c\};\emptyset;\sigma_{9}\{y_{5}% \mapsto y_{7}+y_{8}\})$	(8)
$\mathfrak{G}_{1}(8)$	$(\{y_{8}:b\triangleq c\};\emptyset;\sigma_{8}\{y_{7}\mapsto a\})$	(9)
$\mathfrak{G}_{1}(9)$	$(\emptyset;\{y_{8}:b\triangleq c\};\sigma_{9})$	(10)

To get (5), $\textbf{Mut}_{LR}$ is applied with the axioms $x_{1}+(x_{2}+x_{3})=x_{3}+(x_{1}+x_{2})$ and $x_{1}+x_{2}=x_{2}+x_{1}$ in $R P$ . To get (8), $\textbf{Mut}_{L}$ is applied with the axiom $x_{1}+x_{2}=x_{2}+x_{1}$ in $R P$ . Notice that $\mathfrak{G}_{2}$ could be applied instead of $\mathfrak{G}_{1}$ on (6), (8), (9) since both $\mathfrak{G}_{1}$ and $\mathfrak{G}_{2}$ can handle any AUT between two constants in $C$ . Conversely, $\mathfrak{G}_{1}$ could be applied instead of $\mathfrak{G}_{2}$ on (3) since both $\mathfrak{G}_{1}$ and $\mathfrak{G}_{2}$ include a Sol rule. Eventually, (4) and (10) are two final configurations:

$\blacksquare$

$(\emptyset;\{y_{2}:a\triangleq a+c,y_{3}:f(a)+b\triangleq f(a)\};\{x\mapsto f(% y_{2}+y_{3}),\dots\}$ ,
$\blacksquare$

$(\emptyset;\{y_{8}:b\triangleq c\};\{x\mapsto f(f(a)+(a+y_{8})),\dots\})$ .

The resulting generalizations are:

$\blacksquare$

$f(y_{2}+y_{3})$ , with respective instantiations $(\{y_{2}\mapsto a,y_{3}\mapsto f(a)+b\},\ \{y_{2}\mapsto a+c,\allowbreak y_{3}% \mapsto f(a)\})$ ,
$\blacksquare$

$f(f(a)+(a+y_{8}))$ , with respective instantiations $(\{y_{8}\mapsto b\},\ \{y_{8}\mapsto c\})$ .

The combined generalization algorithm $\mathfrak{G}_{1}\oplus\mathfrak{G}_{2}$ derives additional final configurations. Still, one can check that the corresponding generalizations are always more general than $f(f(a)+(a+y_{8}))$ , just like $f(y_{2}+y_{3})$ is more general than $f(f(a)+(a+y_{8}))$ .

7 Conclusion

We have introduced a combination method for the generalization problem in any union of disjoint regular collapse-free theories. Our framework requires regularity to ensure that any matching problem only has ground solutions. With this property, we have only to consider the generalization of ground terms involving possibly free constants. Collapse-freeness is also very useful for solving easily the generalization of two terms rooted in distinct theories. In future work, we plan to investigate the possibility of relaxing this regular and collapse-free assumption.

The study of non-disjoint unions of theories would also be a natural continuation. In particular, we plan to consider the case of shared constructors, relying on the combination results available for the matching problem in unions of theories sharing only constructors [23]. When the constructors do not appear in the root of either side of the axioms, it might be possible to solve the combined generalization problem similarly to the disjoint case. More generally, allowing only shared constants at the root of the axioms could also be helpful to solve the problem.

For the class of regular collapse-free theories, we have investigated the disjoint combination of rule-based generalization algorithms transforming PSS-configurations. For the subclass of finite syntactic theories, we have shown how the axioms of a resolvent presentation can be used to construct a generalization algorithm transforming PSS-configurations. More broadly, we are interested in identifying examples of non-finite (yet regular collapse-free) theories that admit a generalization algorithm transforming PSS-configurations, to which our combination approach would be applicable. Finally, we conjecture that this combination approach can be naturally extended to generalization procedures that are not necessarily terminating.

References

[1] María Alpuente, Demis Ballis, Angel Cuenca-Ortega, Santiago Escobar, and José Meseguer. ACUOS ${}^{\mbox{2}}$ : A High-Performance System for Modular ACU Generalization with Subtyping and Inheritance. In Francesco Calimeri, Nicola Leone, and Marco Manna, editors, Logics in Artificial Intelligence - 16th European Conference, JELIA 2019, Rende, Italy, May 7-11, 2019, Proceedings, volume 11468 of Lecture Notes in Computer Science, pages 171–181. Springer, 2019. doi:10.1007/978-3-030-19570-0_11.
[2] María Alpuente, Santiago Escobar, Javier Espert, and José Meseguer. A modular order-sorted equational generalization algorithm. Inf. Comput., 235:98–136, 2014. doi:10.1016/J.IC.2014.01.006.
[3] Nino Amiridze and Temur Kutsia. Anti-unification and natural language processing. EasyChair Preprint no. 203, EasyChair, 2018. doi:10.29007/fkrh.
[4] Mauricio Ayala-Rincón, David M. Cerna, Andres Felipe Gonzalez Barragan, and Temur Kutsia. Equational anti-unification over absorption theories. In Christoph Benzmüller, Marijn J. H. Heule, and Renate A. Schmidt, editors, Automated Reasoning - 12th International Joint Conference, IJCAR 2024, Nancy, France, July 3-6, 2024, Proceedings, Part II, volume 14740 of Lecture Notes in Computer Science, pages 317–337. Springer, 2024. doi:10.1007/978-3-031-63501-4_17.
[5] Franz Baader. Unification, weak unification, upper bound, lower bound, and generalization problems. In Ronald V. Book, editor, Rewriting Techniques and Applications, 4th International Conference, RTA-91, Como, Italy, April 10-12, 1991, Proceedings, volume 488 of Lecture Notes in Computer Science, pages 86–97. Springer, 1991. doi:10.1007/3-540-53904-2_88.
[6] Franz Baader and Tobias Nipkow. Term rewriting and all that. Cambridge University Press, 1998. doi:10.1017/CBO9781139172752.
[7] Franz Baader and Klaus U. Schulz. Unification in the union of disjoint equational theories: Combining decision procedures. J. Symb. Comput., 21(2):211–243, 1996. doi:10.1006/JSCO.1996.0009.
[8] Johannes Bader, Andrew Scott, Michael Pradel, and Satish Chandra. Getafix: learning to fix bugs automatically. Proc. ACM Program. Lang., 3(OOPSLA):159:1–159:27, 2019. doi:10.1145/3360585.
[9] Adam D. Barwell, Christopher Brown, and Kevin Hammond. Finding parallel functional pearls: Automatic parallel recursion scheme detection in Haskell functions via anti-unification. Future Gener. Comput. Syst., 79:669–686, 2018. doi:10.1016/j.future.2017.07.024.
[10] Hans-Jürgen Bürckert, Alexander Herold, and Manfred Schmidt-Schauß. On equational theories, unification, and (un)decidability. J. Symb. Comput., 8(1/2):3–49, 1989. doi:10.1016/S0747-7171(89)80021-5.
[11] Jochen Burghardt. E-generalization using grammars. Artif. Intell., 165(1):1–35, 2005. doi:10.1016/J.ARTINT.2005.01.008.
[12] David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary Tatlock, and Nadia Polikarpova. babble: Learning Better Abstractions with E-Graphs and Anti-unification. Proceedings of the ACM on Programming Languages, 7(POPL):396–424, 2023. doi:10.1145/3571207.
[13] David M. Cerna. Anti-unification and the theory of semirings. Theor. Comput. Sci., 848:133–139, 2020. doi:10.1016/J.TCS.2020.10.020.
[14] David M. Cerna and Temur Kutsia. Idempotent anti-unification. ACM Trans. Comput. Log., 21(2):10:1–10:32, 2020. doi:10.1145/3359060.
[15] David M. Cerna and Temur Kutsia. Unital anti-unification: Type and algorithms. In Zena M. Ariola, editor, 5th International Conference on Formal Structures for Computation and Deduction, FSCD 2020, June 29-July 6, 2020, Paris, France (Virtual Conference), volume 167 of LIPIcs, pages 26:1–26:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.FSCD.2020.26.
[16] David M. Cerna and Temur Kutsia. Anti-unification and generalization: A survey. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China, pages 6563–6573. ijcai.org, 2023. doi:10.24963/IJCAI.2023/736.
[17] Hubert Comon, Marianne Haberstrau, and Jean-Pierre Jouannaud. Syntacticness, cycle-syntacticness, and shallow theories. Inf. Comput., 111(1):154–191, 1994. doi:10.1006/inco.1994.1043.
[18] Andrew Cropper, Sebastijan Dumancic, Richard Evans, and Stephen H. Muggleton. Inductive logic programming at 30. Mach. Learn., 111(1):147–172, 2022. doi:10.1007/S10994-021-06089-1.
[19] Reudismam Rolim de Sousa, Gustavo Soares, Rohit Gheyi, Titus Barik, and Loris D’Antoni. Learning quick fixes from code repositories. In Cristiano D. Vasconcellos, Karina Girardi Roggia, Vanessa Collere, and Paulo Bousfield, editors, 35th Brazilian Symposium on Software Engineering, SBES 2021, Joinville, Santa Catarina, Brazil, 27 September 2021 - 1 October 2021, pages 74–83. ACM, 2021. doi:10.1145/3474624.3474650.
[20] Eric Domenjoud, Francis Klay, and Christophe Ringeissen. Combination techniques for non-disjoint equational theories. In Alan Bundy, editor, Automated Deduction - CADE-12, 12th International Conference on Automated Deduction, Nancy, France, June 26 - July 1, 1994, Proceedings, volume 814 of Lecture Notes in Computer Science, pages 267–281. Springer, 1994. doi:10.1007/3-540-58156-1_19.
[21] Rui Dong, Zhicheng Huang, Ian Iong Lam, Yan Chen, and Xinyu Wang. WebRobot: web robotic process automation using interactive programming-by-demonstration. In Ranjit Jhala and Isil Dillig, editors, PLDI ’22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13 - 17, 2022, pages 152–167. ACM, 2022. doi:10.1145/3519939.3523711.
[22] Serdar Erbatur, Andrew M. Marshall, and Christophe Ringeissen. Non-disjoint combined unification and closure by equational paramodulation. In Boris Konev and Giles Reger, editors, Frontiers of Combining Systems - 13th International Symposium, FroCoS 2021, Birmingham, UK, September 8-10, 2021, Proceedings, volume 12941 of Lecture Notes in Computer Science, pages 25–42. Springer, 2021. doi:10.1007/978-3-030-86205-3_2.
[23] Serdar Erbatur, Andrew M. Marshall, and Christophe Ringeissen. Combined Hierarchical Matching: the Regular Case. In Amy P. Felty, editor, 7th International Conference on Formal Structures for Computation and Deduction, FSCD 2022, August 2-5, 2022, Haifa, Israel, volume 228 of LIPIcs, pages 6:1–6:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.FSCD.2022.6.
[24] Boris Galitsky. Developing Enterprise Chatbots - Learning Linguistic Structures. Springer, 2019. doi:10.1007/978-3-030-04299-8.
[25] Peter Graf. Substitution tree indexing. In Jieh Hsiang, editor, Rewriting Techniques and Applications, 6th International Conference, RTA-95, Kaiserslautern, Germany, April 5-7, 1995, Proceedings, volume 914 of Lecture Notes in Computer Science, pages 117–131. Springer, 1995. doi:10.1007/3-540-59200-8_52.
[26] Alexander Herold. Combination of unification algorithms in equational theories. PhD thesis, Kaiserslautern University of Technology, Germany, 1987. URL: https://d-nb.info/880327979.
[27] Jean-Pierre Jouannaud. Syntactic theories. In Branislav Rovan, editor, Mathematical Foundations of Computer Science 1990, MFCS’90, Banská Bystrica, Czechoslovakia, August 27-31, 1990, Proceedings, volume 452 of Lecture Notes in Computer Science, pages 15–25. Springer, 1990. doi:10.1007/BFB0029593.
[28] Claude Kirchner. Méthodes et outils de conception systématique d’algorithmes d’unification dans les théories équationnelles. Thèses d’état, Université de Nancy I, 1985.
[29] Claude Kirchner and Francis Klay. Syntactic theories and unification. In Proceedings of the Fifth Annual Symposium on Logic in Computer Science (LICS ’90), Philadelphia, Pennsylvania, USA, June 4-7, 1990, pages 270–277. IEEE Computer Society, 1990. doi:10.1109/LICS.1990.113753.
[30] Hélène Kirchner and Christophe Ringeissen. Combining symbolic constraint solvers on algebraic domains. J. Symb. Comput., 18(2):113–155, 1994. doi:10.1006/JSCO.1994.1040.
[31] Boris Konev and Temur Kutsia. Anti-unification of concepts in description logic EL. In Chitta Baral, James P. Delgrande, and Frank Wolter, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Fifteenth International Conference, KR 2016, Cape Town, South Africa, April 25-29, 2016, pages 227–236. AAAI Press, 2016. URL: http://www.aaai.org/ocs/index.php/KR/KR16/paper/view/12880.
[32] Christopher Lynch and Barbara Morawska. Basic syntactic mutation. In Andrei Voronkov, editor, Automated Deduction - CADE-18, 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002, Proceedings, volume 2392 of Lecture Notes in Computer Science, pages 471–485. Springer, 2002. doi:10.1007/3-540-45620-1_37.
[33] Sonu Mehta, Ranjita Bhagwan, Rahul Kumar, Chetan Bansal, Chandra Shekhar Maddila, B. Ashok, Sumit Asthana, Christian Bird, and Aditya Kumar. Rex: Preventing bugs and misconfiguration in large services using correlated change analysis. In Ranjita Bhagwan and George Porter, editors, 17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020, Santa Clara, CA, USA, February 25-27, 2020, pages 435–448. USENIX Association, 2020. URL: https://www.usenix.org/conference/nsdi20/presentation/mehta.
[34] Tobias Nipkow. Proof transformations for equational theories. In Proceedings of the Fifth Annual Symposium on Logic in Computer Science (LICS ’90), Philadelphia, Pennsylvania, USA, June 4-7, 1990, pages 278–288. IEEE Computer Society, 1990. doi:10.1109/LICS.1990.113754.
[35] Tobias Nipkow. Combining matching algorithms: The regular case. J. Symb. Comput., 12(6):633–654, 1991. doi:10.1016/S0747-7171(08)80145-9.
[36] Julian Parsert and Elizabeth Polgreen. Reinforcement learning and data-generation for syntax-guided synthesis. In Michael J. Wooldridge, Jennifer G. Dy, and Sriraam Natarajan, editors, Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, pages 10670–10678. AAAI Press, 2024. doi:10.1609/AAAI.V38I9.28938.
[37] Brigitte Pientka. Higher-order term indexing using substitution trees. ACM Trans. Comput. Log., 11(1):6:1–6:40, 2009. doi:10.1145/1614431.1614437.
[38] Gordon D. Plotkin. A note on inductive generalization. Machine Intel., 5(1):153–163, 1970.
[39] Stanislaw J. Purgal, David M. Cerna, and Cezary Kaliszyk. Learning higher-order logic programs from failures. In Luc De Raedt, editor, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, pages 2726–2733. ijcai.org, 2022. doi:10.24963/IJCAI.2022/378.
[40] Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. Programming by Example Using Least General Generalizations. In Carla E. Brodley and Peter Stone, editors, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pages 283–290. AAAI Press, 2014. doi:10.1609/AAAI.V28I1.8744.
[41] John C. Reynolds. Transformational systems and the algebraic structure of atomic formulas. Machine Intel., 5(1):135–151, 1970.
[42] Manfred Schmidt-Schauß. Unification in a combination of arbitrary disjoint equational theories. J. Symb. Comput., 8(1/2):51–99, 1989. doi:10.1016/S0747-7171(89)80022-7.
[43] Erik Tidén. Unification in combinations of collapse-free theories with disjoint sets of function symbols. In Jörg H. Siekmann, editor, 8th International Conference on Automated Deduction, Oxford, England, July 27 - August 1, 1986, Proceedings, volume 230 of Lecture Notes in Computer Science, pages 431–449. Springer, 1986. doi:10.1007/3-540-16780-3_110.
[44] Emily Rowan Winter, Vesna Nowack, David Bowes, Steve Counsell, Tracy Hall, Sæmundur Óskar Haraldsson, John R. Woodward, Serkan Kirbas, Etienne Windels, Olayori McBello, Abdurahman Atakishiyev, Kevin Kells, and Matthew W. Pagano. Towards developer-centered automatic program repair: findings from Bloomberg. In Abhik Roychoudhury, Cristian Cadar, and Miryung Kim, editors, Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE, Singapore, November 14-18, 2022, pages 1578–1588. ACM, 2022. doi:10.1145/3540250.3558953.
[45] Kaiyu Yang and Jia Deng. Learning symbolic rules for reasoning in quasi-natural language. Trans. Mach. Learn. Res., 2023, 2023. URL: https://openreview.net/forum?id=gwRwHUZUgz.
[46] Katherine A. Yelick. Unification in combinations of collapse-free regular theories. J. Symb. Comput., 3(1/2):153–181, 1987. doi:10.1016/S0747-7171(87)80025-1.
[47] Gonzague Yernaux and Wim Vanhoof. Anti-unification of unordered goals. In Florin Manea and Alex Simpson, editors, 30th EACSL Annual Conference on Computer Science Logic, CSL 2022, February 14-19, 2022, Göttingen, Germany (Virtual Conference), volume 216 of LIPIcs, pages 37:1–37:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.CSL.2022.37.
[48] Gonzague Yernaux and Wim Vanhoof. On detecting semantic clones in constraint logic programs. In 16th IEEE International Workshop on Software Clones, IWSC 2022, Limassol, Cyprus, October 2, 2022, pages 32–38. IEEE, 2022. doi:10.1109/IWSC55060.2022.00014.
[49] Dongjun Youn, Wonho Shin, Jaehyun Lee, Sukyoung Ryu, Joachim Breitner, Philippa Gardner, Sam Lindley, Matija Pretnar, Xiaojia Rao, Conrad Watt, and Andreas Rossberg. Bringing the WebAssembly Standard up to Speed with SpecTec. Proc. ACM Program. Lang., 8(PLDI):1559–1584, 2024. doi:10.1145/3656440.

[bib.bib1] [1] María Alpuente, Demis Ballis, Angel Cuenca-Ortega, Santiago Escobar, and José Meseguer. ACUOS ${}^{\mbox{2}}$ : A High-Performance System for Modular ACU Generalization with Subtyping and Inheritance. In Francesco Calimeri, Nicola Leone, and Marco Manna, editors, Logics in Artificial Intelligence - 16th European Conference, JELIA 2019, Rende, Italy, May 7-11, 2019, Proceedings, volume 11468 of Lecture Notes in Computer Science, pages 171–181. Springer, 2019. doi:10.1007/978-3-030-19570-0_11.

[bib.bib2] [2] María Alpuente, Santiago Escobar, Javier Espert, and José Meseguer. A modular order-sorted equational generalization algorithm. Inf. Comput., 235:98–136, 2014. doi:10.1016/J.IC.2014.01.006.

[bib.bib3] [3] Nino Amiridze and Temur Kutsia. Anti-unification and natural language processing. EasyChair Preprint no. 203, EasyChair, 2018. doi:10.29007/fkrh.

[bib.bib4] [4] Mauricio Ayala-Rincón, David M. Cerna, Andres Felipe Gonzalez Barragan, and Temur Kutsia. Equational anti-unification over absorption theories. In Christoph Benzmüller, Marijn J. H. Heule, and Renate A. Schmidt, editors, Automated Reasoning - 12th International Joint Conference, IJCAR 2024, Nancy, France, July 3-6, 2024, Proceedings, Part II, volume 14740 of Lecture Notes in Computer Science, pages 317–337. Springer, 2024. doi:10.1007/978-3-031-63501-4_17.

[bib.bib5] [5] Franz Baader. Unification, weak unification, upper bound, lower bound, and generalization problems. In Ronald V. Book, editor, Rewriting Techniques and Applications, 4th International Conference, RTA-91, Como, Italy, April 10-12, 1991, Proceedings, volume 488 of Lecture Notes in Computer Science, pages 86–97. Springer, 1991. doi:10.1007/3-540-53904-2_88.

[bib.bib6] [6] Franz Baader and Tobias Nipkow. Term rewriting and all that. Cambridge University Press, 1998. doi:10.1017/CBO9781139172752.

[bib.bib7] [7] Franz Baader and Klaus U. Schulz. Unification in the union of disjoint equational theories: Combining decision procedures. J. Symb. Comput., 21(2):211–243, 1996. doi:10.1006/JSCO.1996.0009.

[bib.bib8] [8] Johannes Bader, Andrew Scott, Michael Pradel, and Satish Chandra. Getafix: learning to fix bugs automatically. Proc. ACM Program. Lang., 3(OOPSLA):159:1–159:27, 2019. doi:10.1145/3360585.

[bib.bib9] [9] Adam D. Barwell, Christopher Brown, and Kevin Hammond. Finding parallel functional pearls: Automatic parallel recursion scheme detection in Haskell functions via anti-unification. Future Gener. Comput. Syst., 79:669–686, 2018. doi:10.1016/j.future.2017.07.024.

[bib.bib10] [10] Hans-Jürgen Bürckert, Alexander Herold, and Manfred Schmidt-Schauß. On equational theories, unification, and (un)decidability. J. Symb. Comput., 8(1/2):3–49, 1989. doi:10.1016/S0747-7171(89)80021-5.

[bib.bib11] [11] Jochen Burghardt. E-generalization using grammars. Artif. Intell., 165(1):1–35, 2005. doi:10.1016/J.ARTINT.2005.01.008.

[bib.bib12] [12] David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary Tatlock, and Nadia Polikarpova. babble: Learning Better Abstractions with E-Graphs and Anti-unification. Proceedings of the ACM on Programming Languages, 7(POPL):396–424, 2023. doi:10.1145/3571207.

[bib.bib13] [13] David M. Cerna. Anti-unification and the theory of semirings. Theor. Comput. Sci., 848:133–139, 2020. doi:10.1016/J.TCS.2020.10.020.

[bib.bib14] [14] David M. Cerna and Temur Kutsia. Idempotent anti-unification. ACM Trans. Comput. Log., 21(2):10:1–10:32, 2020. doi:10.1145/3359060.

[bib.bib15] [15] David M. Cerna and Temur Kutsia. Unital anti-unification: Type and algorithms. In Zena M. Ariola, editor, 5th International Conference on Formal Structures for Computation and Deduction, FSCD 2020, June 29-July 6, 2020, Paris, France (Virtual Conference), volume 167 of LIPIcs, pages 26:1–26:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.FSCD.2020.26.

[bib.bib16] [16] David M. Cerna and Temur Kutsia. Anti-unification and generalization: A survey. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China, pages 6563–6573. ijcai.org, 2023. doi:10.24963/IJCAI.2023/736.

[bib.bib17] [17] Hubert Comon, Marianne Haberstrau, and Jean-Pierre Jouannaud. Syntacticness, cycle-syntacticness, and shallow theories. Inf. Comput., 111(1):154–191, 1994. doi:10.1006/inco.1994.1043.

[bib.bib18] [18] Andrew Cropper, Sebastijan Dumancic, Richard Evans, and Stephen H. Muggleton. Inductive logic programming at 30. Mach. Learn., 111(1):147–172, 2022. doi:10.1007/S10994-021-06089-1.

[bib.bib19] [19] Reudismam Rolim de Sousa, Gustavo Soares, Rohit Gheyi, Titus Barik, and Loris D’Antoni. Learning quick fixes from code repositories. In Cristiano D. Vasconcellos, Karina Girardi Roggia, Vanessa Collere, and Paulo Bousfield, editors, 35th Brazilian Symposium on Software Engineering, SBES 2021, Joinville, Santa Catarina, Brazil, 27 September 2021 - 1 October 2021, pages 74–83. ACM, 2021. doi:10.1145/3474624.3474650.

[bib.bib20] [20] Eric Domenjoud, Francis Klay, and Christophe Ringeissen. Combination techniques for non-disjoint equational theories. In Alan Bundy, editor, Automated Deduction - CADE-12, 12th International Conference on Automated Deduction, Nancy, France, June 26 - July 1, 1994, Proceedings, volume 814 of Lecture Notes in Computer Science, pages 267–281. Springer, 1994. doi:10.1007/3-540-58156-1_19.

[bib.bib21] [21] Rui Dong, Zhicheng Huang, Ian Iong Lam, Yan Chen, and Xinyu Wang. WebRobot: web robotic process automation using interactive programming-by-demonstration. In Ranjit Jhala and Isil Dillig, editors, PLDI ’22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13 - 17, 2022, pages 152–167. ACM, 2022. doi:10.1145/3519939.3523711.

[bib.bib22] [22] Serdar Erbatur, Andrew M. Marshall, and Christophe Ringeissen. Non-disjoint combined unification and closure by equational paramodulation. In Boris Konev and Giles Reger, editors, Frontiers of Combining Systems - 13th International Symposium, FroCoS 2021, Birmingham, UK, September 8-10, 2021, Proceedings, volume 12941 of Lecture Notes in Computer Science, pages 25–42. Springer, 2021. doi:10.1007/978-3-030-86205-3_2.

[bib.bib23] [23] Serdar Erbatur, Andrew M. Marshall, and Christophe Ringeissen. Combined Hierarchical Matching: the Regular Case. In Amy P. Felty, editor, 7th International Conference on Formal Structures for Computation and Deduction, FSCD 2022, August 2-5, 2022, Haifa, Israel, volume 228 of LIPIcs, pages 6:1–6:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.FSCD.2022.6.

[bib.bib24] [24] Boris Galitsky. Developing Enterprise Chatbots - Learning Linguistic Structures. Springer, 2019. doi:10.1007/978-3-030-04299-8.

[bib.bib25] [25] Peter Graf. Substitution tree indexing. In Jieh Hsiang, editor, Rewriting Techniques and Applications, 6th International Conference, RTA-95, Kaiserslautern, Germany, April 5-7, 1995, Proceedings, volume 914 of Lecture Notes in Computer Science, pages 117–131. Springer, 1995. doi:10.1007/3-540-59200-8_52.

[bib.bib26] [26] Alexander Herold. Combination of unification algorithms in equational theories. PhD thesis, Kaiserslautern University of Technology, Germany, 1987. URL: https://d-nb.info/880327979.

[bib.bib27] [27] Jean-Pierre Jouannaud. Syntactic theories. In Branislav Rovan, editor, Mathematical Foundations of Computer Science 1990, MFCS’90, Banská Bystrica, Czechoslovakia, August 27-31, 1990, Proceedings, volume 452 of Lecture Notes in Computer Science, pages 15–25. Springer, 1990. doi:10.1007/BFB0029593.

[bib.bib28] [28] Claude Kirchner. Méthodes et outils de conception systématique d’algorithmes d’unification dans les théories équationnelles. Thèses d’état, Université de Nancy I, 1985.

[bib.bib29] [29] Claude Kirchner and Francis Klay. Syntactic theories and unification. In Proceedings of the Fifth Annual Symposium on Logic in Computer Science (LICS ’90), Philadelphia, Pennsylvania, USA, June 4-7, 1990, pages 270–277. IEEE Computer Society, 1990. doi:10.1109/LICS.1990.113753.

[bib.bib30] [30] Hélène Kirchner and Christophe Ringeissen. Combining symbolic constraint solvers on algebraic domains. J. Symb. Comput., 18(2):113–155, 1994. doi:10.1006/JSCO.1994.1040.

[bib.bib31] [31] Boris Konev and Temur Kutsia. Anti-unification of concepts in description logic EL. In Chitta Baral, James P. Delgrande, and Frank Wolter, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Fifteenth International Conference, KR 2016, Cape Town, South Africa, April 25-29, 2016, pages 227–236. AAAI Press, 2016. URL: http://www.aaai.org/ocs/index.php/KR/KR16/paper/view/12880.

[bib.bib32] [32] Christopher Lynch and Barbara Morawska. Basic syntactic mutation. In Andrei Voronkov, editor, Automated Deduction - CADE-18, 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002, Proceedings, volume 2392 of Lecture Notes in Computer Science, pages 471–485. Springer, 2002. doi:10.1007/3-540-45620-1_37.

[bib.bib33] [33] Sonu Mehta, Ranjita Bhagwan, Rahul Kumar, Chetan Bansal, Chandra Shekhar Maddila, B. Ashok, Sumit Asthana, Christian Bird, and Aditya Kumar. Rex: Preventing bugs and misconfiguration in large services using correlated change analysis. In Ranjita Bhagwan and George Porter, editors, 17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020, Santa Clara, CA, USA, February 25-27, 2020, pages 435–448. USENIX Association, 2020. URL: https://www.usenix.org/conference/nsdi20/presentation/mehta.

[bib.bib34] [34] Tobias Nipkow. Proof transformations for equational theories. In Proceedings of the Fifth Annual Symposium on Logic in Computer Science (LICS ’90), Philadelphia, Pennsylvania, USA, June 4-7, 1990, pages 278–288. IEEE Computer Society, 1990. doi:10.1109/LICS.1990.113754.

[bib.bib35] [35] Tobias Nipkow. Combining matching algorithms: The regular case. J. Symb. Comput., 12(6):633–654, 1991. doi:10.1016/S0747-7171(08)80145-9.

[bib.bib36] [36] Julian Parsert and Elizabeth Polgreen. Reinforcement learning and data-generation for syntax-guided synthesis. In Michael J. Wooldridge, Jennifer G. Dy, and Sriraam Natarajan, editors, Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, February 20-27, 2024, Vancouver, Canada, pages 10670–10678. AAAI Press, 2024. doi:10.1609/AAAI.V38I9.28938.

[bib.bib37] [37] Brigitte Pientka. Higher-order term indexing using substitution trees. ACM Trans. Comput. Log., 11(1):6:1–6:40, 2009. doi:10.1145/1614431.1614437.

[bib.bib38] [38] Gordon D. Plotkin. A note on inductive generalization. Machine Intel., 5(1):153–163, 1970.

[bib.bib39] [39] Stanislaw J. Purgal, David M. Cerna, and Cezary Kaliszyk. Learning higher-order logic programs from failures. In Luc De Raedt, editor, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, pages 2726–2733. ijcai.org, 2022. doi:10.24963/IJCAI.2022/378.

[bib.bib40] [40] Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. Programming by Example Using Least General Generalizations. In Carla E. Brodley and Peter Stone, editors, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pages 283–290. AAAI Press, 2014. doi:10.1609/AAAI.V28I1.8744.

[bib.bib41] [41] John C. Reynolds. Transformational systems and the algebraic structure of atomic formulas. Machine Intel., 5(1):135–151, 1970.

[bib.bib42] [42] Manfred Schmidt-Schauß. Unification in a combination of arbitrary disjoint equational theories. J. Symb. Comput., 8(1/2):51–99, 1989. doi:10.1016/S0747-7171(89)80022-7.

[bib.bib43] [43] Erik Tidén. Unification in combinations of collapse-free theories with disjoint sets of function symbols. In Jörg H. Siekmann, editor, 8th International Conference on Automated Deduction, Oxford, England, July 27 - August 1, 1986, Proceedings, volume 230 of Lecture Notes in Computer Science, pages 431–449. Springer, 1986. doi:10.1007/3-540-16780-3_110.

[bib.bib44] [44] Emily Rowan Winter, Vesna Nowack, David Bowes, Steve Counsell, Tracy Hall, Sæmundur Óskar Haraldsson, John R. Woodward, Serkan Kirbas, Etienne Windels, Olayori McBello, Abdurahman Atakishiyev, Kevin Kells, and Matthew W. Pagano. Towards developer-centered automatic program repair: findings from Bloomberg. In Abhik Roychoudhury, Cristian Cadar, and Miryung Kim, editors, Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE, Singapore, November 14-18, 2022, pages 1578–1588. ACM, 2022. doi:10.1145/3540250.3558953.

[bib.bib45] [45] Kaiyu Yang and Jia Deng. Learning symbolic rules for reasoning in quasi-natural language. Trans. Mach. Learn. Res., 2023, 2023. URL: https://openreview.net/forum?id=gwRwHUZUgz.

[bib.bib46] [46] Katherine A. Yelick. Unification in combinations of collapse-free regular theories. J. Symb. Comput., 3(1/2):153–181, 1987. doi:10.1016/S0747-7171(87)80025-1.

[bib.bib47] [47] Gonzague Yernaux and Wim Vanhoof. Anti-unification of unordered goals. In Florin Manea and Alex Simpson, editors, 30th EACSL Annual Conference on Computer Science Logic, CSL 2022, February 14-19, 2022, Göttingen, Germany (Virtual Conference), volume 216 of LIPIcs, pages 37:1–37:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.CSL.2022.37.

[bib.bib48] [48] Gonzague Yernaux and Wim Vanhoof. On detecting semantic clones in constraint logic programs. In 16th IEEE International Workshop on Software Clones, IWSC 2022, Limassol, Cyprus, October 2, 2022, pages 32–38. IEEE, 2022. doi:10.1109/IWSC55060.2022.00014.

[bib.bib49] [49] Dongjun Youn, Wonho Shin, Jaehyun Lee, Sukyoung Ryu, Joachim Breitner, Philippa Gardner, Sam Lindley, Matija Pretnar, Xiaojia Rao, Conrad Watt, and Andreas Rossberg. Bringing the WebAssembly Standard up to Speed with SpecTec. Proc. ACM Program. Lang., 8(PLDI):1559–1584, 2024. doi:10.1145/3656440.

Combining Generalization Algorithms in Regular Collapse-Free Theories

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Our contributions

Organization

2 Preliminaries

2.1 Terms and Substitutions

2.2 Equational Theories

2.3 Combination of Equational Theories

Lemma 1 (Correspondence for Equality [46]).

2.4 Generalization Problems

3 Equational Generalization vs Free Generalization

Theorem 2 (Free 𝑙𝑔𝑔’s versus E-Generalizations).

Proof.

Theorem 3 (Completeness and Finiteness through Free 𝑙𝑔𝑔’s).

Proof.

Example 4.

4 Combined Generalization in Regular Collapse-Free Theories

Definition 5 (Generalizations with Solved Instantiations).

Lemma 6.

▶ Remark 7.

Example 8.

Theorem 9 (Modular Property for GSI Algorithm).

Proof.

Lemma 10 (Correspondence for Generalization).

Proof.

Corollary 11.

5 Rule-Based Generalization in Finite Syntactic Theories

Example 12.

Lemma 13.

Proof.

▶ Remark 14.

▶ Remark 15.

6 White-Box Combination of Rule-Based Generalization Algorithms

Definition 16.

▶ Remark 17.

Corollary 18.

Theorem 19 (Modular Construction for Rule-Based Generalization).

Proof.

▶ Remark 20.

Example 21.

7 Conclusion

References

Combining Generalization Algorithms
in Regular Collapse-Free Theories

Theorem 2 (Free $\mathit{lgg}$ ’s versus $E$ -Generalizations).

Theorem 3 (Completeness and Finiteness through Free $\mathit{lgg}$ ’s).

$\blacktriangleright$ Remark 7.

$\blacktriangleright$ Remark 14.

$\blacktriangleright$ Remark 15.

$\blacktriangleright$ Remark 17.

$\blacktriangleright$ Remark 20.