First-Order Intuitionistic Linear Logic and Hypergraph Languages
Abstract
The Lambek calculus is a substructural logic known to be closely related to the formal language theory: on the one hand, it is used for generating formal languages by means of categorial grammars and, on the other hand, it has formal language semantics, with respect to which it is sound and complete. This paper studies a similar relation between first-order intuitionistic linear logic ILL1 along with its multiplicative fragment MILL1 on the one hand and the hypergraph grammar theory on the other.
In the first part, we introduce a novel concept of hypergraph first-order logic categorial grammar, which is a generalisation of string MILL1 grammars introduced in Richard Moot’s works. We prove that hypergraph ILL1 grammars generate all recursively enumerable hypergraph languages and that hypergraph MILL1 grammars are as powerful as linear-time hypergraph transformation systems. In addition, we show that the class of languages generated by string MILL1 grammars is closed under intersection and that it includes a non-semilinear language as well as an NP-complete one. This shows how much more powerful string MILL1 grammars are as compared to Lambek categorial grammars.
In the second part, we develop hypergraph language models for MILL1. In such models, formulae of the logic are interpreted as hypergraph languages and multiplicative conjunction is interpreted using parallel composition, which is one of the operations of HR-algebras introduced by Courcelle. We prove completeness of the universal-implicative fragment of MILL1 with respect to these models and thus present a new kind of semantics for a fragment of first-order linear logic.
Keywords and phrases:
linear logic, categorial grammar, MILL1 grammar, first-order logic, hypergraph language, graph transformation, language semantics, HR-algebraCategory:
Track B: Automata, Logic, Semantics, and Theory of Programming2012 ACM Subject Classification:
Theory of computation Linear logic ; Theory of computation Grammars and context-free languages ; Theory of computation Rewrite systemsAcknowledgements:
I thank Richard Moot and Sergei Slavnov for their attention to my work and for the productive discussions. I am also grateful to the anonymous reviewers for valuable remarks.Funding:
This work was supported by the Russian Science Foundation under grant no. 23-11-00104, https://rscf.ru/en/project/23-11-00104/.Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

1 Introduction
There is a strong connection between substructural logics, especially non-commutative ones, and the theory of formal languages and grammars [5, 23]. This connection is two-way. On the one hand, a logic can be used as a derivational mechanism for generating formal languages, which is the essence of categorial grammars. One prominent example is Lambek categorial grammars based on the Lambek calculus [18]; formulae of the latter are built using multiplicative conjunction “” and two directed implications “”, “”. In a Lambek categorial grammar, one assigns a finite number of formulae of to each symbol of an alphabet and chooses a distinguished formula ; then, the grammar accepts a string if the sequent is derivable in where, for , is one of the formulae assigned to . A famous result by Pentus [25] says that Lambek categorial grammars generate exactly context-free languages (without the empty word, to be precise).
On the other hand, algebras of formal languages can serve as models for substructural logics. For example, one can define language semantics for the Lambek calculus as follows: a language model is a function mapping formulas of to formal languages such that , , and ; a sequent is interpreted as inclusion . Another famous result by Pentus [26] is that is sound and complete w.r.t. language semantics; strong completeness for the fragment of without “” had been proved earlier by Buszkowski in [4] using canonical models.
Numerous variants and extensions of the Lambek calculus have been studied, including its nonassociative version [23], commutative version [37] (i.e. multiplicative intuitionistic linear logic), the multimodal Lambek calculus [19], the displacement calculus [24] etc. These logics have many common properties, which motivates searching for a unifying logic. One such “umbrella” logic is the first-order multiplicative intuitionistic linear logic [20, 22], which is the multiplicative fragment of first-order intuitionistic linear logic . The Lambek calculus can be embedded in [22]: for example, the sequent is translated into the sequent . In such a translation, variables “fix” the order of formulae. Although is a first-order generalisation of , derivability problems in these logics have the same complexity, namely, they are NP-complete. One can define categorial grammars in the same manner as Lambek categorial grammars [20, 34]. The former generalise the latter, hence generate all context-free languages (see the definition of the latter in [14]). Moot proved in [20] that grammars generate all multiple context-free languages, hence some non-context-free ones, e.g. .
It turns out that the interplay between propositional substructural logics and formal grammars can be elevated fruitfully to that between first-order substructural logics and hypergraph grammars. This is the subject of this article. Hypergraph grammar approaches generate sets of hypergraphs; usually they are designed as generalisations of grammar formalisms for strings. For example, hyperedge replacement grammar [8] is a formalism that extends context-free grammar. A rule of a hyperedge replacement grammar allows one to replace a hyperedge in a hypergraph by another hypergraph; see an example below.
Note that, in this approach, hyperedges but not nodes are labeled. Some of the nodes, called external, are distinguished in a hypergraph; e.g., in the above example, these are the nodes marked by . External nodes are needed to specify how hyperedge replacement is done. A more general approach, which corresponds to type-0 grammars in the Chomsky hierarchy, is hypergraph transformation systems (the term is taken from [17]), which allow one to replace a subhypergraph in a hypergraph by another hypergraph.
Naturally, a hypergraph can be represented by a linear logic formula. Namely, one can interpret hyperedges as predicates, nodes as variables, and external nodes as free variables. For example, the hypergraph in the box from Figure 1 can be converted into the formula . This idea underlies the concept of hypergraph first-order categorial grammars, which we introduce in Section 3. Roughly speaking, given a first-order logic , say, , a hypergraph categorial grammar takes a hypergraph, assigns a formula of to each its hyperedge and node, converts the resulting hypergraph into a sequent and check whether it is derivable in . Using first-order linear logic for generating hypergraph languages is a novel idea which has not yet been explored in the literature. In Section 3, we study expressive power of grammars defined thusly and prove the following.
-
1.
Hypergraph grammars are equivalent to hypergraph transformation systems and thus they generate all recursively enumerable hypergraph languages (Theorem 23). This result relates hypergraph grammars to the well studied approach in the field of hypergraph grammars based on the double pushout graph transformation procedure.
-
2.
Hypergraph grammars are at least as powerful as linear-time hypergraph transformation systems (Theorem 25). The latter are hypergraph transformation systems where the length of a derivation is bounded by a linear function w.r.t. the size of the resulting hypergraph. The linear-time bound has been studied for many grammar formalisms [2, 11, 29, 35], but, to our best knowledge, it is the first time it is used for graph grammars.
The proofs partially use the techniques from [29] where languages generated by grammars over the commutative Lambek calculus are studied. As compared to [29], the proofs in this paper are more technically involved because of complications arising when working with quantifiers and variables in the first-order setting.
In Section 4, using the methods developed for hypergraph grammars, we discover the following properties of the class of languages generated by string grammars.
-
1.
Languages generated by string grammars are closed under intersection (Theorem 26). Consequently, non-semilinear languages, e.g. the language , can be generated by string grammars.
-
2.
String grammars generate an NP-complete language (Theorem 30). Note that the Lambek calculus is NP-complete [27] but Lambek categorial grammars generate only context-free languages, which are in P. The logic is NP-complete as well but string grammars are able to generate non-polynomial languages (assuming PNP), hence they are much more powerful.
The question whether string grammars generate only multiple context-free grammars was left open by Richard Moot in [20], and I considered Theorem 30 to be the first one answering it. Recently, however, Sergei Slavnov pointed out to an alternative answer to this question, using previously known techniques. Namely, in [21], it is shown that hybrid type-logical grammars can be translated into ; hybrid type-logical grammars generalise abstract categorial grammars, and it is proved in [32] that the latter generate an NP-complete language. Our proof relies on a different technique, namely, on reducing linear-time hypergraph transformation systems, which are a rule-based approach unlike hybrid type-logical grammars. In general, finding a natural rule-based formalism equivalent to a grammars is an interesting question to study, and Theorem 30 is a step towards the answer111We believe that linear-time hypergraph transformation systems are essentially equivalent to grammars in terms of generative power because, as we conjecture, it is possible to simulate axioms and rules by hypergraph transformation rules and hence to convert hypergraph grammars into linear-time transformation systems. However, some subtleties arise related to the definition of hypergraph transformation rules when one tries to do so; we discuss them after the proof of Theorem 25..
The second part of the paper (Section 5), not related to the first one, is devoted to developing hypergraph language semantics for , thus establishing the other way round connection between first-order linear logic and hypergraph languages. Linear logic is considered as a logic for reasoning about resources [10], and language models for the Lambek calculus are one of formalisations of this statement with resources being words; this agrees with linguistic applications of . In this paper, we shall show that hypergraphs can be treated as “first-order resources.” In a hypergraph language model (Definition 33), formulas are interpreted by sets of hypergraphs, and the tensor operation is interpreted using the parallel composition operation. The latter one is “gluing” of hypergraphs; it is studied well in the hypergraph grammar theory, namely, it is one of the operations in the language of HR-algebras [7]. Hypergraph language models are a particular case of intuitionistic phase semantics (see the definition in [15]) with the trivial closure operator .
Our main result concerning hypergraph language models is soundness of and completeness of its -fragment w.r.t. them (Theorem 35). The proof is inspired by Buszkowski’s one [4] but it is more technically involved, again because of the first-order setting. This result’s importance amounts to the fact that hypergraph language models is one of few, to our best knowledge, examples of a specific semantics for a fragment of first-order intuitionistic linear logic, which, moreover, is grounded in the hypergraph language theory.
2 Preliminaries
In Section 2.1, we introduce notions from the field of graph grammars, and, in Section 2.2, we introduce first-order intuitionistic linear logic.
2.1 Hypergraphs & Hypergraph Transformation Systems
There are many paradigms in the field of graph grammars, including node replacement grammars, hyperedge replacement grammars, algebraic approaches (double pushout, single pushout) with a more categorical flavour, definability in monadic second-order logic etc. We shall work with the definition of a hypergraph from the field of hyperedge replacement grammars [8, 9, 12] because it fits first-order linear logic better than other ones. In hypergraphs we shall deal with, only hyperedges are labeled while nodes play an auxiliary role. Some of the nodes are marked as external; informally, they play the role of gluing points. Throughout the paper, we shall explore a natural correspondence between hypergraphs defined thusly and first-order linear logic formulae. In contrast, there would be no such correspondence if we sticked to the definition of a hypergraph where nodes are labeled and edges are not.
We fix a countable set ; its elements are called selectors. In the grammar-logic correspondence we shall develop, selectors will guide variable substitution.
Definition 1.
A -typed alphabet is a set along with a function such that is finite for .
Definition 2.
Let be a finite -typed alphabet of hyperedge labels. A hypergraph over is a tuple where is a finite set of nodes; is a finite set of hyperedges; for each , is the labeling function and is the attachment function; is a function with the domain . Elements of are called external nodes. The set of hypergraphs over is denoted by . Let for each .
In drawings of hypergraphs, nodes are depicted as black circles and hyperedges are depicted as labeled rectangles. When depicting a hypergraph , we draw a line with a label from to if . External nodes are represented by symbols in round brackets: if , then we mark as .
There is a standard issue with distinguishing between concrete and abstract hypergraphs, i.e. between hypergraphs and their isomorphism classes. When one considers a hypergraph language , it is reasonable to assume that it consists of abstract hypergraphs (or, equivalently, that it is closed under isomorphism); however, when we write , we assume that is a concrete hypergraph. Following tradition, we often do not distinguish between abstract and concrete hypergraphs to avoid excessive bureaucracy.
Let us fix two selectors, and . If , then the hyperedge is called an edge and it is depicted by an arrow going from to .
Definition 3.
A string graph induced by a string is defined as follows: , ; , , , (for ); , . For example, .
Definition 4.
Given , is a hypergraph such that ; with ; for . For example, if , then .
Definition 5.
If are hypergraphs with , then their disjoint union is the hypergraph where , on for , and is the union of functions and .
Now, let us define the hyperedge replacement operation.
Definition 6.
Let be a hypergraph and let be a binary relation on . Let be the smallest equivalence relation on containing . Then is the following hypergraph: ; ; ; ; .
Definition 7.
Let be two hypergraphs over ; let be a hyperedge such that . Then the replacement of by in (the result being denoted by ) is defined as follows:
-
1.
Remove from and add a disjoint copy of . Formally, let be the hypergraph such that , , is the restriction of to , is the restriction of to , and .
-
2.
Glue the nodes that are incident to in with the external nodes of . Namely, let where .
Using hyperedge replacement, we define hypergraph transformation system, a formalism which shall be used to describe expressive power of hypergraph categorial grammars. It enables one to replace a subhypergraph in a hypergraph with another hypergraph.
Definition 8.
A hypergraph transformation rule (ht-rule) is of the form where are hypergraphs such that and are injective.
We say that is transformed into via and denote this by (also by ) if and for some and such that is injective.
A hypergraph transformation system (ht-system) is a tuple where are -typed alphabets, is a finite set of hypergraph transformation rules and is a start hypergraph such that is injective.
The language generated by consists of hypergraphs such that .
The definition of a hypergraph transformation rule, although given in a slightly unconventional way through hyperedge replacement, coincides with the standard notion of a graph transformation rule with injective morphisms of the double pushout approach formalism in the corresponding category of hypergraphs; compare it with [17]. Also, our definition is essentially the same as that from [36] (with the only difference that the cited paper deals with graphs rather than with hypergraphs). In that paper, the following proposition is proved.
Proposition 9.
Hypergraph transformation systems generate all recursively enumerable hypergraph languages.
This is a rather expected result related to string rewriting systems generating all recursively enumerable string languages. To prove Proposition 9, it suffices to show how to convert a string representation of a hypergraph into the hypergraph itself by means of ht-rules.
2.2 First-Order Intuitionistic Linear Logic
We assume the reader’s familiarity with the basic principles and issues of first-order logic. Let us fix a countable set of variables and a countable set of predicate symbols with arities. Atomic formulae are of the form where is a predicate symbol of arity and are variables. Following [16, 20, 22], we do not allow function symbols; note that complex terms would not fit in the variables are nodes, predicates are hyperedges paradigm. We also do not allow constants because they can easily be simulated by variables.
Formulae of intuitionistic linear logic are built from atomic formulae and propositional constants using the multiplicative connectives , the additive ones , and the exponential one along with the quantifiers . The multiplicative fragment of denoted by does not have constants and uses only and . A sequent is a structure of the form where is a multiset of formulae and is a formula.
Note that we shall sometimes describe multisets using the notation . An element belongs to this multiset times if there are exactly elements such that and such that satisfies .
The only axiom is . The rules for are presented below.
Here is any variable while is a variable which is not free in ; denotes replacing all free occurrences of in by . More generally, if is a function defining a correct substitution, then denotes the result of substituting for for ( is the set of free variables in ). Rules for additive and exponential connectives of can be found e.g. in [33, Appendix E].
The cut rule is admissible in :
Using it, one can prove that the rules , , , and are invertible, i.e. that, if a conclusion of any of these rules is provable in , then so is its premise.
3 Hypergraph First-Order Categorial Grammars
The idea of extending the Lambek calculus and categorial grammars to hypergraphs was explored recently in the work [28], where the hypergraph Lambek calculus was introduced. This is a propositional logic whose formulae are built using two operators, and (similar to linear logic and ). Formulae of this calculus can be used as labels on hyperedges: e.g. if is a hypergraph labeled by formulas, then is a formula. Based on the hypergraph Lambek calculus, hypergraph Lambek grammars were defined and their properties were investigated. Although the definition of the hypergraph Lambek calculus is justified in [28], the syntax of this logic is somewhat cumbersome. We are going to show that one can use any first-order logic, such as , as the underlying logic for hypergraph categorial grammars. The definitions we propose below are much simpler than those from [28]; besides, they enable one to rely on the well studied apparatus of linear logic.
Let us start with the definition of a string categorial grammar over a first-order logic. This notion appears in [20, 34] for but we would like to start with a more general exposition. Let be a first-order sequent calculus of interest and let denote the set of its formulas. Let be two fixed variables (note that earlier we used them as selectors).
Definition 10.
A string grammar is a tuple where is a finite alphabet, is a formula of such that , and is a finite binary relation such that implies . The language generated by is defined as follows: if and only if there are formulas such that for and such that the sequent is derivable in where are distinct variables.
Example 11.
Let , let , and let consist of the pairs , . This grammar accepts the string , because the sequent
is derivable in .
We see that, in Definition 10, the noncommutative structure of a string is simulated by variables. Informally, one could imagine a string graph with the nodes such that, for , there is an edge labeled by connecting to . Based on this observation, let us introduce the central notion of hypergraph grammars. From now on, we consider nodes, selectors and logical variables as objects of the same kind. Besides, if is a -typed alphabet, then we treat as a predicate symbol.
Definition 12.
Let us fix a variable and a symbol . A hypergraph grammar is a quadruple where is a -typed alphabet; is a finite binary relation such that implies and implies ; finally, is a formula of such that .
Definition 13.
The language is defined as follows: if and only if and there are functions , such that
-
1.
for , for ;
-
2.
the sequent is derivable in .
Example 14.
Let be a hypergraph grammar with (, ), , , and with consisting of the pairs
-
; ;
-
;
-
; .
Consider the hypergraph with (nodes are enumerated left to right, top to bottom). It belongs to , because the sequent
is derivable in . In this sequent, the first formula corresponds to the -labeled hyperedge, the second one corresponds to the -labeled hyperedge, the third formula corresponds to and the remaining formulae correspond to .
The string graph also belongs to , because the sequent
is derivable in (the nodes of from left to right are ).
Finally, the hypergraph without hyperedges and with nodes is also accepted by , because the following sequent is derivable in :
Let us comment on Definitions 12 and 13. The relation assigns formulae of to hyperedge labels from . Besides, it assigns formulae with the free variable (or without free variables) to the distinguished “node symbol” . Since, in the theory of hyperedge replacement, it is traditional to consider hypergraphs where only hyperedges are labeled, one might ask why we assign formulas not only to hyperedges but to nodes as well. The answer is that we need to have some control over nodes. If we remove all the parts concerning nodes from Definitions 12 and 13, then hypergraph grammars would completely ignore isolated nodes; this is a minor yet annoying issue. Moreover, each hypergraph language generated by a hypergraph grammar (for, say, ) would be closed under node identification.
Example 15.
Assume that for a hypergraph grammar where nodes do not participate in the grammar formalism. This would mean that there are formulae with free variables in such that , and such that the sequent is derivable in . However, this necessarily implies that the sequent is derivable in too, hence the hypergraph is also accepted by . Consequently, hypergraph grammars without node-formula assignment are not able to generate a language consisting only of string graphs, which is quite undesirable.
Another remark concerning Definition 12 is why we need the set . This set makes the language generated by a grammar consistent in terms of external nodes: if , then . Note that ht-systems and hyperedge replacement grammars [8] are consistent in this sense.
Given a hypergraph grammar , one can consider only string graphs generated by it and thus associate a string language with .
Definition 16.
The string language generated by a hypergraph grammar is the set .
One expects that, normally, string languages generated by hypergraph grammars should be the same as languages generated by string grammars. For example, for , the following proposition holds.
Proposition 17.
If a language is generated by a string grammar, then there is a hypergraph grammar such that . Conversely, if is a hypergraph grammar, then is generated by some string grammar.
This proposition is almost trivial; the only minor technicality is that hypergraph grammars assign formulae to nodes while string grammars are unable to do so. This technicality is the cause why we need to exclude the empty word in the second part of the proposition. See the proof of this proposition in [30, Appendix A.1].
3.1 Hypergraph Transformation Rules as MILL1 Formulae
Our main goal now is to describe a relation between hypergraph first-order linear logic grammars and hypergraph transformation systems. We start with showing how a ht-rule is encoded by a formula. Let us fix a unary predicate ; informally, it is understood as “ is a node”. Given a hypergraph , let us treat its hyperedge labels as predicate symbols. Namely, if , then let us assume that the elements of are enumerated, i.e. ; for any function , let stand for the formula . The arity of is .
Definition 18.
The diagram of a hypergraph is the multiset
Let where is the list of nodes in .
Example 19.
Let be a hypergraph such that . Then
Definition 20.
Given a ht-rule , let
where
-
is the list of nodes in ;
-
is a substitution function defined on as follows: .
The formula is closed. Note that is well defined because is injective.
Example 21.
Consider the ht-rule
Let be the nodes of the left-hand side hypergraph (from left to right) and let be the nodes of the right-hand side hypergraph. Then
The main lemma about is presented below.
Lemma 22.
Let be multisets of ht-rules and let be two hypergraphs with injective . The sequent
(1) |
is derivable in if and only if there exists a derivation of a hypergraph isomorphic to from which uses each rule from exactly once and that can use rules from any number of times.
The proof of this lemma, placed in [30, Appendix A.2], is quite technical; its idea is to do a proof search using focusing [1]. Using Lemma 22 we can prove the following theorem.
Theorem 23.
Hypergraph grammars generate the class of all recursively enumerable hypergraph languages.
Proof.
Clearly, all languages generated by hypergraph grammars are recursively enumerable. To prove the converse, according to Proposition 9, it suffices to show that any ht-system can be converted into a hypergraph grammar generating the same language. Define the hypergraph grammar such that
-
1.
where for ;
-
2.
for and ;
-
3.
.
A hypergraph is accepted by if and only if the following sequent is derivable in :
Note that the antecedent of this sequent is the diagram of and that the succedent equals . By invertibility of the rules and , the above sequent is equiderivable with the one
By Lemma 22, derivability of the latter sequent is equivalent to the fact that is derivable from using rules from , i.e. that . Thus, .
This result justifies soundness of Definition 12: if hypergraph grammars based on , which includes the powerful exponential modality, were not Turing-complete, this would indicate that we do not have enough control in the grammar formalism. For example, if we did not include node-formula assignment in Definition 12, then Theorem 23 would be false.
3.2 Hypergraph MILL1 Grammars
We proceed to investigating expressive power of hypergraph grammars. They are clearly less expressive than ht-systems because they generate only languages from NP. Nevertheless, it turns out that they as powerful as ht-systems in which the length of a derivation of a hypergraph is bounded by a linear function w.r.t. the size of the hypergraph.
Definition 24.
The size of a hypergraph denoted by is the number of nodes and hyperedges in . A linear-time hypergraph transformation system is a ht-system for which there is (a time constant) such that, for each , there is a derivation with at most steps.
Adding linear-time bound to formal grammars has a long history. Linear-time type-0 Chomsky grammars were studied in [2, 11]; linear-time one-tape Turing machines were studied in [35]; linear-time branching vector addition systems were studied in [29] in the context of commutative Lambek grammars. However, to our best knowledge, linear-time graph grammars have not appeared in the literature. Linear-time ht-systems may be considered as the hypergraph counterpart of linear-time type-0 grammars studied in [2, 11], so it is quite a natural formalism. The main result concerning them is presented below.
Theorem 25.
Each linear-time ht-system can be converted into an equivalent hypergraph grammar.
Proof of Theorem 25.
Let be a linear-time ht-system with the time constant . Define the hypergraph grammar such that
-
1.
where for ;
-
2.
for , , , and for ;
-
3.
for , and for .
A hypergraph is accepted by if and only if, for each (and for each ), there exist at most rules from , which we denote by (by resp.), such that the sequent
is derivable in . By invertibility of , this is equivalent to the fact that there exists a multiset of cardinality at most such that all its elements are from and such that the following sequent is derivable in :
By Lemma 22, this is equivalent to the fact that there is a derivation of a hypergraph isomorphic to from that uses each rule from exactly once. Existence of satisfying these properties is equivalent to the fact that there is a derivation of from of length at most , which is equivalent to . Thus, . The question arises whether the converse holds as well. We are not going to address it in the article because of space-time limitations and also because we are mainly interested in lower bounds for the class of hypergraph grammars. Still, we claim that it is possible to convert each hypergraph grammar into a linear-time ht-system with non-injective rules, i.e. with rules where is allowed to be non-injective. This could be done by a straightforward (yet full of tiring technical details) encoding of inference rules by hypergraph transformations. We leave proving that for the future work.
Speaking of upper bounds, we noted that all languages generated by hypergraph grammars are in NP; besides, as we shall show in Theorem 30, there is an NP-complete language of string graphs generated by a hypergraph grammars, so the NP upper bound is accurate.
Let us further explore properties of the class of languages generated by hypergraph grammars. It turns out to be closed under intersection.
Theorem 26.
Languages generated by hypergraph grammars are closed under intersection.
Proof.
Let be two hypergraph grammars; we aim to construct a hypergraph grammar generating . Let us assume without loss of generality that ; also, let us assume that predicate symbols used by and are disjoint. Let us denote the set of subformulas of formulas occuring in by (). Note that, if , then (hypergraphs in these languages would have different types); thus, we can assume that (let us denote this set by ).
Define the grammar where and is the smallest relation such that holds whenever and for .
Lemma 27 (splitting lemma).
-
1.
If the sequent is derivable in such that and for some , then all are also from .
-
2.
The sequent such that and are from and , are from is derivable in if and only if the sequents and are derivable in .
Both statements are proved jointly by straightforward induction on the length of a derivation.
A hypergraph is accepted by if and only if, for , there exist functions and such that for , for , and the following sequent is derivable in :
By invertibility of and splitting lemma, this is equivalent to derivability of the sequents for , hence is equivalent to the fact that belongs to . An analogous technique cannot be used for Lambek categorial grammars because the Lambek calculus is non-commutative, and splitting lemma does not hold for it. In fact, since Lambek categorial grammars generate context-free languages [25], which are not closed under intersection, a similar result does not hold for Lambek categorial grammars.
4 Expressive Power of String MILL1 Grammars
Now, let us apply the results and techniques developed for hypergraph grammars to describing the class of languages generated by string grammars. First, Proposition 17 and Theorem 25 imply the following lower bound.
Corollary 28.
If is a linear-time ht-system, then is generated by a string grammar.
Next, the following result can be proved in the same way as Theorem 26.
Theorem 29.
Languages generated by string MILL1 grammars are closed under intersection.
(Note that we cannot directly infer this theorem from Proposition 17 and Theorem 26 because of the empty string issue.) Since languages generated by string MILL1 grammars contain all context-free languages, they also contain their finite intersections, in particular, the language
It is simple to prove that languages generated by string grammars are also closed under letter-to-letter homomorphisms, so the language can be generated by a string grammar as well. Thus, string grammars generate languages with non-semilinear Parikh images.
Note that, for Lambek categorial grammars, there is imbalance between expressive power and algorithmic complexity. Namely, on the one hand, Lambek categorial grammars generate exactly context-free languages, all of which are polynomially parsable, but on the other hand, parsing in the Lambek calculus is an NP-complete problem [27]. This is not the case for grammars, as the following theorem shows.
Theorem 30.
String grammars generate an NP-complete language.
Proof.
In view of Corollary 28, it suffices to present a linear-time ht-system generating an NP-complete language of string graphs. Let be nonterminal symbols such that , and ; let be terminal symbols with type . Let the start hypergraph be (“” is introduced in Definition 5). The rules are presented below.
-
1.
;
-
2.
, for ;
-
3.
-
4.
, for ;
-
5.
;
-
6.
-
7.
.
This ht-system is linear-time because each production, except for 5, which is applied at most once in any derivation, increases the number of terminal symbols. This ht-system generates hypergraphs of the form
(2) |
such that are nonempty strings over the alphabet and such that coincides as a multiset with for some . Consequently, one can reduce the exact cover problem by 3-sets to this language: if is a set and is a collection of 3-element subsets of , then checking whether is the disjoint union of some sets from is equivalent to checking whether the hypergraph (2) is generated by the ht-system.
Let us present another proof of Theorem 30 that relies on a result by Book [3]. Consider a string rewriting system where is a finite set of rules of the form ( are arbitrary strings) and is the start string. Recall that whenever (and no other pairs of strings are in the relation ); recall also that . Let us convert into the ht-system where the symbols from have the type and . (Let us assume that, in each rule , and are nonempty.) It is straightforward to check that . Finally, [3, Theorem 1] says that there is a linear-time string rewriting system that generates an NP-complete language. Thus, if one constructs from it and then applies Corollary 28, then they obtain a string grammar generating an NP-complete language.
One of the reviewers pointed out that string grammars generating nonsemilinear languages is too much for linguistic applications, where it is widely assumed that natural languages are semilinear. The reviewer asked whether restricting to the fragment where each (free or bound) variable occurs in each formula at most twice results in languages generated by the corresponding class of grammars being semilinear. I am afraid that, even with this restriction, string grammars generate some non-semilinear and NP-complete languages. Let us consider the above construction involving string rewriting systems. If, for example, is a string rewriting rule, then
In a formula of the form , each variable occurs at most four times. However, we claim that one could remove predicates everywhere, which would result in each variable occurring in each formula at most twice. We also claim that Lemma 22 would remain true without predicates in formulas in case where we consider only string graphs. Thus, we can use Book’s result from [3] to generate an NP-complete and a non-semilinear language by a grammar over the restricted fragment of . We do not provide formal proofs but leave them for the future work.
5 Hypergraph Language Semantics
Now, let us proceed to model-theoretic investigations into . Our objective is to generalise language models for the Lambek calculus to and to devise hypergraph language models. This will enable one to regard as a logic for reasoning about hypergraph resources. The most important question is how the composition of such resources should be defined: if are two hypergraphs, then how should one understand “”?
Language semantics for the Lambek calculus, algebraically speaking, is a mapping of formulae to a free semigroup of words which interprets product as elementwise product of interpretations of and . What is the hypergraph counterpart of free semigroups? In the field of hyperedge replacement, there are algebras of hypergraphs called HR-algebras [6, 7, 31]. They include the parallel composition operation and source manipulating operations. Parallel composition is a way of gluing hypergraphs defined as follows.
Definition 31.
Let and be hypergraphs. Let be the smallest equivalence relation on such that for . Then, parallel composition is the hypergraph such that ; , ; , for ; for .
Informally, is obtained by taking the disjoint union of and and fusing with for . This operation is illustrated on Figure 3. Note that parallel composition is associative and commutative.
Other operations used in HR-algebras allow one to reassign selectors to external nodes, to make an external node non-external, or to fuse some external nodes. We shall use an operation that unifies all the three manipulations; we shall call it substitution.
Definition 32.
Given a hypergraph and a partial function , let be the smallest equivalence relation such that whenever . Then is the hypergraph such that ; ; ; , for ; .
One can compare the substitution operation with the operations of source renaming and source fusion from [31] and verify that the former one is interdefinable with the latter ones (in presence of parallel composition). Hence, one can use and as basic operations of HR-algebras. Nicely, exactly these operations can also be used for defining hypergraph language models, which we are going to do now. Let be the empty hypergraph. From now on, (i.e. selectors and variables are the same objects).
Definition 33.
A hypergraph language model is a pair where is a -typed alphabet and is a function mapping formulas of to sets of abstract hypergraphs over which satisfies the following conditions:
-
1.
for any total function ;
-
2.
;
-
3.
;
-
4.
;
-
5.
.
A sequent is true in this model if (for , if ).
This semantics can be viewed as an instance of intuitionistic phase semantics [15] with the commutative monoid and with the trivial closure operator .
The first property of in Definition 33 relates substitution as a logical operation to substitution as a hypergraph transformation; without it, and would be unrelated, which is undesirable. Besides, this property is used to prove correctness. Note that, if is a bijection, then (apply property 1 twice). Quantifiers are interpreted as additive conjunction and disjunction, which reflects their behaviour correctly.
Lemma 34.
-
1.
if and only if .
-
2.
If , then for any substitution .
Proof.
-
1.
Trivially follows from the definition of a model.
- 2.
The main result is soundness of and completeness of its fragment w.r.t. hypergraph language models.
Theorem 35.
-
1.
is sound w.r.t. hypergraph language models.
-
2.
is complete w.r.t. hypergraph language models.
Proof.
Soundness can be checked straightforwardly by showing that the conclusion of each rule with true premises is also true. The only nontrivial cases are the rules and . Assume that where does not occur freely in is true in a model . Without loss of generality, let us assume that is empty (otherwise, we can move it to the succeedent using Lemma 34). Under this assumption, we are given that . By Lemma 34, for each , . This proves that . The case is dealt with similarly.
Completeness of is proved by constructing a canonical model . Before we do this, let us introduce a new notion used in the construction.
Definition 36.
Assume that a countable sequence of variables is fixed. Given a formula , let us read it from left to right and replace the -th occurrence of a free variable from the left by . We denote the resulting formula by .
For example, . We shall use formulas of the form as labels of hyperedges in the canonical model with . So, . The idea is that, in the canonical model, variables are represented by nodes, while hyperedge labels do not carry information about variables ( are, informally, placeholders for variables).
Given a sequent where () and given two finite sets of variables such that , let us define the hypergraph as follows: ; ; for where is a function such that ; ; for (and undefined otherwise).
Let . The proof that is a hypergraph language model is quite technical and it can be found in [30, Appendix A.4]. Now, assume that is true in this model. Then, , i.e. , and is derivable in . By invertibility of , the sequent is derivable.
6 Conclusion
As we have shown, first-order intuitionistic linear logic does have strong connections to the hypergraph grammar theory, namely, to hypergraph transformation systems and to HR-algrebras. The notion of hypergraph first-order categorial grammars naturally and simply extends the concept of Lambek categorial grammars to hypergraphs. Developing hypergraph grammars and relating them to hypergraph transformation systems gave us useful insights into expressive power of string grammars. In turn, the notion of a hypergraph language model revealed a previously unknown connection of first-order intuitionistic linear logic to the apparatus of HR-algebras, the latter having been studied mainly in the context of monadic second-order definability.
Two questions remain open for the future work. The first one is whether the converse to Theorem 25 holds; more generally, it is desirable to characterise precisely hypergraph grammars in terms of hypergraph transformation systems. The second one is whether is complete w.r.t. hypergraph language models.
References
- [1] Jean-Marc Andreoli. Logic programming with focusing proofs in linear logic. Journal of Logic and Computation, 2(3):297–347, 1992. doi:10.1093/logcom/2.3.297.
- [2] Ronald V. Book. Time-bounded grammars and their languages. Journal of Computer and System Sciences, 5(4):397–429, 1971. doi:10.1016/S0022-0000(71)80025-9.
- [3] Ronald V. Book. On the complexity of formal grammars. Acta Informatica, 9:171–181, 1978. doi:10.1007/BF00289076.
- [4] Wojciech Buszkowski. Compatibility of a categorial grammar with an associated category system. Mathematical Logic Quarterly, 28(14-18):229–238, 1982. doi:10.1002/malq.19820281407.
- [5] Wojciech Buszkowski. Type Logics in Grammar, pages 337–382. Springer Netherlands, Dordrecht, 2003. doi:10.1007/978-94-017-3598-8_12.
- [6] Bruno Courcelle. The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Information and Computation, 85(1):12–75, 1990. doi:10.1016/0890-5401(90)90043-H.
- [7] Bruno Courcelle and Joost Engelfriet. Graph Structure and Monadic Second-Order Logic: A Language-Theoretic Approach. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2012.
- [8] Frank Drewes, Hans-Jörg Kreowski, and Annegret Habel. Hyperedge replacement graph grammars. In Grzegorz Rozenberg, editor, Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations, pages 95–162. World Scientific, 1997. doi:10.1142/9789812384720_0002.
- [9] Joost Engelfriet. Context-free graph grammars. In Grzegorz Rozenberg and Arto Salomaa, editors, Handbook of Formal Languages, Volume 3: Beyond Words, pages 125–213. Springer, 1997. doi:10.1007/978-3-642-59126-6_3.
- [10] Jean-Yves Girard. Linear logic: A survey. In Friedrich L. Bauer, Wilfried Brauer, and Helmut Schwichtenberg, editors, Logic and Algebra of Specification, pages 63–112, Berlin, Heidelberg, 1993. Springer Berlin Heidelberg. doi:10.1007/978-3-642-58041-3_3.
- [11] Aleksei Gladkii. On complexity of inference in phase-structure grammars. Algebra i Logika. Sem. (in Russian), 3(5-6):29–44, 1964.
- [12] Annegret Habel. Hyperedge Replacement: Grammars and Languages, volume 643 of Lecture Notes in Computer Science. Springer, 1992. doi:10.1007/BFB0013875.
- [13] Annegret Habel, Jürgen Müller, and Detlef Plump. Double-pushout graph transformation revisited. Mathematical Structures in Computer Science, 11(5):637–688, 2001. doi:10.1017/S0960129501003425.
- [14] Laura Kallmeyer. Parsing Beyond Context-Free Grammars. Springer Berlin Heidelberg, 2010. doi:10.1007/978-3-642-14846-0.
- [15] Max I. Kanovich, Mitsuhiro Okada, and Kazushige Terui. Intuitionistic phase semantics is almost classical. Mathematical. Structures in Comp. Sci., 16(1):67–86, 2006. doi:10.1017/S0960129505005062.
- [16] Yuichi Komori. Predicate logics without the structure rules. Studia Logica, 45(4):393–404, 1986. doi:10.1007/bf00370272.
- [17] Barbara König, Dennis Nolte, Julia Padberg, and Arend Rensink. A tutorial on graph transformation. In Reiko Heckel and Gabriele Taentzer, editors, Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig, volume 10800 of Lecture Notes in Computer Science, pages 83–104. Springer, 2018. doi:10.1007/978-3-319-75396-6_5.
- [18] Joachim Lambek. The mathematics of sentence structure. The American Mathematical Monthly, 65(3):154–170, 1958. doi:10.1080/00029890.1958.11989160.
- [19] Michael Moortgat. Multimodal linguistic inference. Journal of Logic, Language and Information, 5(3–4):349–385, 1996. doi:10.1007/bf00159344.
- [20] Richard Moot. Extended Lambek calculi and first-order linear logic. In Claudia Casadio, Bob Coecke, Michael Moortgat, and Philip J. Scott, editors, Categories and Types in Logic, Language, and Physics - Essays Dedicated to Jim Lambek on the Occasion of His 90th Birthday, volume 8222 of Lecture Notes in Computer Science, pages 297–330. Springer, 2014. doi:10.1007/978-3-642-54789-8_17.
- [21] Richard Moot. Hybrid type-logical grammars, first-order linear logic and the descriptive inadequacy of lambda grammars, 2014. arXiv:1405.6678.
- [22] Richard Moot and Mario Piazza. Linguistic applications of first order intuitionistic linear logic. Journal of Logic, Language and Information, 10(2):211–232, 2001. doi:10.1023/a:1008399708659.
- [23] Richard Moot and Christian Retoré. The Logic of Categorial Grammars - A Deductive Account of Natural Language Syntax and Semantics, volume 6850 of Lecture Notes in Computer Science. Springer, 2012. doi:10.1007/978-3-642-31555-8.
- [24] Glyn Morrill, Oriol Valentín, and Mario Fadda. The displacement calculus. Journal of Logic, Language and Information, 20(1):1–48, 2010. doi:10.1007/s10849-010-9129-2.
- [25] Mati Pentus. Lambek grammars are context free. In Proceedings Eighth Annual IEEE Symposium on Logic in Computer Science, pages 429–433, 1993. doi:10.1109/LICS.1993.287565.
- [26] Mati Pentus. Models for the Lambek calculus. Ann. Pure Appl. Log., 75(1-2):179–213, 1995. doi:10.1016/0168-0072(94)00063-9.
- [27] Mati Pentus. Lambek calculus is NP-complete. Theoretical Computer Science, 357(1):186–201, 2006. Clifford Lectures and the Mathematical Foundations of Programming Semantics. doi:10.1016/j.tcs.2006.03.018.
- [28] Tikhon Pshenitsyn. Hypergraph Lambek grammars. Journal of Logical and Algebraic Methods in Programming, 129:100798, 2022. doi:10.1016/j.jlamp.2022.100798.
- [29] Tikhon Pshenitsyn. Commutative Lambek grammars. Journal of Logic, Language and Information, 32(5):887–936, 2023. doi:10.1007/s10849-023-09407-z.
- [30] Tikhon Pshenitsyn. First-order intuitionistic linear logic and hypergraph languages, 2025. doi:10.48550/arXiv.2502.05816.
- [31] Grzegorz Rozenberg. Handbook of Graph Grammars and Computing by Graph Transformation. World Scientific, 1997. doi:10.1142/3303.
- [32] Sylvain Salvati. A note on the complexity of abstract categorial grammars. In Christian Ebert, Gerhard Jäger, and Jens Michaelis, editors, The Mathematics of Language, pages 266–271, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.
- [33] Harold Shellinx. Some syntactical observations on linear logic. Journal of Logic and Computation, 1(4):537–559, September 1991. doi:10.1093/logcom/1.4.537.
- [34] Sergey Slavnov. Making first order linear logic a generating grammar. Logical Methods in Computer Science, Volume 19, Issue 4, 2023. doi:10.46298/lmcs-19(4:11)2023.
- [35] Kohtaro Tadaki, Tomoyuki Yamakami, and Jack C.H. Lin. Theory of one-tape linear-time Turing machines. Theoretical Computer Science, 411(1):22–43, 2010. doi:10.1016/j.tcs.2009.08.031.
- [36] Tadahiro Uesu. A system of graph grammars which generates all recursively enumerable sets of labelled graphs. Tsukuba Journal of Mathematics, 2(none):11–26, 1978. doi:10.21099/tkbjm/1496158502.
- [37] Johan van Benthem. Language in Action: Categories, Lambdas and Dynamic Logic. MIT Press, 1995.