Abstract 1 Introduction 2 Basic on Strings and Fibonacci Cubes 3 Tilde-distance and Tilde-hypercube 4 Tilde-hypercube Avoiding a Word and Tilde-isometric Words 5 Characterization of Tilde-isometric Words 7 Conclusions References

Generalized Fibonacci Cubes Based on Swap and Mismatch Distance

Marcella Anselmo ORCID Dipartimento di Informatica, Università di Salerno, Italy Giuseppa Castiglione ORCID Dipartimento di Matematica e Informatica, Università di Palermo, Italy Manuela Flores ORCID Dipartimento di Bioscienze e Territorio, Università del Molise, Campobasso, Italy Dora Giammarresi ORCID Dipartimento di Matematica, Università Roma “Tor Vergata”, Italy Maria Madonia ORCID Dipartimento di Matematica e Informatica, Università di Catania, Italy Sabrina Mantaci ORCID Dipartimento di Matematica e Informatica, Università di Palermo, Italy
Abstract

The hypercube of dimension n is the graph with 2n vertices associated to all binary words of length n and edges connecting pairs of vertices with Hamming distance equal to 1. Here, an edit distance based on swaps and mismatches is considered and referred to as tilde-distance. Accordingly, the tilde-hypercube is defined, with edges linking words having tilde-distance equal to 1. The focus is on the subgraphs of the tilde-hypercube obtained by removing all vertices having a given word as factor. If the word is 11, then the subgraph is called tilde-Fibonacci cube; in the case of a generic word, it is called generalized tilde-Fibonacci cube. The paper surveys recent results on the definition and characterization of those words that define generalized tilde-Fibonacci cubes that are isometric subgraphs of the tilde-hypercube. Finally, a special attention is given to the study of the tilde-Fibonacci cubes.

Keywords and phrases:
Swap and mismatch distance, Isometric words, Hypercube
Category:
Research
Copyright and License:
[Uncaptioned image] © Marcella Anselmo, Giuseppa Castiglione, Manuela Flores, Dora Giammarresi, Maria Madonia, and Sabrina Mantaci; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Mathematics of computing Combinatorics on words
; Theory of computation Design and analysis of algorithms ; Theory of computation Formal languages and automata theory
Funding:
Partially supported by INdAM-GNCS Project 2025 - CUP E53C24001950001, FARB Project ORSA240597 of University of Salerno, TEAMS Project and PNRR MUR Project PE0000013-FAIR University of Catania, FFR fund University of Palermo, MUR Excellence Department Project MatMod@TOV, CUP E83C23000330006, awarded to the Department of Mathematics, University of Rome Tor Vergata, “ACoMPA – Algorithmic and Combinatorial Methods for Pangenome Analysis” (CUP B73C24001050001) funded by the NextGeneration EU programme PNRR ECS00000017 Tuscany Health Ecosystem (Spoke 6).
Editors:
Alessio Conte, Andrea Marino, Giovanna Rosone, and Jeffrey Scott Vitter

1 Introduction

The n-dimensional hypercube, Qn, is the well-known graph whose vertices are in correspondence with the 2n words of length n over the binary alphabet {0,1} and two vertices are connected by an edge if the corresponding words differ in one position, that is, if their Hamming distance is 1. Hence, the distance (number of edges in a minimal path) between two vertices in the graph is the Hamming distance of the corresponding words. The notion of hypercube has been extensively investigated because it is used to design interconnection networks (cf. [13, 17]) and also finds applications in theoretical chemistry (cf. [27] and [20, 24] for surveys). However, the critical limitation of the hypercube lies in its exponential growth in size, specifically in the number of vertices. Exploring some of its isometric subgraphs can improve efficiency. A subgraph of the n-dimensional hypercube is isometric to Qn if the distance between any pair of vertices in such subgraphs is the same as the distance in the complete hypercube. With this aim, in 1993, Hsu introduced the Fibonacci cubes [21], obtained from Qn by selecting only those vertices that do not contain 11 as a factor. They have a Fibonacci number of vertices making them useful in applications to reduce the complexity and to limit resources. They have many remarkable properties, also related to Fibonacci numbers (cf. [15]).

In 2012, Generalized Fibonacci cubes have been defined by means of a binary word f; the graph Qn(f) is a subgraphs of Qn whose vertices do not contain f as a factor, i.e. the vertices are f-free binary words [22]. Then, the property of Qn(f) being an isometric subgraph of Qn is related to some combinatorial properties of the avoided word f that in such cases is called isometric. More formally, a binary word f is isometric (or Ham-isometric) when, for any n1, Qn(f) is an isometric subgraph of Qn, and non-isometric, otherwise [25]. The definition can also be done without mentioning graphs and only in terms of the Hamming distance. A word f is Ham-isometric, if for any pair of f-free words u and v of the same length, u can be transformed into v by a minimal sequence of symbol replacements that each time produce an f-free word, as well. Binary Ham-isometric words have been characterized in [23, 25, 35, 38, 39] and research on the topic remains very active [11, 36, 37, 12, 6, 7, 16, 8, 9, 4, 10].

Over the years, many variations of the hypercube have been introduced to enhance certain features. For example, folded hypercubes (cf. [18]) and enhanced hypercubes (cf. [32]) have been defined by adding some edges to the original structure, providing several advantages in terms of topological properties. Motivated by the same reasons, we introduced a type of generalized hypercube that employs a different distance, enabling the definition of isometric subgraphs. The inspiration and motivation came from many applications of computational biology, where many processes involve complex transformations and it is natural to consider not only replacement operations but also swap operations that exchange two adjacent different symbols in a word. The edit distance based on swaps and replacements is worth considering [1, 19, 29] instead of the Hamming distance. Actually, this distance was defined in the 70s by Wagner and Fischer [34, 33] who proved that it can be efficiently computed.

In [3] this distance is referred to as tilde-distance, since the symbol somehow evokes the swap operation. In [2], the tilde-distance is taken as the base to define the tilde-hypercube, Q~n; it has again all the n-binary strings as vertices, but two vertices are adjacent if the tilde-distance is equal to 1. This implies that Q~n has more edges than Qn; in particular, since a swap corresponds to two replacements of consecutive characters, some vertices at distance 2 in Qn, become adjacent in Q~n.

This paper surveys most of the recent results on tilde-isometric words and generalized tilde-Fibonacci cubes. In particular, we collect the main definitions to give a recursive construction of tilde-hypercubes and enumerate their edges and vertices. Then, the subgraphs Q~n(f) of the tilde-hypercubes are considered. They are obtained by selecting the vertices corresponding to f-free words, for a given word f. It is also reported the characterization of words f such that Q~n(f) is an isometric subgraph of Q~n as proved in [5]. We also exhibit an infinite family of tilde-isometric words that are not Hamming-isometric and vice versa. Finally, the last part of the paper focuses on the word f=11, that is both a Hamming- and a tilde-isometric word; the subgraph Q~n(11) is referred to as the tilde-Fibonacci cube. We describe its recursive construction and present new simple results on diameter and radius that prove that the tilde-Fibonacci cubes are self-centered graphs.

2 Basic on Strings and Fibonacci Cubes

In this paper we focus only on the binary alphabet Σ={0,1}. A word (or string) w over Σ of length |w|=n, is w=a1a2an, where a1,a2,,an are symbols in Σ. The set of all words over Σ is denoted Σ. Finally, ϵ denotes the empty word and Σ+=Σ{ϵ}. For any word w=a1a2an, the reverse of w is the word wrev=anan1a1. If xΣ, x¯ denotes the opposite of x, i.e x¯=1 if x=0 and vice versa. Then we define the complement of w the word w¯=a¯1a¯2a¯n.

Let w[i] denote the symbol of w in position i, i.e. w[i]=ai. Then, w[i..j]=aiaj, for 1ijn, denotes a factor u of w of length ji+1 placed from the i-th to the jth position of w. We say that I=[i..j] is the interval where the factor u occurs in w. The prefix (resp. suffix) of w of length l, with 1ln1 is prel(w)=w[1..l] (resp. sufl(w)=w[nl+1..n]). When prel(w)=sufl(w)=u then u is here referred to as an overlap of w of length l; in other frameworks, it is also called the border, or bifix of w (cf. [30]). A word w avoids a word f if w does not contain f as a factor: we also say that w is f-free.

An edit operation is a function O:ΣΣ that transforms a word into another one. Let OP be a set of edit operations. The edit distance of words u,vΣ, with respect to the set OP, is the minimum number of edit operations in OP needed to transform u into v.

In this paper, we consider the edit distance that uses only swap and replacement operations. Note that these operations preserve the length of the word.

Definition 1.

Let w=a1a2an be a word over Σ.
The replacement operation (or replacement, for short) on w at position i, with i=1,,n, is defined by

Ri(a1a2ai1𝒂𝒊ai+1an)=a1a2ai1𝒂¯𝒊ai+1an.

The swap operation (or swap, for short) on w at position i, with i=1,,n1 and aiai+1, is defined by

Si(a1a2ai1𝒂𝒊𝒂𝒊+𝟏ai+2an)=a1a2ai1𝒂𝒊+𝟏𝒂𝒊ai+2an.

Note that one swap corresponds to two replacements of consecutive symbols.

The Hamming distance distH(u,v) of equal-length words u,vΣ is defined as the minimum number of replacements needed to obtain v from u.


Let G=(V(G),E(G)) be a graph, V(G) be the set of its nodes and E(G) be the set of its edges. The distance of u,vV(G), distG(u,v), is the length of the shortest path that connects u and v in G. A subgraph S=(V(S),E(S)) of a (connected) graph G is an isometric subgraph if for any u,vV(S), distS(u,v)=distG(u,v).

The n-hypercube, or binary n-cube, Qn, is a graph with 2n vertices, labeled with the words of length n and edges connecting two vertices u and v in Qn when their labels differ exactly in 1 position, i.e. when distH(u,v)=1. Therefore, distQn(u,v)=distH(u,v).

Denote by fn the n-th Fibonacci number, defined by f1=1,f2=1 and fi=fi1+fi2, for i3. The Fibonacci cube (cf. [22]) Fn of order n is the subgraph of Qn whose vertices are binary words of length n avoiding the factor 11. It is well known that Fn is an isometric subgraph of Qn (cf. [24]).

One of the main properties of Qn and Fn is their recursive structure that have been extensively studied (cf. [21, 28, 26, 24]).

The following results are well-known, but are hereby stated for future reference.

Proposition 2.

Let Qn be the hypercube of order n. Then

  • |V(Qn)|=2n,

  • |E(Qn)|=n2n1.

Proposition 3.

Let Fn be the Fibonacci cube. Then:

  • |V(Fn)|=fn+2,

  • |E(Fn)|=2(n+1)fn+nfn+15.

In other terms, if the number of vertices N=2n of the hypercube is taken as main parameter, the number of edges of a hypercube with N vertices is (NlogN)/2.

The sequence |E(Fn)| is Sequence A001629 in [31]. Hence, the number of edges of a Fibonacci cube with N vertices, N=fn+2, is O(NlogN), asymptotically equal to the number of edges of a hypercube with the same number of vertices.

In order to generalize the notion of Fibonacci cube, one can consider the subgraphs of the n-hypercube whose nodes are f-free, for some word fΣ, denoted by Qn(f), called generalized Fibonacci cubes. Of course, if n<|f|, then Qn(f)=Qn; so it makes sense to consider n|f|. Unfortunately, not all the words f make Qn(f) an isometric subgraph of Qn. The words having this property have been widely studied (cf. [23, 25, 35, 38, 39]). These words are in fact called isometric words, because they reflect on this isometry property on the corresponding graphs.

Under the combinatorial point of view, isometric words are defined as follows. A word fΣ is Ham-isometric (or simply isometric) if and only if Qn(f) is an isometric subgraph of Qn. A word that is not Ham-isometric is said to be Ham-non-isometric. In terms of transformations, isometric words can be characterized by the following:

Proposition 4.

A word f is isometric iff for any pair of f-free words u and v, there exists a sequence of distH(u,v) replacements that transforms u into v where all the intermediate words are also f-free.

A word w has a 2-error overlap if there exists l|w| such that prel(w) and sufl(w) have Hamming distance 2. Then, the following characterization of Ham-non-isometric words is proved (cf. [35]).

Proposition 5 ([35]).

A word f is Ham-non-isometric if and only if f has a 2-error overlap.

3 Tilde-distance and Tilde-hypercube

In this section, the edit distance based on swap and replacement operations is considered. In [3] it is called tilde-distance and denoted by dist . According to this definition one can define the n-tilde-hypercube. We start with the definition of tilde-distance.

Definition 6.

Let u,vΣ be words of equal length. The tilde-distance dist(u,v) between u and v is the minimum number of replacements and swaps needed to transform u into v.

Note that for all u,vΣ, dist(u,v)distH(u,v), since a swap is a shortcut for two adjacent replacements.

Example 7.

The words u=1011 and v=0110 have tilde-distance dist(u,v)=2. In fact, v can be obtained from u with a swap S1 of the first and the second bits, and a replacement R4 in the fourth position. Note that, in order to compute the Hamming distance, three replacements are needed in positions 1,2 and 4, therefore distH(u,v)=3.

In order to describe the sequence of the operations that are used to compute the tilde-distance of two words, we need the following definition of a tilde-transformation:

Definition 8.

Let u,vΣ be words of equal length and dist(u,v)=d. A tilde-transformation τ from u to v is a sequence of d+1 words (w0,w1,,wd) such that w0=u, wd=v, and for any k=0,1,,d1, dist(wk,wk+1)=1. Further, τ is f-free if for any i=0,1,,d, word wi is f-free.

A tilde-transformation (w0,w1,,wd) from u to v with dist(u,v)=d is associated to a sequence of d operations (Oi1,Oi2,,Oid) such that, for any k=1,,d, Oik{Rik,Sik} and wk=Oik(wk1); it can be graphically represented as follows:

u=w0Oi1w1Oi2Oihwh=v.

With a little abuse of notation, in the sequel we will refer to a tilde-transformation both as a sequence of words and as a sequence of operations. Furthermore, as a consequence of the definition, in a tilde-transformation, the positions i1, i2, , id are all distinct.
In the following we give some examples that show some special features of tilde-transformations, which never occur when transformations using only replacements are considered. This highlights, on the one hand, the richness of new situations that arise from introducing the swap operation, and on the other hand, it anticipates the need for new and more sophisticated techniques to handle these increasingly complex scenarios. First of all, we point out that when only replacements are used, the different transformations from u to v use the same set of operations, possibly applied in a different order. This is not the case for the different tilde-transformations between two words. The following two examples show two singular cases of different tilde-transformations which use different sets of operations.

Example 9.

Let u=100, v=001. In this case σ1=(S1,S2) and σ2=(R1,R2) are two tilde-transformations from u to v on different sets of operations. In particular, observe that σ1 flips twice the bit in the second position, whereas in σ2 the second bit is not involved.

σ1:100S1010S2001σ2:100R1000R3001
Example 10.

Consider u=010 and v=101 and the tilde-transformations ρ1=(S1,R3) and ρ2=(S2,R1) from u to v. Here, different sets of operations are used and, differently from Example 9 in both transformations each symbol is changed just once:

ρ1:010S1100R3101ρ2:010S2001R1101

The variety of situations described above translates into a higher degree of difficulty of the tilde-transformations (compared to the Hamming transformations) when handling some property like, for instance, isometricity.

 Remark 11.

Let u,vΣm and τ be a tilde-transformation from u to v. Referring to Example 9, if τ contains two swaps, Si and Si+1, at consecutive positions i and i+1 of u, then such swaps can be substituted by two replacements, namely Ri and Ri+2, still obtaining a tilde-transformation from u to v of length equal to their distance. Hence, in particular, if u=00u and v=10v, among all sequences of swaps and replacements of minimal length that transform u into v there is one starting with the replacement R1.

 Remark 12.

Let u,vΣm and τ be a tilde-transformation from u to v. Referring to Example 10. If τ contains a swap Si+1 and a replacement Ri then they can be substituted by Si and Ri+2. As a consequence, if u=01u and v=10v among all sequences of swaps and replacements of minimal length that transform u into v there is one starting with the replacement R1.

Based on the definition of tilde-distance, a natural extension of the concept of n-hypercube is given in [2] as follows:

Definition 13.

The n-tilde-hypercube Q~n, or tilde-hypercube of order n, is a graph with 2n vertices, labeled with binary words of length n. Two vertices in Q~n, are adjacent whenever the tilde-distance of their labels is 1.

Figure 1: Tilde-hypercubes with n=1,2,3,4 (the colored edges are those added to the traditional hypercube).

Figure 1 shows the tilde-hypercubes of order 1,2,3,4.

 Remark 14.

Qn is a proper subgraph of Q~n. In fact, for u,vΣ, distH(u,v)=1 implies dist(u,v)=1. Note that Qn is not an isometric subgraph of Q~n. Indeed, for any n2, there exists a pair of words (un,vn) of length n such that dist(un,vn)=1 and distH(un,vn)1. For instance, for any 0k,hn2, h+k=n2, consider the words un=0h010k and vn=0h100k; then dist(un,vv)=1 and distH(un,vn)=2, therefore (un,vn) is an edge in Q~n but not in Qn.

The following lemma is the main tool for exhibiting a recursive definition of the tilde-hypercube, in analogy with the classical hypercube.

Lemma 15.

For any u,vΣn1, dist(u0,v0)=dist(u,v)=dist(u1,v1) and dist(u0,u1)=1. Moreover, for any uΣn2, dist(u01,u10)=1.

Proposition 16 ([2]).

The n-tilde-hypercubes Q~n, with n1, can be recursively defined.

Proof.

If n=1, Q~1 has just two vertices 0 and 1 connected by an edge.

Suppose all the tilde-hypercubes of dimension smaller than n have been defined. The hypercube Q~n is recursively constructed as follows. Start with a copy of Q~n1 where every vertex u is replaced by u0, and denote this copy by Q~n10 and a second copy where every vertex u is replaced by u1, and denoted by Q~n11.

By Lemma 15, if (u,v)E(Q~n1), then (u0,v0),(u1,v1)E(Q~n1), this means that Q~n10 and Q~n11 are subgraphs of Q~n. Moreover, for any uΣn1, there is an edge (u0,u1) in Q~n with u0V(Q~n10) and u1V(Q~n11). Finally, for any vΣn2 (v10,v01)E(Q~n) with v10V(Q~n10) and v01V(Q~n11). In Fig. 1, these latter edges added in the fourth step of recursion, are depicted with orange edges. For any other pair of words u,v{0,1}n, dist(u,v)>1, then (u,v) is not an edge of Q~n.

Corollary 17.

Let Q~n be the tilde-hypercube of order n. Then, |E(Q~1)|=1 and, for any n2 |E(Q~n)|=2|E(Q~n1)|+2n1+2n2.

By solving the above recurrence, we find the exact formula |E(Q~n)|=(3n1)2n2 (Sequence A053220 in [31]). Let EQ~(N) be the number of edges of the tilde-hypercube with N vertices, N=2n. Then,

EQ~(N)=3N(logN1)4. (1)

4 Tilde-hypercube Avoiding a Word and Tilde-isometric Words

In analogy with the n-hypercube avoiding a word based on the Hamming distance, referred to as generalized Fibonacci cube in [22], in this section, we consider the n-tilde-hypercube avoiding a word, based on the tilde-distance, here named generalized tilde-Fibonacci cubes.

In [2] the following definition is given:

Definition 18.

The n-tilde-hypercube avoiding a word f, or the tilde-hypercube of order n avoiding a word f, denoted Q~n(f), is the subgraph of Q~n obtained by removing those vertices which contain f as a factor.

We are interested in those words f such that Q~n(f) is an isometric subgraph of Q~n, i.e. the distance of two vertices of Q~n(f) is equal to the tilde-distance of the corresponding labels.

Definition 19.

A word fΣ is tilde-isometric if and only if for all n|f|, Q~n(f) is an isometric subgraph of Q~n.

The following proposition characterizes isometric words re-stating the definition of isometric word under a combinatorial point of view.

Proposition 20.

Let fΣ be a word of length n with n1. The word f is tilde-isometric if and only if for any pair of f-free words u and v of equal length mn, there exists a tilde-transformation from u to v that is f-free. It is tilde-non-isometric if it is not tilde-isometric.

In order to show that a word is tilde-non-isometric it is sufficient to exhibit a pair (u,v) of f-free words such that all the tilde-transformations from u to v are not f-free. Such pair of words is called a pair of tilde-witnesses. More challenging is to prove that a word is tilde-isometric.

Example 21.

The word f=1010 is tilde-non-isometric. In fact, let u=11000 and v=10110; then (u,v) is a pair of tilde-witnesses for f. In fact u and v are f-free; moreover there are only two possible tilde-transformations from u to v, namely 11000S210100R410110 and 11000R411010S210110, and in both cases 1010 appears as factor after the first step. On the other side, observe that f is Ham-isometric by Proposition 5.

The following straightforward property of tilde-isometric binary words is very helpful to simplify proofs and computations.

 Remark 22.

A word f is tilde-isometric iff f¯ is tilde-isometric iff frev is tilde-isometric.

In view of Remark 22 , we will focus on words starting with 1.

5 Characterization of Tilde-isometric Words

The characterization of Ham-isometric words given in [38] and here reported as Proposition 5, uses the notion of 2-error overlap. In this section we introduce the corresponding definition that refers to the tilde-distance. Tilde-error overlaps will have a main role in the characterization of tilde-isometric words but the presence of swap operations will force us to handle them with care.

Definition 23.

Let fΣn. Then, f has a q-tilde-error overlap of length and shift r=n, with 1n1 and 0q, if dist(pre(f),suf(f))=q.

Example 24.

The word f=1101110101101 has a 2-tilde-error overlap of length 6 and shift 7. Indeed, pre6(f)=110111, suf6(f)=101101 and dist(110111,101101)=2.

For our proofs, given a word f with a q-tilde-error overlap of length , we will study the tilde-transformations τ from pre(f) to suf(f) with q operations. For this reason, it is useful to refer to the alignment of the two strings pre(f) to suf(f). Furthermore, for our purpose, it is relevant to consider also the bits adjacent to a tilde-error overlap of a word. For this reason, we introduce the following notation.

Let f be a word in {0,1} and $ be a symbol different from 0,1, here used as delimiter of a word, that by definition “matches” any symbol of the word. Consider f with its delimiters $f$. A q-tilde-error overlap of length is denoted by ($xbay$) where xb, ay are a prefix and a suffix, respectively, of f, a,bΣ, x,yΣ with |x|=|y|=, and dist($xb,ay$)=dist(x,y)=q. This notation makes evident the fact that in f the prefix x is followed by b and the suffix y is preceded by a. Moreover, a q-tilde-error overlap ($xbay$) is sometimes factorized into blocks to highlight the significant parts. For example, the 2-tilde-error overlap in Example 24 is denoted by ($101)(10110110)(101$) because dist(110111,101101)=2=dist(1011,0110).


Figure 2: A word f and its 2-tilde-error overlap of shift r and length =nr, with tilde-transformation (Oi,Oj)=(Ri,Rj) (left), and (Oi,Oj)=(Si,Rj) (right).

In the sequel, we will be interested in the specific case of 1-tilde-error overlap where the single error in the alignment is a swap and in all the cases of 2-tilde-error overlaps. Consider a 2-tilde-error overlap of f of shift r, length =nr, and let (Oi,Oj), 1i<j, be a tilde-transformation from pre(f) to suf(f). Observe that the positions in pre(f) modified by Oi are either i or both i and i+1, following that Oi is a replacement or a swap. Hence, the number of the positions modified by Oi and Oj may be 2, 3 or 4. Fig. 2 shows a word f with its 2-tilde-error overlap of shift r and length =nr. With our notation, the 2-tilde-error overlap is ($w21w1w3baw20w0w3$) in the figure on the left and ($w210w1w3baw201w0w3$) in the figure on the right, where b is the first letter of w4 and a the last letter of w1. A tilde-transformation from pre(f) to suf(f) is given by (Oi,Oj)=(Ri,Rj) in the first case and by (Oi,Oj)=(Si,Rj) in the second case. We say that a 2-tilde-error overlap has non-adjacent errors when there is at least one character interleaving the positions modified by Oi and those modified by Oj.

The 2-tilde-error overlap ($xax)(100001)(yby$) is also considered as having non-adjacent errors because it admits the tilde-transformation (Oi,Oj)=(Ri,Ri+2), despite it has also the other tilde-transformation (Oi,Oj)=(Si,Si+1).

In all the other cases, we say that the 2-tilde-error overlap has adjacent errors.

Let us state the characterization of tilde-isometric words in terms of special configurations in their overlap proved in [5].

Theorem 25.

A word fΣn is tilde-non-isometric if and only if one of the following cases occurs (up to complement, reverse and inversion of rows):

  1. (C0)

    f has a 1-tilde-error overlap ($xax)(0110)(yby$) with x,yΣ, a,bΣ;

  2. (C1)

    f has a 2-tilde-error overlap with non-adjacent errors, different from ($xax)(000101)(yby$) with x,yΣ+, a,bΣ;

  3. (C2)

    f has a 2-tilde-error overlap ($xax)(01011010)(yby$) or ($xax)(01101001)(yby$) with x,yΣ, a,bΣ;

  4. (C3)

    f has a 2-tilde-error overlap ($xax)(010101)(yby$) with x,yΣ, a,bΣ;

  5. (C4)

    f has a 2-tilde-error overlap ($xax)(011100)(0$) with xΣ, aΣ;

  6. (C5)

    f has a 2-tilde-error overlap ($0)(0011)(1$).

Thanks to Theorem 25, we can classify any word as isometric or non-isometric. In the following, several examples are given. The following are two examples of words with a 2-tilde-error overlap with adjacent errors; the first one is tilde-isometric, the second one is tilde-non-isometric.

Example 26.

The word f=010110000 is tilde-isometric; indeed its unique 2-tilde-error overlap has shift 5 and length 4, ($010)(101000)(1$). Note this is the case of non-adjacent errors but of the type prohibited by the condition in (C1).

Example 27.

The word f=1011000 is tilde-non-isometric; indeed it has the 2-tilde-error overlap, of shift 4 and length 3, ($1)(101000)(1$), that satisfies (C1). Note that the pair (u,v)=(10110011000,10101001000) is a pair of tilde-witnesses for f.

The following example shows an infinite family of words that are tilde-isometric and Ham-non-isometric.

Example 28.

All the words f=1n0m (and their complement f=0n1m) for n,m>2 are Ham-non isometric, by Proposition 5, and tilde-isometric. In fact, for n,m>2, f=1n0m has only two 2-tilde-error overlaps and none of them fall into a case in the statement of Theorem 25. The first one is the tilde-error overlap with shift 2, ($1n211n2)(1100)(0n200n2$), the other one has shift n+m2, and it is ($0)(1100)(1$).

Instead, if n=m=2, f=1100 is both tilde-non-isometric and Ham-non-isometric. In fact, f has only one 2-tilde-error overlap ($1)(1100)(0$), corresponding to case (C5) of Theorem 25. Moreover, one can verify that (u,v)=(110100,101010) is a pair of tilde-witnesses for f.

The next corollary shows that there exist Ham-isometric words that are not tilde-isometric. This fact, together with Example 28, proves that the families of Ham-isometric and tilde-isometric words are incomparable.

Corollary 29.

The word f=1010 is Ham isometric and tilde non-isometric.

Proof.

The word 1010 has no 2-error overlap, therefore is isometric. Instead, it has a 2-tilde-error overlap (1$)(010101)($0) corresponding to case (C3) of Theorem 25.

6 Tilde-Fibonacci Cubes

Theorem 25 allows also to give light to isometric subgraphs of Q~n avoiding a word. In fact they are those avoiding a tilde-isometric word.

In analogy with the definition of Fibonacci cubes introduced by Hsu [21], we give the following:

Definition 30.

Let n1 and Q~n be the tilde-hypercube of dimension n. The n-tilde-Fibonacci cube, or tilde-Fibonacci cube of dimension n is F~n=Q~n(11).

Figure 3: Tilde-Fibonacci cubes with n=1,2,3,4 (the colored edges are those added to the traditional hypercube).

Figure 3 shows the tilde-Fibonacci cube of order 1,2,3,4.

From Theorem 25 and Proposition 19, the following corollary derives.

Corollary 31.

The tilde-Fibonacci cube F~n=Q~n(11) is an isometric subgraph of Q~n.

Further, about other hypercubes avoiding words of length 2, we have the following:

 Remark 32.

For each n1, Q~n(10) and Qn(10) coincide. In fact, V(Q~n(10))={0h1k|h,k0,h+k=n}=V(Qn(10)) and E(Q~n(10))={(0i1j,0i11j+1)|  1in, 0jn1,i+j=n}=E(Qn(10)).

By complement, also Q~n(01) and Qn(01) coincide (see Remark 22).

Note also that Q~n(01) is obtained from Q~n(10) by complementing all the bits in the vertices labels, i.e. they are isomorphic.

The tilde Fibonacci cube admits also a recursive construction that allows one to enumerate its edges.

By Proposition 2, |V(F~n)|=|V(Fn)|=fn+2. Among these vertices, fn+1 end with a 0 and fn end with a 1. Figure 3 shows the tilde-Fibonacci cube of order 4.

 Remark 33.

Let uV(Fn1), xΣ. If u ends with 1, then uxV(F~n) iff x=0. If u ends with 0 then uxV(F~n), for any x{0,1}.

Proposition 34.

The n-tilde-Fibonacci cubes F~n, with n1, can be defined recursively.

Proof.

If n=1, F~1 has two vertices 0 and 1 connected by an edge. If n=2, F~2 has three vertices 00, 01 and 10 and E(F~2)={(00,10),(00,01),(01,10)}.

Let n3 and suppose that F~i are defined for all i<n. Then, F~n can be constructed from a copy of F~n1 where each vertex u is replaced by u0, denoted by F~n10, and a copy of F~n2, where each vertex v is replaced by v01, and denoted by F~n201. If there is an edge (u,v) in F~n1, then there is en edge (u0,v0) F~n, i.e. F~n10 is a subgraph of F~n. For similar reasons F~n201 is a subgraph of F~n. Further, for any u0V(F~n1), then there is an edge in F~n connecting v00 in F~n10 and u01 in F~n201 and for any u1V(F~n1) there is an edge linking u10 in F~n10 and u01 in F~n201 (see the orange edges in Fig. 3). By Remark 33 and Lemma 15 no further edges exist in F~n.

Corollary 35.

Let F~n be the n-tilde-Fibonacci cube. Then, |E(F~1)|=1,|E(F~2)|=3 and, for any n2

|E(F~n)|=|E(F~n1)|+|E(F~n2)|+fn+1.

Hence, we can give the following exact formula:

|E(F~n)|=(n+1)fn+3+(n2)fn+15,

(Sequence A023610 in [31] for |E(F~n+1|). Since the number of vertices of F~n is fn+2, from the previous formula it follows that the tilde-Fibonacci cube has O(NlogN) edges, where N is the number of vertices, as for the tilde-hypercube (see Equation (1)). Let us conclude the section with some structural properties of tilde-Fibonacci cubes such as the diameter and the radius.
The eccentricity of a vertex v of a connected graph G is defined as

e(v)=maxuV(G)dG(u,v),

where dG(u,v) is the length of a shortest path from u to v in G. The diameter and the radius of G are respectively defined by

d(G)=maxvV(G)e(v)andr(G)=minvV(G)e(v).

In [21] it is proven that d(Fn)=n and that the maximal distance involves the words (10)n/2 and (01)n/2 for even n, and (01)n/20 and (10)n/21 for odd n. Note that, since the swap operation adds new edges, it shortens the distances between vertices. Moreover, the distances can even be halved because a swap replaces two replacement operations. More precisely, we have the following proposition.

Proposition 36.

Let F~n be the n-tilde-Fibonacci cube. Then, for any n1, d(Fn~)=r(Fn~)=n/2.

Proof.

First, we prove that for any n1 and u,vV(F~n), dist(u,v)n/2 by induction on n.
The case where n=1 is trivial. If n=2, then for any u,v11, we have dist(u,v)=1. Let now u,v be 11-free words of same length n, with n>2. Then, u=xyu and v=ztv, with x,y,z,t{0,1} and u,v{0,1}. Recall that dist(u,v) is the minimal number of swaps and replacements to transform u into v. Since xy can be transformed into zt with at most a single swap or replacement, a possible way to transform u in v is to transform xy into zt and then u into v. Therefore, dist(u,v)1+dist(u,v)1+(n2)/2n/2.

Moreover, for each 11-free word u of length n, with n1, there exists a 11-free word v of length n such that dist(u,v)=n/2. In fact, for each word u of length n, the word v is obtained by replacing in u, from left to right, the blocks 00 with 10, the blocks 10 with 00, the blocks 01 with 10 and finally, for odd n, the last bit with its complement. Trivially, if u is 11-free then v is 11-free, as well. We prove that dist(u,v)=n/2 by induction on n. The case n=1 is trivial. If n=2 then dist(01,10)=dist(10,00)=dist(00,10)=1. If n>2 then either u=00u (v=10v) or u=01u (v=10v). In the first case, by Remark 11, one has dist(u,v)=1+(n2)/2=n/2. In the second case, by Remark 12, one has dist(u,v)=1+(n2)/2=n/2. In any case, dist(u,v)=n/2. Finally, since F~n is an isometric subgraph of Q~n then dF~n(u,v)=dist(u,v) and r(Fn~)=d(Fn~)=n/2.

The previous proposition proves that Fn~ is a self-centered graph, that is, a graph in which radius and diameter coincide (cf. [14] for a survey).

7 Conclusions

The paper surveys some recent results on isometric words with respect to the edit distance that allows swap and replacement operations, here referred to as tilde-distance. Moreover, the tilde-hypercube and the tilde-Fibonacci cube are presented as a generalization of the corresponding classical notions, with the tilde-distance in place of the Hamming distance.

Compared with the setting of Hamming distance, all the problems appear to be more complicated since, when using swaps, the order of performing the operations does matter. Nevertheless, isometric words and generalized Fibonacci cubes based on this tilde-distance open up new scenarios and present interesting new situations that surely deserve further investigation as it can serve as base for string and graph algorithmic developments.

References

  • [1] Amihood Amir, Estrella Eisenberg, and Ely Porat. Swap and mismatch edit distance. Algorithmica, 45(1):109–120, 2006. doi:10.1007/978-3-540-30140-0_4.
  • [2] Marcella Anselmo, Giuseppa Castiglione, Manuela Flores, Dora Giammarresi, Maria Madonia, and Sabrina Mantaci. Hypercubes and isometric words based on swap and mismatch distance. In Descriptional Complexity of Formal Systems. DCFS23, volume 13918 of Lect. Notes Comput. Sci., pages 21–35. Springer Nature, 2023. doi:10.1007/978-3-031-34326-1_2.
  • [3] Marcella Anselmo, Giuseppa Castiglione, Manuela Flores, Dora Giammarresi, Maria Madonia, and Sabrina Mantaci. Isometric words based on swap and mismatch distance. In Developments in Language Theory. DLT23, volume 13911 of Lect. Notes Comput. Sci., pages 23–35. Springer Nature, 2023. doi:10.1007/978-3-031-33264-7_3.
  • [4] Marcella Anselmo, Giuseppa Castiglione, Manuela Flores, Dora Giammarresi, Maria Madonia, and Sabrina Mantaci. Isometric sets of words and generalizations of the Fibonacci cubes. In Computability in Europe. CIE24, volume LNCS 14773 of Lect. Notes Comput. Sci., pages 1–14. Springer, 2024. doi:10.1007/978-3-031-64309-5_35.
  • [5] Marcella Anselmo, Giuseppa Castiglione, Manuela Flores, Dora Giammarresi, Maria Madonia, and Sabrina Mantaci. Characterization of isometric words based on swap and mismatch distance. International Journal of Foundations of Computer Science, 36(03):221–245, 2025. doi:10.1142/S0129054125430051.
  • [6] Marcella Anselmo, Manuela Flores, and Maria Madonia. Fun slot machines and transformations of words avoiding factors. In Fun with Algorithms, volume 226 of LIPIcs, pages 4:1–4:15, 2022. doi:10.4230/LIPIcs.FUN.2022.4.
  • [7] Marcella Anselmo, Manuela Flores, and Maria Madonia. On k-ary n-cubes and isometric words. Theor. Comput. Sci., 938:50–64, 2022. doi:10.1016/j.tcs.2022.10.007.
  • [8] Marcella Anselmo, Manuela Flores, and Maria Madonia. Density of Ham- and Lee- non-isometric k-ary words. In ICTCS’23 Italian Conference on Theor. Comput. Sci., volume 3587 of CEUR Workshop Proceedings, pages 116–128, 2023. URL: https://ceur-ws.org/Vol-3587/3914.pdf.
  • [9] Marcella Anselmo, Manuela Flores, and Maria Madonia. Computing the index of non-isometric k-ary words with Hamming and Lee distance. Computability, IOS press, 13(3-4):199–222, 2024. doi:10.3233/COM-230441.
  • [10] Marcella Anselmo, Manuela Flores, and Maria Madonia. Density of k-ary words with 0, 1, 2 - error overlaps. Theor. Comput. Sci., 1025:114958, 2025. doi:10.1016/j.tcs.2024.114958.
  • [11] Jernej Azarija, Sandi Klavžar, Jaehun Lee, Jay Pantone, and Yoomi Rho. On isomorphism classes of generalized Fibonacci cubes. Eur. J. Comb., 51:372–379, 2016. doi:10.1016/j.ejc.2015.05.011.
  • [12] Marie-Pierre Béal and Maxime Crochemore. Checking whether a word is Hamming-isometric in linear time. Theor. Comput. Sci., 933:55–59, 2022. doi:10.1016/j.tcs.2022.08.032.
  • [13] Laxmi N. Bhuyan and Dharma P. Agrawal. Generalized hypercube and hyperbus structures for a computer network. IEEE Transactions on Computers, C-33(4):323–333, 1984. doi:10.1109/TC.1984.1676437.
  • [14] Fred Buckley. Self-centered graphs. Ann. New York Acad. Sci, 576:71–78, 1989. doi:10.1111/j.1749-6632.1989.tb16384.x.
  • [15] Sergio Cabello, David Eppstein, and Sandi Klavzar. The Fibonacci dimension of a graph. Electron. J. Comb., 18(1), 2011. doi:10.37236/542.
  • [16] Giuseppa Castiglione, Manuela Flores, and Dora Giammarresi. Isometric words and edit distance: Main notions and new variations. In Cellular Automata and Discrete Complex Systems. AUTOMATA 2023, volume 14152 of Lect. Notes Comput. Sci., pages 6–13. Springer, 2023. doi:10.1007/978-3-031-42250-8_1.
  • [17] W.J. Dally. Performance analysis of k-ary n-cube interconnection networks. IEEE Transactions on Computers, 39(6):775–785, 1990. doi:10.1109/12.53599.
  • [18] Ahmed El-Amawy and Shaharam Latifi. Properties and performance of folded hypercubes. IEEE Transactions on Parallel and Distributed Systems, 2(1):31–42, 1991. doi:10.1109/71.80187.
  • [19] Simone Faro and Arianna Pavone. An efficient skip-search approach to swap matching. Comput. J., 61(9):1351–1360, 2018. doi:10.1093/comjnl/bxx123.
  • [20] Frank Harary, John P. Hayes, and Horng J. Wu. A survey of the theory of hypercube graphs. Comput. Math. Appl., 15(4):277–289, 1988. doi:10.1016/0898-1221(88)90213-1.
  • [21] Wen-Jing Hsu. Fibonacci cubes-a new interconnection topology. IEEE Transactions on Parallel and Distributed Systems, 4(1):3–12, 1993. doi:10.1109/71.205649.
  • [22] Aleksandar Ilić, Sandi Klavžar, and Yoomi Rho. Generalized Fibonacci cubes. Discrete Math., 312(1):2–11, 2012. doi:10.1016/j.disc.2011.02.015.
  • [23] Aleksandar Ilić, Sandi Klavžar, and Yoomi Rho. The index of a binary word. Theor. Comput. Sci., 452:100–106, 2012. doi:10.1016/j.tcs.2012.05.025.
  • [24] Sandi Klavžar. Structure of Fibonacci cubes: A survey. J. Comb. Optim., 25(4):505–522, 2013. doi:10.1007/s10878-011-9433-z.
  • [25] Sandi Klavžar and Sergey V. Shpectorov. Asymptotic number of isometric generalized Fibonacci cubes. Eur. J. Comb., 33(2):220–226, 2012. doi:10.1016/j.ejc.2011.10.001.
  • [26] Sandi Klavžar. On median nature and enumerative properties of Fibonacci-like cubes. Discrete Mathematics, 299(1):145–153, 2005. doi:10.1016/j.disc.2004.02.023.
  • [27] Sandi Klavžar and Petra Žigert. Fibonacci cubes are the resonance graphs of fibonaccenes. The Fibonacci Quarterly, 43(3):269–276, 2005. doi:10.1080/00150517.2005.12428368.
  • [28] Emanuele Munarini and Norma Zagaglia Salvi. Structural and enumerative properties of the Fibonacci cubes. Discrete Math., 255(1-3):317–324, 2002. doi:10.1016/S0012-365X(01)00407-1.
  • [29] Gonzalo Navarro. A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31–88, 2001. doi:10.1145/375360.375365.
  • [30] P. Tolstrup Nielsen. A note on bifix-free sequences (corresp.). IEEE Transactions on Information Theory, 19(5):704–706, 1973. doi:10.1109/TIT.1973.1055065.
  • [31] N.J.A. Sloane. On-line encyclopedia of integer sequences. URL: http://oeis.org/.
  • [32] Nian-Feng Tzeng and Sizheng Wei. Enhanced hypercubes. IEEE Trans. Computers, 40(3):284–294, 1991. doi:10.1109/12.76405.
  • [33] Robert A. Wagner. On the complexity of the extended string-to-string correction problem. In Symposium on Theory of Computing, pages 218–223, 1975. doi:10.1145/800116.803771.
  • [34] Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, 1974. doi:10.1145/321796.321811.
  • [35] Jianxin Wei. The structures of bad words. Eur. J. Comb., 59:204–214, 2017. doi:10.1016/j.ejc.2016.05.003.
  • [36] Jianxin Wei. Proof of a conjecture on 2-isometric words. Theor. Comput. Sci., 855:68–73, 2021. doi:10.1016/j.tcs.2020.11.026.
  • [37] Jianxin Wei, Yujun Yang, and Guangfu Wang. Circular embeddability of isometric words. Discret. Math., 343(10):112024, 2020. doi:10.1016/j.disc.2020.112024.
  • [38] Jianxin Wei, Yujun Yang, and Xuena Zhu. A characterization of non-isometric binary words. Eur. J. Comb., 78:121–133, 2019. doi:10.1016/j.ejc.2019.02.001.
  • [39] Jianxin Wei and Heping Zhang. Proofs of two conjectures on generalized Fibonacci cubes. Eur. J. Comb., 51:419–432, 2016. doi:10.1016/j.ejc.2015.07.018.