Small Space Encoding and Recognition of -Palindromic Prefixes
Abstract
Palindromes are non-empty strings that read the same forward and backward. We study the problem of recognizing so-called -palindromic strings, which can be represented as the concatenation of exactly palindromes. [Rubinchik and Shur, MFCS 2020] showed that the problem is solvable in linear space and time. We present a read-only algorithm that recognizes all -palindromic prefixes of a string of length in time and space. As a corollary, we also obtain a read-only algorithm for computing the palindromic length of , i.e., the smallest such that is -palindromic, in time and space.
Keywords and phrases:
palindromic length, read-only algorithms, palindromesCopyright and License:
2012 ACM Subject Classification:
Theory of computation Streaming, sublinear and near linear time algorithmsFunding:
Gabriel Bathie, Jonas Ellert, and Tatiana Starikovskaya: Partially funded by grant ANR-20-CE48-0001 from the French National Research Agency.Editors:
Ho-Lin Chen, Wing-Kai Hon, and Meng-Tsung TsaiSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
A palindrome is a non-empty string that equals its reversed copy, i.e., a string that reads the same both forward and backward. Throughout, we denote the language of palindromes by PAL. Natural variants of PAL include the language of even-length palindromes and the language of palindromes of length greater than one . Recognising (often referred to as “palstar”), , and is a classical problem of formal language theory, introduced by Knuth, Morris, and Pratt [27].111∗ is a Kleene star. In this problem, one is given an input string and must decide whether it is in the language.
Languages , , and are context-free, and Valiant’s parser from 1975 recognizes them in time, where is the length of the input and is the matrix multiplication exponent. Only in 2018, Abboud, Backurs, and Vassilevska Williams showed that Valiant’s parser is optimal if the current clique algorithms are optimal [1], meaning that for general context-free languages, there is little hope of achieving a faster recognition algorithm. The origins of the study of derivatives of the PAL languages are in fact in line with the result of [1]: At one time, it was popularly believed that cannot be recognised in linear time, and it was considered as a candidate for a “hard” context-free language (see [27, Section 6]). However, [27] refuted this hypothesis by showing an -time recognition algorithm for . Manacher [32] found another way to recognize in linear time, and Galil [16] derived a real-time recognition algorithm (see also Slisenko [35]). Later, Galil and Seiferas [17] showed a linear-time recognition algorithm for .
Recognition of appeared to be a much more intricate problem. Galil and Seiferas [17] succeeded to design linear-time recognition algorithms for the cases , but the general question remained open for almost 40 years. Only in 2015, Kosolobov, Rubinchik, and Shur [29] showed an -time recognition algorithm for for all , which was finally improved to optimal time by Rubinchik and Shur in 2020 [34]. A related question is that of computing the palindromic length of a string , which is defined to be the smallest integer such that . The first -time algorithms for computing the palindromic length were presented in [12, 25, 33]. Finally, Borozdin et al. [10] showed an optimal -time algorithm for this problem.
Our contributions.
In this work, we turn our attention to the space complexity of recognising and computing the palindromic length. We start by presenting a characterization of prefixes of a given string that belong to . For , we refer to these prefixes as prefix-palindromes, and otherwise as -palindromic prefixes. A crucial component of the linear time algorithm by Borozdin et al. [10] is the following property: the prefix-palindromes of a length- string can be expressed as sets of form , where is an integer. In order to encode -palindromic prefixes, we introduce a notion of affine prefix sets of order . Intuitively, such a set consists of prefixes of the form with , where are integers. That is, rather than a single repeating substring , we allow multiple different substrings of different lengths. An affine prefix set of order can then be encoded in space. By carefully analyzing the rich structure of periodic substrings induced by -palindromic prefixes, we show that they can be expressed by a small number of affine prefix sets.
Theorem 1.1.
Let be constant, a string, and . The set of prefixes of that belong to is the union of affine prefix sets of order .
The remainder of the paper focuses on the main ideas behind Theorem 1.1. In the full version [5], apart from explaining all details, we show a lower bound for the size of the representation (Theorem 1.2), and provide read-only algorithms for computing affine prefix sets and palindromic length (Theorems 1.3 and 1.4). The lower bound shows that our representation is within a factor of being asymptotically optimal for constant . It is obtained by constructing a large family of strings uniquely identifiable by their palindromic prefixes.
Theorem 1.2.
Let be a string and let . Encoding the lengths of the prefixes of that belong to , for each , requires bits of space.
The read-only algorithms directly implement the techniques used to show Theorem 1.1.
Theorem 1.3.
Let be constant. Given a string and , there is a read-only algorithm that returns a compressed representation of all prefixes of that belong to , for each , in time and space.
Theorem 1.4.
Given a string , there is a read-only algorithm that computes the palindromic length of in time and space.
In particular, for , the algorithm uses time and space, and for , it uses time and sublinear space. In the regime of small palindromic length, this is an improvement over all previously-known algorithms [10, 34], which require space. It remains an intriguing open problem to achieve both linear time and sublinear space. On the other hand, Theorem 1.2 does not imply a lower bound for the algorithms of Theorems 1.3 and 1.4 because they have access to the input. This said, proving an space lower bound for such algorithms appears out of reach with current techniques. The only lower bound method for read-only string processing the authors are aware of relies on deterministic branching programs [9], and shows that any sublinear-space algorithm for longest common substring requires slightly superlinear time [28].
Related work.
The problem of computing palindromes in small space has received significant attention: the longest palindromic substring [8, 24], and all approximate prefix-palindromes [3, 6] have been studied. More broadly, small-space recognition of formal languages has been explored for regular [19, 20, 21, 22, 11], Dyck [26, 30, 31], visibly pushdown [2, 14, 18, 7], context-free [23], and / languages [4].
2 Preliminaries
Series, strings, and substrings.
For , we write to denote . A series is denoted by . The empty series is denoted by . We use the dot-product to denote the concatenation of two series, e.g., for one can represent . We may omit the subscript and superscript for series of length one, e.g., .
A string of length is a sequence of symbols from a set , which we call the alphabet. The input string is also called the text. We denote the set of all length- strings by , and we set as well as . The empty string is denoted by . For , the -th symbol in is denoted by . The substring is the empty string if , and the string otherwise. We may call a substring a fragment of to emphasize that we mean the specific occurrence of that starts at a position . For example, in the string , the substrings and are identical, but and are distinct fragments. A string is a prefix of if there is such that , in which case we may simply write . Similarly, is a suffix of if there is such that , in which case we may simply write . We extend this notion to the empty suffix and the empty prefix . A substring (hence also a suffix or prefix) of is proper if it is shorter than . When introducing a string , we may simply say that is a string rather than saying that is a string of length . The concatenation of two strings and is the string , denoted by either or simply . For non-negative integer , we write to denote the length- string obtained by concatenating copies of . We extend this idea to non-negative rational exponents , for which we write to denote . We only use this notation if .
Palindromes and periodicities.
For a string , we write to denote its reverse, i.e., . We then say that is a palindrome if and only if is non-empty and . The set of all palindromes is denoted by PAL. For a positive integer , the set contains all the strings that can be written as the concatenation of exactly palindromes. We refer to such strings as -palindromic. If , and a string is a one-palindromic prefix of another string, we also refer to it as prefix-palindrome.
We define the forward cyclic rotation . More generally, a cyclic rotation with shift is obtained by iterating (if is positive) or the inverse operation (if is negative) exactly times. A non-empty string is primitive if it is distinct from its non-trivial rotations, i.e., if holds only when divides . If a string can be represented as for some primitive string and integer , then is called the primitive root of .
A string has period if , or equivalently if . The string is a border of . If has period , then we say that is -periodic. If has period , then it can be written as , where . We may alternatively use a rational exponent and write . Below, we provide some simple auxiliary lemmas regarding periodic strings and palindromes (with proofs in the full version).
Fact 2.1 (Periodicity Lemma [13]).
If and are distinct periods of a string of length at least , then is a period of the string.
Corollary 2.2 (Folklore).
For a primitive string , the minimal period of is .
Lemma 2.3.
Assume that a palindrome has a -periodic prefix of length . If , then is -periodic.
Model of computation.
We assume the word-RAM model of computation [15], using words of size when processing an input string of length . The presented algorithms are deterministic and read-only, i.e., they cannot write to the memory occupied by the input. Space complexities are stated in number of words, ignoring the space occupied by the input.
3 Combinatorial Properties of Affine Prefix Sets
In this section, we study the combinatorial structure of -palindromic prefixes of . We start with the definition of affine sets, which we will use as a scaffolding for our analysis.
Definition 3.1 (Affine sets).
A set of strings is affine if there exist , a string , primitive strings , and positive integers and such that
The tuple is a representation of , and is the order of the representation. The order of is the minimal order achieved by any of its representations. We call the components of a representation, and (resp., ) the exponent lower (resp., upper) bounds.
A representation generates (the strings of) the corresponding affine string set. If generates and generates , then their concatenation is defined as , where is a primitive string and is a positive integer such that (i.e., is the primitive root of ). The concatenation generates . (If , then the concatenation is .)
In what follows, we consider affine prefix sets, i.e., affine sets that contain only prefixes of the given input string . We will show that a small number of affine prefix sets suffices to represent the -palindromic prefixes of . The case where , i.e., the structure of prefix-palindromes, is well-understood: there are groups of such palindromes, where each group can be expressed as an arithmetic progression and a corresponding periodic prefix of (see [10, Lemma 5]). Below, we restate this result in the framework of affine prefix sets.
Lemma 3.2.
The prefix-palindromes of a string can be partitioned into affine sets of order at most . Each set of order has representation for some , and integers and .
3.1 Reducing affine prefix sets
A single affine set may have multiple equivalent representations. For example, the affine set is represented by and , . Arguably, the latter representation is preferable, as it has a lower order and can thus be encoded more efficiently. Hence we propose a way of potentially decreasing the order of a representation by reducing it.
Definition 3.3 (Irreducible representation).
A representation of an affine string set is irreducible if and only if and .
From now on, we say that with is fixed if , and flexible otherwise. If there is some such that , then we say that there is an inversion between and . Thus, a representation is irreducible if and only if all components are flexible and have unit lower bounds, and there are no inversions. As per this definition, is the only irreducible representation of from the previous example. (See also Figure 1(a).)
Properties of flexible components.
Now we show how to make an arbitrary representation irreducible, possibly decreasing (but never increasing) its order. The reduction exploits the structure of periodic substrings induced by flexible components.
Lemma 3.4.
Let be a representation of an affine prefix set, and consider any such that is flexible. Then is a period of every string that satisfies and .
Proof.
Let and . By the definition of an affine prefix set, is a prefix of the underlying string . Since is flexible, it holds , and thus is also a prefix of . If both and are prefixes of , then is a prefix of . Hence and have periods . Consequently, , for all , also has period .
If two adjacent components and are flexible, then the lemma allows us to obtain the following lower bound on the length of .
Lemma 3.5.
Let be a representation of an affine prefix set, and let . If both and are flexible, then either or
Proof.
For flexible and , let , , and . Let . By Lemma 3.4, both and are periods of , and is a period of . Since is a period of , it is also a period of . Hence if and only if . For the sake of contradiction, assume that the lemma does not hold, i.e., and . We make two observations.
First, is of length , and it has distinct periods and . The periodicity lemma (˜2.1) implies that is a period of , and thus also a period of its prefix . If , then , which contradicts the primitivity of . Second, is of length . Since is a period of , we know that is a prefix of . Hence is also a period of . If , then , which contradicts the primitivity of .
We have shown that . This is only possible if , which contradicts the assumption that . Therefore, the lemma holds.
Lemma 3.6.
Let be an irreducible representation of an affine prefix set of a string of length . Then it holds and .
Proof.
Let of cardinality be the set of exponent configurations admitted by the representation (where because the representation is irreducible). Then . In order to show , it suffices to show that no two elements in generate the same string.
For the sake of contradiction, assume that there are distinct sequences that generate the same string . Let be the minimal index such that , and assume w.l.o.g. that . Then has the prefix , and we can factorize the corresponding suffix in two different ways as . However, the two factorizations have different lengths , where the second inequality is due to Lemma 3.5. Because of this contradiction, there cannot be distinct sequences that define the same string.
Finally, it holds for any irreducible representation, and thus . Since trivially , it follows or equivalently .
Transforming representations.
Now we use the properties of flexible components to transform an arbitrary representation into an irreducible one. We use the following operations.
Lemma 3.7.
Let be a representation of an affine prefix set .
-
1.
If there is such that is flexible and is fixed, then let . has representation
-
2.
If there is such that both and are flexible and , then and has representation
-
3.
If there is such that , then has representation
-
4.
If is fixed, then has representation .
Proof.
Statements (3) and (4) are trivial. For (2), if and both and are flexible, then Lemma 3.5 implies . Hence the statement follows.
Finally, we show that statement (1) holds. Assume that is flexible and is fixed. Then Lemma 3.4 implies that is a period of , and thus with and . (Either or might be zero, but this is irrelevant for the proof.) Let and . Any rotation of a primitive string is primitive, and hence is primitive. For any exponent , it holds . Hence the stated transformation does not change the represented affine prefix set.
The leftmost (i.e., lowest index) fixed component of a representation can either be removed with truncate (if ), or it can be moved further to the left with (if ). By repeatedly applying truncate and switch, we obtain the following lemma.
Lemma 3.8.
Let be a representation of an affine prefix set, and let with be the indices of the flexible components. Then the affine prefix set has a representation such that is a rotation of for every . Both and all the are functions of , , and , i.e., they are independent of .
After applying Lemma 3.8, we repeatedly apply merge to remove all inversions. Then, we apply split until all flexible components have exponent lower bound 1. This may result in new fixed components, which we remove with Lemma 3.8, resulting in the following lemma.
Lemma 3.9.
An affine prefix set represented by has an irreducible representation of order , where .
3.2 Strongly affine representations
To analyze the intricate structure of repetitive fragments induced by affine prefix sets, it is helpful to assume that periodicity extends slightly beyond the region in question. To this end, we define a strongly affine representation of an affine prefix set of , in which the exponent upper bound of each (flexible) component can be increased by five and still yield an affine prefix set of . A supplementary drawing is provided in Figure 1(b).
Definition 3.10 (Strongly affine representations).
A representation of an affine prefix set of a string is strongly affine if and only if its periodic expansion is also the representation of an affine prefix set of , where , , is defined as follows: if and otherwise.
Definition 3.11 (Canonical representation).
A representation of an affine prefix set is canonical if and only if it is both strongly affine and irreducible.
It can be readily verified that, if is strongly affine, then also , , , and are strongly affine (for any , assuming that the respective operation is indeed applicable). We obtained Lemma 3.9 by applying a sequence of these operations, and hence we have the following immediate corollary.
Corollary 3.12.
An affine prefix set with strongly affine representation has a canonical representation of order , where .
Whether a representation of an affine prefix set of is strongly affine does not only depend on , it also depends on what looks like beyond the end of the longest prefix represented by . Therefore, one cannot hope to transform an arbitrary representation into an equivalent strongly affine representation. However, by “removing” the last five copies of each component and treating them separately, we show that we can cover an affine prefix set of order with at most canonical representations.
Lemma 3.13.
An affine prefix set of order can be partitioned into at most affine prefix sets, each of which has a canonical representation of order at most .
Proof.
Let be a representation of an affine prefix set. We produce a set of representations , where , we define . It is easy to see that the affine sets generated by representations in form a partition of the affine set generated by . By design, for any representation in , and for any component , we know that is either fixed, or it has exponent lower bound and exponent upper bound . Hence the instances in are strongly affine, and it follows from Corollary 3.12 that each of them has an equivalent canonical representation of order at most . Finally, it holds and thus .
By applying the technique from the proof above to the prefix-palindromes, i.e., to each of the representations of order produced by Lemma 3.2, we obtain the following result.
Corollary 3.14.
The set of prefix-palindromes of a string can be partitioned into affine sets of order at most . Each set of order has canonical representation for some , and integers and .
Corollary 3.15.
Let be a canonical representation of an affine prefix set. Then it holds .
Proof.
If is canonical, then clearly is irreducible. Thus, the statement follows from Lemma 3.5 applied to .
Lemma 3.16.
Let be a canonical representation of an affine prefix set, and let . Then is an irreducible representation of an affine prefix set of the string , and, if , also of the string .
Proof.
Consider any . Due to the strong affinity, represents an affine prefix set. Let be a sequence of exponents with . By Lemma 3.4, the string has period . Due to Lemma 3.5, it holds . Hence we have shown that is a prefix of , and is a representation of an affine prefix set of . Since is irreducible, it is easy to see that also is irreducible.
If , then Lemma 3.5 invoked with implies , and indeed only generates strings of length less than .
Corollary 3.17.
Let be a canonical representation of an affine prefix set. Then is a canonical representation of an affine prefix set of the string .
Proof.
By Lemma 3.16, is an irreducible representation of an affine prefix set of . Hence is a canonical representation for .
3.3 Reversing the structure of affine prefix sets
We first show that a periodic fragment of induced by an affine prefix set can be covered by a combination of a forward and a “backward” affine prefix set (see Figure 2):
Lemma 3.18.
Let be an irreducible representation of an affine prefix set, let , and for let be the length- suffix of . For any sequence with , it holds
Proof.
If , then . Inductively assume that the lemma holds for representations of order . Now we show that it holds for representations of order . If is an irreducible representation of an affine prefix set, then clearly is an irreducible representation of another affine prefix set. This representation is of order , and hence the inductive assumption implies
If , then there is nothing left to do. Hence assume . Since is an irreducible representation, Lemma 3.4 implies that and therefore also is a period of . Hence has a border of length , where , and it holds
Finally, as mentioned before, of length has period . Hence , which concludes the proof.
We now build on this characterization to convert irreducible representations of affine prefix sets of into irreducible representations of affine prefix sets of .
Corollary 3.19.
Let be a canonical representation of an affine prefix set, let , and for let be the length- suffix of . Then represents an affine prefix set of .
Proof.
Consider any sequence of exponents admitted by the representation, i.e., . By Lemma 3.16, is an irreducible representation of an affine prefix set of , which implies . For this representation, Lemma 3.18 implies that is a suffix of . Thus, its reversal is a prefix of , which is a prefix of .
4 Appending a Palindrome to an Affine Prefix Set
In this section, we show how to extend an affine prefix set with a palindrome. This amounts to computing the union of affine prefix sets where each new prefix is formed by concatenating a prefix from with a palindrome. We distinguish two cases, depending on whether the appended palindrome lies within a periodic fragment of . In either case, we may temporarily overextend , producing sets that are not affine prefix sets. We then restore validity by truncating the sets using the lemma below. For a set of strings , denote .
Lemma 4.1.
Let be a representation of an affine prefix set . For , we can express as a union of at most affine prefix sets , each with a representation of order at most .
Proof Sketch.
The proof is by induction. For the base case , it is enough to reduce the upper bound on the exponent of . For , the proof proceeds by finding a minimal exponent such that . If , then . Otherwise, we assume w.l.o.g. that is irreducible (see Lemma 3.9). It follows that we do not need to consider prefixes generated with an exponent larger than for . We partition the remaining prefixes into two sets, one of which can be further broken down using the inductive hypothesis, reducing the problem to problems of smaller sizes until the result follows.
4.1 Appending a long palindrome
Assume that the affine prefix set to be extended is given in a canonical representation . We first focus on appending long palindromes of length at least , and then we show that the shorter palindromes can be handled recursively. Note that, for a canonical representation, has a prefix . At the same time, the longest prefix in the affine set is of length less than by Corollary 3.15. This leads us to a case distinction based on the center of the palindrome to be appended. If the center is before position , then we can show that the entire palindrome is within the -periodic prefix of . Otherwise, the left half of the palindrome contains position , and we can use this position as an anchor point for the extension.
4.1.1 Appending a long palindrome within a run of
We now focus on the case where the long palindrome to be appended is entirely within the -periodic prefix of . We proceed in two steps. First (in Theorem 4.2), we show how to append a palindrome under the assumption that the entire string has the form for some integer . Secondly (Corollary 4.3), we truncate the result of the first step so that it corresponds to , where is the largest value such that is a prefix of .
Theorem 4.2.
Let be a canonical representation of an affine prefix set . Let , and for let be the length- suffix of . If for some , then
| (1) |
represents an affine prefix set of , for any positive integer . Furthermore:
-
1.
If , then there is a string and a palindrome such that .
-
2.
For and , if and is a prefix of , then .
Proof Sketch.
The keystone of the proof is Corollary 3.19 which implies that a string , where , is a prefix of
Using this fact, we first establish that a string generated by the canonical representation in Equation 1 is a prefix of . Next, we show that can be decomposed as , where and is a palindrome. It follows that is a substring of a power of , and the condition ensures is a palindrome. Finally, we conclude by considering any string and a sufficiently long palindrome such that is a prefix of . Due to , there is some sequence of exponents such that . From this and the fact that is a palindrome, we show that fits the structure required for membership in , completing the proof.
By combining Theorem 4.2 and Lemma 4.1, we obtain:
Corollary 4.3.
Let be a canonical representation of an affine prefix set . Let be the largest possibly fractional exponent such that is a prefix of , and define . There are affine prefix sets , , each of order , such that for we have and for every , there is a string and a palindrome such that .
4.1.2 Appending a long palindrome outside a run of
Theorem 4.4.
Let be a canonical representation of an affine prefix set and . For , let be the length- suffix of . For any string , represents an affine prefix set of the string , where .
Proof Sketch.
Let . We can split the output representation into a concatenation
| (2) |
We first apply Corollary 3.19 to deduce that Equation 2 represents an affine prefix set of the string . Secondly, we show that every element in contributes exactly one element to , and hence . It thus suffices to show that any string generated by Equation 2 is in . It then readily follows that Equation 2 generates exactly . To do so, we consider any string generated by Equation 2. Such a string must be of the form , where for some exponents . By our previous observations, is a prefix of , and thus there is a unique string such that and . It remains to be shown that , which then implies . For this purpose, we carefully analyze the length of and show that a prefix of length indeed belongs to , concluding the proof.
For a fragment of , denote its center by .
Corollary 4.5.
Let be a canonical representation of an affine prefix set , and consider the set of strings There are affine prefix sets , , each of order , such that both of the following properties hold for :
-
1.
.
-
2.
For every , there is a string and a palindrome such that .
Proof Sketch.
Consider any , where , , , . Due to , Corollary 3.15 implies . Let , where and . We claim that . Indeed, the starting position of is less than the starting position of , and the centers of and coincide with . We call the core palindrome of . Note that every core palindrome is a prefix of (which is independent of ). Therefore, by Corollary 3.14, the set of core palindromes can be represented as the union of affine prefix sets. Let be any of these sets. We now describe how to compute the part of that contains strings of the form , where , , and the core palindrome of is some . The procedure depends on the representation of , which, by Corollary 3.14, is covered by one of the following cases. Let .
- Case 1:
-
is given in strongly affine representation , where is primitive and . In this case, we consider one fixed core palindrome in and apply Theorem 4.4 and Lemma 4.1 to obtain an affine prefix set generated by it. We then show that the sets generated by other core palindromes have a similar representation, which allows to union them and to obtain the final affine prefix set.
- Case 2:
-
is given in representation of order , i.e., it contains a single core palindrome . We proceed exactly like in Case 1, but with a single palindrome.
- Case 3:
-
has strongly affine representation , where is primitive and . For , let . We show that, if contains some with , then the entire can be written as for some . Hence we can simply apply Corollary 4.3 and obtain that the affine prefix set generated by is the union of at most affine prefix sets of order at most .
- Case 4:
-
has strongly affine representation , where is primitive and . We show that this case is impossible due to primitivity of .
We call the created affine prefix sets . There are core palindrome sets, each handled by a single case. Each case creates representations of order . Hence, there are affine prefix sets in total, each of order Furthermore, the four cases cover all possibilities, and hence . The second property holds by construction of the sets .
4.1.3 Appending all long palindromes and recursion
Lemma 4.6.
Let be a canonical representation of an affine prefix set and . There are affine prefix sets , , each of order , such that and for each string , there is a string and such that .
Proof.
We consider the sets from Corollaries 4.3 and 4.5, defined by
where is the largest (possibly fractional) exponent such that is a prefix of . Due to Corollaries 4.3 and 4.5, we can express (a superset of) as the union of affine prefix sets, each of order , where every string in each of the prefix sets is the concatenation of a string from and a palindrome. It remains to be shown that . For the sake of contradiction, assume that there is some string , where , and . Due to , is not a prefix of and hence it must be longer than . Let . We show a lower bound on . Since the given representation is strongly affine, it holds . It is also irreducible, and hence Corollary 3.15 implies . Therefore, it holds . Note that does not have period , but its length- prefix, which is a suffix of , does. Hence, by Lemma 2.3, it follows that is of length over , and therefore
This implies , which contradicts the initial assumption.
We have shown that appending palindromes of length at least results in affine prefix sets of order . For appending shorter palindromes, we exploit properties of strongly affine prefix sets that allow us to apply the previously described approach recursively.
Lemma 4.7.
Let be a canonical representation of an affine prefix set and a set of strings. Then is a union of affine prefix sets, each of order .
Theorem 1.1. [Restated, see original statement.]
Let be constant, a string, and . The set of prefixes of that belong to is the union of affine prefix sets of order .
Proof.
We start with the empty affine prefix set representing . We proceed in levels . The union of the affine prefix sets of level is exactly the set of all -palindromic prefixes of . For each affine prefix set of the current level , we first apply Lemmas 3.9 and 3.13 to obtain at most canonical representations of order . Then, for each of the representations, we append a palindrome using Lemma 4.7, resulting in at most affine prefix sets of order at most , which we move to level . Here, is a positive constant that depends on the precise complexity analysis of Lemma 4.7. Hence, after processing level , the total number of affine prefix sets is bounded by . For all sufficiently large (depending only on and ), the bound simplifies to . (And for the remaining values of , we have .)
Remark 4.8.
Lemmas 3.9 and 3.13 only work with the lengths of the components in the representations of affine prefix sets and the exponents bounds. Since all affine prefix sets are of small order, it is not difficult to see that these lemmas can be implemented efficiently. To implement Lemma 4.7, we make use of two procedures: The first one computes the longest prefix of a string of form , where is a representation of one of the sets, and the second computes a representation of the set of prefix-palindromes of a string. In the read-only model, both procedures can be implemented in time and space. Bounding the number of calls by the number of the generated affine prefix sets, we finally obtain Theorem 1.3. The algorithm of Theorem 1.3 can then be used to test if the palindromic length of is at most by checking whether is a -palindromic prefix, however, to achieve even better complexity, we run two copies of the algorithm, one from the left and the other from the right, and then combine their results to obtain Theorem 1.4.
References
- [1] Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. If the current clique algorithms are optimal, so is Valiant’s parser. SIAM J. Comput., 47(6):2527–2555, 2018. doi:10.1137/16M1061771.
- [2] Rajeev Alur and P. Madhusudan. Visibly pushdown languages. In STOC, pages 202–211, 2004. doi:10.1145/1007352.1007390.
- [3] Amihood Amir and Benny Porat. Approximate on-line palindrome recognition, and applications. In CPM, pages 21–29, 2014. doi:10.1007/978-3-319-07566-2_3.
- [4] Ajesh Babu, Nutan Limaye, Jaikumar Radhakrishnan, and Girish Varma. Streaming algorithms for language recognition problems. Theor. Comput. Sci., 494:13–23, 2013. doi:10.1016/J.TCS.2012.12.028.
- [5] Gabriel Bathie, Jonas Ellert, and Tatiana Starikovskaya. Small space encoding and recognition of k-palindromic prefixes. CoRR, abs/2410.03309, 2024. doi:10.48550/arXiv.2410.03309.
- [6] Gabriel Bathie, Tomasz Kociumaka, and Tatiana Starikovskaya. Small-space algorithms for the online language distance problem for palindromes and squares. In ISAAC, pages 10:1–10:17, 2023. doi:10.4230/LIPICS.ISAAC.2023.10.
- [7] Gabriel Bathie and Tatiana Starikovskaya. Property testing of regular languages with applications to streaming property testing of visibly pushdown languages. In ICALP, pages 119:1–119:17, 2021. doi:10.4230/LIPICS.ICALP.2021.119.
- [8] Petra Berenbrink, Funda Ergün, Frederik Mallmann-Trenn, and Erfan Sadeqi Azer. Palindrome recognition in the streaming model. In STACS, pages 149–161, 2014. doi:10.4230/LIPICS.STACS.2014.149.
- [9] Allan Borodin and Stephen A. Cook. A time-space tradeoff for sorting on a general sequential model of computation. In STOC, pages 294–301, 1980. doi:10.1145/800141.804677.
- [10] Kirill Borozdin, Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Palindromic length in linear time. In CPM, pages 23:1–23:12, 2017. doi:10.4230/LIPIcs.CPM.2017.23.
- [11] Bartlomiej Dudek, Pawel Gawrychowski, Garance Gourdel, and Tatiana Starikovskaya. Streaming regular expression membership and pattern matching. In SODA, pages 670–694, 2022. doi:10.1137/1.9781611977073.30.
- [12] Gabriele Fici, Travis Gagie, Juha Kärkkäinen, and Dominik Kempa. A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms, 28:41–48, 2014. doi:10.1016/J.JDA.2014.08.001.
- [13] Nathan Fine and Herbert Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society, 16(1):109–114, 1965. doi:10.2307/2034009.
- [14] Nathanaël François, Frédéric Magniez, Michel de Rougemont, and Olivier Serre. Streaming property testing of visibly pushdown languages. In ESA, pages 43:1–43:17, 2016. doi:10.4230/LIPICS.ESA.2016.43.
- [15] Michael L. Fredman and Dan E. Willard. BLASTING through the information theoretic barrier with FUSION TREES. In STOC, pages 1–7, 1990. doi:10.1145/100216.100217.
- [16] Zvi Galil. On converting on-line algorithms into real-time and on real-time algorithms for string-matching and palindrome recognition. SIGACT News, 7(4):26–30, 1975. doi:10.1145/990502.990505.
- [17] Zvi Galil and Joel I. Seiferas. A linear-time on-line recognition algorithm for "palstar". J. ACM, 25(1):102–111, 1978. doi:10.1145/322047.322056.
- [18] Moses Ganardi. Visibly pushdown languages over sliding windows. In STACS, pages 29:1–29:17, 2019. doi:10.4230/LIPICS.STACS.2019.29.
- [19] Moses Ganardi, Danny Hucke, Daniel König, Markus Lohrey, and Konstantinos Mamouras. Automata theory on sliding windows. In STACS, pages 31:1–31:14, 2018. doi:10.4230/LIPICS.STACS.2018.31.
- [20] Moses Ganardi, Danny Hucke, and Markus Lohrey. Querying regular languages over sliding windows. In FSTTCS, pages 18:1–18:14, 2016. doi:10.4230/LIPICS.FSTTCS.2016.18.
- [21] Moses Ganardi, Danny Hucke, and Markus Lohrey. Randomized sliding window algorithms for regular languages. In ICALP, pages 127:1–127:13, 2018. doi:10.4230/LIPICS.ICALP.2018.127.
- [22] Moses Ganardi, Danny Hucke, Markus Lohrey, and Tatiana Starikovskaya. Sliding window property testing for regular languages. In ISAAC, pages 6:1–6:13, 2019. doi:10.4230/LIPICS.ISAAC.2019.6.
- [23] Moses Ganardi, Artur Jez, and Markus Lohrey. Sliding windows over context-free languages. In MFCS, pages 15:1–15:15, 2018. doi:10.4230/LIPICS.MFCS.2018.15.
- [24] Pawel Gawrychowski, Oleg Merkurev, Arseny M. Shur, and Przemyslaw Uznanski. Tight tradeoffs for real-time approximation of longest palindromes in streams. Algorithmica, 81(9):3630–3654, 2019. doi:10.1007/S00453-019-00591-8.
- [25] Tomohiro I, Shiho Sugimoto, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing palindromic factorizations and palindromic covers on-line. In CPM, pages 150–161, 2014. doi:10.1007/978-3-319-07566-2_16.
- [26] Rahul Jain and Ashwin Nayak. The space complexity of recognizing well-parenthesized expressions in the streaming model: The index function revisited. IEEE Trans. Inf. Theory, 60(10):6646–6668, 2014. doi:10.1109/TIT.2014.2339859.
- [27] Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2):323–350, 1977. doi:10.1137/0206024.
- [28] Tomasz Kociumaka, Tatiana Starikovskaya, and Hjalte Wedel Vildhøj. Sublinear space algorithms for the longest common substring problem. In ESA, pages 605–617, 2014. doi:10.1007/978-3-662-44777-2_50.
- [29] Dmitry Kosolobov, Mikhail Rubinchik, and Arseny M. Shur. Palk is linear recognizable online. In SOFSEM, pages 289–301, 2015. doi:10.1007/978-3-662-46078-8_24.
- [30] Andreas Krebs, Nutan Limaye, and Srikanth Srinivasan. Streaming algorithms for recognizing nearly well-parenthesized expressions. In MFCS, pages 412–423, 2011. doi:10.1007/978-3-642-22993-0_38.
- [31] Frédéric Magniez, Claire Mathieu, and Ashwin Nayak. Recognizing well-parenthesized expressions in the streaming model. SIAM J. Comput., 43(6):1880–1905, 2014. doi:10.1137/130926122.
- [32] Glenn K. Manacher. A new linear-time “on-line” algorithm for finding the smallest initial palindrome of a string. J. ACM, 22(3):346–351, 1975. doi:10.1145/321892.321896.
- [33] Mikhail Rubinchik and Arseny M. Shur. EERTREE: an efficient data structure for processing palindromes in strings. Eur. J. Comb., 68:249–265, 2018. doi:10.1016/J.EJC.2017.07.021.
- [34] Mikhail Rubinchik and Arseny M. Shur. Palindromic k-factorization in pure linear time. In MFCS, pages 81:1–81:14, 2020. doi:10.4230/LIPICS.MFCS.2020.81.
- [35] Anatol O. Slisenko. A simplified proof of the real-time recognizability of palindromes on Turing machines. J. Sov. Math., 15:68–77, 1981. doi:10.1007/BF01404109.
