Extending the Burrows–Wheeler Transform for Cartesian Tree Matching and Constructing It
Abstract
Cartesian tree matching is a form of generalized pattern matching where a substring of the text matches with the pattern if they share the same Cartesian tree. This form of matching finds application for time series of stock prices and can be of interest for melody matching between musical scores. For the indexing problem, the state-of-the-art data structure is a Burrows–Wheeler transform based solution due to [Kim and Cho, CPM’21], which uses nearly succinct space and can count the number of substrings that Cartesian tree match with a pattern in time linear in the pattern length. The authors address the construction of their data structure with a straight-forward solution that, however, requires pointer-based data structures, resulting in bits of space, where is the text length [Kim and Cho, CPM’21, Section A.4]. We address this bottleneck by a construction that requires bits of space and has a time complexity of , where is alphabet size. Additionally, we can extend this index for indexing multiple circular texts in the spirit of the extended Burrows–Wheeler transform without sacrificing the time and space complexities. We present this index in a dynamic variant, where we pay a logarithmic slowdown and need space linear in the input texts in bits for the extra functionality that we can incrementally add texts. Our extended setting is of interest for finding repetitive motifs common in the aforementioned applications, independent of offsets and scaling.
Keywords and phrases:
Cartesian tree matching, extended Burrows–Wheeler transform, construction algorithm, generalized pattern matchingFunding:
Dominik Köppl: JSPS KAKENHI Grant Numbers JP23H04378 and JP25K21150.Copyright and License:
2012 ACM Subject Classification:
Theory of computationAcknowledgements:
We sincerely thank the anonymous reviewers for their constructive comments.Editors:
Paola Bonizzoni and Veli MäkinenSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
String matching is ubiquitous in computer science, and its variations are custom-made to solve a wide variety of problems. We here focus on a special kind of variation called substring consistent equivalence relation (SCER) [21]. Two strings and are said to SCER-match if they have the same length and all substrings of equal length starting at the same text position SCER-match, i.e., SCER-matches with for all . A specific instance of SCER-matching is order-preserving matching [19, 16], which has been studied for the analysis of numerical time series. The aim of order-preserving matching is to match two strings if the relative order of the symbols in both strings is the same. Order-preserving matching therefore can find matches independently of value offsets and scaling.
Since order-preserving matching takes the global order of the symbols in a string into account, it may be too strict in applications that primarily consider the local ordering of ranges partitioned by the peaks. In fact, time series of stock prices are such a case, where a common pattern called the head-and-shoulder [8] involves one absolute peak (head) and neighboring local peaks (shoulders), where each of these neighboring peaks can be treated individually to match similar head-and-shoulder patterns with slightly changed local peaks. For such a reason, Park et al. [26] proposed Cartesian tree matching. Cartesian tree matching relaxes the notion of order-preserving matching in the sense that an order-preserving match is always a Cartesian tree match, but not necessarily the other way around. For instance, the two strings of digits and fit into the head-and-shoulder pattern, do not order-preserving match, but Cartesian tree match. For that, Cartesian tree matching compares the Cartesian trees of two strings to decide whether they match. The Cartesian tree [28] is a binary tree built upon an array of numbers. If all numbers are distinct, it is the min-heap whose in-order traversal retrieves the original array. Since its inception, Cartesian tree matching and variations thereof have attracted interest from researchers, whose studies include multiple pattern matching [26, 27], approximate matching [1, 18], substring matching [4], subsequence matching [24], indeterminate matching [10], cover matching [15], and palindromic matching [9].
In addition to stock prices, applications of generalized pattern matching can also be found in melody matching between musical scores. Pattern matching music scores such as differences, directions, or magnitudes have been studied [12]. However, the detection of repetitions in a piece of music has also been considered of importance [7]. The question therefore is whether we can find repetitive substrings that Cartesian tree match with a repetition of a set of melodic motifs (i.e., input texts). For that to be efficient, we want to index these motifs.
An index for Cartesian tree matching of a text string is a data structure built upon that can report the number of substrings of that Cartesian tree match a given pattern. Given has symbols drawn from an integer alphabet of size , Park et al. [26] and Nishimoto et al. [23] proposed indexes for Cartesian tree matching of that both occupy bits of space and are constructed with and additional bits of space, respectively. Park et al. [26, Section 5.1] achieve time and Nishimoto et al. [23, Section 5.1] time for answering count queries, where is the number of occurrences of the pattern of length . Adapting Ferragina and Mantaci’s FM-index for exact string matching [5], Kim and Cho [17] proposed an index that occupies bits of space, and answers a count query in time for a pattern of length . For construction, they proposed a straight-forward solution that takes as input the Cartesian suffix tree of Park et al. [26], which, however, requires additional bits of space.
We address two goals in this paper. The first is a construction algorithm for Kim and Cho’s index that takes bits of working space, which is compact for constant alphabet sizes. While we consider an index supporting access to a character of the length- text as compact if its space is bits, we here restrict queries to Cartesian-tree pattern matching, and a Cartesian tree storing nodes can be represented in bits, cf. [26, Section 3.5]. Therefore, a compact index for Cartesian-tree pattern matching takes bits, a bound we reach for constant alphabets. Second, while all aforementioned indexes can partially address the problem by indexing multiple texts, it is hard to detect whether the pattern is a repetition of one of the input texts, which is of interest in case of indexing melodic motifs that can repeat. Here, our goal is an index that can find such matches, even if they start with different offsets of the same repetition. In concrete words, our aim is to index multiple texts for Cartesian pattern matching. The search space for a pattern are the texts that are considered to be infinite concatenations with themselves.
In this paper, we propose an extension of the index of Kim and Cho [17] for Cartesian tree matching with techniques of the extended Burrows–Wheeler transform [20], and call the resulting data structure the cBWT index. The cBWT index is an extension in the sense that it supports indexing multiple input texts for Cartesian tree matching circularly. We show that we can compute the cBWT index in time and bits of space, where is the total length of all texts to index. Our construction allows to build the cBWT index incrementally, i.e., we support the incremental addition of a new string to the set of texts we index. We can also compute the original index of Kim and Cho within the same complexities. Our ideas stem from construction algorithms of indexes for parameterized matching by Hashimoto et al. [11] and Iseri et al. [14], which we recently extended for multiple circular texts [25]. During construction, the cBWT index supports the backward search and count queries for pattern strings in time, where is the pattern length.
2 Preliminaries
Let denote the logarithm to base two. We assume a random access model with word size , where denotes the input size. An interval of integers is denoted by , where if .
Strings.
Let denote an alphabet. We call elements of symbols, a sequence of symbols from a string over , and denote the set of strings over by . Let . The concatenation of and is denoted by or . We write if we concatenate instances of for a non-negative integer , and for the string obtained by infinitely concatenating . We call primitive if implies and . It is known that for every there exists a unique primitive and a unique integer such that , denoted by and , respectively. If , then is called a prefix, a suffix, and substrings of . The length of is the number of symbols in , reports the length of the longest common prefix of and , denotes the unique string of length , and we define . Let and . Then denotes the -th symbol in , and , where if , , and . For notational convenience, if , and if . We call satisfying for every a period of . Let and for each non-negative integer , i.e., denotes the -th left rotation of . The left rotations of for are the conjugates of . We write if and only if (a) and or (b) .
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
Let be a (dynamic) string, , , and . Then inserts the symbol at position of , deletes the -th entry of , returns the number of occurrences of in , returns , returns the index of the -th occurrence of in if , returns the smallest value in larger than if it exists, and returns the maximal interval such that and for every . See Figure 1 for examples. We represent strings by the following dynamic data structure, which supports the aforementioned operations and queries.
Lemma 2.1 ([14, Lemma 4]).
A dynamic string of length over with can be stored in a data structure occupying bits of space, supporting insertion, deletion and the queries access, rank, rnkcnt, select, RNV and MI in time.
Alphabet.
Throughout, we will work with the integer alphabet , where , and a special symbol stipulated to be smaller than any symbol from . The special symbol is motivated by the construction algorithm, and as a delimiter when a string should not be considered circular, as in the index of Kim and Cho [17]. Let .
Cartesian Tree Matching.
The Cartesian tree of a string is a binary tree defined as follows. If , then is the empty tree. If , let denote the position of the smallest symbol in , where ties are broken with respect to text position. Then has as its root, as its left subtree and as its right subtree. We say that two strings Cartesian tree match (ct-match) if and only if , and write . For instance, in Figure 2 the substring of ct-matches while does not.
Parent Distance Encoding.
Park et al. [26] use an encoding scheme for representing Cartesian trees that reduces the computation of a ct-match to checking whether the encoded strings exactly match. We here give a variant thereof which takes the special symbol into account. Let denote a symbol larger than any integer. The parent distance encoding of is a string of length over such that
for each . We have for each and . For example, .111As a side note, the parent distance encoding has been leveraged for devising compact -bit data structures answering range minimum queries, such as the 2d-Min-Heap by Fischer [6] and the LRM-trees by Sadakane and Navarro [22], which semantically coincide.
Lemma 2.2 ([26, Theorem 1]).
Let . Then .
Problem Statement.
We are interested in a solution to the following problem.
Problem (Count).
Given and , count each of the conjugates of the strings in whose infinite iteration has a prefix ct-matching .
Throughout, let . Our running example consists of the strings , and over . Given our running example, the solution to Count for and is and , respectively. Here, since .
3 -bit Index
Let , for each , and for each , where , i.e., we identify each conjugate of each text with its start position inside the concatenation , such that we give them ranks from to . See Figure 3 for an example. In what follows, we put these ranks in a specific order by a permutation of such that the permuted ranks of all conjugates with prefixes of their infinite concatenation ct-matching a pattern form an interval .
3.1 Conjugate Array
We express the permutation of the ranks of all conjugates by the so-called conjugate array, which we will subsequently define. To achieve this, we extend the ideas of Mantaci et al. [20] to Cartesian tree matching, and introduce a preorder on . For notational convenience, we define the rotational parent distance encoding of by . For any , let if and only if there exists some natural number such that or holds.
Lemma 3.1.
The relation defines a total preorder on , i.e., the relation is binary, reflexive, transitive and connected.
We call this preorder the -preorder. We write if and only if , and if and only if . For instance, and . Note that the latter example violates antisymmetry, i.e., the relation is not a total order. In the following, we present a convenient but not necessarily optimal way to compute the -preorder of two given strings.
Lemma 3.2 (Weak Periodicity Lemma).
Let and be two periods of a string . If , then is also a period of .
Lemma 3.3 ([13, Lemma 5]).
Let and . Then if and only if .
Proof.
Without loss of generality, . Let , and .
-
Assume . Then, on the one hand,
Since the rotational prev-encoding is commutative with left rotations, is a period of . Consequently, both and are periods of . Since , we can apply Lemma 3.2 and find that is a period of . As divides , can be formed by repeating an integral number of times, which implies , i.e., . On the other hand,
Since the rotational prev-encoding is commutative with left rotations, is a period of . As , is a period of in addition to . Then and Lemma 3.2 imply that is a period of . As divides , can be repeated an integral number of times to form , which implies . Consequently, , and therefore . In particular, , i.e., .
-
Let . Assume with minimal such that . If , then , which contradicts . Hence, and without loss of generality, assume . Then , , and consequently , a contradiction to . Thus, .
Corollary 3.4.
Let and . Then if and only if .
Similarly to Boucher et al. [2], we define the conjugate array of as the string of length over such that if and only if
i.e., is the number of all conjugates smaller than according to -preorder, where we break ties first with respect to text index, and then with respect to text position. By resolving all ties this way, we ensure that is well-defined. Since is a permutation, its inverse, which we denote by , is also well-defined. See Figure 3 for our running example’s conjugate array.
| 1 | ||||||
| 2 | ||||||
| 3 | ||||||
| 4 | ||||||
| 5 | ||||||
| 6 | ||||||
| 7 | ||||||
| 8 | ||||||
| 9 | ||||||
| 10 | ||||||
| 11 |
We define the conjugate range of a pattern of length in as a maximal interval such that for every . Leveraging Lemma 2.2, we find that the conjugate range is well-defined.
Corollary 3.5.
Let , , , and . Then if and only if .
We have because the empty string is a prefix of every conjugate. The computation of for some pattern on indexes related to the FM-index usually perform an algorithmic technique called the backward search. The backward search for in processes backwards in an online manner and refines to at the -th step on reading with . Finally, the length of is the solution to Count by Corollary 3.5. For our running example, and . Below we define the necessary tools to allow for an efficient backward search.
3.2 LF-mapping
We want to define a map that allows us to cycle backwards through each of the texts to be indexed by mapping to the position of a conjugate in the conjugate array from the position of its left rotation. However, if we want to represent this mapping space-efficiently, we have to be careful since we used tie-breaks within texts when we sorted the conjugates by a preorder. Our idea is to relax the requirements and define a map that allows us to cycle backwards through a text that is -equal to the original to dodge any issues arising from our tie-breaks. For that, we want to cycle backwards in text order inside the roots of each -encoded text. Whenever we want to move backwards at the starting position of a root, we jump to its end position. We express this backwards movement with the following permutation. For every with , let
For our running example, . Here, we observe that we have two cycles and corresponding to and , respectively, which are -equal to the original text . Now we are ready to express the LF-mapping in terms of the function . The LF-mapping of is a string of length over such that for each . Since the LF-mapping is a permutation, it has an inverse , which we call the FL-mapping of and denote by . See Figure 5 for an example. The LF-mapping is at the core of the backward search. However, storing LF-mapping and FL-mapping in their plain form creates the need for two integer arrays of length and entries of bits, which motivates the following encoding scheme. Let the rotational Cartesian tree signature encoding of denote a string of length over such that
for each , i.e., if , then reports the number of positions such that and . See Figure 4 for an example. Note that for each satisfying and .
Lemma 3.6.
Let . Then , where if , and otherwise.
| 1 | |||||
|---|---|---|---|---|---|
| 2 | |||||
| 3 | |||||
| 4 |
Taking advantage of both encodings, we investigate how the -preorder of two strings changes if we rotate them. For convenience, we borrowed the following notation from literature [14, 11]. Let for every , and for each . For example, and .
Lemma 3.7 ([17, Lemma 3]).
Let such that . Then if and only if (a) or (b) and , where .
Proof.
We show the statement for since it is non-trivial. Let and . Then by Corollary 3.4, , and .
- ()
-
Assume . Let . Then and , i.e. . Consequently, , which implies by Corollary 3.4. The statement follows by contraposition.
- ()
-
We exhaust all possible cases for .
- Case 1.
-
Assume . Let . Then and , i.e. . Consequently, , which implies by Corollary 3.4.
- Case 2.
- Case 3.
-
Assume . Then and . Moreover, . Hence, by Corollary 3.4.
Our representation of the LF- and FL-mapping consists of two strings and , which are defined as follows. First, is the string of length over such that for each .
Corollary 3.8.
Let , the accumulated length of all texts in , , and . If , then .
Second, is the string of length over such that for each . By what follows, and suffice to compute both LF- and FL-mapping of .
Corollary 3.9.
Let , the accumulated length of all texts in , and . Then and .
In Figure 5 we present and of our running example.
Remark 3.10.
Let , and . If we substitute the occurrence of in and for , then the modified strings constitute the integer-based representation of Kim and Cho’s index [17, Section 4].
3.3 Backward Search
The LF-mapping can be leveraged for the backward search by the following result, which is due to Lemma 2.2 and Corollary 3.5.
Lemma 3.11 ([17, Lemma 6]).
Let , , , , and . For , if and only if (a) and or (b) and .
At this point it is straight-forward to apply the techniques developed by Kim and Cho [17, Sections 5 and 6] to define a static index of that occupies bits of space and that solves Count in time, where is the pattern length. However, for brevity and in view of a space efficient construction of our proposed index and its extension, we will represent and by the dynamic data structure of Lemma 2.1, and introduce an auxiliary string for the backward search. Let denote a string of length over such that and for each .
Lemma 3.12.
Let , , , and . Then .
Lemma 3.13.
Let , , , , , , and . Then Algorithm 1 correctly computes .
Proof.
Let . We show how Algorithm 1 computes and to obtain .
- Case 1.
- Case 2.
-
Assume . This case is handled from Line 6 through Line 12. By Lemma 3.11, if and only if for each . Thus, we correctly compute in Line 7, and return an empty interval in Line 13 due to Line 2 if . Assume . We compute the lowest value in greater than in Line 8 and determine the largest such that in Line 9. Let denote the interval computed in Line 10. Then for each by Lemma 3.12. We compute . The following statements hold due to Lemma 3.7.
-
If satisfies , then since .
-
If satisfies , then since and .
-
If satisfies , then since we have .
We apply these results to compute in Line 11, and then infer in Line 12.
-
The next result is obtained from the computation of Demaine and colleagues’ [3] Cartesian tree signature encoding of the given string.
Lemma 3.14.
Given of length , we can process in time such that we can subsequently compute and in time, for every .
Theorem 3.15.
Let and . There exists a data structure occupying bits of space that solves Count in time, where is the pattern length.
Proof.
We represent , and by the data structure of Lemma 2.1, which leads to the claimed space complexity. We preprocess a pattern of length with Lemma 3.14, and compute from for each in descending order leveraging Lemma 3.13. Since each conjugate range update takes queries, the claimed complexity for solving Count follows from Lemma 2.1. We call the data structure of Theorem 3.15 the cBWT index of . The cBWT index of the running example is presented in Figure 5.
| 1 | |||||||
| 2 | |||||||
| 3 | |||||||
| 4 | |||||||
| 5 | |||||||
| 6 | |||||||
| 7 | |||||||
| 8 | |||||||
| 9 | |||||||
| 10 | |||||||
| 11 |
4 Construction in Bits of Space
We will show how to construct the cBWT index of a single text, and how an existing cBWT index can be extended to also index an additional text. We then leverage these two results to iteratively construct the cBWT index of any set of texts, adding one new text per iteration step.
4.1 Single Text cBWT Index
Before we can tackle the construction of the cBWT index of a single text, we need another technical result, which follows from an examination of the definitions and Lemma 3.7.
Lemma 4.1.
Let , , and . Then
We will first show how to construct the cBWT index of a text from in which occurs exactly once, and then show how to construct the cBWT index of an arbitrary text from (without the requirement on ).
Lemma 4.2.
Let , , , and . Algorithm 2 correctly computes and the cBWT index of for each .
Proof.
For each , let maximal such that and for each , and maximal such that for each , i.e., the left and right boundary of the former are computed in Line 6 and Line 5, respectively, and the latter in Line 2. The following statements are due to the location of the single in each conjugate of and Lemma 3.7.
-
If , then .
-
If , then if and only if .
-
If , then .
-
If , then if and only if .
-
If and , then if and only if .
We apply these results from Line 2 through Line 6 to compute the number of conjugates of smaller than according to -preorder. Since for each by assumption on and ,
for each with . Thus, is also the number of conjugates of smaller than according to -preorder. Subsequently, Algorithm 2 updates the cBWT index of from Line 8 through Line 17. By assumption on and , and . From Line 8 through Line 14 we leverage Lemma 3.12 and Lemma 4.1 to compute and, if , , and update . By assumption on , for each . Consequently, and . Moreover, if we set , then and . It remains to compute and , which are and , respectively. The update of both and is done from Line 15 through Line 17.
Corollary 4.3.
Let , , , and . The cBWT index of can be constructed in bits of space and time.
Proof.
We show the case where . We represent , and by the data structure of Lemma 2.1. Initially, . We preprocess with Lemma 3.14 in time to have access to for each in time. For each in descending order, we apply Algorithm 2 to compute and extend the cBWT index of to that of . Since the extension of the cBWT index of to that of takes queries for each , the construction of the cBWT index of takes a total of queries by Lemma 3.6. The claimed complexities then follow by Lemma 2.1.
Remark 4.4.
Due to Remark 3.10, the previous statement yields the integer-based representation of Kim and Cho’s index [17, Section 4] by substituting with in both and . Thus, we can construct their compact index [17, Section 5] directly [17, Section A.4] while retaining the complexities as stated in Corollary 4.3.
Corollary 4.5.
Let and . Then the cBWT index of can be constructed in bits of space and time.
Proof.
Let . We construct the cBWT index of within the claimed complexities by Corollary 4.3. During construction, we build a bit string of length such that if and only if . By choice of , for each , and . The following statements hold for each by choice of and Lemma 3.3.
-
.
-
.
-
.
4.2 Extending an Existing cBWT Index
Let be a non-empty set of strings and denote the accumulated length of all strings in . In this subsection, we assume we have the cBWT index of at hand, and want to extend it by another text of length to index . To achieve this, we compute the integers , and for each , whose definitions follow. Let ,
for each . We can apply the techniques exhibited in Lemma 4.2 to compute the helper values.
Lemma 4.6.
Let , , and with . Then Algorithm 3 correctly computes , and .
Proof.
Algorithm 3 takes , , , , and the cBWT index of as input. If , then we return , and in Line 2 since . Thus, assume . For each , let maximal such that and for each . The following statements are due to Lemma 3.7 and .
-
If , then if and only if and .
-
If , then if and only if .
-
If and , then if and only if .
We apply these results from Line 3 through Line 10 to compute . Leveraging Lemma 4.1 and (the proof of) Lemma 3.12, we then compute and from Line 11 through Line 16 and from Line 17 through Line 23, respectively.
Lemma 4.7.
Let , , , and . If , then , and can be computed from , , and in time with the cBWT index of .
Proof.
We apply Algorithm 3. Correctness follows from Lemma 4.6, and the time complexity is due to the loop of Algorithm 3 in Line 7 and Lemma 2.1.
For the iterative construction, we augment the cBWT index by a dynamic bit string with the invariant that is zeroed before and after each extension with a new string. We use to temporarily mark modified parts in the cBWT such that we retain the functionality of the index even during construction. The length of is the number of characters indexed by the cBWT index, which we call in the next lemma the augmented cBWT index for clarity.
Lemma 4.8.
Let , , , , , and . The augmented cBWT index of can be extended to index in bits of space and time.
Proof.
We compute the cBWT index of within the claimed complexities by Corollary 4.5. Then we compute , and for each .
- Case 1.
- Case 2.
-
Assume for each and . Let . By assumption and Corollary 3.4, , , and for each . We leverage Lemma 3.14 to preprocess within the claimed complexities. Since , and by construction, we can now use Lemma 4.7 to compute , and for each in descending order. Hence, we obtain , and for all in descending order within the claimed complexities by Lemma 3.6.
Leveraging the backward search on the cBWT index of , we store the -values and -values in lex-order, which takes bits of space. To store the -values in lex-order and stay within the claimed time and space complexity, we utilize , which we assume to be represented by the data structure of Lemma 2.1. For each in descending order, we call , and discard once has been computed. Then . The following statements for each are due to construction.
-
If , then and .
-
If , then and .
-
.
-
If , and , then .
-
If , and , then .
-
If , and , then .
-
If , and , then .
Consequently, queries and operations are needed to update the cBWT index of to index . Finally, we zero . The claimed complexities then follow from Lemma 2.1.
We are finally able to state the main result of this section.
Theorem 4.9.
Let and . The cBWT index of can be constructed in bits of space and time.
Proof.
Let for each , and let denote a zeroed bit string of length for each . First, we construct and the cBWT index of within the claimed complexities by Lemma 4.5, where each string is represented by the data structure of Lemma 2.1. Subsequently, we iteratively extend the cBWT index of augmented by to the cBWT index of augmented by for each in ascending order leveraging Lemma 4.8. Then the cBWT index of is constructed in bits of space and time.
References
- [1] Bastien Auvray, Julien David, Richard Groult, and Thierry Lecroq. Approximate cartesian tree matching: An approach using swaps. In Proc. SPIRE, volume 14240 of LNCS, pages 49–61, 2023. doi:10.1007/978-3-031-43980-3_5.
- [2] Christina Boucher, Davide Cenzato, Zsuzsanna Lipták, Massimiliano Rossi, and Marinella Sciortino. r-indexing the eBWT. In Proc. SPIRE, volume 12944 of LNCS, pages 3–12, 2021. doi:10.1007/978-3-030-86692-1_1.
- [3] Erik D. Demaine, Gad M. Landau, and Oren Weimann. On Cartesian trees and range minimum queries. Algorithmica, 68(3):610–625, 2014. doi:10.1007/s00453-012-9683-x.
- [4] Simone Faro, Thierry Lecroq, Kunsoo Park, and Stefano Scafiti. On the longest common Cartesian substring problem. The Computer Journal, 66(4):907–923, 2022. doi:10.1093/COMJNL/BXAB204.
- [5] Paolo Ferragina and Giovanni Manzini. Opportunistic data structures with applications. In Proc. FOCS, pages 390–398, 2000. doi:10.1109/SFCS.2000.892127.
- [6] Johannes Fischer. Optimal succinctness for range minimum queries. In Proc. LATIN, volume 6034 of LNCS, pages 158–169, 2010. doi:10.1007/978-3-642-12200-2_16.
- [7] Peter Foster, Anssi Klapuri, and Simon Dixon. A method for identifying repetition structure in musical audio based on time series prediction. In Proc. EUSIPCO, pages 1299–1303. IEEE, 2012. URL: https://ieeexplore.ieee.org/document/6334323/.
- [8] Tak-Chung Fu, Korris Fu-Lai Chung, Robert Wing Pong Luk, and Chak-man Ng. Stock time series pattern matching: Template-based vs. rule-based approaches. Eng. Appl. Artif. Intell., 20(3):347–364, 2007. doi:10.1016/J.ENGAPPAI.2006.07.003.
- [9] Mitsuru Funakoshi, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing maximal palindromes in non-standard matching models. In Proc. IWOCA, volume 14764 of LNCS, pages 165–179, 2024. doi:10.1007/978-3-031-63021-7_13.
- [10] Paweł Gawrychowski, Samah Ghazawi, and Gad M. Landau. On indeterminate strings matching. In Proc. CPM, volume 161 of LIPIcs, pages 14:1–14:14, 2020. doi:10.4230/LIPICS.CPM.2020.14.
- [11] Daiki Hashimoto, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara. Computing the parameterized Burrows–Wheeler transform online. In Proc. SPIRE, volume 13617 of LNCS, pages 70–85, 2022. doi:10.1007/978-3-031-20643-6_6.
- [12] Yuzuru Hiraga. Structural recognition of music by pattern matching. In Proc. ICMC. Michigan Publishing, 1997. URL: https://hdl.handle.net/2027/spo.bbp2372.1997.113.
- [13] Wing-Kai Hon, Tsung-Han Ku, Chen-Hua Lu, Rahul Shah, and Sharma V. Thankachan. Efficient algorithm for circular Burrows–Wheeler transform. In Proc. CPM, volume 7354 of LNCS, pages 257–268, 2012. doi:10.1007/978-3-642-31265-6_21.
- [14] Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara. Breaking a barrier in constructing compact indexes for parameterized pattern matching. In Proc. ICALP, volume 297 of LIPIcs, pages 89:1–89:19, 2024. doi:10.4230/LIPICS.ICALP.2024.89.
- [15] Natsumi Kikuchi, Diptarama Hendrian, Ryo Yoshinaka, and Ayumi Shinohara. Computing covers under substring consistent equivalence relations. In Proc. SPIRE, volume 12303 of LNCS, pages 131–146, 2020. doi:10.1007/978-3-030-59212-7_10.
- [16] Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theor. Comput. Sci., 525:68–79, 2014. doi:10.1016/J.TCS.2013.10.006.
- [17] Sung-Hwan Kim and Hwan-Gue Cho. A compact index for Cartesian tree matching. In Proc. CPM, volume 191 of LIPIcs, pages 18:1–18:19, 2021. doi:10.4230/LIPIcs.CPM.2021.18.
- [18] Sungmin Kim and Yo-Sub Han. Approximate Cartesian tree pattern matching. In Proc. DLT, volume 14791 of LNCS, pages 189–202, 2024. doi:10.1007/978-3-031-66159-4_14.
- [19] Marcin Kubica, Tomasz Kulczyński, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett., 113(12):430–433, 2013. doi:10.1016/J.IPL.2013.03.015.
- [20] Sabrina Mantaci, Antonio Restivo, Giovanna Rosone, and Marinella Sciortino. An extension of the Burrows–Wheeler transform. Theor. Comput. Sci., 387(3):298–312, 2007. doi:10.1016/j.tcs.2007.07.014.
- [21] Yoshiaki Matsuoka, Takahiro Aoki, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Generalized pattern matching and periodicity under substring consistent equivalence relations. Theor. Comput. Sci., 656:225–233, 2016. doi:10.1016/j.tcs.2016.02.017.
- [22] Gonzalo Navarro and Kunihiko Sadakane. Fully functional static and dynamic succinct trees. ACM Trans. Algorithms, 10(3):16:1–16:39, 2014. doi:10.1145/2601073.
- [23] Akio Nishimoto, Noriki Fujisato, Yuto Nakashima, and Shunsuke Inenaga. Position heaps for Cartesian-tree matching on strings and tries. In Proc. SPIRE, volume 12944 of LNCS, pages 241–254, 2021. doi:10.1007/978-3-030-86692-1_20.
- [24] Tsubasa Oizumi, Takeshi Kai, Takuya Mieno, Shunsuke Inenaga, and Hiroki Arimura. Cartesian tree subsequence matching. In Proc. CPM, volume 223 of LIPIcs, pages 14:1–14:18, 2022. doi:10.4230/LIPICS.CPM.2022.14.
- [25] Eric M. Osterkamp and Dominik Köppl. Extending the parameterized Burrows–Wheeler transform. In Proc. DCC, pages 143–152, 2024. doi:10.1109/DCC58796.2024.00022.
- [26] Sung Gwan Park, Magsarjav Bataa, Amihood Amir, Gad M. Landau, and Kunsoo Park. Finding patterns and periods in Cartesian tree matching. Theor. Comput. Sci., 845:181–197, 2020. doi:10.1016/J.TCS.2020.09.014.
- [27] Siwoo Song, Geonmo Gu, Cheol Ryu, Simone Faro, Thierry Lecroq, and Kunsoo Park. Fast algorithms for single and multiple pattern Cartesian tree matching. Theor. Comput. Sci., 849:47–63, 2021. doi:10.1016/J.TCS.2020.10.009.
- [28] Jean Vuillemin. A unifying look at data structures. Commun. ACM, 23(4):229–239, 1980. doi:10.1145/358841.358852.
