Encodings for Range Minimum Queries over Bounded Alphabets
Abstract
Range minimum queries (RMQs) are fundamental operations with widespread applications in database management, text indexing and computational biology. While many space-efficient data structures have been designed for RMQs on arrays with arbitrary elements, there has not been any results developed for the case when the alphabet size is small, which is the case in many practical scenarios where RMQ structures are used. In this paper, we investigate the encoding complexity of RMQs on arrays over bounded alphabet. We consider both one-dimensional (1D) and two-dimensional (2D) arrays. For the 1D case, we present a near-optimal space encoding. For constant-sized alphabets, this also supports the queries in constant time. For the 2D case, we systematically analyze the 1-sided, 2-sided, 3-sided and 4-sided queries and derive lower bounds for encoding space, and also matching upper bounds that support efficient queries in most cases. Our results demonstrate that, even with the bounded alphabet restriction, the space requirements remain close to those for the general alphabet case.
Keywords and phrases:
Range minimum queries, Encoding data structures, Cartesian treesFunding:
Seungbum Jo: This work was supported by research fund of Chungnam National University.Copyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Data structures design and analysisEditors:
Paola Bonizzoni and Veli MäkinenSeries and Publisher:

1 Introduction
Efficiently processing range queries on arrays has been one of the central problems in computer science with a wide range of applications in areas such as database management, text indexing, computational biology etc. In this paper, we focus on a few different variants of range minimum queries (RMQ) on one- and two-dimensional arrays. Given a query range within an array (that we are allowed to preprocess) a range minimum query returns the position of a smallest element within the query range.
Over the past decades, there has been a significant amount of research on designing space-efficient data structures that support fast queries for the RMQ problem. Almost all of these results focus on arrays with arbitrary elements (and a few on binary arrays), whereas in many real-world scenarios one often encounters input arrays that are not arbitrary but drawn from bounded or small alphabets (but not necessarily binary) - genome sequences for example. Our aim is to investigate the effect of the alphabet size on the size of the RMQ encoding.
In the one-dimensional (1D) case, many classical data structures [15, 22, 1] have established that RMQ can be answered in constant time using linear space. The space usage is further improved to match the information-theoretic lower bound (up to lower-order terms) [9]. Extending the RMQ from one to two dimensions introduces new complexities. Unlike the 1D case, where the Cartesian tree provides a natural and optimal representation for encoding RMQ answers, the two-dimensional (2D) case lacks a straightforward analogue [6].
In this paper we study encoding data structures for RMQ in the 1D and 2D arrays. An encoding data structure for a given set of queries is a structure that stores sufficient information to answer those queries without directly accessing the original input data. On the other hand, an indexing data structure refers to a structure that maintains the original input while maintaining small auxiliary structures to support queries efficiently. Our primary focus is on establishing upper and lower bounds for RMQ encodings when the input array is drawn from a bounded-sized alphabet. Many data structures, such as text indexing structures [10, 17, 16, 7] and rank/select structures on sequences [14, 19], can be made compact and fast when the input is drawn from a bounded-sized alphabet rather than an arbitrary one. We explore whether one can exploit the bounded-alphabet input to improve the complexity of 1D and 2D RMQ encodings.
1.1 Previous results on RMQ encodings
1D-RMQ encodings.
For the 1D case, any encoding for RMQ requires at least bits due to its bijective relationship with the Cartesian tree of the input [23]. The best-known result is the -bit data structure by Fischer and Heun that supports query time. When the input is highly compressible, one can design data structures whose space usage is parameterized by the compressibility of the Cartesian tree of the input [11, 18] while still supporting queries efficiently. Fischer considered the number of distinct Cartesian trees with bounded alphabets in his Ph.D. thesis (see Section 3.9.1 in [8]) and computed, for each , a constant such that storing a Cartesian tree constructed from an input of size over an alphabet of size requires at least bits.
2D-RMQ encodings.
When the input is an 2D array with , Demaine et al. [6] showed that there is no Cartesian tree-like structure for the 2D case. Specifically, no structure exists that fully encodes the answers to all queries that can be constructed in linear time. They further showed that when , any encoding data structure for answering RMQ queries requires bits. Since the input array can be trivially encoded using bits (with respect to RMQs), this implies that any encoding data structure uses asymptotically the same space as indexing data structures [24, 3] when . Thus, most of the subsequent results focused on the case where .
Brodal et al. [3] proposed an -bit encoding with query time. Moreover, they proved that any encoding for answering RMQ requires at least bits. Brodal et al. [2] proposed an asymptotically optimal -bit encoding for answering RMQ, although the queries are not supported efficiently using this encoding. For , the problem of designing a -bit data structure for a 2D array that answers queries in sublinear time remains an open problem.
For an 2D array, Golin et al. [13] considered the following four type of queries:
-
1.
-sided: for
-
2.
-sided: for ,
-
3.
-sided: for ,
-
4.
-sided: for , (any rectangular range).
Golin et al. [13] provided expected upper bounds and matching lower bounds for the encoding space required to answer RMQ for these four query types assuming the elements of the input array are drawn uniformly at random.
1.2 Our results
Query type | Space (in bits) | Query time | Reference |
Upper bounds | |||
1-sided | * | [13]* | |
Theorem 5 | |||
2-sided | * | [13]* | |
Theorem 7 | |||
Theorem 10 | |||
3-sided | * | [13]* | |
Theorem 7 | |||
Theorem 13 | |||
4-sided | * | [13]* | |
[4] | |||
[2] | |||
Theorem 16, | |||
Lower bounds | |||
1-sided | * | [13]* | |
Theorem 6 | |||
2-sided | * | [13]* | |
Theorem 7 | |||
Theorem 11, | |||
3-sided | * | [13]* | |
Theorem 14, | |||
4-sided | * | [13]* | |
[4] | |||
Theorem 15, |
In this paper, we study encoding data structures for RMQ on 1D and 2D arrays over a bounded-sized alphabet. All the encoding results assume a -bit word RAM model, where is the input size. For a 1D array of length over an alphabet of size , we show that any RMQ encoding needs at least bits (Theorem 3). This shows that even for moderately large alphabet, the lower bound is close to the -bit lower bound for the general alphabet. Moreover, we show that for any constant-sized alphabet, one can achieve optimal space usage (up to lower-order terms) while supporting the queries in constant time (Theorem 4).
For a 2D array with , we first show (Theorem 5) that bits are necessary and sufficient to answer 1-sided RMQ (in constant time). We then generalize this bound to bits when the alphabet size is (Theorem 6). For 2-sided and 3-sided RMQ, we show that bits are necessary and sufficient for answering queries in constant time (Theorem 7 and Corollary 12). On the other hand, when the input 2D array is over an alphabet of size , we show that any 2-sided RMQ encoding requires at least bits (Theorem 11). We also show that one can match this lower bound for some ranges of parameters, by describing a data structure that takes bits which can answer 2-sided RMQ in time (Theorem 10). For 3-sided RMQ, we show that there exists a data structure of size bits that answers queries in time (Theorem 13). We also show that the space bound is asymptotically optimal for some range of parameters by showing that any 3-sided RMQ encoding requires at least bits (Theorem 14). Finally, for the 4-sided RMQ, we first show that bits are necessary and sufficient (Theorem 15), and design an encoding that supports 4-sided RMQ in constant time when is (Theorem 16). See Table 1 for a summary of the results on 2D arrays.
The remainder of this paper is organized as follows. In Section 2, we revisit the RMQ problem in the 1D setting over bounded-sized alphabets, establishing tight upper and lower bounds. Section 3 explores the 2D case, where we systematically consider various types of query ranges (namely, 1-, 2-, 3- and 4-sided queries) and derive upper and lower bounds on the size of the encodings, in all cases achieving asymptotically optimal bounds.
We use the following notation throughout the rest of this paper. Given a input 1D (resp. 2D) array and a 1D range (resp. a 2D rectangular range ) on the array, (resp. ) returns the position of the smallest element within the given range. In case of a tie, a tie-breaking rule is applied, always returning the leftmost (resp. top-leftmost) position. Also for a 2D array, denotes a position at the -th row and -th column, and denotes the set .
2 RMQ on 1D array over bounded-sized alphabets
In this section, we consider a data structure for answering RMQ on a 1D array of size where the array elements are from an alphabet of size . We begin with the case when . In this case, RMQ can be answered in time using an -bit data structure [20] that supports and queries on in time. Here, a query returns the number of ’s in the prefix , while a query returns the position of the -th occurrence of in , for any and . Furthermore, the following lemma shows that this data structure is optimal for answering RMQ, up to additive lower-order terms.
Lemma 1.
At least bits are necessary to answer RMQ on a 1D binary array of size .
Proof.
Consider an arbitrary binary array of length with . Then for any , if and only if , since RMQ returns the leftmost position in case of a tie. Consequently, can be fully reconstructed using only RMQ, which proves the lemma.
For an arbitrary 1D array of size , it is known that bits are necessary and sufficient (up to lower-order additive terms) to answer RMQ [9]. The lower bound follows from the fact that if two arrays yield different RMQ results, their corresponding Cartesian trees [23] must have distinct shapes. The Cartesian tree of (denoted ) is an unlabeled binary tree defined as follows: (i) The root of corresponds to . (ii) The left and right subtrees of are the Cartesian trees of and , respectively, if they exist.
Let’s define the left height of a binary tree as the maximum number of left edges on any root to leaf path in the binary tree. From the definition of a Cartesian tree and the tie-breaking rule for RMQ, one can observe that the element corresponding to any node in the Cartesian tree is strictly less the element corresponding to its left child. From this we derive the following observation.
Proposition 2.
Given a 1D array over an alphabet of size , the left height of is at most .
Genitrini et al. [12] count the number of labeled relaxed binary trees with internal nodes, where the left height is bounded by . Roughly speaking, a relaxed binary tree of size is a directed acyclic graph consisting of a binary tree with n internal nodes, one leaf, and pointers. It is constructed from a binary tree of size , where the first leaf in a reverse post-order traversal is kept and all other leaves are replaced by pointers. These links may point to any node that has already been visited by the reverse post-order traversal. See Figure 1 for an example. A Cartesian tree with nodes can be viewed as a relaxed binary tree with internal nodes by adding dummy left (resp. right) leaves for nodes that have no left (resp. right) children.
Furthermore, since a Cartesian tree is an unlabeled tree, its relaxed binary tree representation contains a single leaf node. Therefore, for a 1D array of size over an alphabet of size , the number of distinct Cartesian trees on is at least the number of distinct relaxed binary trees with internal nodes and left height at most , since exactly one node is added as a rightmost leaf in its relaxed binary tree. Therefore, from [12], we directly obtain a following theorem.
Theorem 3 ([12]).
Given a 1D array of size over an alphabet of size , at least bits are necessary to answer RMQ, where .
For example, when is and , at least , , , and bits per element are required to answer RMQ on , respectively (see Table 1 in [12] for more examples)111for , Fischer [8] obtained the same result.. This implies that even for constant-sized alphabets, the space lower bound for answering RMQ remains very close to that of the general case. From an upper bound perspective, when , we can obtain an -bit data structure that answers RMQ in time, as summarized in the following theorem.
Theorem 4.
Given a 1D array of size over an alphabet of size , there exists a data structure of size bits that can answer RMQ in time.
Proof.
We use a slightly modified version of the data structure from Davoodi et al. [5]. Roughly speaking, their approach decomposes into micro-trees of size at most . Each micro-tree is stored as an index into a precomputed table of bits, which contains the information of all possible Cartesian trees of size at most . One can show that these pointers into the precomputed table use bits. Additionally, the data structure includes -bit auxiliary structures to support the queries in time.
We modify the data structure such that the precomputed table stores only Cartesian trees of size at most with left height at most . As a result, one can show that the total space required to store the pointers for all micro-trees is reduced to bits, while all other auxiliary structures remain unchanged.
3 RMQ on 2D array over bounded-sized alphabets
In this section, we consider a data structure for answering RMQ on a 2D array of size over an alphabet of size with .
3.1 1-sided queries
Recall that for 1-sided RMQ, the query range is restricted to for . We begin by considering the upper and lower bounds of the data structure for the general case.
Theorem 5.
For an array, (1) at least bits are necessary to answer 1-sided RMQ, and (2) there exists an -bit data structure that answers 1-sided RMQ in time.
Proof.
(1) Consider a set containing all possible
arrays where each array has exactly one entry with value in the -th column, while all other entries are set to .
The total number of such arrays is . Then, any array can be reconstructed using 1-sided RMQ on , since the position of the value in column is given by . Thus, at least bits are necessary to answer 1-sided RMQ.
(2) For all indices such that or , we store the row positions of using at most bits (in this case, is in the -th column). Additionally, we maintain a bit string of size to mark all such indices , along with an auxiliary structure of bits that supports rank and select queries on it in time [20]. Thus, the total size of the data structure is at most bits.
To answer in time, we proceed as follows:
(i) Identify the rightmost such that
using rank and select queries on , and
(ii) return , where is the stored row position (of ) corresponding to the -th column.
When is defined over an alphabet of size , there exists at most indices where . Therefore, the data structure of the Theorem 5 takes at most bits [20], which is smaller than bits when . Furthermore, the following theorem shows that this is optimal when , up to additive lower-order terms (if , the data structure of Theorem 5 gives a data structure with optimal space usage).
Theorem 6.
For an array defined over an alphabet of size , at least bits are necessary to answer 1-sided RMQ.
Proof.
We use a similar argument as in the proof of (1) in Theorem 5. Let be distinct column positions. Define as the set of all possible arrays where: (i) For each , exactly one entry in the -th column has the value . (ii) All other entries are set to . Since can be chosen arbitrarily within the given range, the total size of is . Furthermore, any array can be reconstructed using 1-sided RMQ on . Specifically, for each , the position of the value is given by , where is the -th rightmost column such that the column position of is .
3.2 2-sided queries
In this section, we consider the case where the query range is restricted to with and . As with 1-sided queries, we first give the upper and lower bounds of the data structure for the general case.
Theorem 7.
For an array, (1) at least bits are necessary to answer 2-sided RMQ, and (2) there exists an -bit data structure that answers 2-sided RMQ in time.
Proof.
(1) Let be the set of all possible 1D arrays of size such that for each : (i) , and (ii) for , is either or . The size of is . Furthermore, any array can be reconstructed using for all , based on the tie-breaking rule for RMQ.
Next, let be the set of all possible arrays where the -th row is an array from . The size of is . Moreover, any array can be reconstructed using 2-sided RMQ on , since the -th row of can be determined from for all (note that always lies in the -th row by construction - since any element in the first rows is larger than every element in the -th row). Thus, at least bits are necessary to answer 2-sided RMQ.
(2) Given an input array , define distinct 1D arrays of size where for each and , .
We then construct a data structure to answer RMQ on these arrays in time on each of these arrays (we do not store the arrays), using a total of bits [9]. Additionally, we maintain data structures for answering RMQ on each column of in time, using another bits.
To answer the 2-sided RMQ query , we proceed as follows:
(i) Compute in time by on , and (ii) compute in
time by on the -th column of .
Next, consider the case where the input array is defined over an alphabet of size . For each position , we refer to the range as the 2-sided region defined by . Let be a set of all positions in that serve as answers to some 2-sided RMQ on . For any two positions and of , we say that dominates if and only if and . Because of the tie breaking rule, it follows that for any two distinct positions and in if dominates , then . For example in a 2D array in Figure 2(a), two positions and are in , and since the position dominates the position , is greater than . We partition into set of staircases such that for any , is a set of all positions with . Then from the definition, we can directly observe the following:
Proposition 8.
Consider staircases within an array defined over an alphabet of size , the following holds for any :
-
(a)
No position in is dominated by any other position within the same staircase.
-
(b)
The -th bottom-most position in coincides with the -th leftmost position in .
-
(c)
The union of 2-sided regions defined by the positions in contains no value smaller than .
For each , we define a lattice path from to that satisfies the following conditions: (i) moves only north or east and passes through all positions in as its turning points in a clockwise order, and (ii) for any position in the array, there exists a position in dominated by if and only if lies on or below . Moreover, can be efficiently encoded from the following lemma:
Lemma 9 ([3]).
can be encoded as a bitstring of size containing ones, where is if and only if the -th move of is north. Using select queries on this bitstring, one can whether a position lies on, below, or above .
We can maintain bitstrings of Lemma 9 using bits [20] that support rank and select queries in time. Given any position and , we can determine whether the value at is at most in time by Lemma 9 and Proposition 8. Consequently, the 2-sided RMQ can be answered in time performing a binary search to find the smallest where lies on or below After finding such , we report as the answer, where and denote the number of zeros and ones in up to the -th , respectively (i.e., is the position in dominated by ). Both and can be found in time using rank and select queries on . We summarize the result in the following theorem.
Theorem 10.
For an array defined over an alphabet of size , there exist a data structures using bits that answers 2-sided RMQ in time.
Note that for any , the data structure of Theorem 10 uses asymptotically less space than the data structure of Theorem 7. Finally, we consider the space lower bound for answering 2-sided RMQ on a 2D array with a bounded alphabet. The following theorem implies that when for any constant , the data structure in Theorem 10 achieves asymptotically optimal space usage.
Theorem 11.
For an array defined over an alphabet of size , at least bits are necessary to answer 2-sided RMQ.
Proof.
For an array and , we define parallelogram regions from the leftmost part of the array, where each region is determined by the four positions: , , , and . For each region , we construct a staircase , which consists of the positions in whose values are . All other positions in the array are assigned the value . Let be the set of all possible such arrays.
Now, consider two distinct arrays and in where the staircases differ. Denote them as and , respectively. Without loss of generality, let and be the leftmost positions in and , respectively, with . In this case, cannot dominate any positions in , implying that the answers to the query on and are different. Consequently, at least bits are necessary to answer the 2-sided RMQ on an array.
To derive a lower bound on the size of , observe that for each region , we can define a lattice path that does not cross the region boundaries. This path starts at the bottom-left corner of , moves only north or east, and ends at the top-right corner of . Since each distinct lattice path corresponds to a distinct staircase, defined by the positions of its turning points (in a clockwise order), we obtain at least possible lattice paths for each region [21]. See Figure 2(b) for an example. This implies that is at least .
3.3 3-sided queries
In this section, we consider the case where the query range is restricted to with and . Since a 2-sided query is a special case of a 3-sided query with , Theorem 7 implies that at least bits are necessary to answer 3-sided queries. Furthermore, the following corollary shows that this space lower bound is also asymptotically tight for 3-sided queries.
Corollary 12.
For an array , there exists an -bit data structure that answers 3-sided RMQ in time.
Proof.
We construct the same data structure as in the proof of Theorem 7 (2) using bits. To answer the 3-sided RMQ query in time, we first compute by on , and compute by on the -th column of . Next, consider the case where the input array is defined over an alphabet of size . For each , define a 1D array of size such that where is the smallest row index such that (i) and (ii) all preceding values in the column, , are greater than . If -th column does not have the value , set , and if -th column has the value , but does not satisfy the condition (ii), set . We maintain the arrays along with RMQ data structures that allow queries to be answered in time, using a total of bits [9].
Given a 3-sided range on and a value , let be the result of on . Then the value exists in the range if and only if . Thus, we can determine whether exists in the range in time and compute the 3-sided RMQ in time by performing a binary search to find the smallest that exists in the range. We summarize this result in the following theorem.
Theorem 13.
For an array defined over an alphabet of size , there exists a data structure of size bits that supports 3-sided RMQ in time.
Compared to the general case, the above data structure requires less space when , and when , it answers 3-sided RMQ in time.
In the general case, the optimal space data structures for answering 2-sided and 3-sided RMQ require asymptotically the same space by Theorem 7 and Corollary 12. However, when the alphabet size is bounded by , the data structure from Theorem 10 uses asymptotically less space than the data structure of Theorem 13 when . Finally, we consider the space lower bound for answering 3-sided RMQ on a 2D array with a bounded alphabet. The following theorem implies that when , the data structure in Theorem 13 achieves asymptotically optimal space usage.
Theorem 14.
For an array defined over an alphabet of size , at least bits are necessary to answer 3-sided RMQ.
Proof.
(1) Let be the set of all possible 1D arrays of size such that for each : (i) for , has exactly one occurrence of at the position , (ii) , and (iii) all other positions have the value . Then the size of is . Next, let be the set of all possible arrays where each column is chosen from . The size of is . Moreover, any array can be reconstructed using 3-sided RMQ on since -th column of can be reconstructed from for all , which proves the theorem. Specifically, for any , the value appears at where is the index of the -th row from the bottom that satisfies .
3.4 4-sided queries
In this section, we consider the case where the query range is an arbitrary rectangular region. When the alphabet size is unbounded, there exists a data structure using bits that answers RMQ in time, while at least bits are necessary for answering RMQ on [3]. A -bit encoding is also known [2], although it does not support the queries efficiently. Here, we focus on the case the case where the input array is defined over a bounded alphabet of size . First, we give a lower bound on the space required for the data structure, as stated in the following theorem.
Theorem 15.
For an array defined over an alphabet of size , at least bits are necessary to answer 4-sided RMQ, assuming .
Proof.
Our proof is analogous to the -bit lower bound proof of Brodal et al. [3]. First, divide an input array into blocks of size , and further divide each block into four sub-blocks of size . Next, assign values to each block using the following procedure:
-
Assign the odd values from to the upper-left sub-block in increasing order, following a row-major order.
-
Assign the even values from to the anti-diagonals in the bottom-right sub-block, such that the values in each anti-diagonal are larger than those in the anti-diagonals to the right.
-
Assign to all the remaining positions in the block.
Let be the set of all arrays where values are assigned according to the above procedure. Since, for any array , each anti-diagonal within any sub-block of forms a permutation, and there are blocks in , the total size of is at least . Thus, is . Now, consider two distinct arrays and in where . From the assignment procedure, the position lies on an anti-diagonal in the bottom-right sub-block of some block (note that for every position in the other three sub-blocks of every block). There exists another position in the upper-left sub-block of the same block such that (see Figure 3(a) for an example). Thus, on returns , while the same query on returns , This implies that at least different arrays have distinct RMQ answers.
Theorem 15 implies that if , an -bit indexing data structure (i.e., a data structure that explicitly stores the input array) of Brodal et al. [3], which answers RMQ in time, uses asymptotically optimal space for the bounded-alphabet case. This means that in most scenarios, restricting the alphabet size provides little advantage in terms of space efficiency compared to the general case. However, when , the term in the above indexing data structure of [3] can dominate the space for storing the input. In the following theorem, we show that this can be improved to bits while still supporting query time222In fact, the result of [3] provides a trade-offs between index size and query time - if only -bit space is used for the index, the query time increases to ..
Theorem 16.
For an array defined over an alphabet of size , there exists a data structure of size bits that supports 4-sided RMQ in time.
Proof.
Let . Then we partition into blocks of size and construct the following structures:
-
To efficiently handle queries whose range falls entirely within a single block, we store an index data structure from Brodal et al. [3] for each block as follows. We first store a precomputed table of size , which contains all possible index structures from [3] for blocks of size . Since each block can be represented as an index into this precomputed table using bits, the total space required for storing indices of all blocks is bits. As we maintain these index structures, we can also access any position in the array in time.
-
Next, we define an auxiliary array , where each entry is given by , where is a column position of on . We maintain the index data structure of [3] on , using bits. This allows us to access any position in in time and answer RMQ on in time when both the leftmost and rightmost columns of the query range align with block boundaries. Note that the exact column position of the answer can be determined using the RMQ structure on individual blocks.
Similarly, we define an auxiliary array and maintain the same structure using bits so that we can answer RMQ on in time when both the topmost and bottommost rows in the query range align with block boundaries.
-
Finally, let be an array, where each entry is given by where is on . We maintain the index data structure of [3] on using bits. This allows us to access any position in in time and answer RMQ on in time when all boundaries of the query range align with block boundaries.
Any rectangular query range on can be partitioned into the following regions: (i) at most four rectangular regions fully contained within a single block, (ii) at most two rectangular regions where the leftmost and rightmost columns align with block boundaries, (iii) at most two rectangular regions where the topmost and bottommost rows align with block boundaries, and (iv) at most one rectangular region where all four boundaries align with block boundaries. Using the structures described above, we can determine the position and value of the minimum element in each partitioned region in time. See Figure 3(b) for an example. The final result is obtained in time by returning the position of the minimum element among them.
4 Conclusions
In this paper, we studied encoding data structures for RMQ on 1D and 2D arrays when the alphabet size is bounded. Most of our data structures use asymptotically optimal space across a wide range of alphabet sizes. We also compare our results to the general case with no restrictions on the alphabet size. For 1D arrays, the space usage differs only slightly, even for a constant-sized alphabet. For 2D arrays, the data structure asymptotically requires less space compared to the general case over a wider range of alphabet sizes, from both upper and lower bound perspectives. For example, for an 2D array with , we showed that the space usage can be improved for all types of queries when the alphabet size .
It would be interesting to explore other parameters, such as compressed matrix size, that could further improve space usage for encoding data structures for RMQ.
References
- [1] Michael A. Bender and Martin Farach-Colton. The level ancestor problem simplified. Theor. Comput. Sci., 321(1):5–12, 2004. doi:10.1016/J.TCS.2003.05.002.
- [2] Gerth Stølting Brodal, Andrej Brodnik, and Pooya Davoodi. The encoding complexity of two dimensional range minimum data structures. In ESA, volume 8125 of Lecture Notes in Computer Science, pages 229–240. Springer, 2013. doi:10.1007/978-3-642-40450-4_20.
- [3] Gerth Stølting Brodal, Pooya Davoodi, Moshe Lewenstein, Rajeev Raman, and Srinivasa Rao Satti. Two dimensional range minimum queries and Fibonacci lattices. Theor. Comput. Sci., 638:33–43, 2016. doi:10.1016/J.TCS.2016.02.016.
- [4] Gerth Stølting Brodal, Pooya Davoodi, and Srinivasa Rao Satti. On space efficient two dimensional range minimum data structures. Algorithmica, 63(4):815–830, 2012. doi:10.1007/S00453-011-9499-0.
- [5] Pooya Davoodi, Rajeev Raman, and Srinivasa Rao Satti. Succinct representations of binary trees for range minimum queries. In Computing and Combinatorics - 18th Annual International Conference, COCOON 2012, Sydney, Australia, August 20-22, 2012. Proceedings, volume 7434 of Lecture Notes in Computer Science, pages 396–407. Springer, 2012. doi:10.1007/978-3-642-32241-9_34.
- [6] Erik D. Demaine, Gad M. Landau, and Oren Weimann. On Cartesian trees and range minimum queries. In Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, Sotiris Nikoletseas, and Wolfgang Thomas, editors, Automata, Languages and Programming, pages 341–353, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. doi:10.1007/978-3-642-02927-1_29.
- [7] Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. An alphabet-friendly FM-index. In SPIRE, volume 3246 of Lecture Notes in Computer Science, pages 150–160. Springer, 2004. doi:10.1007/978-3-540-30213-1_23.
- [8] Johannes Fischer. Data structures for efficient string algorithms. PhD thesis, Ludwig-Maximilians-Universität München, 2007.
- [9] Johannes Fischer and Volker Heun. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput., 40(2):465–492, 2011. doi:10.1137/090779759.
- [10] Travis Gagie, Gonzalo Navarro, and Nicola Prezza. Fully functional suffix trees and optimal text searching in BWT-runs bounded space. J. ACM, 67(1):2:1–2:54, 2020. doi:10.1145/3375890.
- [11] Pawel Gawrychowski, Seungbum Jo, Shay Mozes, and Oren Weimann. Compressed range minimum queries. Theor. Comput. Sci., 812:39–48, 2020. doi:10.1016/J.TCS.2019.07.002.
- [12] Antoine Genitrini, Bernhard Gittenberger, Manuel Kauers, and Michael Wallner. Asymptotic enumeration of compacted binary trees of bounded right height. J. Comb. Theory A, 172:105177, 2020. doi:10.1016/J.JCTA.2019.105177.
- [13] Mordecai J. Golin, John Iacono, Danny Krizanc, Rajeev Raman, Srinivasa Rao Satti, and Sunil M. Shende. Encoding 2D range maximum queries. Theor. Comput. Sci., 609:316–327, 2016. doi:10.1016/J.TCS.2015.10.012.
- [14] Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In SODA, pages 841–850. ACM/SIAM, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644250.
- [15] Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. SIAM J. Comput., 13(2):338–355, 1984. doi:10.1137/0213024.
- [16] Gregory Kucherov and Yakov Nekrich. Full-fledged real-time indexing for constant size alphabets. Algorithmica, 79(2):387–400, 2017. doi:10.1007/S00453-016-0199-7.
- [17] J. Ian Munro, Gonzalo Navarro, and Yakov Nekrich. Text indexing and searching in sublinear time. In CPM, volume 161 of LIPIcs, pages 24:1–24:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.CPM.2020.24.
- [18] J. Ian Munro, Patrick K. Nicholson, Louisa Seelbach Benkner, and Sebastian Wild. Hypersuccinct trees - new universal tree source codes for optimal compressed tree data structures and range minima. In ESA, volume 204 of LIPIcs, pages 70:1–70:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ESA.2021.70.
- [19] Gonzalo Navarro. Wavelet trees for all. J. Discrete Algorithms, 25:2–20, 2014. doi:10.1016/J.JDA.2013.07.004.
- [20] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans. Algorithms, 3(4):43, 2007. doi:10.1145/1290672.1290680.
- [21] Masako Sato. Note on the number of minimal lattice paths restricted by two parallel lines. Discret. Math., 44(1):117–121, 1983.
- [22] Baruch Schieber and Uzi Vishkin. On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput., 17(6):1253–1262, 1988. doi:10.1137/0217079.
- [23] Jean Vuillemin. A unifying look at data structures. Commun. ACM, 23(4):229–239, 1980. doi:10.1145/358841.358852.
- [24] Hao Yuan and Mikhail J. Atallah. Data structures for range minimum queries in multidimensional arrays. In SODA, pages 150–160. SIAM, 2010. doi:10.1137/1.9781611973075.14.