Abstract 1 Introduction 2 Preliminaries 3 Basic data structure for FIDs 4 Advanced data structure for FIDs 5 Select and partial sum References

Optimal Static Fully Indexable Dictionaries

Jingxun Liang ORCID Carnegie Mellon University, Pittsburgh, PA, USA Renfei Zhou ORCID Carnegie Mellon University, Pittsburgh, PA, USA
Abstract

Fully indexable dictionaries (FID) store sets of integer keys while supporting rank/select queries. They serve as basic building blocks in many succinct data structures. Despite the great importance of FIDs, no known FID is succinct with efficient query time when the universe size U is a large polynomial in the number of keys n, which is the conventional parameter regime for dictionary problems. In this paper, we design an FID that uses log(Un)+n(logU/t)Ω(t) bits of space, and answers rank/select queries in O(t+loglogn) time in the worst case, for any parameter 1tlogn/loglogn, provided U=n1+Θ(1). This time-space trade-off matches known lower bounds for FIDs [40, 41, 46] when tlog0.99n.

Our techniques also lead to efficient succinct data structures for the fundamental problem of maintaining n integers each of =Θ(logn) bits and supporting partial-sum queries, with a trade-off between O(t) query time and n+n/(logn/t)Ω(t) bits of space. Prior to this work, no known data structure for the partial-sum problem achieves constant query time with n+o(n) bits of space usage.

Keywords and phrases:
data structures, dictionaries, space efficiency
Category:
Track A: Algorithms, Complexity and Games
Funding:
Renfei Zhou: Partially supported by the MongoDB PhD Fellowship.
Copyright and License:
[Uncaptioned image] © Jingxun Liang and Renfei Zhou; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Data structures design and analysis
Related Version:
Full Version: https://arxiv.org/abs/2504.19350
Acknowledgements:
The authors thank William Kuszmaul and Huacheng Yu for helpful suggestions on paper writing, and anonymous reviewers for pointing out important related works.
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

A fully indexable dictionary (a.k.a. rank/select dictionary; FID for short) is a fundamental data structure, which stores a set S of n keys from a universe [U], supporting Rank and Select queries:

  • Rank(x): return the number of keys in S that are smaller than or equal to x.

  • Select(i): return the i-th smallest key in S.

This paper focuses on static FIDs, where the key set is given at the beginning and does not change over time. There is also the dynamic case, allowing updates to the key set via insertions and deletions, which is not considered in this paper.

FIDs are powerful data structures with numerous applications. First, if we choose to think of S as an indicator vector in {0,1}U, then the problem becomes storing a bit string containing n ones and Un zeros, while supporting prefix-sum queries and queries for the position of the i-th one. This so-called rank/select problem serves as a subroutine in many space-efficient data structures. Second, given Rank and Select, one can also support predecessor search queries (i.e., find the largest element in S that does not exceed x) by calling Select(Rank(x)) – this, too, serves as a common data-structural subroutine. The result is that FIDs have many applications in data structures for strings [10, 14, 24, 26, 36, 37, 21, 13], trees [12, 18, 44, 35], parentheses sequences [17, 35], multisets [44], permutations [34], variations of dictionaries [8, 7], etc.

The most fundamental question regarding FIDs is to determine their best possible time-space trade-off. If an FID storing n keys from the universe [U] uses log(Un)+R bits of space, we say it incurs R bits of redundancy, where the first term log(Un) is referred to as the information-theoretic optimum. Conventionally, an FID is said to be compact if the redundancy R=O(log(Un)), and is said to be succinct if R=o(log(Un)). The time-space trade-off of an FID is the relationship between the redundancy R and the worst-case query time under a RAM with word-size Θ(logU).

Towards the optimal trade-off

There is vast literature on space-efficient FIDs. The earliest works [28, 33, 9] on FIDs focused on the setting where one needs to store S’s indicator vector using U bits in the plain format, while using up to R additional bits to store auxiliary information. Compared to the information-theoretic optimum log(Un), their approaches are succinct only if U=(2±o(1))n. In 2002, Raman, Raman, and Satti [44] constructed an FID with redundancy O(UloglogU/logU) and constant query time. Later, Pǎtraşcu [39] reduced the redundancy to U/(logU/t)Ω(t)+O(U3/4) bits with worst-case query time O(t).

Some applications of FIDs demand to store dense sets where the universe size U=O(n); in this case, Pǎtraşcu’s FID is already succinct and achieves an ideal trade-off between O(t) query time and n/(logn/t)Ω(t) bits of redundancy, which is provably optimal when tlog0.99n [41, 46]. However, when U=n1+Θ(1), which is the conventional parameter regime for dictionary problems, the redundancy of Pǎtraşcu’s FID will be polynomially larger than the information-theoretic optimum log(Un)=Θ(nlogn) even for large running times, e.g., t=log0.99n, and thus is not succinct or compact. Gupta et al. [25] further improved the redundancy to O(nloglogn) bits, thus being succinct, at the cost of slower queries of O(log2logn) time. It is currently not known how to construct succinct FIDs with query time o(log2logn).

On the lower-bound side, Pǎtraşcu and Thorup [40] showed an Ω(loglogn) time lower bound for predecessor queries, assuming that U=n1+Θ(1) and that the data structure is compact. Their lower bound also applies to FIDs by a reduction. Another lower bound proven by Pǎtraşcu and Viola [41, 46] shows that, if an FID takes O(t) worst-case time per query, it must incur n/(logn)O(t) bits of redundancy. The best-known lower bound for FIDs is a simple combination of these two independent lower bounds: When the redundancy equals n/(logn)Ω(t), the worst-case query time must be at least Ω(max{loglogn,t}), provided U=n1+Θ(1). There remains a huge gap between the lower and upper bounds.

In summary, it has remained one of the basic open questions in the field whether one can hope to construct an FID that is both succinct and that supports queries in the optimal time of O(loglogn). Moreover, although there are well-established lower bounds for the time-space trade-off of a static FID, it remains open on the upper-bound side to obtain tight bounds at any point along the curve. These are the problems that we seek to resolve in the current paper.

This paper: Tight upper bounds for FIDs

In this paper, we construct a succinct FID as shown in the following theorem.

Theorem 1.

For any parameters n,U,t with U=n1+Θ(1) and tlogn/loglogn, there is a static fully indexable dictionary with query time O(t+loglogn) and redundancy R=n/(logn/t)Ω(t), in word RAM with word size w=Θ(logn).

Thus, it is possible to achieve succinctness while offering an optimal time bound of t=O(loglogn) – in this case, the redundancy of our construction is n/(logn)Ω(loglogn)n/polylogn bits. More generally, the time-space trade-off offered by the above theorem is provably optimal for all tlog0.99n, as, in every parameter regime, it matches one of the two known lower bounds. Somewhat surprisingly, this means that the maximum of the two completely independent lower bounds forms a single tight lower bound for FIDs.

High-level technical approach

To understand the high-level approach taken by our data structure, let us consider the lower bound in [40], which points out a distribution of hard instances such that any data structure with near-linear space O(npolylogn) needs to spend Ω(loglogn) time for each query on these inputs.

A critical insight in the current paper is that, although it is difficult to improve the query time for these hard instances, they are well-structured so that they only occupy a small fraction of all possible inputs. Therefore, in principle, the space we actually need to store these hard instances is small.

Moreover, if we restrict only to these hard instances, then we can afford to use a data structure that is space-efficient compared to the optimal bound for FIDs but that is space-inefficient compared to the information-theoretic optimum for hard instances. So we can handle hard instances by creating data structures that are morally space inefficient (i.e., not optimal for these hard instances), but that are nonetheless space efficient in the context of the overall FID problem.

On the flip side, when an input is not hard, we can think of it morally as being like a “random” input. The entropy of random inputs is large, so we cannot waste much space. Fortunately, random instances can benefit from the same high-level techniques that have already been developed in past work [39], allowing for fast queries with good space efficiency.

Of course, “hard instances” and “random instances” are really just two extremes in a large design space of inputs. What is remarkable is that, nonetheless, it is possible to combine the high-level approaches described above in order to construct a single data structure that achieves the optimal time-space trade-off for all inputs.

Other implications of our techniques

As mentioned earlier, FIDs can naturally support predecessor search queries, so our result also improves the predecessor data structures:

Corollary 2.

For any parameters n,U with U=n1+Θ(1), there is a data structure storing a set S of n keys from a universe [U] that supports predecessor searches in S, with the same time-space trade-off as in Theorem 1.

Prior to this result, it remained open whether one could achieve O(loglogn) query time while also offering o(n) bits of redundancy [42].

By adjusting our data structure for FIDs, we show the following time-space trade-off for so-called select dictionaries, which are dictionaries that support Select but not Rank:

Theorem 3.

For any parameters n,U,t with U=n1+Θ(1) and tlogn/loglogn, there is a static dictionary that answers Select queries within O(t) time and has redundancy R=n/(logn/t)Ω(t), in word RAM with word size w=Θ(logn).

Here, again, the bounds that we achieve are provably optimal, as they match a lower bound previously proven by [41]. This represents a significant improvement over the previous state of the art [44], which was unable to achieve space bounds any better than O(n/logn).

Another application of our techniques is to the basic question of storing n Θ(logn)-bit integers and answering partial-sum queries.

Theorem 4.

For any parameters n,,t with =Θ(logn) and tlogn/loglogn, there is a data structure storing a sequence of n -bit integers a1,,an, which uses n+n/(logn/t)Ω(t) bits of space and can answer the partial-sum queries j=1iaj for any given i within O(t) time.

For this partial-sum problem, the previous state of the art with O(1) query times incurred O(n) bits of redundancy [43]. Whether or not this could be reduced to o(n) remained an open question. Our bound reduces it all the way to n/polylogn bits for a polylogarithm of our choice.

1.1 Related works

Grossi et al. [23] studied the FID problem for polynomial universe sizes with O(1) query time. In this setting, achieving a compact space usage of O(nlogn) bits is impossible due to the lower bound established by [40]. They showed a trade-off between O(ε1) query time and O(n1+ε) bits of space.

In the special case where the universe size U=2n, Yu [47] presented a data structure that supports Rank queries in O(t) time, but not Select queries. This data structure incurs only n/(logn)Ω(t)+n1Ω(1) bits of redundancy, outperforming Pǎtraşcu’s data structure when t=(logn)1o(1) is large, and matching the lower bound of [40] for all t.

A closely related setting is the dynamic FID problem, where, in addition to answering Rank and Select queries, the data structure also needs to support fast insertions and deletions. When the universe size U is linear in the number of keys n, Li et al. [30] achieved O(n/2log0.199n) bits of redundancy with the optimal operational time of O(logn/loglogn). For polynomial universe sizes U=n1+Θ(1), it remains open whether a succinct data structure with O(logn/loglogn) time per operation can be constructed.

Predecessor data structures are also closely related to FIDs and have been extensively studied across various parameter regimes and settings due to their importance. For a comprehensive overview, see the survey by Navarro and Rojas-Ledesma [38].

Another variant of dictionary is the unordered dictionary problem, where the data structure only needs to answer membership queries – whether a given key x is in the current key set. A series of works has focused on both static and dynamic unordered dictionaries [48, 27, 45, 31, 6, 32, 5], leading to a static unordered dictionary with constant query time and nε bits of redundancy [27], as well as tight upper and lower bounds for the time-space trade-off of dynamic unordered dictionaries for polynomial universe sizes [31, 6, 32].

Raman, Raman, and Satti [44] studied the indexable dictionary (ID) problem, which is similar to FIDs, but Rank(x) only returns the rank of x when x is present in the current key set; otherwise, it returns “not exist.” They constructed IDs with O(n/logn) bits of redundancy and O(1) query time. Note that the ID setting is easier than FIDs, allowing data structures to achieve both succinctness and constant-time queries.

A further relaxation of ID is monotone minimal perfect hashing (MMPH), where the data structure only needs to support Rank queries for elements in the key set – if x is not in the key set, Rank(x) can return an arbitrary answer. (Select queries are not required.) The key point of this relaxation is that MMPH data structures use asymptotically less space: Belazzougui, Boldi, Pagh, and Vigna [3] constructed a data structure that uses only O(nlogloglogU) bits, while encoding the key set itself requires log(Un)=O(nlog(U/n)) bits. This bound is shown to be optimal by Assadi, Farach-Colton, and Kuszmaul [1] (see also [29]).

There is also a line of research on rank/select problems over arbitrary alphabets [21, 19, 15, 20, 22, 2, 4]. Given a sequence in [σ]n, the select query asks for the k-th occurrence of a given symbol s[σ]; rank queries are defined similarly. These problems generalize FIDs, which correspond to the special case σ=2. For arbitrary σ, the optimal time/redundancy trade-off is still not fully understood [4].

2 Preliminaries

Augmented B-trees

We will use the augmented B-trees (aB-trees for short) from [39] as a subroutine.

Let B and m be parameters such that m is a power of B, and let A[1m] be an array of elements in the alphabet Σ. An aB-tree of branching factor B and size m is a full B-ary tree over m leaves, which correspond to the entries of the array A[1m]. Additionally:

  • Each node of the aB-tree is augmented by a label from the label alphabet Φ. The label of a leaf node is determined by the corresponding entry A[i] in the array, and the label of an internal node is determined by the label sequence of its children. Formally, there is a transition function 𝒜 determining the label ϕu of each the internal node u: ϕu=𝒜(ϕ), where ϕ(ϕ1,ϕ2,,ϕB) is the label sequence of u’s B children.

  • There is a recursive query algorithm, which starts by examining the label of the root, and then recursively traverses down a path from the root to some leaf of the tree. At each step, the algorithm examines the label of the current node and the labels of its children to determine which of the children to recurse on. After reaching a leaf, the algorithm outputs the answer to the query based on all the examined labels. Furthermore, this algorithm is restricted to spend only O(1) time at each examined node, ensuring that the query time remains at most O(logBm).

Beyond [39], we also consider incomplete aB-trees. Let m be any integer, not necessarily a power of B, and let t be an integer such that Btm. An incomplete aB-tree with branching factor B and size m is derived from a (full) aB-tree over Bt leaves with the same branching factor B, by retaining only the first m leaves while removing other Btm leaves, and (repeatedly) deleting all internal nodes without any child. In such an incomplete aB-tree, the transition function 𝒜 of labels still follows the form ϕu=𝒜(ϕ), but the input label sequence ϕ=(ϕ1,,ϕ) may have a length less than B, as u might possess fewer than B children.

Let 𝒩(m,ϕ) be the number of instances of A[1m] such that the aB-tree of branching factor B over it will have root label ϕ. According to Theorem 8 of [39], when m is a power of B, we can compress the aB-tree with size m and root label ϕ to within log𝒩(m,ϕ)+2 bits. Their proof directly works for incomplete aB-trees (mBt) as well.

Lemma 5 (Natural generalization of [39, Theorem 8]).

Let m,B,t be parameters with mBt and B=O(wlog(m+|Φ|)). Suppose there is an (incomplete) aB-tree with branching factor B, size m, and root label ϕ, then we can compress this aB-tree to within log𝒩(m,ϕ)+2 bits with query time O(t), in the word RAM model with word size w. The data structure uses a lookup table of O(B2t|Σ|+B3t|Φ|2B) words, which only depends on B and t.

Notice that the lookup table only depends on B and t, but not m. The original construction in [39], when applied on incomplete aB-trees, uses a lookup table of O(B(|Σ|+|Φ|B+1+B|Φ|B)) words, which depends on B, t, and m. To avoid the dependency on m, we simply concatenate Bt such lookup tables for all values of m together, with at most O(Bt+1(|Σ|+|Φ|B+1+B|Φ|B))O(B2t|Σ|+B3t|Φ|2B) words, and let aB-trees with different sizes m use different parts of the (concatenated) lookup table. Later when we apply this lemma in our data structure, we will let multiple instances with the same B and t (but with different m) share a single lookup table.

Predecessor data structures

We will use the following extension of predecessor data structure as a subroutine.

Problem 1 (Predecessor with associated values).

Storing a set S[U] of n keys,111We use notation [n] to denote the set {0,1,,n1} for any non-negative integer n, and use [a,b] to denote the set of integers {a,a+1,,b} when there is no ambiguity. where each key is associated with a value in [V], supporting predecessor and successor queries:

  • Predecessor(x): Return the largest element yS such that yx, and the associated value of y.

  • Successor(x): Return the smallest element yS such that y>x, and the associated value of y.

By studies on dictionaries [16] and predecessor data structures [40], there is a compact construction for predecessor data structures with associated values:

Lemma 6.

For any Un, there are predecessor data structures with associated values using O(nlogU+nlogV) bits, with query time O(loglog(U/n)) in the worst case.

Proof Sketch.

We maintain the following data structures simultaneously:

  • A predecessor data structure for the set S[U] by [40] with space O(nlogU) and query time O(loglog(U/n)).

  • A successor data structure similar to the predecessor data structure.

  • A “perfect hashing” by [16, 11] to store the key set S with their associated values, with a space complexity of O(n(logU+logV)) bits and a worst-case query time O(1).

Throughout this paper, we use the terminology “predecessor data structure” for short to refer to the data structure defined by Lemma 6.

3 Basic data structure for FIDs

In this section, we will construct a basic data structure for FIDs with a slightly worse time and space guarantee than the requirement of Theorem 1. It will serve as a subroutine in the final data structure in Section 4: The final data structure is based on the algorithm framework in this section, and by replacing a subroutine with the result in this section (Theorem 7) in a non-recursive way, it achieves the ideal time-space trade-off. Formally, we will prove the following theorem:

Theorem 7 (Weak version of Theorem 1).

For any parameters U,n,t with U=n1+Ω(1), tlogU/loglogU and a constant ε>0, there is a static fully indexable dictionary with query time O(tloglogU) and redundancy R=max{n/(logU/t)Ω(t),O(logU)}, in the word-RAM model with word size w=Θ(logU). The data structure uses a lookup table of O(U10ε) words that only depend on U and t which could be shared among multiple instances of fully indexable dictionaries.

Although in most applications of FIDs we care about polynomially-sized universes U=n1+Θ(1), here we also consider the parameter regime where n=Uo(1) is significantly smaller than U. The reason is that, later in Section 4, we will use Theorem 7 to maintain short subsequences of keys.

In the remainder of this section, we construct this basic data structure to prove Theorem 7. Table 1 lists the main parameters and notations we will introduce in this section.

Table 1: Table of Notations.
Notation Explanation
t A parameter indicating that our algorithm’s time constraint is O(tloglogU).
R Rmax{n/(logU/t)Ω(t),O(logU)} is the desired redundancy of our FID.
B The branching factor of aB-trees in the mid parts. BlogB=(εlogU)/t. When tlog0.99U, there is B=logΘ(1)U.
Bt The number of keys in each block.
h htlogB. Each mid part consists of 2h bits.
b blog(U/n)h. Each low part consists of b bits.
xi(low)δi(mid)δi(high) The value in the low, mid, and high parts. See Figure 1.
δi The value in the mid and high parts together. It equals δi(mid)+22hδi(high).
δi δij=1iδj is the prefix sum of {δi}.
Δ Δ=δBt is the summation of all δi within a block.

Partitioning into blocks

Let S={x1,x2,,xn} be the set of keys we need to store, where x1<x2<<xn. The first step of the construction is to divide the sequence (x1,x2,,xn) into small blocks, and to store some inter-block data structure that reduces the entire FID problem to small-scale FID problems within each block.

Let B be a parameter where BlogB=εlogUt, and we break the whole sequence into blocks of size Bt. As the sequence (x1,x2,,xn) is monotonically increasing, the partition into blocks could be viewed as partitioning the possible range of the keys [U] into n/Bt disjoint intervals, where the k-th block corresponds to the interval (x(k1)Bt,xkBt].222We let x0=0 for convenience. When n is not divisible by Bt, especially when n<Bt, the size of the last block will be smaller than Bt. Fortunately, the same construction below works for any block size mBt, and we only illustrate our construction for block size Bt.

Our inter-block data structure consists of the following two parts:

  • The sequence of endpoints of each interval, i.e., (xBt,x2Bt,,xn).

  • A predecessor structure for the sequence of endpoints (xBt,x2Bt,,xn), where each entry xkBt is associated with value k.

Given the above auxiliary information, we view each block k as an FID with Bt keys from a universe of size xkBtx(k1)Bt. When we perform a rank query Rank(x), the second part above can help us locate the block k containing the queried element x (i.e., x(x(k1)Bt,xkBt]), transforming the original query into a Rank query in the k-th block, within O(loglogU) time (see Lemma 6). When we perform a select query Select(i), it becomes a Select query within block k=i/Bt. The remaining question is to answer Rank/Select queries within each block.

Storing difference sequences within blocks

Throughout the remainder of this section, we mainly focus on the FID problem within a single block k. Letting s=(k1)Bt, the intra-block FID problem requires us to maintain the sequence of keys (xs+1,,xs+Bt) in the universe (xs,xs+Bt]. We cut the binary representation of each key xs+i into two parts, as shown in Figure 1(a):

  • Letting h be a parameter to be determined and blogUnh, we define the low part as the b least significant bits in the binary representation of each key xs+i, denoted by integers xi(low)[2b]. We will directly store these integers in the data structure.

  • The logUb remaining (more significant) bits are referred to as the mid-high part, denoted by integers xi(mid-high). For these integers, we aim to store the difference sequence, i.e., the differences δixi(mid-high)xi1(mid-high) between adjacent pairs of elements.

Clearly, we have

xs+i=xi(low)+xi(mid-high)2b(0iBt).

For short, we let δi denote the partial sum j=1iδj, so that xi(mid-high)=δi+x0(mid-high). We further define ΔδBt as the sum of the difference sequence (δ1,,δBt):

Δi=1Btδi=xBt(mid-high)x0(mid-high)=xs+Bt2bxs2b.

We denote by Δk the value of Δ in the k-th block. Clearly,

k=1n/BtΔk=xn2bx02bU2b=n2h. (1)

With these notations, the FID problem within a single block can be restated as follows:

Problem 2 (FID within a block).

Let Δ be a parameter stored outside. We need to store two sequences (x1(low),x2(low),,xBt(low)) and (δ1,δ2,,δBt), such that xi(low)[2b] and i=1Btδi=Δ, with the property that (xi(low)+δi2b)i=1Bt forms a strictly increasing sequence, supporting the following queries:

  • PartialSum(i): Return δi2b+xi(low). It corresponds to the Select queries in the original FID problem.

  • Rank(x): Return the largest i such that δi2b+xi(low)x.

As introduced above, once we have a solution to Problem 2 with O(T) time for any T, with the help of the inter-block data structure, we may answer the Rank/Select queries to the original key sequence in O(T+loglogU) time.

(a) Separating low parts from mid-high parts.
(b) Separating mid parts and high parts.
Figure 1: Partitioning keys into three parts. We (a) divide the binary representations of the keys into low parts and mid-high parts, and further (b) take the difference sequence (δi)1iBt of the mid-high parts and divide them into mid parts and high parts. There are blog(U/n)h bits in the low part, 2h bits in the mid part, and lognh bits in the high part.

The three-part partition

For some technical reasons, we further divide each δi (i.e., the difference sequence of the mid-high part) into two smaller parts as follows. We will use different approaches to organize these parts in Section 3.1.

  • Let δi(mid) be the integer formed by the 2h least significant bits of δi, which we call the mid part.

  • Let δi(high) be the integer formed by the remaining logUb2h=lognh bits, which we call the high part.

Formally,

δi=δi(mid)+δi(high)220pt(i[1,Bt]).

By now, we have cut the sequence of keys (xs+1,,xs+Bt) in a single block into three parts: the high part (δ1(high),,δBt(high)), the mid part (δ1(mid),,δBt(mid)), and the low part (x1(low),,xBt(low)), as shown in Figure 1(b). The following subsection 3.1 will design (almost) separate data structures to store these three parts.

3.1 Data structure within each block

In this subsection, we will design the intra-block data structure with time complexity O(tloglogU) of three parts. At a very high level, these three parts of the data structure will interact as follows when performing PartialSum and Rank queries:

  • During the query PartialSum(i), we will compute δi(high), δi(mid) and xi(low) from the high part, mid part and the low part of the data structure separately,333We define δi(high) and δi(mid) similarly to δi: δi(high)j=1iδj(high), and so is δi(mid). and then combine them into the output.

  • During the Rank(x) query, we sequentially examine the high part, mid part, and low part of the data structure. At each step, we narrow down the search interval for the desired index i, gradually approaching its exact value in the low part.

Below, we will provide the detailed construction of the three parts separately.

High part

The high part sequence (δ1(high),,δBt(high)) is sparse and thus easy to store. As

i=1Btδi(high)=i=1Btδi22h122hi=1Btδi=Δ22h,

there are at most Δ/22h non-zero entries in the high part. By setting a large parameter h, we can make the space usage of the high part very small such that we can regard the entire high part as redundant information.

Let I[1,Bt] be the set of indices of all non-zero entries in the high part, then |I|Δ/22h. The high part of the data structure consists of the following two components:

  • A predecessor data structure of the set I, where each iI is associated with δi(high).

  • A predecessor data structure of the set {δi:iI}, where each δi is associated with i.444Note that all the δi are distinct for iI as δi(high)0 for all iI, hence {δi:iI} is a valid set.

According to Lemma 6, there is a compact implementation of predecessor data structures, with O(|I|w)=O(wΔ/22h) bits of space and query time O(loglogU).

Mid part

We will store the mid-part sequence (δ1(mid),,δBt(mid)) by an aB-tree with branching factor B and size Bt, using Lemma 5.555When the block size m is smaller than Bt, we will use an incomplete aB-tree, that is why we need to support incomplete aB-trees in Lemma 5. The details of the aB-tree are as follows.

  • Each leaf of the aB-tree contains a single mid-part entry δi(mid)[22h]Σ.

  • There is a label on each node that equals the sum of all the leaves δi(mid) in its subtree. As δi(mid)<22h for each i[1,Bt], we have Φ[Bt22h].

  • The premise of Lemma 5, B=O(wlog(Bt+|Φ|)), can be satisfied by setting h=tlogB. Recalling that B is a parameter with BlogB=εlogUt, we have B=εlogUtlogB=O(logUlog(Bt22h))=O(wlog(Bt+|Φ|)).

  • The size of the lookup table is

    O(B2t|Σ|+B3t|Φ|2B) =O(B3t|Φ|2B)O(B3t(B3t)2B)
    =O(B6tB+3t)O(B10tB)=O(U10ε)

    words, which will be shared between blocks.

  • The aB-tree can support three types of queries within O(t) time:

    1. 1.

      Given i, query δi(mid).

    2. 2.

      Given v, query the largest index i2 with δi2(mid)v.

    3. 3.

      Given the value v, query the maximal interval [i1,i2] of the sequence (δ1(mid),,δBt(mid)) with respect to v, defined as follows.

    Definition 8.

    The maximal interval of a (non-negative) sequence (a1,,aBt) with respect to value v is defined as the interval [i1,i2] formed by all indices i with ai=v. In the maximal interval, we have ai1+1=ai1+2==ai2=0, while ai1,ai2+1>0, and ai1=v.

We now compute the space usage of the mid part. To use Lemma 5, we need to first store the root label ϕ before storing the aB-tree, as Lemma 5 assumes free access to the root label. After that, the aB-tree will occupy log𝒩(Bt,ϕ)+2 bits where 𝒩(Bt,ϕ) is the number of sequences (δ1(mid),,δBt(mid)) with root label ϕ. Recalling that ϕ is the sum of this sequence and

ϕ=i=1Btδi(mid)i=1Btδi=Δ,

the number of possible sequences (δ1(mid),,δBt(mid)) is at most (Δ+BtBt1). Therefore, the number of bits used by the mid part (per block) is at most

O(logΔ)+log𝒩(Bt,ϕ)+2=log(Δ+BtBt1)+O(w).

Low part

In this basic data structure, the low part is directly stored in an array. Specifically, we store the sequence (x1(low),,xBt(low)) one by one as Bt integers each of b bits in the memory.

Note that the low-part sequence has the following locally increasing property which we will use in our query algorithms:

  • Restricted to a maximal interval [i1,i2] of (δ1,,δBt), the subsequence (xi1(low),,xi2(low)) is strictly increasing. This is because by the condition of Problem 2, the sequence (xi1(low)+2bδi1,,xi2(low)+2bδi2) is strictly increasing, whereas δi1==δi2 by the definition of maximal interval.

Query algorithms

Now we can formally state our algorithms for the PartialSum and Rank queries in this intra-block data structure.

For the PartialSum(i) query, we will query the high part, mid part, and low part of the data structure separately to get δi(high), δi(mid), and the xi(low):

  • To get δi(high), we query i on the first predecessor data structure of the high part, getting the largest index iI such that ii, with its associated value δi(high). As i is the last index before i with a nonzero δi(high), we have δi(high)=δi(high), as desired. This takes O(loglogU) time.

  • To get δi(mid), we directly query it from the aB-tree of the mid part, which takes O(t) time.

  • To get xi(low), we directly read it out from the integer array of the low part, which takes O(1) time.

Finally, the algorithm will return δi2b+xi(low)=δi(high)22h+b+δi(mid)2b+xi(low). The total time cost will be O(t+loglogU).

For the Rank(x) query, we will sequentially query the high part, mid part, and low part of the data structure to narrow down the search interval of the desired index i:

  • We first query the high part to locate i within a maximal interval of (δ1(high),,δBt(high)). This is achieved by querying the second predecessor data structure of the high part, when we will get two adjacent indices i1,i2I with δi1x/2b<δi2, which locates the index i within [i1,i2). Moreover, as i1,i2 are adjacent indices in I, the interval [i1,i2) forms a maximal interval of the sequence (δ1(high),,δBt(high)) with respect to the value vhighδi1(high). This takes O(loglogU) time.

  • Recall that the mid-part step needs to find the largest i[i1,i2) such that δi does not exceed a threshold x/2b. As all i share the same value δi(high), this is equivalent to finding the largest i[i1,i2) such that δi(mid) does not exceed x/2bvhigh22h. This can be done with one query to the aB-tree in the mid part.

    • If the i we found satisfies the strict inequality δi<x/2b, we conclude the query with i=i, because no matter what the low parts are, we already know δi2b+xi(low)(δi+1)2b1 is strictly smaller than the queried key x, and δi+12b+xi+1(low)δi+12b is larger than x.

    • If δi is equal to the threshold x/2b, we find the maximal interval [i1,i2][i1,i2) of (δ1,,δBt) with δi1=δi2=x/2b by querying the aB-tree again. In this case, we can locate the answer i to the query within the interval [i11,i2], but the concrete answer depends on the information in the low part.

    In both cases, the mid-part step takes O(t) time.

  • Assuming the previous step encounters the second case, where an interval [i1,i2] is known to have the same δi for all i, and the answer i to the query is guaranteed to reside in [i11,i2]. Therefore, we need to find the largest i[i1,i2] such that xi(low)(xmod2b), and that will be the answer to the query. (If xi1(low) is already larger than (xmod2b), the answer to the query should be i11.) Since (xi1(low),,xi2(low)) is strictly increasing, we can use binary search to find i within O(logL) time where Li2i1+1; as LBt, the running time of this step is bounded by O(logBt)=O(tloglogU).666Recall that BlogB=εlogUt, we have BlogU, hence tlogBtloglogU.

Hence, the total time cost of the Rank query is at most O(tloglogU).

Space usage of the block

Recalling that in our construction, the high part, mid part, and low part of the data structure use O(wΔ/22h), log(Δ+BtBt1)+O(w) and bBt bits, respectively, the total space used by the intra-block data structure is

bBt+log(Δ+BtBt1)+O(w+wΔ/22h). (2)

3.2 Performance of the data structure

In this subsection, we will combine all parts of our construction to get the basic data structure, and check its time complexity, lookup table size, and redundancy to complete the proof of Theorem 7.

Time complexity

Recall that our query algorithm will first query the inter-block data structure using O(loglogU) time to locate a block to query, and then query one intra-block data structure within O(tloglogU) time to get the answer. The time complexity of the whole data structure is O(tloglogU) per operation.

Lookup table size

The only lookup table used by our data structure occurs in the mid part within each block, which is used for aB-trees due to Lemma 5. As computed before, the size of this lookup table is O(U10ε) words, which is shared between all blocks, as desired.

Redundancy

The redundancy of the whole FID consists of the following two parts:

  • The entire inter-block data structure is regarded as redundant. To store the endpoint sequence, we need to store n/Bt+1 keys from [U], which occupies O(wn/Bt+w) bits. To store the predecessor data structure of the endpoint sequence, we also need at most O(wn/Bt+w) bits according to Lemma 6. Recalling that BlogB=εlogUt, we have BBlogB=εlogUt, and hence

    O(wn/Bt)O(nlogU(εlogU/t)t/2) O(nlogU(logU/t)t/4)O(n(logU/t)t/8),
    Rmax{n(logU/t)Ω(t),O(logU)}. (3)

    thus this part is covered by the desired amount of redundancy.

  • The redundancy caused by intra-block data structures, which we calculate below.

Recall that (2) upper bounds the space usage of each intra-block data structure. Taking a summation of (2) over all blocks, we get the total space consumption of the intra-block data structures:

k=1n/Bt(bBt+log(Δk+BtBt1)+O(w+wΔk/22h))
= nb+k=1n/Btlog(Δk+BtBt1)+O(w+nwBt+w22hk=1n/BtΔk)
nb+k=1n/Btlog(Δk+BtBt1)+O(w+nwBt+nw2h)
nb+k=1n/Btlog(Δk+BtBt1)+O(R), (4)

where the first inequality is due to (1); the second inequality is because h=tlogB, thus O(nw/2h)=O(nw/Bt)O(R) according to (3). We further have

k=1n/Btlog(Δk+BtBt1)=log(k=1n/Bt(Δk+BtBt1)) log(k=1n/BtΔk+nnn/Bt)
log(2hn+nn). (5)

Comparing this quantity with the information-theoretic lower bound log(Un) of FID, we get

log(Un)log(2hn+nn)=log(U2hn+nU12hn+n1Un+12hn+1)
nlog(U2hn+n)=n(logU2hnlog2h+12h)nbn2hln2, (6)

where the last inequality is because blogUnh and log(1+2h)2h/ln2. By plugging (3.2) and (6) into (4), the total space usage of all intra-block data structures is at most

nb+log(2hn+nn)+O(R)log(Un)+O(R+n2h)=log(Un)+O(R),

where O(n/2h)=O(n/Bt) is covered by R according to (3).

In conclusion, the total redundancy of our data structure for FIDs is bounded by R. Since we have constructed an FID with time complexity O(tloglogU), lookup table size O(U10ε), and redundancy R=max{n/(logU/t)Ω(t),O(logU)}, Theorem 7 follows.

4 Advanced data structure for FIDs

In this section, we will prove Theorem 1 by improving the basic data structure. See 1

Proof.

We start by reviewing the bottleneck of Theorem 7. In the basic data structure, the time bottleneck is that Rank queries take O(tloglogU) time in the low-part step. Specifically, suppose we are required to answer the query Rank(x). The inter-block data structure, together with the high and mid parts of the intra-block data structure, uses O(t+loglogU) time to locate the desired index i within a maximal interval [i1,i2] of (δ1,,δBt). Then, in the case where the low-part step is required, we need to further determine the maximum index i[i1,i2] such that xi(low)vlow(xmod2b). This can be formulated as a Rank query for the increasing sequence (xi1(low),,xi2(low)):

  • Rank(vlow): Return the largest index i[i1,i2], such that xi(low)vlow.

In the basic data structure, we directly store the sequence (xi1(low),,xi2(low)) as a sorted array using bL bits and support Rank queries by binary search within O(logL) time, where Li2i1+1 is the length of the subsequence. However, in the worst case, L can be as large as the block size Bt, which makes the time of the binary search O(logBt)=O(tloglogU). To address this bottleneck, we change the storage structure of the low part below.

New construction for the low part

The key observation is that storing the subsequence (xi1(low),,xi2(low)) to support Rank and PartialSum queries can be viewed as managing a smaller FID. Hence, we have the following unified data structure to store it:

  • If LLthrdlogU, then we still store the subsequence as a sorted array within bL bits.

  • Otherwise, we will use the basic data structure in Theorem 7 (as a subroutine) to store the subsequence, with parameter t=O(1) and constant ε to be determined.888Theorem 7 requires that the universe size 2b of the basic data structure is at least a large polynomial of the number of keys LBt. This condition holds because, on one hand, our choice of B (i.e., BlogB=(εlogU)/t) implies LBt=Uo(1); on the other hand, 2b=(U/n)/2h=(U/n)/Bt=polyU. As {xi1(low),,xi2(low)}[2b] is a set of L elements, the basic data structure will have query time O(tloglogU)=O(loglogU), and will use

    log(2bL) +L(log2b)Ω(1)+O(log2b)Lloge2bL+O(L+b)
    =LbLlogL+O(L+b)Lb

    bits of space, where the last inequality uses the fact L>Lthrd=logU to ensure that LlogL is asymptotically larger than O(L) and O(b)O(log(U/n)h)O(logU). Hence, in this case, the basic data structure storing the subsequence also fits in bL bits. We pad 0’s to the end of the encoding of the basic data structure until it occupies exactly bL bits, ensuring memory alignment.

Clearly, this unified data structure allows us to answer Rank and Select queries of the subsequence (x1(low),,xBt(low)) within O(loglogU) time:

  • If LLthrd, then the subsequence is stored as a sorted array. For Rank queries, we can use binary search to get the desired index, which takes O(logL)O(logLthrd)=O(loglogU) time; for Select queries, we can directly read out the desired xi(low) within O(1) time.

  • Otherwise, the subsequence is stored using the basic data structure in Theorem 7, which can support Rank and Select queries within O(loglogU) time.

Further, we can store the entire low part using the above unified data structure for subsequences. Specifically, we first divide the interval [1,Bt] into all maximal intervals of (δ1,,δBt). For each maximal interval [i1,i2] of length L, we store the corresponding subsequence of the low part (xi1(low),,xi2(low)) using the unified data structure in bL bits. Finally, we concatenate all these unified data structures from the leftmost maximal interval to the rightmost one, obtaining a string of bBt bits, which serves as the encoding of the entire low part.

Finally, to obtain the advanced data structure, we start with the basic data structure and replace the encoding of the low part with the new construction above (the concatenation of unified data structures). As our new construction for the low part occupies the same space (bBt bits) as before, the redundancy of this data structure is still R=max{n/(logU/t)Ω(t),O(logU)}=n/(logU/t)Ω(t). What makes the advanced data structure special is that the “basic data structure” appears twice here, once as the entire framework and once as the subroutines for maximal intervals in the low part. By embedding small instances of basic data structures within a larger framework of the same basic data structure in a non-recursive manner, we can improve the query time of FIDs without introducing any additional redundancy, as introduced below.

Query algorithms

The query algorithms for our advanced data structure are similar to those of the basic data structure in Section 3: We first use the inter-block information to transform the original queries on the entire FID into Rank/PartialSum queries within each block. Then, a three-stage process involving the high, mid, and low parts will obtain the answer to the query. The only difference is that, before we access anything in the low part, we need to first compute the maximal interval of (δ1,,δBt) that we plan to access; by comparing the length of the interval with the threshold Lthrd, we determine if the unified data structure for that maximal interval is stored as a sorted array or a basic data structure. The algorithms to answer Rank/PartialSum queries within each block are explained below.

  • For the query Rank(x), using the high and mid parts of the data structure, we can either answer the query directly or locate the desired index i within a maximal interval [i1,i2] of (δ1,,δBt). In the latter case, we read the encoding of the unified data structure for (xi1(low),,xi2(low)), which resides in the ((i11)b+1)-th bit to the i2b-th bit in the encoding of the entire low part. Recall that comparing i1i2+1 with Lthrd will tell us whether the unified data structure is stored as a sorted array or a basic data structure for FIDs. After that, we perform the query Rank(vlow)=Rank(xmod2b) on the unified data structure to obtain the desired index i within O(loglogU) time.

  • For the query PartialSum(i), we first get δi by the mid and high parts. Then, instead of directly reading out xi(low) from the low part as before, now we also need to know the maximal interval [i1,i2] of (δ1,,δBt) containing i to help us access the low part. We use a similar algorithm to the Rank query, to compute this maximal interval [i1,i2] with respect to δi, and to extract the encoding of the subsequence (xi1(low),,xi2(low)). By querying Select(ii1+1) on the unified data structure of this subsequence, we get xi(low) in O(loglogU) time.

Both types of queries take O(t+loglogU) time, because the aB-trees in the mid part take O(t) time, while other steps (including the high and low parts and the inter-block data structure) take O(loglogU) time per query. This meets the requirement in Theorem 1.

Space of the lookup table

Finally, we check that the size of the lookup table introduced by Theorem 7 is also dominated by R. Recall that the lookup table consists of O(U10ε) words. As U=n1+Θ(1), we can assume there is a constant α>1 such that Unα. Then, we can set ε=1/(20α), which means that the number of bits in the lookup table is O(U1/(2α)logU)=O(n1/2logn), which is significantly smaller than R=n/(logU/t)Ω(t), as desired.

In summary, we get a data structure for static FID with query time O(t+loglogU) and redundancy R=n/(logU/t)Ω(t), which concludes the proof of Theorem 1.

5 Select and partial sum

In this section, we prove Theorems 3 and 4 by adjusting the basic data structure introduced in Section 3. We will rely on the predecessor data structure from [40] when the set to store is relatively dense:

Lemma 9 (Similar to Lemma 6, see [40]).

For Unlogtn, there is a predecessor data structure with associated values that uses O(nlogU+nlogV) bits of space and answers queries in O(logt) time.

We follow the same notations in Section 3 in the following proofs.

5.1 Select dictionaries

See 3

Proof.

Recall that we have divided the binary representations of keys into the high, mid, and low parts, and in the high part, for each block of Bt keys, we stored a predecessor data structure (Lemma 6) to compute δi(high), which takes O(loglogn) time to answer each predecessor query. This is the only step exceeding O(t) time in the process of answering Select queries.

Instead of storing predecessor data structures for each block separately, here we store one large predecessor data structure for all n keys with nonzero δi(high)’s. It achieves the same functionality of computing δi(high). The number of elements with nonzero δi(high)’s is bounded by n/2h=n/Bt, thus the predecessor data structure stores n/Btn/logtn elements from the range [1,n]. (If the number of nonzero δi(high)’s is smaller than n/Bt, we add dummy elements until there are n/Bt elements.) According to Lemma 9, the predecessor data structure takes O(logt) time to answer each query, and takes nlogn/Bt=n/(logn/t)Ω(t) bits of space, which fits in our desired redundancy. Other parts of the data structure remain the same as in Section 3.

When we perform a Select query, the above predecessor data structure in the high part will compute the prefix sum of the high part of the difference sequence, which takes O(logt) time per query; the aB-trees in the mid part takes O(t) time to return the prefix sum of the mid part (within each block); finally, the low part reads xi(low) directly to obtain the low part of the target key. The entire process takes O(t) time.

5.2 Partial sum on integer sequences

See 4

Proof.

Let xij=1iai be the partial-sum sequence of the input. The partial-sum problem is equivalent to storing a (multi-)set of keys x1x2xn supporting Select queries, i.e., a select dictionary. The only distinction is that the difference xixi1 between any two adjacent keys is bounded by 21 in this problem. The data structure we design for partial-sum is similar to that of the select dictionaries, except that we adjust the parameters and change the number of bits in the high, mid, and low parts:

  • There is no high part.

  • Let B be a parameter such that BlogB=εlognt for a small constant ε, and let htlogB. We call the (h) least significant bits of each xi the low part, and store these bits directly using an array.

  • The remaining bits xi/2h are called the mid part. In their difference sequence δ1,,δn where δixi/2hxi1/2h, each entry δi equals either ai/2h (i.e., the h most significant bits of the input entry ai) or ai/2h+1, and thus is in [0,2h]. Same as in Section 3, we divide δ1,,δn into blocks of size Bt and use aB-trees to store them, supporting prefix-sum queries on δ1,,δn.

Recall that 𝒩(Bt,ϕ) represents the number of instances for an aB-tree with size Bt and root label ϕ (i.e., the sum of entries in the aB-tree equals ϕ), which is bounded by (2h+1)Bt. The space usage of the mid part is thus

log𝒩(Bt,ϕ)+2+O(logn) Btlog(2h+1)+O(logn)Bth+O(Bt/2h+logn)
=Bth+O(logn)

bits per block, where log𝒩(Bt,ϕ)+2 is the space usage of the aB-tree, and O(logn) is the space to store the root label ϕ of the aB-tree. Taking a summation of the space usage over all blocks, including the n(h) bits taken by the array in the low part, the inter-block information, and the lookup tables, we know the total space occupied by the data structure is at most

n(h)+nBt(Bth+O(logn))+n0.1n(h)+nh+O(nlogn/Bt)n+n/(logn/t)Ω(t)

bits, as desired. Similar to Theorem 3, each query takes O(t) time.

References

  • [1] Sepehr Assadi, Martín Farach-Colton, and William Kuszmaul. Tight bounds for monotone minimal perfect hashing. ACM Transactions on Algorithms, page 3677608, August 2024. See also SODA 2023. doi:10.1145/3677608.
  • [2] Jérémy Barbay, Francisco Claude, Travis Gagie, Gonzalo Navarro, and Yakov Nekrich. Efficient fully-compressed sequence representations. Algorithmica, 69(1):232–268, May 2014. doi:10.1007/s00453-012-9726-3.
  • [3] Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. Monotone minimal perfect hashing: Searching a sorted table with O(1) accesses. In Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 785–794, 2009. doi:10.1137/1.9781611973068.86.
  • [4] Djamal Belazzougui and Gonzalo Navarro. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms, 11(4):1–21, June 2015. doi:10.1145/2629339.
  • [5] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, and William Kuszmaul. Modern hashing made simple. In Proc. 7th Symposium on Simplicity in Algorithms (SOSA), pages 363–373, 2024. doi:10.1137/1.9781611977936.33.
  • [6] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, William Kuszmaul, and Mingmou Liu. On the optimal time/space tradeoff for hash tables. In Proc. 54th ACM SIGACT Symposium on Theory of Computing (STOC), pages 1284–1297, 2022. doi:10.1145/3519935.3519969.
  • [7] Daniel K. Blandford and Guy E. Blelloch. Compact dictionaries for variable-length keys and data with applications. ACM Transactions on Algorithms, 4(2):17:1–17:25, May 2008. doi:10.1145/1361192.1361194.
  • [8] H. Buhrman, P. B. Miltersen, J. Radhakrishnan, and S. Venkatesh. Are bitvectors optimal? SIAM Journal on Computing, 31(6):1723–1744, January 2002. doi:10.1137/S0097539702405292.
  • [9] David R. Clark. Compact PAT trees. PhD thesis, University of Waterloo, 1996.
  • [10] David R. Clark and J. Ian Munro. Efficient suffix trees on secondary storage. In Proc. 7th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 383–391, 1996.
  • [11] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, fourth edition. MIT Press, April 2022.
  • [12] P. Ferragina, F. Luccio, G. Manzini, and S. Muthukrishnan. Structuring labeled trees for optimal succinctness, and beyond. In Proc. 46th IEEE Symposium on Foundations of Computer Science (FOCS), pages 184–193, 2005. doi:10.1109/SFCS.2005.69.
  • [13] Paolo Ferragina, Roberto Grossi, Ankur Gupta, Rahul Shah, and Jeffrey Scott Vitter. On searching compressed string collections cache-obliviously. In Proc. 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 181–190, 2008. doi:10.1145/1376916.1376943.
  • [14] Paolo Ferragina and Giovanni Manzini. Indexing compressed text. Journal of the ACM, 52(4):552–581, July 2005. doi:10.1145/1082036.1082039.
  • [15] Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms, 3(2):20–es, May 2007. doi:10.1145/1240233.1240243.
  • [16] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst case access time. Journal of the ACM, 31(3):538–544, June 1984. doi:10.1145/828.1884.
  • [17] Richard F. Geary, Naila Rahman, Rajeev Raman, and Venkatesh Raman. A simple optimal representation for balanced parentheses. Theoretical Computer Science, 368(3):231–246, December 2006. doi:10.1016/j.tcs.2006.09.014.
  • [18] Richard F. Geary, Rajeev Raman, and Venkatesh Raman. Succinct ordinal trees with level-ancestor queries. ACM Transactions on Algorithms, 2(4):510–534, October 2006. doi:10.1145/1198513.1198516.
  • [19] Alexander Golynski, J. Ian Munro, and Srinivasa Rao Satti. Rank/select operations on large alphabets: a tool for text indexing. In Proc. 17th ACM-SIAM Symposium on Discrete Algorithm (SODA), pages 368–373, 2006. doi:10.1145/1109557.1109599.
  • [20] Alexander Golynski, Rajeev Raman, and Srinivasa Rao Satti. On the redundancy of succinct data structures. In Proc. 11th Scandinavian Workshop on Algorithm Theory (SWAT), pages 148–159, 2008. doi:10.1007/978-3-540-69903-3_15.
  • [21] Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In Proc. 14th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 841–850, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644250.
  • [22] Roberto Grossi, Alessio Orlandi, and Rajeev Raman. Optimal trade-offs for succinct string indexes. In Proc. 37th International Colloquium Conference on Automata, Languages and Programming (ICALP), pages 678–689, 2010. doi:10.1007/978-3-642-14165-2_57.
  • [23] Roberto Grossi, Alessio Orlandi, Rajeev Raman, and Srinivasa Rao Satti. More haste, less waste: lowering the redundancy in fully indexable dictionaries. In Proc. 26th International Symposium on Theoretical Aspects of Computer Science (STACS), pages 517–528, 2009. doi:10.4230/LIPICS.STACS.2009.1847.
  • [24] Roberto Grossi and Jeffrey Scott Vitter. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378–407, January 2005. doi:10.1137/S0097539702402354.
  • [25] Ankur Gupta, Wing-Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. Compressed data structures: Dictionaries and data-aware measures. Theoretical Computer Science, 387(3):313–331, 2007. doi:10.1016/j.tcs.2007.07.042.
  • [26] Wing-Kai Hon, Kunihiko Sadakane, and Wing-Kin Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing, 38(6):2162–2178, January 2009. doi:10.1137/070685373.
  • [27] Yang Hu, Jingxun Liang, Huacheng Yu, Junkai Zhang, and Renfei Zhou. Optimal static dictionary with worst-case constant query time. In Proc. 57th ACM Symposium on Theory of Computing (STOC), 2025.
  • [28] Guy Joseph Jacobson. Succinct static data structures. PhD thesis, Carnegie Mellon University, 1988.
  • [29] Dmitry Kosolobov. Simplified tight bounds for monotone minimal perfect hashing. In Proc. 35th Symposium on Combinatorial Pattern Matching (CPM), pages 19:1–19:13, 2024. doi:10.4230/LIPICS.CPM.2024.19.
  • [30] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic “succincter”. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1715–1733, 2023. doi:10.1109/FOCS57990.2023.00104.
  • [31] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Tight cell-probe lower bounds for dynamic succinct dictionaries. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1842–1862, 2023. doi:10.1109/FOCS57990.2023.00112.
  • [32] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic dictionary with subconstant wasted bits per key. In Proc. 35th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 171–207, 2024. doi:10.1137/1.9781611977912.9.
  • [33] J. Ian Munro. Tables. In Proc. 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pages 37–42, 1996. doi:10.1007/3-540-62034-6_35.
  • [34] J. Ian Munro, Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct representations of permutations and functions. Theoretical Computer Science, 438:74–88, June 2012. doi:10.1016/j.tcs.2012.03.005.
  • [35] J. Ian Munro and Venkatesh Raman. Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing, 31(3):762–776, 2001. doi:10.1137/S0097539799364092.
  • [36] J.Ian Munro, Venkatesh Raman, and S.Srinivasa Rao. Space efficient suffix trees. Journal of Algorithms, 39(2):205–222, 2001. doi:10.1006/jagm.2000.1151.
  • [37] Gonzalo Navarro and Veli Mäkinen. Compressed full-text indexes. ACM Computing Surveys, 39(1):2–es, April 2007. doi:10.1145/1216370.1216372.
  • [38] Gonzalo Navarro and Javiel Rojas-Ledesma. Predecessor search. ACM Computing Surveys, 53(5):1–35, September 2021. doi:10.1145/3409371.
  • [39] Mihai Pǎtraşcu. Succincter. In Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pages 305–313, 2008.
  • [40] Mihai Pǎtraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In Proc. 38th ACM Symposium on Theory of Computing (STOC), pages 232–240, 2006. doi:10.1145/1132516.1132551.
  • [41] Mihai Pǎtraşcu and Emanuele Viola. Cell-probe lower bounds for succinct partial sums. In Proc. 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 117–122, 2010.
  • [42] Giulio Ermanno Pibiri and Rossano Venturini. Dynamic Elias-Fano representation. In Proc. 28th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 30:1–30:14, 2017. doi:10.4230/LIPIcs.CPM.2017.30.
  • [43] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct dynamic data structures. In Proc. 7th International Workshop on Algorithms and Data Structures (WADS), pages 426–437, 2001.
  • [44] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Transactions on Algorithms, 3(4):43, November 2007. See also SODA 2002. doi:10.1145/1290672.1290680.
  • [45] Rajeev Raman and Srinivasa Rao Satti. Succinct dynamic dictionaries and trees. In Proc. 30th International Colloquium on Automata, Languages and Programming (ICALP), pages 357–368, 2003.
  • [46] Emanuele Viola. New sampling lower bounds via the separator. In Proc. 38th Computational Complexity Conference (CCC), pages 26:1–26:23, 2023. doi:10.4230/LIPIcs.CCC.2023.26.
  • [47] Huacheng Yu. Optimal succinct rank data structure via approximate nonnegative tensor decomposition. In Proc. 51st ACM SIGACT Symposium on Theory of Computing (STOC), pages 955–966, 2019. doi:10.1145/3313276.3316352.
  • [48] Huacheng Yu. Nearly optimal static Las Vegas succinct dictionary. In Proc. 52nd ACM SIGACT Symposium on Theory of Computing (STOC), pages 1389–1401, 2020. doi:10.1145/3357713.3384274.