Optimal Static Fully Indexable Dictionaries

Liang, Jingxun; Zhou, Renfei

doi:10.4230/LIPIcs.ICALP.2025.114

Optimal Static Fully Indexable Dictionaries

Jingxun Liang

Carnegie Mellon University, Pittsburgh, PA, USA Renfei Zhou

Carnegie Mellon University, Pittsburgh, PA, USA

Abstract

Fully indexable dictionaries (FID) store sets of integer keys while supporting rank/select queries. They serve as basic building blocks in many succinct data structures. Despite the great importance of FIDs, no known FID is succinct with efficient query time when the universe size $U$ is a large polynomial in the number of keys $n$ , which is the conventional parameter regime for dictionary problems. In this paper, we design an FID that uses $\log\binom{U}{n}+\frac{n}{(\log U/t)^{\Omega(t)}}$ bits of space, and answers rank/select queries in $O(t+\log\log n)$ time in the worst case, for any parameter $1\leq t\leq\log n/\log\log n$ , provided $U=n^{1+\Theta(1)}$ . This time-space trade-off matches known lower bounds for FIDs [40, 41, 46] when $t\leq\log^{0.99}n$ .

Our techniques also lead to efficient succinct data structures for the fundamental problem of maintaining $n$ integers each of $\ell=\Theta(\log n)$ bits and supporting partial-sum queries, with a trade-off between $O(t)$ query time and $n\ell+n/(\log n/t)^{\Omega(t)}$ bits of space. Prior to this work, no known data structure for the partial-sum problem achieves constant query time with $n\ell+o(n)$ bits of space usage.

Keywords and phrases:

data structures, dictionaries, space efficiency

Category:

Track A: Algorithms, Complexity and Games

Funding:

Renfei Zhou: Partially supported by the MongoDB PhD Fellowship.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Data structures design and analysis

Related Version:

Full Version: https://arxiv.org/abs/2504.19350

Acknowledgements:

The authors thank William Kuszmaul and Huacheng Yu for helpful suggestions on paper writing, and anonymous reviewers for pointing out important related works.

DOI:

10.4230/LIPIcs.ICALP.2025.114

Event:

52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Editors:

Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

A fully indexable dictionary (a.k.a. rank/select dictionary; FID for short) is a fundamental data structure, which stores a set $S$ of $n$ keys from a universe $[U]$ , supporting Rank and Select queries:

$\blacksquare$

$\textup{{Rank}}(x)$ : return the number of keys in $S$ that are smaller than or equal to $x$ .
$\blacksquare$

$\textup{{Select}}(i)$ : return the $i$ -th smallest key in $S$ .

This paper focuses on static FIDs, where the key set is given at the beginning and does not change over time. There is also the dynamic case, allowing updates to the key set via insertions and deletions, which is not considered in this paper.

FIDs are powerful data structures with numerous applications. First, if we choose to think of $S$ as an indicator vector in $\{0,1\}^{U}$ , then the problem becomes storing a bit string containing $n$ ones and $U-n$ zeros, while supporting prefix-sum queries and queries for the position of the $i$ -th one. This so-called rank/select problem serves as a subroutine in many space-efficient data structures. Second, given Rank and Select, one can also support predecessor search queries (i.e., find the largest element in $S$ that does not exceed $x$ ) by calling $\textup{{Select}}(\textup{{Rank}}(x))$ – this, too, serves as a common data-structural subroutine. The result is that FIDs have many applications in data structures for strings [10, 14, 24, 26, 36, 37, 21, 13], trees [12, 18, 44, 35], parentheses sequences [17, 35], multisets [44], permutations [34], variations of dictionaries [8, 7], etc.

The most fundamental question regarding FIDs is to determine their best possible time-space trade-off. If an FID storing $n$ keys from the universe $[U]$ uses $\log\binom{U}{n}+R$ bits of space, we say it incurs $R$ bits of redundancy, where the first term $\log\binom{U}{n}$ is referred to as the information-theoretic optimum. Conventionally, an FID is said to be compact if the redundancy $R=O\big{(}\log\binom{U}{n}\big{)}$ , and is said to be succinct if $R=o\big{(}\log\binom{U}{n}\big{)}$ . The time-space trade-off of an FID is the relationship between the redundancy $R$ and the worst-case query time under a RAM with word-size $\Theta(\log U)$ .

Towards the optimal trade-off

There is vast literature on space-efficient FIDs. The earliest works [28, 33, 9] on FIDs focused on the setting where one needs to store $S$ ’s indicator vector using $U$ bits in the plain format, while using up to $R$ additional bits to store auxiliary information. Compared to the information-theoretic optimum $\log\binom{U}{n}$ , their approaches are succinct only if $U=(2\pm o(1))n$ . In 2002, Raman, Raman, and Satti [44] constructed an FID with redundancy $O(U\log\log U/\log U)$ and constant query time. Later, Pǎtraşcu [39] reduced the redundancy to $U/(\log U/t)^{\Omega(t)}+O(U^{3/4})$ bits with worst-case query time $O(t)$ .

Some applications of FIDs demand to store dense sets where the universe size $U=O(n)$ ; in this case, Pǎtraşcu’s FID is already succinct and achieves an ideal trade-off between $O(t)$ query time and $n/(\log n/t)^{\Omega(t)}$ bits of redundancy, which is provably optimal when $t\leq\log^{0.99}n$ [41, 46]. However, when $U=n^{1+\Theta(1)}$ , which is the conventional parameter regime for dictionary problems, the redundancy of Pǎtraşcu’s FID will be polynomially larger than the information-theoretic optimum $\log\binom{U}{n}=\Theta(n\log n)$ even for large running times, e.g., $t=\log^{0.99}n$ , and thus is not succinct or compact. Gupta et al. [25] further improved the redundancy to $O(n\log\log n)$ bits, thus being succinct, at the cost of slower queries of $O(\log^{2}\log n)$ time. It is currently not known how to construct succinct FIDs with query time $o(\log^{2}\log n)$ .

On the lower-bound side, Pǎtraşcu and Thorup [40] showed an $\Omega(\log\log n)$ time lower bound for predecessor queries, assuming that $U=n^{1+\Theta(1)}$ and that the data structure is compact. Their lower bound also applies to FIDs by a reduction. Another lower bound proven by Pǎtraşcu and Viola [41, 46] shows that, if an FID takes $O(t)$ worst-case time per query, it must incur $n/(\log n)^{O(t)}$ bits of redundancy. The best-known lower bound for FIDs is a simple combination of these two independent lower bounds: When the redundancy equals $n/(\log n)^{\Omega(t)}$ , the worst-case query time must be at least $\Omega(\max\{\log\log n,\,t\})$ , provided $U=n^{1+\Theta(1)}$ . There remains a huge gap between the lower and upper bounds.

In summary, it has remained one of the basic open questions in the field whether one can hope to construct an FID that is both succinct and that supports queries in the optimal time of $O(\log\log n)$ . Moreover, although there are well-established lower bounds for the time-space trade-off of a static FID, it remains open on the upper-bound side to obtain tight bounds at any point along the curve. These are the problems that we seek to resolve in the current paper.

This paper: Tight upper bounds for FIDs

In this paper, we construct a succinct FID as shown in the following theorem.

Theorem 1.

For any parameters $n, U, t$ with $U=n^{1+\Theta(1)}$ and $t\leq\log n/\log\log n$ , there is a static fully indexable dictionary with query time $O(t+\log\log n)$ and redundancy $R=n/(\log n/t)^{\Omega(t)}$ , in word RAM with word size $w=\Theta(\log n)$ .

Thus, it is possible to achieve succinctness while offering an optimal time bound of $t=O(\log\log n)$ – in this case, the redundancy of our construction is $n/(\log n)^{\Omega(\log\log n)}\ll n/\operatorname{poly}\log n$ bits. More generally, the time-space trade-off offered by the above theorem is provably optimal for all $t\leq\log^{0.99}n$ , as, in every parameter regime, it matches one of the two known lower bounds. Somewhat surprisingly, this means that the maximum of the two completely independent lower bounds forms a single tight lower bound for FIDs.

High-level technical approach

To understand the high-level approach taken by our data structure, let us consider the lower bound in [40], which points out a distribution of hard instances such that any data structure with near-linear space $O(n\operatorname{poly}\log n)$ needs to spend $\Omega(\log\log n)$ time for each query on these inputs.

A critical insight in the current paper is that, although it is difficult to improve the query time for these hard instances, they are well-structured so that they only occupy a small fraction of all possible inputs. Therefore, in principle, the space we actually need to store these hard instances is small.

Moreover, if we restrict only to these hard instances, then we can afford to use a data structure that is space-efficient compared to the optimal bound for FIDs but that is space-inefficient compared to the information-theoretic optimum for hard instances. So we can handle hard instances by creating data structures that are morally space inefficient (i.e., not optimal for these hard instances), but that are nonetheless space efficient in the context of the overall FID problem.

On the flip side, when an input is not hard, we can think of it morally as being like a “random” input. The entropy of random inputs is large, so we cannot waste much space. Fortunately, random instances can benefit from the same high-level techniques that have already been developed in past work [39], allowing for fast queries with good space efficiency.

Of course, “hard instances” and “random instances” are really just two extremes in a large design space of inputs. What is remarkable is that, nonetheless, it is possible to combine the high-level approaches described above in order to construct a single data structure that achieves the optimal time-space trade-off for all inputs.

Other implications of our techniques

As mentioned earlier, FIDs can naturally support predecessor search queries, so our result also improves the predecessor data structures:

Corollary 2.

For any parameters $n, U$ with $U=n^{1+\Theta(1)}$ , there is a data structure storing a set $S$ of $n$ keys from a universe $[U]$ that supports predecessor searches in $S$ , with the same time-space trade-off as in Theorem 1.

Prior to this result, it remained open whether one could achieve $O(\log\log n)$ query time while also offering $o(n)$ bits of redundancy [42].

By adjusting our data structure for FIDs, we show the following time-space trade-off for so-called select dictionaries, which are dictionaries that support Select but not Rank:

Theorem 3.

For any parameters $n, U, t$ with $U=n^{1+\Theta(1)}$ and $t\leq\log n/\log\log n$ , there is a static dictionary that answers Select queries within $O(t)$ time and has redundancy $R=n/(\log n/t)^{\Omega(t)}$ , in word RAM with word size $w=\Theta(\log n)$ .

Here, again, the bounds that we achieve are provably optimal, as they match a lower bound previously proven by [41]. This represents a significant improvement over the previous state of the art [44], which was unable to achieve space bounds any better than $O(n/\sqrt{\log n})$ .

Another application of our techniques is to the basic question of storing $n$ $\Theta(\log n)$ -bit integers and answering partial-sum queries.

Theorem 4.

For any parameters $n,\ell,t$ with $\ell=\Theta(\log n)$ and $t\leq\log n/\log\log n$ , there is a data structure storing a sequence of $n$ $\ell$ -bit integers $a_{1},\ldots,a_{n}$ , which uses $n\ell+n/(\log n/t)^{\Omega(t)}$ bits of space and can answer the partial-sum queries $\sum_{j=1}^{i}a_{j}$ for any given $i$ within $O(t)$ time.

For this partial-sum problem, the previous state of the art with $O(1)$ query times incurred $O(n)$ bits of redundancy [43]. Whether or not this could be reduced to $o(n)$ remained an open question. Our bound reduces it all the way to $n/\operatorname{poly}\log n$ bits for a polylogarithm of our choice.

1.1 Related works

Grossi et al. [23] studied the FID problem for polynomial universe sizes with $O(1)$ query time. In this setting, achieving a compact space usage of $O(n\log n)$ bits is impossible due to the lower bound established by [40]. They showed a trade-off between $O(\varepsilon^{-1})$ query time and $O(n^{1+\varepsilon})$ bits of space.

In the special case where the universe size $U=2n$ , Yu [47] presented a data structure that supports Rank queries in $O(t)$ time, but not Select queries. This data structure incurs only $n/(\log n)^{\Omega(t)}+n^{1-\Omega(1)}$ bits of redundancy, outperforming Pǎtraşcu’s data structure when $t=(\log n)^{1-o(1)}$ is large, and matching the lower bound of [40] for all $t$ .

A closely related setting is the dynamic FID problem, where, in addition to answering Rank and Select queries, the data structure also needs to support fast insertions and deletions. When the universe size $U$ is linear in the number of keys $n$ , Li et al. [30] achieved $O(n/2^{\log^{0.199}n})$ bits of redundancy with the optimal operational time of $O(\log n/\log\log n)$ . For polynomial universe sizes $U=n^{1+\Theta(1)}$ , it remains open whether a succinct data structure with $O(\log n/\log\log n)$ time per operation can be constructed.

Predecessor data structures are also closely related to FIDs and have been extensively studied across various parameter regimes and settings due to their importance. For a comprehensive overview, see the survey by Navarro and Rojas-Ledesma [38].

Another variant of dictionary is the unordered dictionary problem, where the data structure only needs to answer membership queries – whether a given key $x$ is in the current key set. A series of works has focused on both static and dynamic unordered dictionaries [48, 27, 45, 31, 6, 32, 5], leading to a static unordered dictionary with constant query time and $n^{\varepsilon}$ bits of redundancy [27], as well as tight upper and lower bounds for the time-space trade-off of dynamic unordered dictionaries for polynomial universe sizes [31, 6, 32].

Raman, Raman, and Satti [44] studied the indexable dictionary (ID) problem, which is similar to FIDs, but $\textup{{Rank}}(x)$ only returns the rank of $x$ when $x$ is present in the current key set; otherwise, it returns “not exist.” They constructed IDs with $O(n/\sqrt{\log n})$ bits of redundancy and $O(1)$ query time. Note that the ID setting is easier than FIDs, allowing data structures to achieve both succinctness and constant-time queries.

A further relaxation of ID is monotone minimal perfect hashing (MMPH), where the data structure only needs to support Rank queries for elements in the key set – if $x$ is not in the key set, $\textup{{Rank}}(x)$ can return an arbitrary answer. (Select queries are not required.) The key point of this relaxation is that MMPH data structures use asymptotically less space: Belazzougui, Boldi, Pagh, and Vigna [3] constructed a data structure that uses only $O(n\log\log\log U)$ bits, while encoding the key set itself requires $\log\binom{U}{n}=O(n\log(U/n))$ bits. This bound is shown to be optimal by Assadi, Farach-Colton, and Kuszmaul [1] (see also [29]).

There is also a line of research on rank/select problems over arbitrary alphabets [21, 19, 15, 20, 22, 2, 4]. Given a sequence in $[\sigma]^{n}$ , the select query asks for the $k$ -th occurrence of a given symbol $s\in[\sigma]$ ; rank queries are defined similarly. These problems generalize FIDs, which correspond to the special case $\sigma=2$ . For arbitrary $\sigma$ , the optimal time/redundancy trade-off is still not fully understood [4].

2 Preliminaries

Augmented B-trees

We will use the augmented B-trees (aB-trees for short) from [39] as a subroutine.

Let $B$ and $m$ be parameters such that $m$ is a power of $B$ , and let $A[1\ldots m]$ be an array of elements in the alphabet $\Sigma$ . An aB-tree of branching factor $B$ and size $m$ is a full $B$ -ary tree over $m$ leaves, which correspond to the entries of the array $A[1\ldots m]$ . Additionally:

$\blacksquare$

Each node of the aB-tree is augmented by a label from the label alphabet $\Phi$ . The label of a leaf node is determined by the corresponding entry $A[i]$ in the array, and the label of an internal node is determined by the label sequence of its children. Formally, there is a transition function $\mathcal{A}$ determining the label $\phi_{u}$ of each the internal node $u$ : $\phi_{u}=\mathcal{A}(\bm{\phi})$ , where $\bm{\phi}\coloneqq(\phi_{1},\phi_{2},\ldots,\phi_{B})$ is the label sequence of $u$ ’s $B$ children.
$\blacksquare$

There is a recursive query algorithm, which starts by examining the label of the root, and then recursively traverses down a path from the root to some leaf of the tree. At each step, the algorithm examines the label of the current node and the labels of its children to determine which of the children to recurse on. After reaching a leaf, the algorithm outputs the answer to the query based on all the examined labels. Furthermore, this algorithm is restricted to spend only $O(1)$ time at each examined node, ensuring that the query time remains at most $O(\log_{B}m)$ .

Beyond [39], we also consider incomplete aB-trees. Let $m$ be any integer, not necessarily a power of $B$ , and let $t$ be an integer such that $B^{t}\geq m$ . An incomplete aB-tree with branching factor $B$ and size $m$ is derived from a (full) aB-tree over $B^{t}$ leaves with the same branching factor $B$ , by retaining only the first $m$ leaves while removing other $B^{t}-m$ leaves, and (repeatedly) deleting all internal nodes without any child. In such an incomplete aB-tree, the transition function $\mathcal{A}$ of labels still follows the form $\phi_{u}=\mathcal{A}(\bm{\phi})$ , but the input label sequence $\bm{\phi}=(\phi_{1},\ldots,\phi_{\ell})$ may have a length $\ell$ less than $B$ , as $u$ might possess fewer than $B$ children.

Let $\mathcal{N}(m,\phi)$ be the number of instances of $A[1\ldots m]$ such that the aB-tree of branching factor $B$ over it will have root label $\phi$ . According to Theorem 8 of [39], when $m$ is a power of $B$ , we can compress the aB-tree with size $m$ and root label $\phi$ to within $\log\mathcal{N}(m,\phi)+2$ bits. Their proof directly works for incomplete aB-trees ( $m\leq B^{t}$ ) as well.

Lemma 5 (Natural generalization of [39, Theorem 8]).

Let $m, B, t$ be parameters with $m\leq B^{t}$ and $B=O(\frac{w}{\log(m+\lvert\Phi\rvert)})$ . Suppose there is an (incomplete) aB-tree with branching factor $B$ , size $m$ , and root label $\phi$ , then we can compress this aB-tree to within $\log\mathcal{N}(m,\phi)+2$ bits with query time $O(t)$ , in the word RAM model with word size $w$ . The data structure uses a lookup table of $O(B^{2t}\lvert\Sigma\rvert+B^{3t}\lvert\Phi\rvert^{2B})$ words, which only depends on $B$ and $t$ .

Notice that the lookup table only depends on $B$ and $t$ , but not $m$ . The original construction in [39], when applied on incomplete aB-trees, uses a lookup table of $O(B(|\Sigma|+|\Phi|^{B+1}+B|\Phi|^{B}))$ words, which depends on $B$ , $t$ , and $m$ . To avoid the dependency on $m$ , we simply concatenate $B^{t}$ such lookup tables for all values of $m$ together, with at most $O(B^{t+1}\cdot(\lvert\Sigma\rvert+\lvert\Phi\rvert^{B+1}+B\lvert\Phi\rvert^{B}% ))\leq O(B^{2t}\lvert\Sigma\rvert+B^{3t}\lvert\Phi\rvert^{2B})$ words, and let aB-trees with different sizes $m$ use different parts of the (concatenated) lookup table. Later when we apply this lemma in our data structure, we will let multiple instances with the same $B$ and $t$ (but with different $m$ ) share a single lookup table.

Predecessor data structures

We will use the following extension of predecessor data structure as a subroutine.

Problem 1 (Predecessor with associated values).

Storing a set $S\subset[U]$ of $n$ keys,¹¹1We use notation $[n]$ to denote the set $\{0,1,\ldots,n-1\}$ for any non-negative integer $n$ , and use $[a,b]$ to denote the set of integers $\{a,a+1,\dots,b\}$ when there is no ambiguity. where each key is associated with a value in $[V]$ , supporting predecessor and successor queries:

$\blacksquare$

$\textup{{Predecessor}}(x)$ : Return the largest element $y\in S$ such that $y\leq x$ , and the associated value of $y$ .
$\blacksquare$

$\textup{{Successor}}(x)$ : Return the smallest element $y\in S$ such that $y>x$ , and the associated value of $y$ .

By studies on dictionaries [16] and predecessor data structures [40], there is a compact construction for predecessor data structures with associated values:

Lemma 6.

For any $U\geq n$ , there are predecessor data structures with associated values using $O\big{(}n\log U+n\log V\big{)}$ bits, with query time $O(\log\log(U/n))$ in the worst case.

Proof Sketch.

We maintain the following data structures simultaneously:

$\blacksquare$

A predecessor data structure for the set $S\subset[U]$ by [40] with space $O(n\log U)$ and query time $O(\log\log(U/n))$ .
$\blacksquare$

A successor data structure similar to the predecessor data structure.
$\blacksquare$

A “perfect hashing” by [16, 11] to store the key set $S$ with their associated values, with a space complexity of $O(n(\log U+\log V))$ bits and a worst-case query time $O(1)$ .

$\hfill\blacktriangleleft$

Throughout this paper, we use the terminology “predecessor data structure” for short to refer to the data structure defined by Lemma 6.

3 Basic data structure for FIDs

In this section, we will construct a basic data structure for FIDs with a slightly worse time and space guarantee than the requirement of Theorem 1. It will serve as a subroutine in the final data structure in Section 4: The final data structure is based on the algorithm framework in this section, and by replacing a subroutine with the result in this section (Theorem 7) in a non-recursive way, it achieves the ideal time-space trade-off. Formally, we will prove the following theorem:

Theorem 7 (Weak version of Theorem 1).

For any parameters $U, n, t$ with $U=n^{1+\Omega(1)}$ , $t\leq\log U/\log\log U$ and a constant $\varepsilon>0$ , there is a static fully indexable dictionary with query time $O(t\log\log U)$ and redundancy $R=\max\{n/(\log U/t)^{\Omega(t)},O(\log U)\}$ , in the word-RAM model with word size $w=\Theta(\log U)$ . The data structure uses a lookup table of $O(U^{10\varepsilon})$ words that only depend on $U$ and $t$ which could be shared among multiple instances of fully indexable dictionaries.

Although in most applications of FIDs we care about polynomially-sized universes $U=n^{1+\Theta(1)}$ , here we also consider the parameter regime where $n=U^{o(1)}$ is significantly smaller than $U$ . The reason is that, later in Section 4, we will use Theorem 7 to maintain short subsequences of keys.

In the remainder of this section, we construct this basic data structure to prove Theorem 7. Table 1 lists the main parameters and notations we will introduce in this section.

Table 1: Table of Notations.

Notation	Explanation
$t$	A parameter indicating that our algorithm’s time constraint is $O(t\log\log U)$ .
$R$	$R\coloneqq\max\{n/(\log U/t)^{\Omega(t)},\,O(\log U)\}$ is the desired redundancy of our FID.
$B$	The branching factor of aB-trees in the mid parts. $B\log B=(\varepsilon\log U)/t$ . When $t\leq\log^{0.99}U$ , there is $B=\log^{\Theta(1)}U$ .
$B^{t}$	The number of keys in each block.
$h$	$h\coloneqq t\log B$ . Each mid part consists of $2h$ bits.
$b$	$b\coloneqq\log(U/n)-h$ . Each low part consists of $b$ bits.
$x_{i}^{(\textup{low})}$ , $\delta_{i}^{(\textup{mid})}$ , $\delta_{i}^{(\textup{high})}$	The value in the low, mid, and high parts. See Figure 1.
$\delta_{i}$	The value in the mid and high parts together. It equals $\delta_{i}^{(\textup{mid})}+2^{2h}\cdot\delta_{i}^{(\textup{high})}$ .
$\delta_{\leq i}$	$\delta_{\leq i}\coloneqq\sum_{j=1}^{i}\delta_{j}$ is the prefix sum of $\{\delta_{i}\}$ .
$\Delta$	$\Delta=\delta_{\leq B^{t}}$ is the summation of all $\delta_{i}$ within a block.

Partitioning into blocks

Let $S=\{x_{1},x_{2},\ldots,x_{n}\}$ be the set of keys we need to store, where $x_{1}<x_{2}<\cdots<x_{n}$ . The first step of the construction is to divide the sequence $(x_{1},x_{2},\ldots,x_{n})$ into small blocks, and to store some inter-block data structure that reduces the entire FID problem to small-scale FID problems within each block.

Let $B$ be a parameter where $B\log B=\frac{\varepsilon\log U}{t}$ , and we break the whole sequence into blocks of size $B^{t}$ . As the sequence $(x_{1},x_{2},\ldots,x_{n})$ is monotonically increasing, the partition into blocks could be viewed as partitioning the possible range of the keys $[U]$ into $n/B^{t}$ disjoint intervals, where the $k$ -th block corresponds to the interval $(x_{(k-1)B^{t}},x_{k\cdot B^{t}}]$ .²²2We let $x_{0}=0$ for convenience. When $n$ is not divisible by $B^{t}$ , especially when $n<B^{t}$ , the size of the last block will be smaller than $B^{t}$ . Fortunately, the same construction below works for any block size $m\leq B^{t}$ , and we only illustrate our construction for block size $B^{t}$ .

Our inter-block data structure consists of the following two parts:

$\blacksquare$

The sequence of endpoints of each interval, i.e., $(x_{B^{t}},x_{2\cdot B^{t}},\ldots,x_{n})$ .
$\blacksquare$

A predecessor structure for the sequence of endpoints $(x_{B^{t}},x_{2\cdot B^{t}},\ldots,x_{n})$ , where each entry $x_{k\cdot B^{t}}$ is associated with value $k$ .

Given the above auxiliary information, we view each block $k$ as an FID with $B^{t}$ keys from a universe of size $x_{kB^{t}}-x_{(k-1)B^{t}}$ . When we perform a rank query $\textup{{Rank}}(x)$ , the second part above can help us locate the block $k$ containing the queried element $x$ (i.e., $x\in(x_{(k-1)B^{t}},\,x_{kB^{t}}]$ ), transforming the original query into a Rank query in the $k$ -th block, within $O(\log\log U)$ time (see Lemma 6). When we perform a select query $\textup{{Select}}(i)$ , it becomes a Select query within block $k=\lceil i/B^{t}\rceil$ . The remaining question is to answer Rank/Select queries within each block.

Storing difference sequences within blocks

Throughout the remainder of this section, we mainly focus on the FID problem within a single block $k$ . Letting $s=(k-1)B^{t}$ , the intra-block FID problem requires us to maintain the sequence of keys $(x_{s+1},\ldots,x_{s+B^{t}})$ in the universe $(x_{s},x_{s+B^{t}}]$ . We cut the binary representation of each key $x_{s+i}$ into two parts, as shown in Figure 1(a):

$\blacksquare$

Letting $h$ be a parameter to be determined and $b\coloneqq\log\frac{U}{n}-h$ , we define the low part as the $b$ least significant bits in the binary representation of each key $x_{s+i}$ , denoted by integers $x_{i}^{(\textup{low})}\in[2^{b}]$ . We will directly store these integers in the data structure.
$\blacksquare$

The $\log U-b$ remaining (more significant) bits are referred to as the mid-high part, denoted by integers $x_{i}^{(\textup{mid-high})}$ . For these integers, we aim to store the difference sequence, i.e., the differences $\delta_{i}\coloneqq x_{i}^{(\textup{mid-high})}-x_{i-1}^{(\textup{mid-high})}$ between adjacent pairs of elements.

Clearly, we have

x_{s+i}=x_{i}^{(\textup{low})}+x_{i}^{(\textup{mid-high})}\cdot 2^{b}\qquad(0% \leq i\leq B^{t}).

For short, we let $\delta_{\leq i}$ denote the partial sum $\sum_{j=1}^{i}\delta_{j}$ , so that $x_{i}^{(\textup{mid-high})}=\delta_{\leq i}+x_{0}^{(\textup{mid-high})}$ . We further define $\Delta\coloneqq\delta_{\leq B^{t}}$ as the sum of the difference sequence $(\delta_{1},\ldots,\delta_{B^{t}})$ :

\displaystyle\Delta\coloneqq\sum_{i=1}^{B^{t}}\delta_{i}=x_{B^{t}}^{(\textup{% mid-high})}-x_{0}^{(\textup{mid-high})}=\left\lfloor\frac{x_{s+B^{t}}}{2^{b}}% \right\rfloor-\left\lfloor\frac{x_{s}}{2^{b}}\right\rfloor.

We denote by $\Delta_{k^{\prime}}$ the value of $\Delta$ in the $k^{\prime}$ -th block. Clearly,

\displaystyle\sum_{k^{\prime}=1}^{n/B^{t}}\Delta_{k^{\prime}}=\left\lfloor% \frac{x_{n}}{2^{b}}\right\rfloor-\left\lfloor\frac{x_{0}}{2^{b}}\right\rfloor% \leq\frac{U}{2^{b}}=n\cdot 2^{h}.

(1)

With these notations, the FID problem within a single block can be restated as follows:

Problem 2 (FID within a block).

Let $\Delta$ be a parameter stored outside. We need to store two sequences $(x_{1}^{(\textup{low})},x_{2}^{(\textup{low})},\ldots,x_{B^{t}}^{(\textup{low}% )})$ and $(\delta_{1},\delta_{2},\ldots,\delta_{B^{t}})$ , such that $x_{i}^{(\textup{low})}\in[2^{b}]$ and $\sum_{i=1}^{B^{t}}\delta_{i}=\Delta$ , with the property that $\big{(}x_{i}^{(\textup{low})}+\delta_{\leq i}\cdot 2^{b}\big{)}_{i=1}^{B^{t}}$ forms a strictly increasing sequence, supporting the following queries:

$\blacksquare$

$\textup{{PartialSum}}(i)$ : Return $\delta_{\leq i}\cdot 2^{b}+x_{i}^{(\textup{low})}$ . It corresponds to the Select queries in the original FID problem.
$\blacksquare$

$\textup{{Rank}}(x)$ : Return the largest $i$ such that $\delta_{\leq i}\cdot 2^{b}+x_{i}^{(\textup{low})}\leq x$ .

As introduced above, once we have a solution to Problem 2 with $O(T)$ time for any $T$ , with the help of the inter-block data structure, we may answer the Rank/Select queries to the original key sequence in $O(T+\log\log U)$ time.

(a) Separating low parts from mid-high parts.

(b) Separating mid parts and high parts.

Figure 1: Partitioning keys into three parts. We (a) divide the binary representations of the keys into low parts and mid-high parts, and further (b) take the difference sequence

(\delta_{i})_{1\leq i\leq B^{t}}

of the mid-high parts and divide them into mid parts and high parts. There are

b\coloneqq\log(U/n)-h

bits in the low part,

2h

bits in the mid part, and

\log n-h

bits in the high part.

The three-part partition

For some technical reasons, we further divide each $\delta_{i}$ (i.e., the difference sequence of the mid-high part) into two smaller parts as follows. We will use different approaches to organize these parts in Section 3.1.

$\blacksquare$

Let $\delta_{i}^{(\textup{mid})}$ be the integer formed by the $2h$ least significant bits of $\delta_{i}$ , which we call the mid part.
$\blacksquare$

Let $\delta_{i}^{(\textup{high})}$ be the integer formed by the remaining $\log U-b-2h=\log n-h$ bits, which we call the high part.

Formally,

\delta_{i}=\delta_{i}^{(\textup{mid})}+\delta_{i}^{(\textup{high})}\cdot 2^{20% pt}\qquad(i\in[1,B^{t}]).

By now, we have cut the sequence of keys $(x_{s+1},\ldots,x_{s+B^{t}})$ in a single block into three parts: the high part $(\delta_{1}^{(\textup{high})},\ldots,\delta_{B^{t}}^{(\textup{high})})$ , the mid part $(\delta_{1}^{(\textup{mid})},\ldots,\delta_{B^{t}}^{(\textup{mid})})$ , and the low part $(x_{1}^{(\textup{low})},\ldots,x_{B^{t}}^{(\textup{low})})$ , as shown in Figure 1(b). The following subsection 3.1 will design (almost) separate data structures to store these three parts.

3.1 Data structure within each block

In this subsection, we will design the intra-block data structure with time complexity $O(t\log\log U)$ of three parts. At a very high level, these three parts of the data structure will interact as follows when performing PartialSum and Rank queries:

$\blacksquare$

During the query $\textup{{PartialSum}}(i)$ , we will compute $\delta_{\leq i}^{(\textup{high})}$ , $\delta_{\leq i}^{(\textup{mid})}$ and $x_{i}^{(\textup{low})}$ from the high part, mid part and the low part of the data structure separately,³³3We define $\delta_{\leq i}^{(\textup{high})}$ and $\delta_{\leq i}^{(\textup{mid})}$ similarly to $\delta_{\leq i}$ : $\delta_{\leq i}^{(\textup{high})}\coloneqq\sum_{j=1}^{i}\delta_{j}^{(\textup{% high})}$ , and so is $\delta_{\leq i}^{(\textup{mid})}$ . and then combine them into the output.
$\blacksquare$

During the $\textup{{Rank}}(x)$ query, we sequentially examine the high part, mid part, and low part of the data structure. At each step, we narrow down the search interval for the desired index $i$ , gradually approaching its exact value in the low part.

Below, we will provide the detailed construction of the three parts separately.

High part

The high part sequence $(\delta_{1}^{(\textup{high})},\ldots,\delta_{B^{t}}^{(\textup{high})})$ is sparse and thus easy to store. As

\displaystyle\sum_{i=1}^{B^{t}}\delta_{i}^{(\textup{high})}=\sum_{i=1}^{B^{t}}% \left\lfloor\frac{\delta_{i}}{2^{2h}}\right\rfloor\leq\frac{1}{2^{2h}}\sum_{i=% 1}^{B^{t}}\delta_{i}=\frac{\Delta}{2^{2h}},

there are at most $\Delta/2^{2h}$ non-zero entries in the high part. By setting a large parameter $h$ , we can make the space usage of the high part very small such that we can regard the entire high part as redundant information.

Let $I\subseteq[1,B^{t}]$ be the set of indices of all non-zero entries in the high part, then $\lvert I\rvert\leq\Delta/2^{2h}$ . The high part of the data structure consists of the following two components:

$\blacksquare$

A predecessor data structure of the set $I$ , where each $i\in I$ is associated with $\delta_{\leq i}^{(\textup{high})}$ .
$\blacksquare$

A predecessor data structure of the set $\{\delta_{\leq i}:i\in I\}$ , where each $\delta_{\leq i}$ is associated with $i$ .⁴⁴4Note that all the $\delta_{\leq i}$ are distinct for $i\in I$ as $\delta_{i}^{(\textup{high})}\neq 0$ for all $i\in I$ , hence $\{\delta_{\leq i}:i\in I\}$ is a valid set.

According to Lemma 6, there is a compact implementation of predecessor data structures, with $O(\lvert I\rvert\cdot w)=O(w\cdot\Delta/2^{2h})$ bits of space and query time $O(\log\log U)$ .

Mid part

We will store the mid-part sequence $(\delta_{1}^{(\textup{mid})},\ldots,\delta_{B^{t}}^{(\textup{mid})})$ by an aB-tree with branching factor $B$ and size $B^{t}$ , using Lemma 5.⁵⁵5When the block size $m$ is smaller than $B^{t}$ , we will use an incomplete aB-tree, that is why we need to support incomplete aB-trees in Lemma 5. The details of the aB-tree are as follows.

$\blacksquare$

Each leaf of the aB-tree contains a single mid-part entry $\delta_{i}^{(\textup{mid})}\in[2^{2h}]\eqqcolon\Sigma$ .
$\blacksquare$

There is a label on each node that equals the sum of all the leaves $\delta_{i}^{(\textup{mid})}$ in its subtree. As $\delta_{i}^{(\textup{mid})}<2^{2h}$ for each $i\in[1,B^{t}]$ , we have $\Phi\subseteq[B^{t}\cdot 2^{2h}]$ .
$\blacksquare$

The premise of Lemma 5, $B=O(\frac{w}{\log(B^{t}+\lvert\Phi\rvert)})$ , can be satisfied by setting $h=t\log B$ . Recalling that $B$ is a parameter with $B\log B=\frac{\varepsilon\log U}{t}$ , we have $B=\frac{\varepsilon\log U}{t\log B}=O(\frac{\log U}{\log(B^{t}\cdot 2^{2h})})=% O(\frac{w}{\log(B^{t}+\lvert\Phi\rvert)})$ .
$\blacksquare$

The size of the lookup table is

$\displaystyle O(B^{2t}\lvert\Sigma\rvert+B^{3t}\lvert\Phi\rvert^{2B})$ $\displaystyle=O(B^{3t}\lvert\Phi\rvert^{2B})\leq O(B^{3t}\cdot(B^{3t})^{2B})$

$\displaystyle=O(B^{6tB+3t})\ll O(B^{10tB})=O(U^{10\varepsilon})$

words, which will be shared between blocks.
$\blacksquare$
The aB-tree can support three types of queries within $O(t)$ time:
1. 1.
  
  Given $i$ , query $\delta_{\leq i}^{(\textup{mid})}$ .
2. 2.
  
  Given $v$ , query the largest index $i_{2}$ with $\delta_{\leq i_{2}}^{(\textup{mid})}\leq v$ .
3. 3.
  
  Given the value $v$ , query the maximal interval $[i_{1},i_{2}]$ of the sequence $(\delta_{1}^{(\textup{mid})},\ldots,\delta_{B^{t}}^{(\textup{mid})})$ with respect to $v$ , defined as follows.
Definition 8.

The maximal interval of a (non-negative) sequence $(a_{1},\ldots,a_{B^{t}})$ with respect to value $v$ is defined as the interval $[i_{1},i_{2}]$ formed by all indices $i$ with $a_{\leq i}=v$ . In the maximal interval, we have $a_{i_{1}+1}=a_{i_{1}+2}=\cdots=a_{i_{2}}=0$ , while $a_{i_{1}},a_{i_{2}+1}>0$ , and $a_{\leq i_{1}}=v$ .

We now compute the space usage of the mid part. To use Lemma 5, we need to first store the root label $\phi$ before storing the aB-tree, as Lemma 5 assumes free access to the root label. After that, the aB-tree will occupy $\log\mathcal{N}(B^{t},\phi)+2$ bits where $\mathcal{N}(B^{t},\phi)$ is the number of sequences $(\delta_{1}^{(\textup{mid})},\ldots,\delta_{B^{t}}^{(\textup{mid})})$ with root label $\phi$ . Recalling that $\phi$ is the sum of this sequence and

\phi=\sum_{i=1}^{B^{t}}\delta_{i}^{(\textup{mid})}\leq\sum_{i=1}^{B^{t}}\delta% _{i}=\Delta,

the number of possible sequences $(\delta_{1}^{(\textup{mid})},\ldots,\delta_{B^{t}}^{(\textup{mid})})$ is at most $\binom{\Delta+B^{t}}{B^{t}-1}$ . Therefore, the number of bits used by the mid part (per block) is at most

\displaystyle O(\log\Delta)+\log\mathcal{N}(B^{t},\phi)+2=\log\binom{\Delta+B^% {t}}{B^{t}-1}+O(w).

Low part

In this basic data structure, the low part is directly stored in an array. Specifically, we store the sequence $(x_{1}^{(\textup{low})},\ldots,x_{B^{t}}^{(\textup{low})})$ one by one as $B^{t}$ integers each of $b$ bits in the memory.

Note that the low-part sequence has the following locally increasing property which we will use in our query algorithms:

$\blacksquare$

Restricted to a maximal interval $[i_{1},i_{2}]$ of $(\delta_{1},\ldots,\delta_{B^{t}})$ , the subsequence $(x_{i_{1}}^{(\textup{low})},\ldots,x_{i_{2}}^{(\textup{low})})$ is strictly increasing. This is because by the condition of Problem 2, the sequence $(x_{i_{1}}^{(\textup{low})}+2^{b}\cdot\delta_{\leq i_{1}},\,\ldots\,,\,x_{i_{2% }}^{(\textup{low})}+2^{b}\cdot\delta_{\leq i_{2}})$ is strictly increasing, whereas $\delta_{\leq i_{1}}=\cdots=\delta_{\leq i_{2}}$ by the definition of maximal interval.

Query algorithms

Now we can formally state our algorithms for the PartialSum and Rank queries in this intra-block data structure.

For the $\textup{{PartialSum}}(i)$ query, we will query the high part, mid part, and low part of the data structure separately to get $\delta_{\leq i}^{(\textup{high})}$ , $\delta_{\leq i}^{(\textup{mid})}$ , and the $x_{i}^{(\textup{low})}$ :

$\blacksquare$

To get $\delta_{\leq i}^{(\textup{high})}$ , we query $i$ on the first predecessor data structure of the high part, getting the largest index $i^{\prime}\in I$ such that $i^{\prime}\leq i$ , with its associated value $\delta_{\leq i^{\prime}}^{(\textup{high})}$ . As $i^{\prime}$ is the last index before $i$ with a nonzero $\delta_{i^{\prime}}^{(\textup{high})}$ , we have $\delta_{\leq i}^{(\textup{high})}=\delta_{\leq i^{\prime}}^{(\textup{high})}$ , as desired. This takes $O(\log\log U)$ time.
$\blacksquare$

To get $\delta_{\leq i}^{(\textup{mid})}$ , we directly query it from the aB-tree of the mid part, which takes $O(t)$ time.
$\blacksquare$

To get $x_{i}^{(\textup{low})}$ , we directly read it out from the integer array of the low part, which takes $O(1)$ time.

Finally, the algorithm will return $\delta_{\leq i}\cdot 2^{b}+x_{i}^{(\textup{low})}=\delta_{\leq i}^{(\textup{% high})}\cdot 2^{2h+b}+\delta_{\leq i}^{(\textup{mid})}\cdot 2^{b}+x_{i}^{(% \textup{low})}$ . The total time cost will be $O(t+\log\log U)$ .

For the $\textup{{Rank}}(x)$ query, we will sequentially query the high part, mid part, and low part of the data structure to narrow down the search interval of the desired index $i$ :

$\blacksquare$

We first query the high part to locate $i$ within a maximal interval of $(\delta_{1}^{(\textup{high})},\ldots,\delta_{B^{t}}^{(\textup{high})})$ . This is achieved by querying the second predecessor data structure of the high part, when we will get two adjacent indices $i_{1},i_{2}\in I$ with $\delta_{\leq i_{1}}\leq\lfloor x/2^{b}\rfloor<\delta_{\leq i_{2}}$ , which locates the index $i$ within $[i_{1},i_{2})$ . Moreover, as $i_{1},i_{2}$ are adjacent indices in $I$ , the interval $[i_{1},i_{2})$ forms a maximal interval of the sequence $(\delta_{1}^{(\textup{high})},\ldots,\delta_{B^{t}}^{(\textup{high})})$ with respect to the value $v_{\textup{high}}\coloneqq\delta_{\leq i_{1}}^{(\textup{high})}$ . This takes $O(\log\log U)$ time.
$\blacksquare$
Recall that the mid-part step needs to find the largest $i^{\prime}\in[i_{1},i_{2})$ such that $\delta_{\leq{i^{\prime}}}$ does not exceed a threshold $\lfloor x/2^{b}\rfloor$ . As all ${i^{\prime}}$ share the same value $\delta_{\leq{i^{\prime}}}^{(\textup{high})}$ , this is equivalent to finding the largest ${i^{\prime}}\in[i_{1},i_{2})$ such that $\delta_{\leq{i^{\prime}}}^{(\textup{mid})}$ does not exceed $\lfloor x/2^{b}\rfloor-v_{\textup{high}}\cdot 2^{2h}$ . This can be done with one query to the aB-tree in the mid part.
- –
  
  If the ${i^{\prime}}$ we found satisfies the strict inequality $\delta_{\leq{i^{\prime}}}<\lfloor x/2^{b}\rfloor$ , we conclude the query with $i={i^{\prime}}$ , because no matter what the low parts are, we already know $\delta_{\leq{i^{\prime}}}\cdot 2^{b}+x_{i^{\prime}}^{(\textup{low})}\leq(% \delta_{\leq i^{\prime}}+1)\cdot 2^{b}-1$ is strictly smaller than the queried key $x$ , and $\delta_{\leq{i^{\prime}+1}}\cdot 2^{b}+x_{i^{\prime}+1}^{(\textup{low})}\geq% \delta_{\leq{i^{\prime}+1}}\cdot 2^{b}$ is larger than $x$ .
- –
  
  If $\delta_{\leq i^{\prime}}$ is equal to the threshold $\lfloor x/2^{b}\rfloor$ , we find the maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]\subseteq[i_{1},i_{2})$ of $(\delta_{1},\ldots,\delta_{B^{t}})$ with $\delta_{\leq i^{\prime}_{1}}=\delta_{\leq i^{\prime}_{2}}=\lfloor x/2^{b}\rfloor$ by querying the aB-tree again. In this case, we can locate the answer $i$ to the query within the interval $[i^{\prime}_{1}-1,\,i^{\prime}_{2}]$ , but the concrete answer depends on the information in the low part.
In both cases, the mid-part step takes $O(t)$ time.
$\blacksquare$

Assuming the previous step encounters the second case, where an interval $[i^{\prime}_{1},\,i^{\prime}_{2}]$ is known to have the same $\delta_{\leq i^{\prime}}$ for all $i^{\prime}$ , and the answer $i$ to the query is guaranteed to reside in $[i^{\prime}_{1}-1,\,i^{\prime}_{2}]$ . Therefore, we need to find the largest $i\in[i^{\prime}_{1},i^{\prime}_{2}]$ such that $x_{i}^{(\textup{low})}\leq(x\bmod 2^{b})$ , and that will be the answer to the query. (If $x_{i^{\prime}_{1}}^{(\textup{low})}$ is already larger than $(x\bmod 2^{b})$ , the answer to the query should be $i^{\prime}_{1}-1$ .) Since $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ is strictly increasing, we can use binary search to find $i$ within $O(\log L)$ time where $L\coloneqq i^{\prime}_{2}-i^{\prime}_{1}+1$ ; as $L\leq B^{t}$ , the running time of this step is bounded by $O(\log B^{t})=O(t\log\log U)$ .⁶⁶6Recall that $B\log B=\frac{\varepsilon\log U}{t}$ , we have $B\leq\log U$ , hence $t\log B\leq t\log\log U$ .

Hence, the total time cost of the Rank query is at most $O(t\log\log U)$ .

Space usage of the block

Recalling that in our construction, the high part, mid part, and low part of the data structure use $O(w\cdot\Delta/2^{2h})$ , $\log\binom{\Delta+B^{t}}{B^{t}-1}+O(w)$ and $b\cdot B^{t}$ bits, respectively, the total space used by the intra-block data structure is

\displaystyle b\cdot B^{t}+\log\binom{\Delta+B^{t}}{B^{t}-1}+O\left(w+w\cdot% \Delta/2^{2h}\right).

(2)

3.2 Performance of the data structure

In this subsection, we will combine all parts of our construction to get the basic data structure, and check its time complexity, lookup table size, and redundancy to complete the proof of Theorem 7.

Time complexity

Recall that our query algorithm will first query the inter-block data structure using $O(\log\log U)$ time to locate a block to query, and then query one intra-block data structure within $O(t\log\log U)$ time to get the answer. The time complexity of the whole data structure is $O(t\log\log U)$ per operation.

Lookup table size

The only lookup table used by our data structure occurs in the mid part within each block, which is used for aB-trees due to Lemma 5. As computed before, the size of this lookup table is $O(U^{10\varepsilon})$ words, which is shared between all blocks, as desired.

Redundancy

The redundancy of the whole FID consists of the following two parts:

$\blacksquare$

The entire inter-block data structure is regarded as redundant. To store the endpoint sequence, we need to store $n/B^{t}+1$ keys from $[U]$ , which occupies $O(w\cdot n/B^{t}+w)$ bits. To store the predecessor data structure of the endpoint sequence, we also need at most $O(w\cdot n/B^{t}+w)$ bits according to Lemma 6. Recalling that $B\log B=\frac{\varepsilon\log U}{t}$ , we have $B\geq\sqrt{B\log B}=\sqrt{\frac{\varepsilon\log U}{t}}$ , and hence

$\displaystyle O\left(w\cdot n/B^{t}\right)\leq O\left(\frac{n\log U}{(% \varepsilon\log U/t)^{t/2}}\right)$ $\displaystyle\leq O\left(\frac{n\log U}{(\log U/t)^{t/4}}\right)\leq O\left(% \frac{n}{(\log U/t)^{t/8}}\right),$

$\displaystyle\leq R\coloneqq\max\left\{\frac{n}{(\log U/t)^{\Omega(t)}},\,O(% \log U)\right\}.$ (3)

thus this part is covered by the desired amount of redundancy.
$\blacksquare$

The redundancy caused by intra-block data structures, which we calculate below.

Recall that (2) upper bounds the space usage of each intra-block data structure. Taking a summation of (2) over all blocks, we get the total space consumption of the intra-block data structures:

	$\displaystyle\sum_{k=1}^{n/B^{t}}\left(b\cdot B^{t}+\log\binom{\Delta_{k}+B^{t% }}{B^{t}-1}+O\left(w+w\cdot\Delta_{k}/2^{2h}\right)\right)$
$\displaystyle{}={}$	$\displaystyle nb+\sum_{k=1}^{n/B^{t}}\log\binom{\Delta_{k}+B^{t}}{B^{t}-1}+O% \left(w+\frac{nw}{B^{t}}+\frac{w}{2^{2h}}\cdot\sum_{k=1}^{n/B^{t}}\Delta_{k}\right)$
$\displaystyle{}\leq{}$	$\displaystyle nb+\sum_{k=1}^{n/B^{t}}\log\binom{\Delta_{k}+B^{t}}{B^{t}-1}+O% \left(w+\frac{nw}{B^{t}}+\frac{nw}{2^{h}}\right)$
$\displaystyle{}\leq{}$	$\displaystyle nb+\sum_{k=1}^{n/B^{t}}\log\binom{\Delta_{k}+B^{t}}{B^{t}-1}+O% \left(R\right),$	(4)

where the first inequality is due to (1); the second inequality is because $h=t\log B$ , thus $O(nw/2^{h})=O(nw/B^{t})\leq O(R)$ according to (3). We further have

	$\displaystyle\sum_{k=1}^{n/B^{t}}\log\binom{\Delta_{k}+B^{t}}{B^{t}-1}{}={}% \log\left(\prod_{k=1}^{n/B^{t}}\binom{\Delta_{k}+B^{t}}{B^{t}-1}\right)$	$\displaystyle{}\leq{}\log\binom{\sum_{k=1}^{n/B^{t}}\Delta_{k}+n}{n-n/B^{t}}$
		$\displaystyle{}\leq{}\log\binom{2^{h}n+n}{n}.$		(5)

Comparing this quantity with the information-theoretic lower bound $\log\binom{U}{n}$ of FID, we get

		$\displaystyle\log\binom{U}{n}-\log\binom{2^{h}n+n}{n}{}={}\log\left(\frac{U}{2% ^{h}n+n}\cdot\frac{U-1}{2^{h}n+n-1}\cdot\cdots\cdot\frac{U-n+1}{2^{h}n+1}\right)$
	$\displaystyle{}\geq{}$	$\displaystyle n\log\left(\frac{U}{2^{h}n+n}\right){}={}n\left(\log\frac{U}{2^{% h}n}-\log\frac{2^{h}+1}{2^{h}}\right){}\geq{}nb-\frac{n}{2^{h}\ln 2},$		(6)

where the last inequality is because $b\coloneqq\log\frac{U}{n}-h$ and $\log(1+2^{-h})\leq 2^{-h}/\ln 2$ . By plugging (3.2) and (6) into (4), the total space usage of all intra-block data structures is at most

\displaystyle nb+\log\binom{2^{h}n+n}{n}+O\left(R\right)\leq\log\binom{U}{n}+O% \left(R+\frac{n}{2^{h}}\right)=\log\binom{U}{n}+O(R),

where $O(n/2^{h})=O(n/B^{t})$ is covered by $R$ according to (3).

In conclusion, the total redundancy of our data structure for FIDs is bounded by $R$ . Since we have constructed an FID with time complexity $O(t\log\log U)$ , lookup table size $O(U^{10\varepsilon})$ , and redundancy $R=\max\{n/(\log U/t)^{\Omega(t)},O(\log U)\}$ , Theorem 7 follows.

4 Advanced data structure for FIDs

In this section, we will prove Theorem 1 by improving the basic data structure. See 1

Proof.

We start by reviewing the bottleneck of Theorem 7. In the basic data structure, the time bottleneck is that Rank queries take $O(t\log\log U)$ time in the low-part step. Specifically, suppose we are required to answer the query $\textup{{Rank}}(x)$ . The inter-block data structure, together with the high and mid parts of the intra-block data structure, uses $O(t+\log\log U)$ time to locate the desired index $i$ within a maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]$ of $(\delta_{1},\ldots,\delta_{B^{t}})$ . Then, in the case where the low-part step is required, we need to further determine the maximum index $i\in[i^{\prime}_{1},i^{\prime}_{2}]$ such that $x_{i}^{(\textup{low})}\leq v_{\textup{low}}\coloneqq(x\bmod 2^{b})$ . This can be formulated as a Rank query for the increasing sequence $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ :

$\blacksquare$

$\textup{{Rank}}(v_{\textup{low}})$ : Return the largest index $i\in[i^{\prime}_{1},i^{\prime}_{2}]$ , such that $x_{i}^{(\textup{low})}\leq v_{\textup{low}}$ .

In the basic data structure, we directly store the sequence $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ as a sorted array using $b L$ bits and support Rank queries by binary search within $O(\log L)$ time, where $L\coloneqq i^{\prime}_{2}-i^{\prime}_{1}+1$ is the length of the subsequence. However, in the worst case, $L$ can be as large as the block size $B^{t}$ , which makes the time of the binary search $O(\log B^{t})=O(t\log\log U)$ . To address this bottleneck, we change the storage structure of the low part below.

New construction for the low part

The key observation is that storing the subsequence $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ to support Rank and PartialSum queries can be viewed as managing a smaller FID. Hence, we have the following unified data structure to store it:

$\blacksquare$

If $L\leq L_{\textup{thrd}}\coloneqq\log U$ , then we still store the subsequence as a sorted array within $b L$ bits.
$\blacksquare$

Otherwise, we will use the basic data structure in Theorem 7 (as a subroutine) to store the subsequence, with parameter $t^{\prime}=O(1)$ and constant $\varepsilon$ to be determined.⁸⁸8Theorem 7 requires that the universe size $2^{b}$ of the basic data structure is at least a large polynomial of the number of keys $L\leq B^{t}$ . This condition holds because, on one hand, our choice of $B$ (i.e., $B\log B=(\varepsilon\log U)/t$ ) implies $L\leq B^{t}=U^{o(1)}$ ; on the other hand, $2^{b}=(U/n)/2^{h}=(U/n)/B^{t}=\operatorname{poly}U$ . As $\big{\{}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(% \textup{low})}\big{\}}\subseteq[2^{b}]$ is a set of $L$ elements, the basic data structure will have query time $O(t^{\prime}\log\log U)=O(\log\log U)$ , and will use

$\displaystyle\log\binom{2^{b}}{L}$ $\displaystyle+\frac{L}{(\log 2^{b})^{\Omega(1)}}+O(\log 2^{b})\leq L\log{\frac% {e2^{b}}{L}}+O(L+b)$

$\displaystyle=Lb-L\log L+O(L+b)\leq Lb$

bits of space, where the last inequality uses the fact $L>L_{\textup{thrd}}=\log U$ to ensure that $L\log L$ is asymptotically larger than $O(L)$ and $O(b)\coloneqq O(\log(U/n)-h)\leq O(\log U)$ . Hence, in this case, the basic data structure storing the subsequence also fits in $b L$ bits. We pad 0’s to the end of the encoding of the basic data structure until it occupies exactly $b L$ bits, ensuring memory alignment.

Clearly, this unified data structure allows us to answer Rank and Select queries of the subsequence $\big{(}x_{1}^{(\textup{low})},\ldots,x_{B^{t}}^{(\textup{low})}\big{)}$ within $O(\log\log U)$ time:

$\blacksquare$

If $L\leq L_{\textup{thrd}}$ , then the subsequence is stored as a sorted array. For Rank queries, we can use binary search to get the desired index, which takes $O(\log L)\leq O(\log L_{\textup{thrd}})=O(\log\log U)$ time; for Select queries, we can directly read out the desired $x_{i}^{(\textup{low})}$ within $O(1)$ time.
$\blacksquare$

Otherwise, the subsequence is stored using the basic data structure in Theorem 7, which can support Rank and Select queries within $O(\log\log U)$ time.

Further, we can store the entire low part using the above unified data structure for subsequences. Specifically, we first divide the interval $[1,B^{t}]$ into all maximal intervals of $(\delta_{1},\ldots,\delta_{B^{t}})$ . For each maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]$ of length $L$ , we store the corresponding subsequence of the low part $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ using the unified data structure in $b L$ bits. Finally, we concatenate all these unified data structures from the leftmost maximal interval to the rightmost one, obtaining a string of $bB^{t}$ bits, which serves as the encoding of the entire low part.

Finally, to obtain the advanced data structure, we start with the basic data structure and replace the encoding of the low part with the new construction above (the concatenation of unified data structures). As our new construction for the low part occupies the same space ( $bB^{t}$ bits) as before, the redundancy of this data structure is still $R=\max\{n/(\log U/t)^{\Omega(t)},\,O(\log U)\}=n/(\log U/t)^{\Omega(t)}$ . What makes the advanced data structure special is that the “basic data structure” appears twice here, once as the entire framework and once as the subroutines for maximal intervals in the low part. By embedding small instances of basic data structures within a larger framework of the same basic data structure in a non-recursive manner, we can improve the query time of FIDs without introducing any additional redundancy, as introduced below.

Query algorithms

The query algorithms for our advanced data structure are similar to those of the basic data structure in Section 3: We first use the inter-block information to transform the original queries on the entire FID into Rank/PartialSum queries within each block. Then, a three-stage process involving the high, mid, and low parts will obtain the answer to the query. The only difference is that, before we access anything in the low part, we need to first compute the maximal interval of $(\delta_{1},\ldots,\delta_{B^{t}})$ that we plan to access; by comparing the length of the interval with the threshold $L_{\textup{thrd}}$ , we determine if the unified data structure for that maximal interval is stored as a sorted array or a basic data structure. The algorithms to answer Rank/PartialSum queries within each block are explained below.

$\blacksquare$

For the query $\textup{{Rank}}(x)$ , using the high and mid parts of the data structure, we can either answer the query directly or locate the desired index $i$ within a maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]$ of $(\delta_{1},\ldots,\delta_{B^{t}})$ . In the latter case, we read the encoding of the unified data structure for $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ , which resides in the $((i^{\prime}_{1}-1)b+1)$ -th bit to the $i^{\prime}_{2}b$ -th bit in the encoding of the entire low part. Recall that comparing $i^{\prime}_{1}-i^{\prime}_{2}+1$ with $L_{\textup{thrd}}$ will tell us whether the unified data structure is stored as a sorted array or a basic data structure for FIDs. After that, we perform the query $\textup{{Rank}}(v_{\textup{low}})=\textup{{Rank}}(x\bmod 2^{b})$ on the unified data structure to obtain the desired index $i$ within $O(\log\log U)$ time.
$\blacksquare$

For the query $\textup{{PartialSum}}(i)$ , we first get $\delta_{\leq i}$ by the mid and high parts. Then, instead of directly reading out $x_{i}^{(\textup{low})}$ from the low part as before, now we also need to know the maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]$ of $(\delta_{1},\ldots,\delta_{B^{t}})$ containing $i$ to help us access the low part. We use a similar algorithm to the Rank query, to compute this maximal interval $[i^{\prime}_{1},i^{\prime}_{2}]$ with respect to $\delta_{\leq i}$ , and to extract the encoding of the subsequence $\big{(}x_{i^{\prime}_{1}}^{(\textup{low})},\ldots,x_{i^{\prime}_{2}}^{(\textup% {low})}\big{)}$ . By querying $\textup{{Select}}(i-i^{\prime}_{1}+1)$ on the unified data structure of this subsequence, we get $x_{i}^{(\textup{low})}$ in $O(\log\log U)$ time.

Both types of queries take $O(t+\log\log U)$ time, because the aB-trees in the mid part take $O(t)$ time, while other steps (including the high and low parts and the inter-block data structure) take $O(\log\log U)$ time per query. This meets the requirement in Theorem 1.

Space of the lookup table

Finally, we check that the size of the lookup table introduced by Theorem 7 is also dominated by $R$ . Recall that the lookup table consists of $O(U^{10\varepsilon})$ words. As $U=n^{1+\Theta(1)}$ , we can assume there is a constant $\alpha>1$ such that $U\leq n^{\alpha}$ . Then, we can set $\varepsilon=1/(20\alpha)$ , which means that the number of bits in the lookup table is $O(U^{1/(2\alpha)}\log U)=O(n^{1/2}\log n)$ , which is significantly smaller than $R=n/(\log U/t)^{\Omega(t)}$ , as desired.

In summary, we get a data structure for static FID with query time $O(t+\log\log U)$ and redundancy $R=n/(\log U/t)^{\Omega(t)}$ , which concludes the proof of Theorem 1. $\hfill\blacktriangleleft$

5 Select and partial sum

In this section, we prove Theorems 3 and 4 by adjusting the basic data structure introduced in Section 3. We will rely on the predecessor data structure from [40] when the set to store is relatively dense:

Lemma 9 (Similar to Lemma 6, see [40]).

For $U\leq n\log^{t}n$ , there is a predecessor data structure with associated values that uses $O\big{(}n\log U+n\log V\big{)}$ bits of space and answers queries in $O(\log t)$ time.

We follow the same notations in Section 3 in the following proofs.

5.1 Select dictionaries

See 3

Proof.

Recall that we have divided the binary representations of keys into the high, mid, and low parts, and in the high part, for each block of $B^{t}$ keys, we stored a predecessor data structure (Lemma 6) to compute $\delta_{\leq i}^{(\textup{high})}$ , which takes $O(\log\log n)$ time to answer each predecessor query. This is the only step exceeding $O(t)$ time in the process of answering Select queries.

Instead of storing predecessor data structures for each block separately, here we store one large predecessor data structure for all $n$ keys with nonzero $\delta_{i}^{(\textup{high})}$ ’s. It achieves the same functionality of computing $\delta_{\leq i}^{(\textup{high})}$ . The number of elements with nonzero $\delta_{i}^{(\textup{high})}$ ’s is bounded by $n/2^{h}=n/B^{t}$ , thus the predecessor data structure stores $n/B^{t}\geq n/\log^{t}n$ elements from the range $[1,n]$ . (If the number of nonzero $\delta_{i}^{(\textup{high})}$ ’s is smaller than $n/B^{t}$ , we add dummy elements until there are $n/B^{t}$ elements.) According to Lemma 9, the predecessor data structure takes $O(\log t)$ time to answer each query, and takes $n\log n/B^{t}=n/(\log n/t)^{\Omega(t)}$ bits of space, which fits in our desired redundancy. Other parts of the data structure remain the same as in Section 3.

When we perform a Select query, the above predecessor data structure in the high part will compute the prefix sum of the high part of the difference sequence, which takes $O(\log t)$ time per query; the aB-trees in the mid part takes $O(t)$ time to return the prefix sum of the mid part (within each block); finally, the low part reads $x_{i}^{(\textup{low})}$ directly to obtain the low part of the target key. The entire process takes $O(t)$ time. $\hfill\blacktriangleleft$

5.2 Partial sum on integer sequences

See 4

Proof.

Let $x_{i}\coloneqq\sum_{j=1}^{i}a_{i}$ be the partial-sum sequence of the input. The partial-sum problem is equivalent to storing a (multi-)set of keys $x_{1}\leq x_{2}\leq\cdots\leq x_{n}$ supporting Select queries, i.e., a select dictionary. The only distinction is that the difference $x_{i}-x_{i-1}$ between any two adjacent keys is bounded by $2^{\ell}-1$ in this problem. The data structure we design for partial-sum is similar to that of the select dictionaries, except that we adjust the parameters and change the number of bits in the high, mid, and low parts:

$\blacksquare$

There is no high part.
$\blacksquare$

Let $B$ be a parameter such that $B\log B=\frac{\varepsilon\log n}{t}$ for a small constant $\varepsilon$ , and let $h\coloneqq t\log B$ . We call the $(\ell-h)$ least significant bits of each $x_{i}$ the low part, and store these bits directly using an array.
$\blacksquare$

The remaining bits $\lfloor x_{i}/2^{\ell-h}\rfloor$ are called the mid part. In their difference sequence $\delta_{1},\ldots,\delta_{n}$ where $\delta_{i}\coloneqq\lfloor x_{i}/2^{\ell-h}\rfloor-\lfloor x_{i-1}/2^{\ell-h}\rfloor$ , each entry $\delta_{i}$ equals either $\lfloor a_{i}/2^{\ell-h}\rfloor$ (i.e., the $h$ most significant bits of the input entry $a_{i}$ ) or $\lfloor a_{i}/2^{\ell-h}\rfloor+1$ , and thus is in $[0,2^{h}]$ . Same as in Section 3, we divide $\delta_{1},\ldots,\delta_{n}$ into blocks of size $B^{t}$ and use aB-trees to store them, supporting prefix-sum queries on $\delta_{1},\ldots,\delta_{n}$ .

Recall that $\mathcal{N}(B^{t},\phi)$ represents the number of instances for an aB-tree with size $B^{t}$ and root label $\phi$ (i.e., the sum of entries in the aB-tree equals $\phi$ ), which is bounded by $(2^{h}+1)^{B^{t}}$ . The space usage of the mid part is thus

	$\displaystyle\log\mathcal{N}(B^{t},\phi)+2+O(\log n)$	$\displaystyle\leq B^{t}\log(2^{h}+1)+O(\log n)\leq B^{t}\cdot h+O(B^{t}/2^{h}+% \log n)$
		$\displaystyle=B^{t}\cdot h+O(\log n)$

bits per block, where $\log\mathcal{N}(B^{t},\phi)+2$ is the space usage of the aB-tree, and $O(\log n)$ is the space to store the root label $\phi$ of the aB-tree. Taking a summation of the space usage over all blocks, including the $n\cdot(\ell-h)$ bits taken by the array in the low part, the inter-block information, and the lookup tables, we know the total space occupied by the data structure is at most

n(\ell-h)+\frac{n}{B^{t}}\cdot\left(B^{t}\cdot h+O(\log n)\right)+n^{0.1}\leq n% (\ell-h)+n\cdot h+O(n\log n/B^{t})\leq n\ell+n/(\log n/t)^{\Omega(t)}

bits, as desired. Similar to Theorem 3, each query takes $O(t)$ time. $\hfill\blacktriangleleft$

References

[1] Sepehr Assadi, Martín Farach-Colton, and William Kuszmaul. Tight bounds for monotone minimal perfect hashing. ACM Transactions on Algorithms, page 3677608, August 2024. See also SODA 2023. doi:10.1145/3677608.
[2] Jérémy Barbay, Francisco Claude, Travis Gagie, Gonzalo Navarro, and Yakov Nekrich. Efficient fully-compressed sequence representations. Algorithmica, 69(1):232–268, May 2014. doi:10.1007/s00453-012-9726-3.
[3] Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. Monotone minimal perfect hashing: Searching a sorted table with $O(1)$ accesses. In Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 785–794, 2009. doi:10.1137/1.9781611973068.86.
[4] Djamal Belazzougui and Gonzalo Navarro. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms, 11(4):1–21, June 2015. doi:10.1145/2629339.
[5] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, and William Kuszmaul. Modern hashing made simple. In Proc. 7th Symposium on Simplicity in Algorithms (SOSA), pages 363–373, 2024. doi:10.1137/1.9781611977936.33.
[6] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, William Kuszmaul, and Mingmou Liu. On the optimal time/space tradeoff for hash tables. In Proc. 54th ACM SIGACT Symposium on Theory of Computing (STOC), pages 1284–1297, 2022. doi:10.1145/3519935.3519969.
[7] Daniel K. Blandford and Guy E. Blelloch. Compact dictionaries for variable-length keys and data with applications. ACM Transactions on Algorithms, 4(2):17:1–17:25, May 2008. doi:10.1145/1361192.1361194.
[8] H. Buhrman, P. B. Miltersen, J. Radhakrishnan, and S. Venkatesh. Are bitvectors optimal? SIAM Journal on Computing, 31(6):1723–1744, January 2002. doi:10.1137/S0097539702405292.
[9] David R. Clark. Compact PAT trees. PhD thesis, University of Waterloo, 1996.
[10] David R. Clark and J. Ian Munro. Efficient suffix trees on secondary storage. In Proc. 7th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 383–391, 1996.
[11] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, fourth edition. MIT Press, April 2022.
[12] P. Ferragina, F. Luccio, G. Manzini, and S. Muthukrishnan. Structuring labeled trees for optimal succinctness, and beyond. In Proc. 46th IEEE Symposium on Foundations of Computer Science (FOCS), pages 184–193, 2005. doi:10.1109/SFCS.2005.69.
[13] Paolo Ferragina, Roberto Grossi, Ankur Gupta, Rahul Shah, and Jeffrey Scott Vitter. On searching compressed string collections cache-obliviously. In Proc. 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 181–190, 2008. doi:10.1145/1376916.1376943.
[14] Paolo Ferragina and Giovanni Manzini. Indexing compressed text. Journal of the ACM, 52(4):552–581, July 2005. doi:10.1145/1082036.1082039.
[15] Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms, 3(2):20–es, May 2007. doi:10.1145/1240233.1240243.
[16] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with $O(1)$ worst case access time. Journal of the ACM, 31(3):538–544, June 1984. doi:10.1145/828.1884.
[17] Richard F. Geary, Naila Rahman, Rajeev Raman, and Venkatesh Raman. A simple optimal representation for balanced parentheses. Theoretical Computer Science, 368(3):231–246, December 2006. doi:10.1016/j.tcs.2006.09.014.
[18] Richard F. Geary, Rajeev Raman, and Venkatesh Raman. Succinct ordinal trees with level-ancestor queries. ACM Transactions on Algorithms, 2(4):510–534, October 2006. doi:10.1145/1198513.1198516.
[19] Alexander Golynski, J. Ian Munro, and Srinivasa Rao Satti. Rank/select operations on large alphabets: a tool for text indexing. In Proc. 17th ACM-SIAM Symposium on Discrete Algorithm (SODA), pages 368–373, 2006. doi:10.1145/1109557.1109599.
[20] Alexander Golynski, Rajeev Raman, and Srinivasa Rao Satti. On the redundancy of succinct data structures. In Proc. 11th Scandinavian Workshop on Algorithm Theory (SWAT), pages 148–159, 2008. doi:10.1007/978-3-540-69903-3_15.
[21] Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In Proc. 14th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 841–850, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644250.
[22] Roberto Grossi, Alessio Orlandi, and Rajeev Raman. Optimal trade-offs for succinct string indexes. In Proc. 37th International Colloquium Conference on Automata, Languages and Programming (ICALP), pages 678–689, 2010. doi:10.1007/978-3-642-14165-2_57.
[23] Roberto Grossi, Alessio Orlandi, Rajeev Raman, and Srinivasa Rao Satti. More haste, less waste: lowering the redundancy in fully indexable dictionaries. In Proc. 26th International Symposium on Theoretical Aspects of Computer Science (STACS), pages 517–528, 2009. doi:10.4230/LIPICS.STACS.2009.1847.
[24] Roberto Grossi and Jeffrey Scott Vitter. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378–407, January 2005. doi:10.1137/S0097539702402354.
[25] Ankur Gupta, Wing-Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. Compressed data structures: Dictionaries and data-aware measures. Theoretical Computer Science, 387(3):313–331, 2007. doi:10.1016/j.tcs.2007.07.042.
[26] Wing-Kai Hon, Kunihiko Sadakane, and Wing-Kin Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing, 38(6):2162–2178, January 2009. doi:10.1137/070685373.
[27] Yang Hu, Jingxun Liang, Huacheng Yu, Junkai Zhang, and Renfei Zhou. Optimal static dictionary with worst-case constant query time. In Proc. 57th ACM Symposium on Theory of Computing (STOC), 2025.
[28] Guy Joseph Jacobson. Succinct static data structures. PhD thesis, Carnegie Mellon University, 1988.
[29] Dmitry Kosolobov. Simplified tight bounds for monotone minimal perfect hashing. In Proc. 35th Symposium on Combinatorial Pattern Matching (CPM), pages 19:1–19:13, 2024. doi:10.4230/LIPICS.CPM.2024.19.
[30] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic “succincter”. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1715–1733, 2023. doi:10.1109/FOCS57990.2023.00104.
[31] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Tight cell-probe lower bounds for dynamic succinct dictionaries. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1842–1862, 2023. doi:10.1109/FOCS57990.2023.00112.
[32] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic dictionary with subconstant wasted bits per key. In Proc. 35th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 171–207, 2024. doi:10.1137/1.9781611977912.9.
[33] J. Ian Munro. Tables. In Proc. 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pages 37–42, 1996. doi:10.1007/3-540-62034-6_35.
[34] J. Ian Munro, Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct representations of permutations and functions. Theoretical Computer Science, 438:74–88, June 2012. doi:10.1016/j.tcs.2012.03.005.
[35] J. Ian Munro and Venkatesh Raman. Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing, 31(3):762–776, 2001. doi:10.1137/S0097539799364092.
[36] J.Ian Munro, Venkatesh Raman, and S.Srinivasa Rao. Space efficient suffix trees. Journal of Algorithms, 39(2):205–222, 2001. doi:10.1006/jagm.2000.1151.
[37] Gonzalo Navarro and Veli Mäkinen. Compressed full-text indexes. ACM Computing Surveys, 39(1):2–es, April 2007. doi:10.1145/1216370.1216372.
[38] Gonzalo Navarro and Javiel Rojas-Ledesma. Predecessor search. ACM Computing Surveys, 53(5):1–35, September 2021. doi:10.1145/3409371.
[39] Mihai Pǎtraşcu. Succincter. In Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pages 305–313, 2008.
[40] Mihai Pǎtraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In Proc. 38th ACM Symposium on Theory of Computing (STOC), pages 232–240, 2006. doi:10.1145/1132516.1132551.
[41] Mihai Pǎtraşcu and Emanuele Viola. Cell-probe lower bounds for succinct partial sums. In Proc. 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 117–122, 2010.
[42] Giulio Ermanno Pibiri and Rossano Venturini. Dynamic Elias-Fano representation. In Proc. 28th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 30:1–30:14, 2017. doi:10.4230/LIPIcs.CPM.2017.30.
[43] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct dynamic data structures. In Proc. 7th International Workshop on Algorithms and Data Structures (WADS), pages 426–437, 2001.
[44] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct indexable dictionaries with applications to encoding $k$ -ary trees, prefix sums and multisets. ACM Transactions on Algorithms, 3(4):43, November 2007. See also SODA 2002. doi:10.1145/1290672.1290680.
[45] Rajeev Raman and Srinivasa Rao Satti. Succinct dynamic dictionaries and trees. In Proc. 30th International Colloquium on Automata, Languages and Programming (ICALP), pages 357–368, 2003.
[46] Emanuele Viola. New sampling lower bounds via the separator. In Proc. 38th Computational Complexity Conference (CCC), pages 26:1–26:23, 2023. doi:10.4230/LIPIcs.CCC.2023.26.
[47] Huacheng Yu. Optimal succinct rank data structure via approximate nonnegative tensor decomposition. In Proc. 51st ACM SIGACT Symposium on Theory of Computing (STOC), pages 955–966, 2019. doi:10.1145/3313276.3316352.
[48] Huacheng Yu. Nearly optimal static Las Vegas succinct dictionary. In Proc. 52nd ACM SIGACT Symposium on Theory of Computing (STOC), pages 1389–1401, 2020. doi:10.1145/3357713.3384274.

[bib.bib1] [1] Sepehr Assadi, Martín Farach-Colton, and William Kuszmaul. Tight bounds for monotone minimal perfect hashing. ACM Transactions on Algorithms, page 3677608, August 2024. See also SODA 2023. doi:10.1145/3677608.

[bib.bib2] [2] Jérémy Barbay, Francisco Claude, Travis Gagie, Gonzalo Navarro, and Yakov Nekrich. Efficient fully-compressed sequence representations. Algorithmica, 69(1):232–268, May 2014. doi:10.1007/s00453-012-9726-3.

[bib.bib3] [3] Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. Monotone minimal perfect hashing: Searching a sorted table with $O(1)$ accesses. In Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 785–794, 2009. doi:10.1137/1.9781611973068.86.

[bib.bib4] [4] Djamal Belazzougui and Gonzalo Navarro. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms, 11(4):1–21, June 2015. doi:10.1145/2629339.

[bib.bib5] [5] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, and William Kuszmaul. Modern hashing made simple. In Proc. 7th Symposium on Simplicity in Algorithms (SOSA), pages 363–373, 2024. doi:10.1137/1.9781611977936.33.

[bib.bib6] [6] Michael A. Bender, Martín Farach-Colton, John Kuszmaul, William Kuszmaul, and Mingmou Liu. On the optimal time/space tradeoff for hash tables. In Proc. 54th ACM SIGACT Symposium on Theory of Computing (STOC), pages 1284–1297, 2022. doi:10.1145/3519935.3519969.

[bib.bib7] [7] Daniel K. Blandford and Guy E. Blelloch. Compact dictionaries for variable-length keys and data with applications. ACM Transactions on Algorithms, 4(2):17:1–17:25, May 2008. doi:10.1145/1361192.1361194.

[bib.bib8] [8] H. Buhrman, P. B. Miltersen, J. Radhakrishnan, and S. Venkatesh. Are bitvectors optimal? SIAM Journal on Computing, 31(6):1723–1744, January 2002. doi:10.1137/S0097539702405292.

[bib.bib9] [9] David R. Clark. Compact PAT trees. PhD thesis, University of Waterloo, 1996.

[bib.bib10] [10] David R. Clark and J. Ian Munro. Efficient suffix trees on secondary storage. In Proc. 7th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 383–391, 1996.

[bib.bib11] [11] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, fourth edition. MIT Press, April 2022.

[bib.bib12] [12] P. Ferragina, F. Luccio, G. Manzini, and S. Muthukrishnan. Structuring labeled trees for optimal succinctness, and beyond. In Proc. 46th IEEE Symposium on Foundations of Computer Science (FOCS), pages 184–193, 2005. doi:10.1109/SFCS.2005.69.

[bib.bib13] [13] Paolo Ferragina, Roberto Grossi, Ankur Gupta, Rahul Shah, and Jeffrey Scott Vitter. On searching compressed string collections cache-obliviously. In Proc. 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 181–190, 2008. doi:10.1145/1376916.1376943.

[bib.bib14] [14] Paolo Ferragina and Giovanni Manzini. Indexing compressed text. Journal of the ACM, 52(4):552–581, July 2005. doi:10.1145/1082036.1082039.

[bib.bib15] [15] Paolo Ferragina, Giovanni Manzini, Veli Mäkinen, and Gonzalo Navarro. Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms, 3(2):20–es, May 2007. doi:10.1145/1240233.1240243.

[bib.bib16] [16] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with $O(1)$ worst case access time. Journal of the ACM, 31(3):538–544, June 1984. doi:10.1145/828.1884.

[bib.bib17] [17] Richard F. Geary, Naila Rahman, Rajeev Raman, and Venkatesh Raman. A simple optimal representation for balanced parentheses. Theoretical Computer Science, 368(3):231–246, December 2006. doi:10.1016/j.tcs.2006.09.014.

[bib.bib18] [18] Richard F. Geary, Rajeev Raman, and Venkatesh Raman. Succinct ordinal trees with level-ancestor queries. ACM Transactions on Algorithms, 2(4):510–534, October 2006. doi:10.1145/1198513.1198516.

[bib.bib19] [19] Alexander Golynski, J. Ian Munro, and Srinivasa Rao Satti. Rank/select operations on large alphabets: a tool for text indexing. In Proc. 17th ACM-SIAM Symposium on Discrete Algorithm (SODA), pages 368–373, 2006. doi:10.1145/1109557.1109599.

[bib.bib20] [20] Alexander Golynski, Rajeev Raman, and Srinivasa Rao Satti. On the redundancy of succinct data structures. In Proc. 11th Scandinavian Workshop on Algorithm Theory (SWAT), pages 148–159, 2008. doi:10.1007/978-3-540-69903-3_15.

[bib.bib21] [21] Roberto Grossi, Ankur Gupta, and Jeffrey Scott Vitter. High-order entropy-compressed text indexes. In Proc. 14th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 841–850, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644250.

[bib.bib22] [22] Roberto Grossi, Alessio Orlandi, and Rajeev Raman. Optimal trade-offs for succinct string indexes. In Proc. 37th International Colloquium Conference on Automata, Languages and Programming (ICALP), pages 678–689, 2010. doi:10.1007/978-3-642-14165-2_57.

[bib.bib23] [23] Roberto Grossi, Alessio Orlandi, Rajeev Raman, and Srinivasa Rao Satti. More haste, less waste: lowering the redundancy in fully indexable dictionaries. In Proc. 26th International Symposium on Theoretical Aspects of Computer Science (STACS), pages 517–528, 2009. doi:10.4230/LIPICS.STACS.2009.1847.

[bib.bib24] [24] Roberto Grossi and Jeffrey Scott Vitter. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378–407, January 2005. doi:10.1137/S0097539702402354.

[bib.bib25] [25] Ankur Gupta, Wing-Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. Compressed data structures: Dictionaries and data-aware measures. Theoretical Computer Science, 387(3):313–331, 2007. doi:10.1016/j.tcs.2007.07.042.

[bib.bib26] [26] Wing-Kai Hon, Kunihiko Sadakane, and Wing-Kin Sung. Breaking a time-and-space barrier in constructing full-text indices. SIAM Journal on Computing, 38(6):2162–2178, January 2009. doi:10.1137/070685373.

[bib.bib27] [27] Yang Hu, Jingxun Liang, Huacheng Yu, Junkai Zhang, and Renfei Zhou. Optimal static dictionary with worst-case constant query time. In Proc. 57th ACM Symposium on Theory of Computing (STOC), 2025.

[bib.bib28] [28] Guy Joseph Jacobson. Succinct static data structures. PhD thesis, Carnegie Mellon University, 1988.

[bib.bib29] [29] Dmitry Kosolobov. Simplified tight bounds for monotone minimal perfect hashing. In Proc. 35th Symposium on Combinatorial Pattern Matching (CPM), pages 19:1–19:13, 2024. doi:10.4230/LIPICS.CPM.2024.19.

[bib.bib30] [30] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic “succincter”. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1715–1733, 2023. doi:10.1109/FOCS57990.2023.00104.

[bib.bib31] [31] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Tight cell-probe lower bounds for dynamic succinct dictionaries. In Proc. 64th IEEE Symposium on Foundations of Computer Science (FOCS), pages 1842–1862, 2023. doi:10.1109/FOCS57990.2023.00112.

[bib.bib32] [32] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic dictionary with subconstant wasted bits per key. In Proc. 35th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 171–207, 2024. doi:10.1137/1.9781611977912.9.

[bib.bib33] [33] J. Ian Munro. Tables. In Proc. 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), pages 37–42, 1996. doi:10.1007/3-540-62034-6_35.

[bib.bib34] [34] J. Ian Munro, Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct representations of permutations and functions. Theoretical Computer Science, 438:74–88, June 2012. doi:10.1016/j.tcs.2012.03.005.

[bib.bib35] [35] J. Ian Munro and Venkatesh Raman. Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing, 31(3):762–776, 2001. doi:10.1137/S0097539799364092.

[bib.bib36] [36] J.Ian Munro, Venkatesh Raman, and S.Srinivasa Rao. Space efficient suffix trees. Journal of Algorithms, 39(2):205–222, 2001. doi:10.1006/jagm.2000.1151.

[bib.bib37] [37] Gonzalo Navarro and Veli Mäkinen. Compressed full-text indexes. ACM Computing Surveys, 39(1):2–es, April 2007. doi:10.1145/1216370.1216372.

[bib.bib38] [38] Gonzalo Navarro and Javiel Rojas-Ledesma. Predecessor search. ACM Computing Surveys, 53(5):1–35, September 2021. doi:10.1145/3409371.

[bib.bib39] [39] Mihai Pǎtraşcu. Succincter. In Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pages 305–313, 2008.

[bib.bib40] [40] Mihai Pǎtraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In Proc. 38th ACM Symposium on Theory of Computing (STOC), pages 232–240, 2006. doi:10.1145/1132516.1132551.

[bib.bib41] [41] Mihai Pǎtraşcu and Emanuele Viola. Cell-probe lower bounds for succinct partial sums. In Proc. 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 117–122, 2010.

[bib.bib42] [42] Giulio Ermanno Pibiri and Rossano Venturini. Dynamic Elias-Fano representation. In Proc. 28th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 30:1–30:14, 2017. doi:10.4230/LIPIcs.CPM.2017.30.

[bib.bib43] [43] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct dynamic data structures. In Proc. 7th International Workshop on Algorithms and Data Structures (WADS), pages 426–437, 2001.

[bib.bib44] [44] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct indexable dictionaries with applications to encoding $k$ -ary trees, prefix sums and multisets. ACM Transactions on Algorithms, 3(4):43, November 2007. See also SODA 2002. doi:10.1145/1290672.1290680.

[bib.bib45] [45] Rajeev Raman and Srinivasa Rao Satti. Succinct dynamic dictionaries and trees. In Proc. 30th International Colloquium on Automata, Languages and Programming (ICALP), pages 357–368, 2003.

[bib.bib46] [46] Emanuele Viola. New sampling lower bounds via the separator. In Proc. 38th Computational Complexity Conference (CCC), pages 26:1–26:23, 2023. doi:10.4230/LIPIcs.CCC.2023.26.

[bib.bib47] [47] Huacheng Yu. Optimal succinct rank data structure via approximate nonnegative tensor decomposition. In Proc. 51st ACM SIGACT Symposium on Theory of Computing (STOC), pages 955–966, 2019. doi:10.1145/3313276.3316352.

[bib.bib48] [48] Huacheng Yu. Nearly optimal static Las Vegas succinct dictionary. In Proc. 52nd ACM SIGACT Symposium on Theory of Computing (STOC), pages 1389–1401, 2020. doi:10.1145/3357713.3384274.

	$\displaystyle O(B^{2t}\lvert\Sigma\rvert+B^{3t}\lvert\Phi\rvert^{2B})$	$\displaystyle=O(B^{3t}\lvert\Phi\rvert^{2B})\leq O(B^{3t}\cdot(B^{3t})^{2B})$
		$\displaystyle=O(B^{6tB+3t})\ll O(B^{10tB})=O(U^{10\varepsilon})$

	$\displaystyle O\left(w\cdot n/B^{t}\right)\leq O\left(\frac{n\log U}{(% \varepsilon\log U/t)^{t/2}}\right)$	$\displaystyle\leq O\left(\frac{n\log U}{(\log U/t)^{t/4}}\right)\leq O\left(% \frac{n}{(\log U/t)^{t/8}}\right),$
		$\displaystyle\leq R\coloneqq\max\left\{\frac{n}{(\log U/t)^{\Omega(t)}},\,O(% \log U)\right\}.$		(3)

	$\displaystyle\log\binom{2^{b}}{L}$	$\displaystyle+\frac{L}{(\log 2^{b})^{\Omega(1)}}+O(\log 2^{b})\leq L\log{\frac% {e2^{b}}{L}}+O(L+b)$
		$\displaystyle=Lb-L\log L+O(L+b)\leq Lb$

Optimal Static Fully Indexable Dictionaries

Abstract

Keywords and phrases:

Category:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Towards the optimal trade-off

This paper: Tight upper bounds for FIDs

Theorem 1.

High-level technical approach

Other implications of our techniques

Corollary 2.

Theorem 3.

Theorem 4.

1.1 Related works

2 Preliminaries

Augmented B-trees

Lemma 5 (Natural generalization of [39, Theorem 8]).

Predecessor data structures

Problem 1 (Predecessor with associated values).

Lemma 6.

Proof Sketch.

3 Basic data structure for FIDs

Theorem 7 (Weak version of Theorem 1).

Partitioning into blocks

Storing difference sequences within blocks

Problem 2 (FID within a block).

The three-part partition

3.1 Data structure within each block

High part

Mid part

Definition 8.

Low part

Query algorithms

Space usage of the block

3.2 Performance of the data structure

Time complexity

Lookup table size

Redundancy

4 Advanced data structure for FIDs

Proof.

New construction for the low part

Query algorithms

Space of the lookup table

5 Select and partial sum

Lemma 9 (Similar to Lemma 6, see [40]).

5.1 Select dictionaries

Proof.

5.2 Partial sum on integer sequences

Proof.

References