Abstract 1 Introduction 2 Randomized Block Search Trees 3 Bounds for Depth 4 Bounds for Size using a Top-Down Analysis 5 Bounds for Dynamic Updates via Partial-Rebuilding References Appendix A Induction Base Case: Expected Size of Buffering Trees Appendix B Additional Proofs

B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load

Roodabeh Safavi ORCID Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria Martin P. Seybold ORCID Faculty of Computer Science, Theory and Applications of Algorithms, University of Vienna, Austria
Abstract

Uniquely represented (UR) data structures represent each logical state with a unique storage state. We study the problem of maintaining a dynamic set of n keys from a totally ordered universe in this context. UR structures are also called “strongly history independent” structures in the literature.

We introduce a two-layer data structure called (α,ε)-Randomized Block Search Tree (RBST) that is uniquely represented and suitable for external memory (EM). Though RBSTs naturally generalize the well-known binary Treaps, several new ideas are needed to analyze the expected search, update, and storage efficiency in terms of block-reads, block-writes, and blocks stored. We prove that searches have O(ε1+logαn) block-reads, that dynamic updates perform O(ε1+logα(n)/α) block-writes and O(ε2+(1+ε1+lognα)logαn) block-reads, and that (α,ε)-RBSTs have an asymptotic load-factor of at least (1ε) for every ε(0,1/2].

Thus (α,ε)-RBSTs improve on the known, uniquely represented B-Treap [Golovin; ICALP’09]. Compared with non-UR structures, the RBST is also, to the best of our knowledge, the first external memory structure that is storage-efficient and has a non-amortized, write-efficient update bound.

Keywords and phrases:
Unique Representation, Randomization, Top-Down Analysis, Block Search Tree, Write-Efficiency, Storage-Efficiency
Copyright and License:
[Uncaptioned image] © Roodabeh Safavi and Martin P. Seybold; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
Related Version:
Full Version: https://arxiv.org/abs/2303.04722
Funding:
This work was supported under the Australian Research Council Discovery Projects funding scheme (project number DP180102870). This project has received funding from the European Research Council (ERC) margin: [Uncaptioned image] under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 101019564) “The Design of Modern Fully Dynamic Data Structures (MoDynStruct)” and from the Austrian Science Fund (FWF) project Z 422-N and project “Fast Algorithms for a Reactive Network Layer (ReactNet)” P 33775-N, with additional funding from the netidee SCIENCE Stiftung, 2020–2024.
Editors:
Pat Morin and Eunjin Oh

1 Introduction

Organizing a set of keys such that insertions, deletions, and searches are supported is one of the most basic problems in computer science that drives the development and analysis of structures with various guarantees and performance trade-offs. Binary search trees for example have several known height balancing methods (e.g. AVL and Red-Black trees) and weight balancing methods (e.g. BB[a] trees), some providing 𝒪(1) writes for each update [17].

Block Search Trees are generalizations of binary search trees that store in every tree node up to α keys and α+1 child pointers, where the array size α of the blocks is a (typically) fixed parameter. Since such layouts enforce a certain data locality, block structures are central for read- and write-efficient access in machine models with a memory hierarchy.

External Memory and deterministic B-Trees variants.

In the classic External Memory (EM) model, the n data items of a problem instance must be transferred in blocks of a fixed size B between a block-addressible external memory and an internal Main Memory (MM), which can store up to M data items and perform computations. Typically, nM>B and the MM can only store a small number of blocks throughout processing, say M/B=𝒪(1). Though Vitter’s Parallel Disk Model can formalize more general settings [30, Section 2.1], we are interested in the most basic case with one, merely block-addressible, EM device and one computing device with MM. The cost of a computation is typically measured in the number of block-reads from EM (Is), the number of block-writes to EM (Os), or the total number of IOs performed on EM. For basic algorithmic tasks like the range search, block search trees with α=Θ(B) support algorithms with asymptotically optimal total IOs (e.g. [30, Section 10]). Classic B-Trees [4] guarantee that every block, beside the root, has a load-factor of at least 1/2 and that all leaves are at equal depth, using a deterministic balance criteria. The B*-Tree variant [23, Section 6.2.4] guarantees a load-factor of at least 2/3 based on a (deterministic) overflow strategy that shares keys among the two neighboring nodes on equal levels. Yao’s Fringe Analysis [32] showed that inserting keys under a random order in (deterministic balanced) B-Trees yields an expected asymptotic load-factor of ln(2)69%, and the expected asymptotic load-factor is 2ln(3/2)81% for B*-Trees. For general workloads however, maintaining higher load-factors (with even wider overflow strategies) further increases the update write-bounds in block search trees [3, 24]. The B+-Tree variant for key-value pairs, which stores the values separate from the keys to increase the load-factor and thus fan-outs, is heavily used in practical file systems, see e.g. [14].

The IO-optimized111Bε-trees are usually called “write-optimized”, though they aim to minimize IOs and not writes (Os). Bε-Tree [13, 7] provides smooth trade-offs between 𝒪(log1+αε(n)/α1ε) amortized IOs per update and 𝒪(log1+αεn) block-reads (Is) for searches. E.g. tuning parameter ε=1 retains the bounds of B-Trees and ε=1/2 provides improved bounds. In the fully update IO-optimized case (ε=0) however, searches have merely a 𝒪(logn) bound for the number of block-reads. The main design idea to achieve this is to augment the non-leaf nodes of a B-Tree with additional “message buffers” so that a key insertion can be stored close to the root. Messages then only need propagation, further down the tree, once a message buffer is full (cf. [13, Section 3.4] and [7]). Clearly, the state of either of those (deterministic) structures depends heavily on the actual sequence in which the keys were inserted in them.

Uniquely Represented Data Structures.

For security or privacy reasons, some applications, see e.g. [19], require data structures that do not reveal any information about historical states to an observer of the memory representation, i.e. evidence of keys that were deleted or evidence of the keys’ insertion sequence. Data structures are called uniquely represented (UR) if they have this strong history independence property. For example, sorted arrays are UR, but the aforementioned search trees are not UR due to deterministic rebalancing. Early results [28, 29, 1] show lower and upper bounds on comparison-based search in UR data structures on the pointer machine. Using UR hash tables [9, 26] in the RAM model, pointer structures can also be mapped uniquely into memory. (See [9, Thm. 4.1], [10, Sec. 2].)

The Randomized Search Trees are defined by inserting keys, one at a time, in order of a random permutation into an initially empty binary search tree. The well-known Treap [27, 31] maintains the tree shape, of the permutation order, and supports efficient insertions, deletions, and searches (see also [25, Chapter 1.3]). Searches read with high probability 𝒪(logn) tree nodes, any insertion (or deletion) performs expected 𝒪(1) rotations (writes), and the expected subtree weight, beneath the updated key, is 𝒪(logn). Beside UR, these remarkably strong bounds are central in the design of simple MM algorithms for dynamic problems. The constant writes bound of Treaps, and some Red-Black Tree variants, is particularly valuable for persistent data structures [16, 22], that is 𝒪(1) amortized writes are not sufficient to design fully-persistent data structures with linear space. The logarithmic weight bound for example is particularly valuable for obtaining a 𝒪(1) query time for order-queries in dynamic sets with 𝒪(logn) expected update time [12, Section 5]. This is a central sub-problem in designing fully-persistent data structures, see [18] for example. Further important examples include asymmetric read/write cost settings [8, 5] and parallel computation [11], or simply storage media that can only endure few write-cycles.

Golovin [20] introduced the B-Treap as the first UR data structure for Byte-addressable External Memory (BEM). The B-Treap originates from a certain block storage layout of an associated (binary) Treap, i.e. the block tree is obtained, bottom-up, by iteratively placing subtrees, of a certain size, in one block node222E.g. byte-addressable memory is required for intra-block child pointers and each key maintains a size field that stores its subtree size (in a binary Treap), resulting in logarithmic writes (cf. Os in Table 1). . The author shows bounds of B-Treaps, under certain parameter conditions: If α is at least polylogarithmic, specifically α=Ω((lnn)11δ) for some δ(0,1), then updates have expected 𝒪(1δlogαn) block-writes and range-queries have expected 𝒪(1δlogαn+k/α) block-reads, where k is the output size (see Theorem 1 in [20]). The paper also discusses experimental data, from a non-dynamic implementation, that shows an expected load-factor close to 1/3 [20, Sec.6]. Though this storage layout approach allows leveraging known bounds of Treaps to analyze B-Treaps, the block size conditions are limiting for applications of B-Treaps in algorithms. Since the update algorithm is rather involved, Golovin [21] proposed a simpler, Skip-List based EM approach for a UR solution.

The known Skip-List based approaches [21] and [6] avoid the restrictions on α, however both require an involved form of history independent hash-table for strings of key values (see [21, Section 2.2.2]). Moreover, the B-Skip-List [21] provides only a logarithmic 𝒪(logn) depth bound (see [6, Lemma 15]) and the EM Skip-List in [6] offers only a weaker notion of History Independence than UR and has logarithmic block-writes (cf. [6, proof of Lem.19]).

Table 1: Bounds of bottom-up B-Treaps and the proposed top-down RBSTs.
UR Tree Model [# Blocks]
B-Treap [20] BEM Size 𝒪(n/α)
Update (Os) 𝒪(1δlogαn) , if α=Ω(ln11δn).
Search (Is)
(α,ε)-RBST EM Size (1+𝒪(ε))n/α
Update (Os) 𝒪(ε1+logα(n)/α)
Update (Is) 𝒪(ε2+(1+ε1+lognα)logαn)
Search (Is) 𝒪(ε1+logαn)

Contribution and Paper Outline.

Our work provides the first UR EM search tree that is generally applicable for all block sizes α2, whilst providing bounds that improve on the only known UR EM search tree solution, B-Treap [20].

We introduce a new design technique, and provide a self-contained analysis, that leads to a two-layer randomized data structure called (α,ε)-Randomized Block Search Trees (RBSTs), where α is the block size and ε(0,1/2] a tuning parameter for the space-efficiency of our secondary layer (see Section 2). Without the secondary structures, the RBSTs are precisely the distribution of trees that are obtained by inserting the keys, one at a time, under a random order into an initially empty block search tree (e.g. α=1 yields the Treap distribution). RBSTs are UR and the algorithms for searching are simple. In Section 2.1, we give a partial-rebuild algorithm for dynamic updates that occupies 𝒪(1) blocks of temporary MM storage and performs a number of writes (Os) to EM that is proportional to the structural change in the RBST. Its number of reads (Is) is bounded within a small factor of the writes. We prove three performance bounds for (α,ε)-RBSTs in terms of block-reads, block-writes, and blocks stored.

Section 3 shows that searching reads expected 𝒪(ε1+logαn) blocks, and that RBST depth is with high probability 𝒪(ε1+ε1lnn+logαn) for all α2. If α=polylog(n), RBST depth is with high probability 𝒪(ε1+logαn). Section 4 shows that (α,ε)-RBSTs have an expected asymptotic load-factor of at least (1ε) for every ε(0,1/2]. Combining both analysis techniques, we show in Section 5 that dynamic updates perform expected 𝒪(ε1+logα(n)/α) block-writes and expected 𝒪(ε2+(1+ε1+lognα)logαn) block-reads.

See Table 1 for an overview of the write bounds (Os), and read bounds (Is), of the known bottom-up B-Treap and our proposed top-down RBST. To compare the update write-bounds (Os), note that RBSTs have constant 𝒪(ε1) writes for all α that are admissible in B-Treaps, whereas B-Treaps write at least logarithmic Ω(logαn) blocks. This is a significant asymptotic improvement. To compare the update read-bounds (Is), note that the factor (1+lognα)=𝒪(1) for all α=Ω(ln11δn) with δ(0,1) that are admissible in B-Treaps, whereas the factor 1δ for α close to ln(n). That is, the RBST bounds improve on all B-Treap bounds. Thus, RBSTs are simple UR search trees, for all block sizes α2, that simultaneously provide improved search, storage utilization, and write-efficiency, compared to the bottom-up B-Treap [20].

Comparing also with non-UR structures, (α,ε)-RBSTs are, to the best of our knowledge, the first EM structure that provides storage-efficiency and write-efficiency for every update. The decisive novelty of our tree structure is that all block-nodes of the primary layer use the maximum fan-out and that the block-nodes of the secondary layer use a weight-proportional fan-out, i.e. proportional to the number of keys that are stored in the subtree. The design of our secondary layer structures has thus nothing in common with the “message-buffers” in Bε-trees. Our result is incomparable to IO-optimized Bε-trees: Though RBSTs are UR and provide better bounds for space, searching (Is), and writes (Os), the block-reads (Is) for updates are Ω(logαn) and thus worse than the IO-bound of the (non-UR) Bε-Tree.

2 Randomized Block Search Trees

Unlike the well-known B-Trees that store between α and α/2 keys in every block and have all leaves at equal depth, our definition of (α,ε)-RBSTs does not strictly enforce a minimum load for every single node. Like binary Treaps, the shape of an RBST TX(π), over a set X of n keys, is defined by an incremental process that inserts the keys xX in a random order π into an initially empty block search tree structure, i.e. with ascending priority values π(x). The actual algorithms for dynamic updates are discussed in Section 2.1.

This section defines the basic tree structure that supports both, reporting queries (i.e. range search) and order-statistic queries (i.e. range counting and searching the k-th smallest value). Section 2.1 then shows how to avoid explicitly maintaining the subtree weights of nodes close to the root, which yields improved update write-bounds for the case that order-statistic queries are not required in applications. Next, we define the structure of RBSTs and state their invariants.

Let array size α1 and buffer threshold β0 be fixed integer parameters. Every RBST block node stores a fixed-size array for α keys, α+1 child pointers, and one parent pointer. We use the key-value x of minimum priority π(x) from those keys contained in the node’s array as the UR label of the block-node. Every child pointer is organized as a pair, storing the pointer to the child block and the total number of keys s1 contained in that subtree333E.g. subtree weights are immediately known to the parent, without reading the respective child blocks.. All internal blocks of an RBST are full, i.e. they store exactly α keys. Any block has one of two possible states, either non-buffering (sα+β) or buffering (s<α+β). Though (α,ε)-RBSTs will use β=Θ(α2/ε), we first give the definition of the primary tree without the use of buffers (β=0), i.e. every internal block remains non-buffering.

For β=0, the tree shape is defined by the following incremental process that inserts the keys in ascending order of their priority π(). Starting at the root block, the insertion of key x first locates the leaf block of the tree using the search tree property. If the leaf is non-full, x is emplaced in the sorted key array of the node. Otherwise, a new child block is allocated (according to the search tree property) and x is stored therein. This concludes the definition of the tree structure for β=0.

Figure 1: Cartesian representations of two RBSTs that shows the i-th smallest key xiX with priority value π(i) as point (i,π(i)) in the plane (n=25 and α=3). (Cf. [31] for Cartesian Trees.)
Active separators of a block v are shown with dotted vertical lines and the horizontal lines are p(v). The left tree has β=0 and the right tree has buffering subtrees. Note that all, except the last block, on a root-to-leaf path are full.
(Invariant 1:)

Thus, any internal block stores those α keys that have the smallest priority values in its subtree444 For example, α=1 and β=0 yields the well-known Treap and α>1 search trees with fan-out α+1. .

See Figure 1 (left) for an example on 25 keys with β=0 that consists of 12 blocks. It demonstrates that leaves may merely store one key in their block. We address this storage inefficiency issue next.

The “high-level idea” for β>0 is to delay generating new leaves until the subtree contains a sufficiently large number of keys. To this end, we use secondary structures, which we call buffers555Though we use the word “buffer” as it describes this purpose best, our definition has nothing to do with the “message buffers” in [13, 2]., that replace the storage layout of all those subtrees that contain <α+β keys.

(Invariant 2:)

Thus, all remaining block nodes of the primary structure are full and their children are either a primary block or a buffer.

There are several possible ways for UR organization of buffers, for example using a list of at most (α+β)/α=𝒪(α/ε) blocks that are sorted by key values. We propose the following structure that leads to stronger bounds.

Secondary UR Trees: Buffers with weight proportional fan-out.

Our buffers are search trees that consist of nodes similar to the primary tree. They also store the keys with the α smallest priority values, from the subtree, in their block. However, the fan-out of internal nodes varies based on the number of keys nα+β in the subtree. Define δ(n)=min{α+1,max{1,nαρ}} as the fan-out bound for a subtree of weight n<α+β=:α+(α+1)ρ. We defer the calibration of the parameter ρ=Θ(α/ε) to the start of the proof of Theorem 9. The subtree weight of buffering nodes is maintained explicitly, with the aforementioned in-pointers. To obtain UR, the buffer layout is again defined recursively solely based on the set of keys in the buffer and the random permutation π.

For fan-out bound δ=1, the subtree root has at most one child, which yields a list of blocks. The keys inside each block are stored in ascending key order, and the blocks in the list have ascending priority values. That is, the first block contains those keys with the α smallest priorities, the second block contains, from the remaining nα keys, those with the α smallest priorities, and so forth.

For fan-out bound δ[2,α+1], we also store the keys with the α smallest priorities in the root. We call those keys with the δ1 smallest priority values the active separators of this block. The remaining nα keys in the subtree are partitioned in δ sets of key values, using the active separators only. Each of these key sets is organized recursively, in a buffer, and the resulting trees are referenced as the children of the active separators.

This concludes the definition of our tree structure.

See the right part of Figure 1 for an example of the (α,ε)-RBST on the same permutation (y-coordinates) and 25 keys (x-coordinates) that consists of 11 blocks. (See [31] for Cartesian Tree representations.) The active separators are shown in black, non-active separators in gray, and the block nodes as rectangles.

Some remarks on the definition of primary and secondary RBST nodes are in order. The buffer is UR since the storage layout only depends on the priority order π of the keys, the number of keys in the buffer, and the order of the key values in it. Note that summing the state values of the child pointers yields s1++sδ+α=n the weight of the subtree, without additional block reads. Moreover, whenever an update in the secondary structure brings the buffer’s root above the threshold, i.e. s1++sα+1β, we have that its root contains those keys with the α smallest priorities form the set and all keys are active separators, i.e. the buffer root immediately provides both invariants of blocks from the primary tree.

Note that this definition of RBSTs not only naturally contains the randomized search trees of block size α, but also allows a smooth trade-off towards a simple list of blocks (respectively ε= and ε=0). Though leaves are non-empty, it is possible that the tree shape defined by the random permutation π yields leaves with load as low as 1/α in RBSTs.

As stated at the beginning of this section, the definition above does not yield an efficient algorithm to actually maintain RBSTs. The remainder of this section specifies our algorithms for searching, dynamic insertion, and dynamic deletion in RBSTs. Our analysis is presented in Sections 34, and 5.

Successor and Range Search in (𝜶,𝜺)-RBSTs.

As with ordinary B-Trees, we determine with binary search on the array the successor of search value q, i.e. the smallest active separator key in the block that has a value of at least q, and follow the respective child pointer. If the current search node is buffering, then we additionally check the set of non-active separators in the block (i.e. all keys) during the search to find the actual successor of q from X. The search descends until the list, i.e. block nodes with fan-out 1, in the respective RBST buffer is found. Since the lists are sorted by priority, we check all keys, by scanning the 𝒪(1/ε) blocks of the list, to determine the successor of q. This point-search extends to range-queries in the ordinary way by considering the two search paths of the range’s boundary [q,q].

To summarize, successor and range search occupies 𝒪(1) blocks in main memory, does not write blocks, and the number of block-reads is 𝒪(D), where D is the depth of the RBST.

2.1 Insertion and Deletion via Partial-Rebuilds

As with the Treaps, the tree shape of (α,ε)-RBSTs is solely determined by the permutation π of the keys. For X{x}=X, any update method that transforms TX(π) into TX(π) and vice-versa are suitable algorithms for insertions and deletions. Unlike Treaps, that use rotations to always perform insertions and deletions at the leaf level, rotations in block search trees seem challenging to realize and analyze. We propose an update algorithm that rebuilds subtrees in a top-down fashion, which has vague similarities to rebalancing Scapegoat Trees.

The main difficulty for RBST update algorithms is to achieve IO-bounds for EM-access that are (near) proportional to the structural change in the trees, while having an 𝒪(1) worst-case bound on the temporary blocks needed in main memory. The partial-rebuild algorithm we introduce in this section has bounds on the number of block-writes (Os) in terms of the structural change of the update, the expected Os and Is are analyzed in Section 5.

Observation 1.

Let π be a permutation on the key universe U and X=X{x} the keys in the tree, after and before the insertion of x. Let m be the number of blocks in TX(π) that are different from the blocks in TX(π) and m the number of blocks in TX(π) that are different from the blocks in TX(π). Let D and D be the depth of the subtrees in TX(π) and TX(π) that contain those blocks.

The rebuild algorithm in this section performs 𝒪(m+m) block-writes to EM, reads 𝒪(m+Dm) blocks, and uses 𝒪(1) blocks of temporary MM storage during the update.

Our rebuild-algorithm aims at minimizing the number of block-writes (Os) in an update, using a greedy top-down approach that builds the subtree of TX in auxiliary EM storage while reading the keys in TX from UR EM. On completion, we delete the m obsolete blocks from UR memory and write the m result blocks back to UR EM to obtain the final tree TX. This way, our update algorithm will only require 𝒪(1) blocks of temporary storage in MM. The basic idea is as follows. Given a designated separator-interval ((u),r(u)) when assembling block u for writing to auxiliary EM, we determine those keys with the α-smallest priority values by searching in TX, place them in the array of u, determine the fan-out δu and the active separators of block node u, and recursively construct the remaining subtrees for those keys within each section between active separators of u. Eventually, we free the obsolete blocks of TX form UR and move the resulting subtree from the auxiliary to UR memory.

Next we discuss all possible rebuild cases when inserting a new key x to the tree, their realization is discussed thereafter. For a block node v, let X(v) denote the set of keys stored in the subtree of v, B(v)X(v) the keys stored in the array of v itself, p(v)=max{π(y):yB(v)}, and p(v)=min{π(y):yB(v)}666For example, the key y with π(y)=p(v) is the UR label of the block v. See also Figure 1 for p(v)..

Consider the search path for x in TX. Starting at the root, we stop at the first node v that i) must store x in its array (i.e. p(parent(v))<π(x)p(v)) or ii) that requires rebuilding (i.e. must increase its fan-out). To check for case ii), we use the augmented child references of buffering node v to determine the number of keys in the resulting subtree v. Thus, both fan-outs, δv pre-insertion and δv post-insertion, are known without additional block reads from the stored subtree weights.

If neither i) nor ii) occurs, the search halts in a buffer list (δv=δv1) and we simply insert the key and rewrite all subsequent blocks of the list in one linear pass. Note that, if the list is omitted (as there are no keys in the active separator interval), the insertion must simply allocate a new block that solely stores x to generate the output RBST. It remains to specify the insertion algorithm for case i) and ii). See Figure 2 for an overview of all cases.

Figure 2: Rebuild cases (blue) for insertion of key x (red) in block v of an RBST.

In ii), v is a buffer node that must increase its fan-out. It suffices to build two trees from one subtree, each containing the top α keys in their root. If the new key x is contained in this interval, we pass x as additional “carry” parameter to the procedure. If x is not contained in this interval, the insertion continues on the respective child of v that must contain x. (See Figure 2 bottom.) Note that 3 subtrees of v are built in auxiliary EM, regardless of α.

In i), v is an internal node that stores α keys and there are four cases; depending on if x is an active separator in v and if the fan-out stays or increases (see Figure 2). In all cases, it suffices to rebuild at most three subtrees. Two trees for the two separator-intervals that are incident to x and one tree that contains the two subtrees of the separator-intervals incident to the key y that is displaced from block B(v) by the insertion of x, i.e. π(y)=p(v).

To complete the insertion algorithm, it remains to specify our top-down greedy rebuild procedure for a given interval between two active separators. First, we isolate the task of finding the relevant keys, needed for writing the next block to auxiliary EM within the following range-search function top(k,v,(,r)). Given a subtree root v in TX, a positive integer kα, and query range (,r), we compute those keys from Y=X(v)(,r) that have the k smallest priority values, together with their respective new subtree weights w0++wk=|Y|k, based on a naïve search that visits (in ascending key order) all descendants of v that overlap with the query range.

For our k=α and k=α1 rebuilds, we run the respective top-query on the designated interval range and check if x is stored in the output block or beneath it. Then we determine the fan-out from the range count results {wi}, which determines the active separators in the output block. We allocate one block in auxiliary EM for each non-empty child-reference and set their parent pointers. Finally, we recursively rebuild the subtrees for the intervals of the active sections, passing them the reference to the subtree root in TX that holds all necessary keys for rebuilding its subtree. Note that for this form of recursion, using parent pointers allows to implement the rebuild procedure such that only a constant number of temporary blocks of MM storage suffice. After the subtrees are built in the auxiliary EM, we delete the m old blocks from UR and move the m blocks to UR to obtain the tree T.

Note that deleting has the same cases for rebuilding sections of keys between active separators. Moreover, if a deletion leads to a buffering block (δv<α+1), then all its children are buffering and thus store their subtree weight together with the reference to it. Thus, determining the fan-out for deletions requires no additional block reads.

Clearly, the number of block writes is 𝒪(m+m) and the total number of block reads of all top-calls is 𝒪(Dm), from a charging argument: That is, any one block u of the m blocks in old tree is only read by a top call if the key range ((u),r(u)) of u intersects the query range (,r) of the top call, i.e. (,r)((u),r(u)). Consequently (,r) contains the smallest key, the largest key, or all keys of B(u). The number of either such reads that is charged to a block u (from the m blocks) is bounded by D, since those blocks from the output tree that issue the calls to top have nested intervals (1,r1)(2,r2) that are contained in each other, by the search tree property (of the m result blocks).

Our bounds for Observation 1 in terms of writes (Os) and reads (Is) will follow from our analysis of the depth and size of RBSTs in the next sections.

3 Bounds for Depth

Since successful tree searches terminate earlier, it suffices to analyze the query cost for keys qX, i.e. unsuccessful searches. In this section, we bound the expected block reads for searching for some fixed q in the block trees TX, where the randomness is over the permutations πPerm(n), i.e. the set of bijections π:X{1,,n}.

Consider the sequence v1,v2, of blocks of an RBST on the search path from the root to some fixed q. Since the subtree weight is a (strictly) decreasing function, we have that the fan-out δ of the nodes is a non-increasing function that eventually assumes a value 1. From the definition of δ, we have that there are (in the worst case) at most O(1/ε) blocks in the search path with δ=1. Thus, it suffices to bound the expected number of internal blocks with δ2, which all have the search tree property [(vi),r(vi)]((vi+1),r(vi+1))q.

Lemma 2.

Let X be a set of n keys, qX, and α2 an integer. The expected number of primary tree nodes (δ=α+1) in an (α,ε)-RBST on X that contain q in their interval is 𝒪(logαn). The expected number of secondary tree nodes (δ<α+1) that have q in their interval is 𝒪(1/ε).

In particular, the expected number of blocks in the search path of q is 𝒪(ε1+logαn).

Further, the depth of an (α,ε)-RBST on X is with high probability less than 𝒪(ε1+logn).

The block structure of RBSTs do not allow a clear extension of the Treap analysis based on “Ancestor-Events” [27] or the “Backward-Analysis” of point-location cost in geometric search structures [15, Chapter 6.4] to obtain bounds on block-reads (Is) that are stronger than the high-probability O(logn)-depth bound of ordinary Treaps. Our proof technique is vaguely inspired by a top-down analysis technique of randomized algorithms for static problems that consider “average conflict list” size. However, several new ideas are needed to analyze a dynamic setting where the location q is user-specified and not randomly selected by the algorithm. Our proof uses a partition in layers whose sizes increase exponentially with factor α and a bidirectional counting process of keys in a layer of that partition.

Proof.

Recall that there are in the worst case 𝒪(1/ε) nodes with fan-out δ1 on the search path of q. Consider the active separators {x1,,xd} of the RBST blocks that contain q in their search-interval. This set consists of two sequences {xi+} and {xi} of strictly decreasing and increasing key values, respectively. Since their priority values are strictly increasing, i.e. π(x1)<π(x2)<<π(xd), we have from the known bounds of ordinary Treaps that d=𝒪(logn) with high probability [27, Lem. 4.8]. Thus, RBST depth is with high probability less than 𝒪(ε1+logn).

To show our sharper search bound, let random variable Dq be the number of primary tree nodes v that contain q in their interval [(v),r(v)]. Partition {1,,n} with intervals of the form [αi,αi+1) for indices 0ilogαn. E.g. permutation π induces an assignment of keys xX to an unique layer index, i.e. i with π(x)[αi,αi+1). For internal node v let p(v)=min{π(x):x is stored in v} and p(v) is the maximum over the set. Note that p(v)+α1p(v), since every internal node stores exactly α keys. Thus, Dq counts nodes v that either have both p(v),p(v)[αi,αi+1) or have that p(v) has a larger layer index than i. We bound the expected number of nodes of the first kind, since there are in the worst case at most logαn from the second kind. Defining for every layer index i a random variable Vi that counts the tree nodes in that layer

Vi(π)=|{vTX(π):(v)<q<r(v) and p(v),p(v)[αi,αi+1)}|,

we have that Dqlogαn+V, where V=i0Vi is the total number of blocks of the first kind. The remainder of the proof shows bounds for Vi.

Let Ki={xX:π(x)<αi+1} for each index i, Ki={xKi:x<q}, and Ki+=KiKi. Thus, KiKi+1, and |Ki|=αi+11. Define Yi to be the number of consecutive keys form Ki that are less than q but not contained in Ki1. Analogously, random variable Yi+ counts those larger than q and Yi:=Yi+Yi+ is the number of consecutive keys of Ki, whose range contain q, but do not contain elements from Ki1. Since all separators of primary nodes are active and internal nodes store exactly α keys, we have Vi(π)Yi(π)/α for every πPerm(n).

Next we bound 𝔼[Yi] based on a sequence of binary events: Starting at q, we consider the elements xKi in descending key-order and count as “successes” if xKiKi1 until one “failure” (xKi1) is observed. If Ki has no more elements, the experiment continues on Ki+ in ascending key order. Defining Zi as the number of successes on termination, we have Yi(π)Zi(π). The probability to obtain failure, after observing j successes, is |Ki1||Ki|j, which is at least pi:=|Ki1||Ki| for all j0 and i>0.

Hence Pr[Zi=j]Pr[Zi=j+1] where random variable ZiNB(1,pi), i.e. the number of trials to obtain one failure in a sequence of independent Bernoulli trials with failure probability pi. Since 𝔼[Zi]=1/pi, we have 𝔼[Yi]𝔼[Zi]<𝔼[Zi]=1pi=αi+11αi1<α11/αi.

We thus have 𝔼[Vi]𝔼[Yi]/α<2αα11/αi4 for all α2, which shows the lemma’s statement for the primary tree nodes.

Since there are in the worst case 𝒪(1/ε) nodes with fan-out δ1 on a search path, it remains to bound the expected number of secondary nodes with a fan-out in [2,α] on a search path. If there are any such nodes to count, consider the top-most buffering node v and let n:=|X(v)| be its subtree weight. For any fixed integer n, random variables regarding the subtree only depend on the relative order of the n keys, which are uniform from Perm(n). Since the expected number of keys in either of the δ(n) sections is (nα)/δ(n), the expected number of keys in the section of q has an upper bound of the form (nα)/δ(n)=𝒪(α/ε). Since the bound holds for all n, this bound holds unconditionally. The lemma’s 𝒪(1/ε) expectation bound on the nodes follows from the fact that all secondary nodes (with δ2) of a search path store exactly α keys. Thus, the expected number of such nodes is 𝒪(1/ε).

Next, we show our depth bound that improves on the simpler 𝒪(ε1+lnn) bound above. We first give a tail-estimate for the number of primary nodes and then an estimate for the number of secondary nodes on a search-path to the fixed query point q.

Lemma 3.

Let α2. The random variable Dq is with high probability 𝒪(logαn).

We will tighten above’s analysis by considering the relative priority orders of the Yi keys that are counted in layer i, where ir=𝒪(logαn). Indeed, even if random variable Yi exceeds its mean Θ(α) by a large factor, the blocks of a search path to q can only contain all Yi keys if also their relative priority-order have a, very specific, deep block structure. This observation allows to obtain a much stronger tail bound.

Proof.

Let Y^i be the number of keys from KiKi1 that are stored in the block of a node on the search path to q, i.e. Y^iYi. We will show for the random variables Y^i that Pr[Y^i=cα]<(e2/c)cα to obtain a tailbound that is strong enough for a union bound of all n+1 possible queries qX. Recall from Stirling’s formula, that 1<n!2πn(n/e)n<e1/12n for all n1.

To observe cα keys from layer i stored in c blocks of a search path, the α keys stored in the last block must all have priorities larger than the first (c1)α keys, the α keys in the second to last block must all have priorities larger than the first (c2)α keys, and so forth. Further, each of the α! relative orders of any one such batch of α keys does not affect them being stored in the block of a search path. Thus the probability that cα keys have relative priorities to form c blocks of a search path is at most

(α!)c(cα)!<(e112α2πα(α/e)α)c(cα/e)cα=ec12α(2πα)c2(α/ecα/e)cα=ec12α(2παcα)c<ec12α(ec)αc,

where the last inequality is due to 2πα<eα and eαcα(e/c)ααeα1 being true for all α2. Thus Pr[Y^i=cα]<(e2/c)cα.

For ce3, Pr[Y^i=cα]<(1eα)c=:zc and we have

Pr[Y^icα]jc0k<αPr[Y^i=jα+k]αjcPr[Y^i=jα]αjczj=αzc1z<2α/eαc.

From the layer partition, we have that the outcome of Yi is independent of Yi+1, i.e. bounds for Yi are with respect to |Ki| and regardless of the actual set Ki, and all relative priority-orders of the keys in KiKi1 are equally likely. Since across all r layers the probability bound for the sum is maximized when all r factors have equal thresholds, we have

Pr[Y^1++Y^rcαr]i=1rPr[Y^icα]<(2αeαc)r2αn1+1/log2αncα/lnαn3ncα/lnα<n3c,

where the third last inequality uses that logα(n+1)rlogα(n)+1, the second last inequality uses that α2 and 2αn, and and the last inequality uses that α/lnα>1.

Lemma 4.

The depth of a buffering subtree is with high probability 𝒪(ε1+ε1logn). If α=polylog(n), then the depth of a buffering subtree is with high probability 𝒪(ε1+logαn).

We believe that Lemma 4 cannot be improved substantially without redesigning the secondary structures, update algorithms, and proofs. Specifically, parameter constellations with small δ(n)=ω(1) provide a major obstacle in obtaining better bounds.

Proof.

For the number of secondary tree nodes, which have δv[2,α] and store exactly α keys, we observe that having d such nodes on a search path requires that the topmost node has one child with weight N1(d1)α+ρ, and the second topmost node must have one child with weight N2(d2)α+ρ, and so forth, where the RBST parameter ρ=Θ(α/ε). Thus, at least d/4 nodes must have weight Ni(d/2)α+ρ, and consequently fan-out value at least kδ((d/2)α+ρ)=Ω(εd). Using our Pr[δ(Ni)k]e2k bound of Corollary 17 on the fan-out of a child, we obtain the probability of having d such nodes on a search path is at most i=d/4d/2exp(2δ(Ni))eΩ(εd2), i.e. a high-probability bound for all d=ω(1).

Thus, the number secondary nodes (δv2) is at most d=O(logαn) with probability at least 1nΩ(εln(n)/(lnα)2), and d=O(ε1lnn) with probability at least 1nΩ(1).

4 Bounds for Size using a Top-Down Analysis

Our analysis will frequently use the following characterization of the partitions of a set X of n keys that are induced by the α elements of the smallest priority values from the set.

Observation 5.

There is a bijective mapping between the partitions on n keys, induced by the first α keys from πPerm(n), and the solutions to the equation X1+X2++Xα+1=nα, where the variables Xi are non-negative integers. This bijection implies that the solutions of the equation happen with the same probability.

In other words, Xi is the number of keys in the i-th section beneath the root of TX, where 1iα+1. Thus, Xi can be considered as the number of consecutive keys from X that are between the i-th and (i+1)-th key stored in the root. For example, in an RBST on the keys {1,,10} and block size α=3, a root block consisting of the key values 4, 7, 8 is characterized by the assignment X1=3,X2=2,X3=0,X4=2. Next we analyze the effect of our secondary structures, since the size of RBSTs without buffer (β=0) can be dramatically large. E.g. the expected number of blocks 𝔼[S] for subtrees of size n=2α is

𝔼[S] =1+Pr[X1>0]++Pr[Xα+1>0]
=1+(α+1)(1(n1α1)/(nα))=1+(α+1)(1α/n)=1+(α+1)/2.

In this example, buffering stores the whole subtree using only one additional block.

4.1 Buffering for UR trees with load-factor 𝟏𝜺

Our top-down analysis of the expected size, and thus load-factors, of (α,ε)-RBSTs frequently uses the following counting and index exchange arguments.

Lemma 6 (Exchange).

We have Pr[Xi<c]=Pr[Xj<c] for all i,j{1,,α+1}.

Proof.

Due to Observation 5, each solution to X1++Xα+1=nα occurs with the same probability. Hence, we can calculate Pr[Xic] by counting the number of solutions where Xic and dividing it by the total number of solutions.

Thus, Pr[Xi<c]=1Pr[Xic]=1(ncα)/(nα)=1Pr[Xjc]=Pr[Xj<c].

Next, we show our size bound for an (α,ε)-RBST with n keys, where the priorities are from a uniform distribution over the permutations in Perm(n). Our analysis crucially relies on the basic fact that restricting Perm(n) on an arbitrary key subset of cardinality n<n yields the uniform distribution on Perm(n). Random variable Sn denotes the space, i.e. the number of blocks used by the RBST on a set of n keys, Fn denotes the number of full, and En:=SnFn the number of non-full blocks. Next we show a probability bound for the event that a given section, say the k-th, contains a number of keys Xk of a certain range.

Lemma 7.

For any 1kα+1 and i<j, we have x=ijPr[Xk=x]=(niα)(n1jα)(nα).

Proof.

We have Pr[Xki]=(niα)/(nα) and Pr[Xkj+1]=(n(j+1)α)/(nα).

These basic facts allow us to compute the following expressions.

Lemma 8.

We have x=1mPr[Xk=x]x=((nα+1)(nmα+1)m(n1mα))/(nα).

Proof.

By elementary computation, we have

x=1mPr[Xk=x]x =x=1mPr[Xk=x]+x=2mPr[Xk=x]++x=mmPr[Xk=x] (1)
=(n1α)(n1mα)(nα)++(nmα)(n1mα)(nα)
=i=1m(niα)(nα)m(n1mα)(nα).

Note that an RBST on n=n+1 keys with block size α=α+1 has Pr[X1=i]=(niα)(n+1α+1). Thus, we have from Lemma 7 that

i=1mPr[X1=i]=i=1m(niα)(n+1α+1)=((n+1)1α+1)((n+1)1mα+1)(n+1α+1) (2)
i=1m(niα)=(nα+1)(nmα+1)

and the lemma follows by using (2) for the last summation in (1). We are now ready to state our size bound.

Theorem 9 (Size).

The expected number of non-full blocks in an n-key (α,ε)-RBST is at most 𝔼[En]max{εn/α, 1}.

Our proof is by induction on n, where the base cases are due to the number of non-full blocks that are occupied by the secondary buffer structures (Appendix A).

Proof.

Observation 18 states that En1 for all n with δ(n)1. Moreover, Theorem 21 shows that 𝔼[En]27δ(n) for nβ+α=(α+1)ρ+α. (Recall that β=(α+1)ρ.) Next, we set ρ sufficiently large for the following proof. For simplicity, we will use t and γ instead of ρ+α and (α+1)ρ+α respectively. We can conclude from Theorem 21 that for t<nγ,

𝔼[En] 27nαρ=27nα108α/ε27(nα108α/ε+1)272nα108α/ε=nα2α/ε.

The last inequality holds because n>ρ+α=108α/ε+α and nα108α/ε>1.

For n>γ, we prove the theorem by induction. For each index i{1,,α+1}, define Yi as the number of non-full blocks in the i-th section beneath the root (of an RBST with n keys). Note that Yi only depends on Xi, i.e. the number of keys in the i-th section, and their relative priorities. We thus have

𝔼[Yi] =xi=1nαPr[Xi=xi]𝔼[Exi]
xi=1tPr[Xi=xi]1+xi=t+1γPr[Xi=xi]xi2α/ε+xi=γ+1nαPr[Xi=xi]xiα/ε
=(xi=1tPr[Xi=xi]+xi=t+1γPr[Xi=xi])xi=t+1γPr[Xi=xi]
+(xi=t+1γPr[Xi=xi]xi2α/εxi=t+1γPr[Xi=xi]xiα/ε)2xi=1tPr[Xi=xi]xi2α/ε
+(xi=1tPr[Xi=xi]xiα/ε+xi=t+1γPr[Xi=xi]xiα/ε+xi=γ+1nαPr[Xi=xi]xiα/ε)
=(xi=1γPr[Xi=xi]xi=t+1γPr[Xi=xi]xi2α/εxi=1tPr[Xi=xi]xi2α/ε)
+xi=1nαPr[Xi=xi]xiα/εxi=1tPr[Xi=xi]xi2α/εxi=t+1γPr[Xi=xi]
xi=1γPr[Xi=xi](1xi2α/ε)+𝔼[Xi]α/ε

Using Lemma 6, we have that the summation term has equal value for each section i{1,,α+1}. Since, for nα, the root is full and not counted in En, we have

𝔼[En]=𝔼[Y1++Yα+1] (1+α)x1=1γPr[X1=x1](1x12α/ε)+nαα/ε. (3)

To proof the inequality it suffices to show that the right-hand side of (3) is at most εnα. This is true if and only if (1+α)x1=1γPr[X1=x1](1εx12α)ε. Since ε>0 and α+1>0, it suffices to show that the value of the sum is not positive. This is if and only if

x1=1γPr[X1=x1]ε2αx1=1γPr[X1=x1]x1. (4)

Using Lemma 7 on the left-hand and Lemma 8 on the right-hand side, it remains to show

2αε(n1α)2αε(n1γα) (nα+1)(nγα+1)γ(n1γα). (5)

By elementary computation, this inequality holds for all n>γ. (See Appendix B.)

Since each full block contains α distinct keys, i.e. Fnn/α, we showed that Sn(1+ε)n/α. Thus, the load-factor, i.e. the relative utilization of the allocated blocks, has:

Corollary 10 (Load-Factor).

Let ε(0,1/2] and nα+β. The expected number of blocks in an n-key (α,ε)-RBST is at most (1+ε)n/α, i.e. the expected load-factor is at least 1ε.

Proof.

The load-factor is L=n/αSn. Since ϕ(x)=1/x is a convex function, Jensen’s inequality gives 1/𝔼[X]𝔼[1/X]. Using Thm. 9 yields 𝔼[L]nα1(1+ε)n/α=11+ε1ε.

Combining Lemma 2 and Theorem 9, we obtained the following for the range-search.

Corollary 11 (Range-Search).

Let ε(0,1/2]. The expected number of block-reads for range reporting in (α,ε)-RBSTs is 𝒪(ε1+k/α+logαn), where k is the number of results.

5 Bounds for Dynamic Updates via Partial-Rebuilding

The weight-proportional fan-out in the design of our secondary UR tree structure has the advantage that it avoids the need of additional restructuring work whenever a subtree must change its state between buffering and non-buffering, due to updates (cf. Section 2.1). Together with the space bound from last section and the observation that the partial-rebuild algorithm for updating RBSTs only rebuilds at most three subtrees beneath the affected node, regardless of its fan-out, allows us to show our bound on the expected structural change.

Using the counting technique from Lemma 2, it is easy to show that 𝔼|X(v)| is at most 2ni=1nαni=𝒪(αlnn). The proof of the theorem will use the following, stronger bound.

Lemma 12 (Subtree Weight).

Let x be a key and random variable |X(v)| denote the subtree weight of node v that stores xB(v). Then Nα+βnNPr[|X(v)|=N]=𝒪(αlogαn).

Proof.

Let random variable TX be the largest set of consecutive keys that have π(x)=minxT{π(x)} and xT. Note that TX(v)X, and T=X(v) iff π(x)=p(v). Recall from binary Treaps that 𝔼|T|=Θ(lnn) for any xX, i.e. H(n)𝔼|T|2H(n/2) where H(m)=11++1m. We will show an upper bound of 𝒪(αlogα)𝔼|T| by arguing for the case |X(v)|=N. Indeed, if 𝔼[|T|||X(v)|=N]=NΩ(logαα) holds for given Nα+β, then

N=α+βnPr[|X(v)|=N]N= N=α+βnPr[|X(v)|=N]𝔼[|T|||X(v)|=N]/Ω(log(α)/α)
=𝒪(αlogα) N=α+βnPr[|X(v)|=N]𝔼[|T|||X(v)|=N]𝒪(αlogα)𝔼[T].

Observe that x partitions the N keys in X(v) into N keys smaller than x and N+ keys larger than x, where N+1+N+=N. Though the priority orders of a set X(v) of N keys are restricted to those that have xB(v), we have that all relative orders of the keys in B(v) remain equally likely. Consider the case that B(v)={xi,,x1,x,x1+,,xα1i+} contains exactly i many separators smaller than x for some given integer i[0,α), i.e. there are α1i separators larger than x. Let the random variables Xi,,X0X(v) denote the key sets of the respective sections smaller than x and X0+,X1+, for the sections larger than x. That is, we have j1+|Xj|=N+1 and j1+|Xj+|=N++1 regardless of the priority orders. We consider the relative π-orders of those i+1 separators {xi,,x1}{x} in B(v) and use the fact (regarding ancestors in binary Treaps) that the entire keys of the j-th section are in XjT if π(x)=min{π(x),π(xj),,π(x1)}, which has probability 1j+1. Now, using that 𝔼[|Xj||N,N,i]=𝔼[|X0||N,N,i]=Nii+10 for all j, we have

𝔼[|T||N,N,i]1=𝔼[T|N,N,i]+𝔼[T+|N,N,i]
=j=0i1i+1(1+𝔼[|Xj||N,N,i])+j=0αi1αi(1+𝔼[|Xj+||N,N,i])
=N+1i+1j=0i1j+1+N++1αij=0αi11j+1=(N+1)H(i+1)i+1+(N++1)H(αi)αi.

Since H(i+1)i+1H(α)α, we have that 𝔼[|T||N,i]=NΩ(lnαα), thus 𝔼[|T||N]=NΩ(lnαα).

Theorem 13.

The expected number of writes (Os) of an update in an n-key (α,ε)-RBST is 𝒪(ε1+logα(n)/α). Updates have 𝒪(ε2+(1+ε1+lognα)logαn) expected reads (Is).

Recall that our algorithm reads blocks only from one search path and during the top-queries. To bound 𝔼[m+m] and 𝔼[Dm] from Obs. 1, we first analyze the writes (Os). This extends to the reads (Is), using a high-probability bound for depth (e.g. Lemma 2 or Lemma 3 and 4), since both random variables are in the worst-case 𝒪(n).

Proof.

We show our bound on the expected total structural change for deleting a key x from the RBST. (Insertion of the key would count the same number of blocks.) Let X=X{x} be the set of keys before and after the deletion of x respectively, where |X|=n. Let v be the subtree root that is subject to the partial rebuild algorithm from Section 2.1. From Lemma 2, determining the search path of v takes expected 𝒪(ε1+logαn) reads (Is). Either v is a primary node with δv=α+1 or v is a buffering node with δvα+1.

For δv1, we have in the worst-case at most 𝒪(1/ε) blocks in the buffer’s list.

For a primary node with δv=α+1, v contains x in its block B(v). From Lemma 12, we have that Nα+βnNPr[|X(v)|=N]=𝒪(αlogαn). Now, for any given integer N of keys in the subtree, we have from Theorem 9 that the expected number of blocks is 𝒪(N/α) and, from linearity of expectation, that three of the α+1 subtrees contain at most 𝒪(N/α2) blocks. Thus, the contribution towards the expected number of block-writes is at most Nα+βPr[|X(v)|=N]N𝒪(1/α2)=𝒪(logα(n)/α). Moreover, using the high-probability bound of Lemma 2, we have that the depth of the subtree of v is less than 𝒪(ε1+lnn) for any given integer N=|X(v)|n. Thus, the contribution towards the total number of block-reads, issued by the top-queries of the update operation (cf. Observation 1), is at most Nα+βPr[|X(v)|=N]N(ε1+lnn)𝒪(1/α2)=𝒪(ε1logα(n)/α+lnnαlogαn).

For a buffering node with δv2, v has |X(v)|=Θ(αδv/ε). By Theorem 21, subtree rooted at v has expected 𝒪(δv/ε) blocks and writing three, of its δv subtrees, has 𝔼[m]=𝒪(ε1) expected block-writes. For the block-reads, consider the number of keys in some active section Xi of v. If Xi=Θ(cα/ε) for some c2, then the worst-case depth of this child is 𝒪(c/ε). By Lemma 14, the probability is at most 𝒪(ec). Thus, c=2δvPr[δ(Xi)=c]𝒪(c/ε)𝔼[m|δ(Xi)=c]=c=2δvc2ec𝒪(1/ε2)=𝒪(ε2)c>1c2/ec shows that the expected block-reads are 𝒪(ε2) for either of the three affected subtrees of v.

References

  • [1] Arne Andersson and Thomas Ottmann. New Tight Bounds on Uniquely Represented Dictionaries. SIAM J. Comput., 24(5):1091–1101, 1995. doi:10.1137/S0097539792241102.
  • [2] Lars Arge. The buffer tree: A technique for designing batched external data structures. Algorithmica, 37(1):1–24, 2003. doi:10.1007/s00453-003-1021-x.
  • [3] Ricardo A. Baeza-Yates. The expected behaviour of b+-trees. Acta Informatica, 26(5):439–471, 1989. doi:10.1007/BF00289146.
  • [4] Rudolf Bayer and Edward M. McCreight. Organization and maintenance of large ordered indices. Acta Informatica, 1:173–189, 1972. doi:10.1007/BF00288683.
  • [5] Naama Ben-David, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, Charles McGuffey, and Julian Shun. Parallel algorithms for asymmetric read-write costs. In Proc. Symposium on Parallelism in Algorithms and Architectures (SPAA’16), pages 145–156, 2016. doi:10.1145/2935764.2935767.
  • [6] Michael A. Bender, Jonathan W. Berry, Rob Johnson, Thomas M. Kroeger, Samuel McCauley, Cynthia A. Phillips, Bertrand Simon, Shikha Singh, and David Zage. Anti-persistence on persistent storage: History-independent sparse tables and dictionaries. In Proc. 35th Symposium on Principles of Database Systems (PODS’16), pages 289–302. ACM, 2016. doi:10.1145/2902251.2902276.
  • [7] Michael A. Bender, Martin Farach-Colton, Jeremy T. Fineman, Yonatan R. Fogel, Bradley C. Kuszmaul, and Jelani Nelson. Cache-oblivious streaming b-trees. In Proc. Symposium on Parallelism in Algorithms and Architectures (SPAA’07), pages 81–92. ACM, 2007. doi:10.1145/1248377.1248393.
  • [8] Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, and Julian Shun. Sorting with asymmetric read and write costs. In Proc. Symposium on Parallelism in Algorithms and Architectures (SPAA’15), pages 1–12, 2015. doi:10.1145/2755573.2755604.
  • [9] Guy E. Blelloch and Daniel Golovin. Strongly History-Independent Hashing with Applications. In Proc. 48th Symposium on Foundations of Computer Science (FOCS’07), 2007. doi:10.1109/FOCS.2007.36.
  • [10] Guy E. Blelloch, Daniel Golovin, and Virginia Vassilevska. Uniquely Represented Data Structures for Computational Geometry. In Joachim Gudmundsson, editor, Proc. 11th Scandinavian Workshop on Algorithm Theory (SWAT’08), pages 17–28. Springer, 2008. doi:10.1007/978-3-540-69903-3_4.
  • [11] Guy E. Blelloch and Margaret Reid-Miller. Fast set operations using treaps. In Proc. 10th Symposium on Parallel Algorithms and Architectures (SPAA’98), pages 16–26, 1998. doi:10.1145/277651.277660.
  • [12] Milutin Brankovic, Nikola Grujic, André van Renssen, and Martin P. Seybold. A simple dynamization of trapezoidal point location in planar subdivisions. In Proc. 47th International Colloquium on Automata, Languages, and Programming (ICALP’20), pages 18:1–18:18, 2020. doi:10.4230/LIPIcs.ICALP.2020.18.
  • [13] Gerth Stølting Brodal and Rolf Fagerberg. Lower bounds for external memory dictionaries. In Proc. 14th Symposium on Discrete Algorithms (SODA’03), pages 546–554, 2003. URL: https://dl.acm.org/doi/10.5555/644108.644201.
  • [14] Douglas Comer. The ubiquitous b-tree. ACM Comput. Surv., 11(2):121–137, 1979. doi:10.1145/356770.356776.
  • [15] Mark de Berg, Otfried Cheong, Marc J. van Kreveld, and Mark H. Overmars. Computational Geometry: Algorithms and Applications, 3rd Edition. Springer, 2008. doi:10.1007/978-3-540-77974-2.
  • [16] James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator, and Robert Endre Tarjan. Making data structures persistent. J. Comput. Syst. Sci., 38(1):86–124, 1989. doi:10.1016/0022-0000(89)90034-2.
  • [17] Amr Elmasry, Mostafa Kahla, Fady Ahdy, and Mahmoud Hashem. Red-black trees with constant update time. Acta Informatica, 56(5):391–404, 2019. doi:10.1007/s00236-019-00335-9.
  • [18] Eric Demaine et al. Lecture notes on persistent data structures. https://web.archive.org/web/20230601205149/http://courses.csail.mit.edu/6.851/spring21/scribe/lec1.pdf.
  • [19] Daniel Golovin. Uniquely represented data structures with applications to privacy. PhD thesis, Carnegie Mellon University, 2008.
  • [20] Daniel Golovin. B-Treaps: A Uniquely Represented Alternative to B-Trees. In Proc. 36th International Colloquium on Automata, Languages, and Programming (ICALP’09), pages 487–499, 2009. doi:10.1007/978-3-642-02927-1_41.
  • [21] Daniel Golovin. The b-skip-list: A simpler uniquely represented alternative to b-trees. CoRR, abs/1005.0662, 2010. arXiv:1005.0662.
  • [22] Dana Jansens. Persistent binary search trees. https://web.archive.org/web/20230526175144/https://cglab.ca/˜dana/pbst/, 2014. Accessed: 2023-05-26.
  • [23] Donald E. Knuth. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley, 1973.
  • [24] Klaus Küspert. Storage Utilization in B*-Trees with a Generalized Overflow Technique. Acta Informatica, 19:35–55, 1983. doi:10.1007/BF00263927.
  • [25] Ketan Mulmuley. Computational Geometry: An Introduction Through Randomized Algorithms. Prentice Hall, 1994.
  • [26] Moni Naor and Vanessa Teague. Anti-presistence: history independent data structures. In Proc. 33rd Symposium on Theory of Computing (STOC’01), pages 492–501, 2001. doi:10.1145/380752.380844.
  • [27] Raimund Seidel and Cecilia R. Aragon. Randomized Search Trees. Algorithmica, 16(4/5):464–497, 1996. doi:10.1007/BF01940876.
  • [28] Lawrence Snyder. On uniquely represented data strauctures. In Proc. 18th Symposium on Foundations of Computer Science (SFCS’77), pages 142–146, 1977. doi:10.1109/SFCS.1977.22.
  • [29] Rajamani Sundar and Robert Endre Tarjan. Unique Binary-Search-Tree Representations and Equality Testing of Sets and Sequences. SIAM J. Comput., 23(1):24–44, 1994. doi:10.1137/S0097539790189733.
  • [30] Jeffrey Scott Vitter. External memory algorithms and data structures. ACM Comput. Surv., 33(2):209–271, 2001. doi:10.1145/384192.384193.
  • [31] Jean Vuillemin. A unifying look at data structures. Commun. ACM, 23(4):229–239, 1980. doi:10.1145/358841.358852.
  • [32] Andrew Chi-Chih Yao. On Random 2-3 Trees. Acta Informatica, 9:159–170, 1978. doi:10.1007/BF00289075.

Appendix A Induction Base Case: Expected Size of Buffering Trees

Lemma 14.

Let X1,,Xm be some non-negative integral random variables where X1+X2++Xm=n. For integers n1 and tn, we have the tail-bounds

(1tn+1)m1Pr[Xit](1tn+m1)m1.

Proof.

Pr[Xit]=(nt+m1m1)(n+m1m1) =nt+m1n+m1nt+m2n+m2nt+m(m1)n+m(m1)
=(1tn+m1)(1tn+m2)(1tn+m(m1))

Since for 1im1, tn+m1tn+mitn+m(m1)=tn+1, the lemma holds.

Lemma 15.

For 0<cab, we have acbcab.

Proof.

acbcababbcabacab.

For a subtree of size nβ+α=:(α+1)ρ+α, fan-out parameter δ(n)=min{α+1,max{1,nαρ}} is equal to max{1,nαρ}. Therefore, for n with 2δ(n)α+1, we have (δ(n)1)ρ+α<nδ(n)ρ+α. Moreover, for k{2,3,,δn}, n>(k1)ρ+α if and only if δ(n)k.

Lemma 16.

For a subtree of size nβ+α, where δ(n)α+1, and all k{2,3,,δ(n)}, we have

Pr[δ(Xi)k]=Pr[Xi>(k1)ρ+α](1k1δ(n))δ(n)1,

where Xi is the number of keys in the i-th section beneath the root.

Proof.

Pr[Xi>(k1)ρ+α] =Pr[Xi(k1)ρ+α+1]

Set variables m, t, and n of Lemma 14 to δ(n), (k1)ρ+α+1, and nα respectively. We have

Pr[Xi(k1)ρ+α+1](1(k1)ρ+α+1nα+δ(n)1)δ(n)1.

Since nδ(n)ρ+α, we have

Pr[Xi>(k1)ρ+α](1(k1)ρ+α+1δ(n)ρ+αα+δ(n)1)δ(n)1

Using Lemma 15, we subtract δ(n)1 from the numerator and denominator of (k1)ρ+α+1δ(n)ρ+αα+δ(n)1. Thus

Pr[Xi>(k1)ρ+α](1(k1)ρ+α+1(δ(n)1)δ(n)ρ)δ(n)1.

Since δ(n) is at most α+1, we have α+1(δ(n)1) is positive and conclude

Pr[Xi>(k1)ρ+α](1(k1)ρδ(n)ρ)δ(n)1=(1k1δ(n))δ(n)1,

as stated.

We will combine the results of Lemmas 16 and 15 and get the following result.

Corollary 17.

For 2kδ(n), we have

Pr[δ(Xi)k] =Pr[Xi>(k1)ρ+α](1k11δ(n)1)δ(n)1e(k2).
Observation 18.

For a subtree with δ(n)=1, the data structure is a simple list and has at most one non-full block.

Lemma 19.

For a subtree with δ(n)=2, the expected number of non-full blocks 𝔼[En]5.

Proof.

We will design an algorithm calculating an upper bound on the number of non-full blocks. The root has α keys, and the remaining keys are randomly split into two sections with X1 and X2 keys, i.e. X1+X2=nα. For i{1,2}, if δ(Xi)=1, there is at most one non-full block, and the algorithm does not need to proceed in this section anymore. If δ(Xi)=2, to observe non-full blocks, the Xiα keys should be partitioned again. It is impossible for both X1 and X2 to have δ(Xi)=2 since it implies that n=α+X1+X2>α+2(ρ+α)=2ρ+3α, which is a contradiction. So at each iteration, either the algorithm stops or continues with one of the sections. In the last step, there are two sections each with at most one non-full block. Thus, the expected number of iterations plus 1 is an upper bound on 𝔼[En].

Clearly, 𝔼[En] is indeed an increasing function of n, so losing α keys of the root at each iteration results in fewer non-full blocks. To obtain a weaker upper bound, we assume that at each step, the key with the highest priority splits the n1 remaining keys, instead of nα keys, into two new sections. These keys include the α1 other keys of the root as well. It suffices to show that the expected number of iterations is at most 4.

At iteration k, k keys with the highest priorities split n keys into k sections with Y1(k),Y2(k),,Yk(k) keys, where Y1(k)+Y2(k)++Yk(k)=nk. Round k+1 occurs if any of Yi(k)s exceeds ρ+α. Since k top keys are chosen uniformly at random, all solutions to the equation Y1(k)+Y2(k)++Yk(k)=nk have equal probabilities. Next, we find an upper bound on Pr[Yi(k)>ρ+α] for i{1,,k}. Set parameters m, t, and n of Lemma 14 to k, ρ+α, and nk respectively. Thus

Pr[Yi(k)>ρ+α] (1ρ+α+1nk+k1)k1(1ρ+α1n1)k1.

Since n is at most 2ρ+α, we have

Pr[Yi(k)>ρ+α] (1ρ+α12ρ+α1)k1(1ρ2ρ)k1=2(k1).

The last inequality is an application of Lemma 15. Using the union bound, we have

Pr[Yi(k)>ρ+α] k/2k1.

This is indeed the probability of round k+1 occurring, conditioned on the probability of rounds k having occurred. Define Z to be the number of iterations and Zk to be an indicator random variable of whether the k-th iteration happened. If

Pr[Zk=1]=Pr[Zk1=1]Pr[Zk=1|Zk1=1]+Pr[Zk1=0]Pr[Zk=1|Zk1=0]
=Pr[Zk1=1]Pr[Zk=1|Zk1=1]k/2k1

It follows that

𝔼[Z]=k=1𝔼[Zk]=k=1Pr[Zk=1]k=1k/2k1=2(j=1k=j2k)=2(j=12j112)=22111/21/2=4.

Lemma 20.

For a subtree with δ(n)=3, the expected number of non-full blocks 𝔼[En]10.

Proof.

nα keys beneath the root are separated into three sections with X1,X2, and X3 keys. Same as Lemma 19, we can prove that it is impossible to have δ(Xi)2 for every i{1,2,3}, or δ(Xi)=δ(Xj)=3 for some ij{1,2,3}. Due to the symmetric property of the random variables Xi, we can conclude

1= 3Pr[δ(X1)=3δ(X2)=1δ(X3)=1]
+3Pr[δ(X1)=2δ(X2)=2δ(X3)=1]
+3Pr[δ(X1)=2δ(X2)=1δ(X3)=1]
+Pr[δ(X1)=1δ(X2)=1δ(X3)=1].

Next, we perform an induction on 2ρ+α<n3ρ+α. Using Observation 18, Lemma 19, and induction hypothesis, we get

𝔼[En] 3Pr[δ(X1)=3δ(X2)=1δ(X3)=1](10+1+1)
+3Pr[δ(X1)=2δ(X2)=2δ(X3)=1](5+5+1)
+3Pr[δ(X1)=2δ(X2)=1δ(X3)=1](5+1+1)
+Pr[δ(X1)=1δ(X2)=1δ(X3)=1](1+1+1)
< 3Pr[δ(X1)=3δ(X2)=1δ(X3)=1](127)
+3Pr[δ(X1)=2δ(X2)=2δ(X3)=1](117)
+7(3Pr[δ(X1)=3δ(X2)=1δ(X3)=1]
+3Pr[δ(X1)=2δ(X2)=2δ(X3)=1]
+3Pr[δ(X1)=2δ(X2)=1δ(X3)=1]
+Pr[δ(X1)=1δ(X2)=1δ(X3)=1])
= 3Pr[δ(X1)=3δ(X2)=1δ(X3)=1]5
+3Pr[δ(X1)=2δ(X2)=2δ(X3)=1]4+7
7+3Pr[δ(X1)3]5+3Pr[δ(X1)2δ(X2)2]4.

Lemma 16 implies that Pr[δ(X1)3]19. (Set δ(n)=3 and k=3.) To complete the proof we show that Pr[δ(X1)2δ(X2)2]19. Consequently, we get the desired inequality 𝔼[En]7+359+349=10.

The two top-priority keys split the keys into three parts with Y1, Y2, and Y3 keys, where Y1+Y2+Y3=n2. The i-th part includes all Xi keys beneath the root and some extra keys from the root, so for each i{1,2,3}, we have XiYi.

Pr[δ(X1)2δ(X2)2] =Pr[X1ρ+α+1X2ρ+α+1]
Pr[Y1ρ+α+1Y2ρ+α+1]=(n2(ρ+α+1)2)/(n2)

Same technique of Lemma 14 yields (n2(ρ+α+1)2)/(n2)(12(ρ+α+1)n)2.

Combining the previous inequalities together with n3ρ+α implies

Pr[δ(X1)2δ(X2)2] (12(ρ+α+1)3ρ+α)2.

Subtract α from the numerator and denominator. By Lemma 15, we have:

Pr[δ(X1)2δ(X2)2] (12ρ+α+23ρ)2(12ρ3ρ)2=19,

as required.

Theorem 21.

For a subtree of size nβ+α, we have that the expected number of non-full blocks 𝔼[En]27δ(n).

Proof.

We will prove the statement by induction on n. Observation 18, Lemma 19, and Lemma 20 provide the bases of the induction. The problem is divided into subproblems by using random variables X1,,Xδ(n).

𝔼[En]=𝔼[EX1]+𝔼[EX2]++𝔼[EXδ(n)]=δ(n)𝔼[EX1]

Note that Xinα, so the induction assumption is applicable to it. For n4ρ+α, we will use the induction assumption and bases to obtain

𝔼[EXi] Pr[δ(Xi)=1]1+Pr[δ(Xi)=2]5+Pr[δ(Xi)=3]10+k=4δ(n)Pr[δ(Xi)=k](27k)
<Pr[δ(Xi)=1]10+Pr[δ(Xi)=2]10+Pr[δ(Xi)=3]10+k=4δ(n)Pr[δ(Xi)=k](27k)
=10Pr[1δ(Xi)3]+27k=4δ(n)Pr[δ(Xi)=k]k10+27k=4δ(n)Pr[δ(Xi)=k]k

We can rewrite k=4δ(n)Pr[δ(Xi)=k]k to 3Pr[δ(Xi)4]+k=4δ(n)Pr[δ(Xi)k]. Then by using Corollary 17, we get

𝔼[EXi] 10+27(3Pr[δ(Xi)4]+k=4δ(n)Pr[δ(Xi)k])10+27(3e2+k=4e(k2))
10+27(3e2+k=2ek)=10+27(3e2+e21e1)10+0.6202×27<27,

as stated.

Appendix B Additional Proofs

Lemma 22.

Consider a subtree with n>γ keys, where γ is equal to β+α=(α+1)ρ+α. We have:

2αε(n1αi)2αε(n1γαi) (nα+1i)(nγα+1i)γ(n1γαi).

Proof.

The proof is by induction on n. For n=γ+1 the statement is

2αε(γαi) (γ+1α+1i) (6)
2αεγ(γ1)(γα+i+1)(αi)! (γ+1)γ(γα+i+1)(αi+1)(αi)!
2αε(αi+1) γ+1,

This is true because γ=(α+1)ρ+α=108(α+1)α/ε+α>2(α+1)α/ε.

Assume the lemma holds for each γ<mn. We will prove the lemma for n+1, i.e. we will show:

2αε(nαi)2αε(n+11γαi) (n+1α+1i)(n+1γα+1i)γ(n+11γαi)

holds for all i. The inequality is true for i=α1, since

2αεn2αε(nγ) n(n+1)2(nγ)(nγ+1)2γ(nγ)
2αεγ 2γnγ2+γ22γn2γ22
2αεγ γ2+γ2.

Hence, it suffices to observe that

2αεγγ2+γ24αεγ+1=(α+1)ρ+α+1=(α+1)108α/ε+α+1

which is true.

For i<α1, we will use the identity equation (mk)=(m1k)+(m1k1) together with the induction assumption. We have

2αε(nαi)2αε(nγαi)
=2αε(n1α(i+1))+2αε(n1αi)2αε(n1γα(i+1))2αε(n1γαi)
=[2αε(n1α(i+1))2αε(n1γα(i+1))]+[2αε(n1αi)2αε(n1γαi)]
(nα+1(i+1))(nγα+1(i+1))γ(n1γα(i+1))
+(nα+1i)(nγα+1i)γ(n1γαi)
=(n+1α+1i)(n+1γα+1i)γ(nγαi).