A Simple Integer Successor-Delete Data Structure
Abstract
We consider a simple decremental data structure for maintaining a set of integers, that supports initializing the set to followed by deletions and successor queries in arbitrary order in total time. The data structure consists of a single array of integers. A straightforward modification allows the data structure to also support predecessor and range queries, with a total output , in total time, where . The data structure is essentially a special case of the classic union-find data structure with path compression but with unweighted linking (i.e., without linking by rank or size), that is known to achieve logarithmic amortized time bounds (Tarjan and van Leeuwen, 1984). In this paper we study the efficiency of this simple data structure, and compare it to other, theoretically superior, data structures.
Keywords and phrases:
Successor queries, deletions, interval union-find, union-find2012 ACM Subject Classification:
Theory of computation Data structures design and analysisFunding:
Supported by Independent Research Fund Denmark, grant 9131-00113B.Editors:
Petra Mutzel and Nicola PrezzaSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
We consider a simple decremental data structure for maintaining a set of integers , under the operations and , where initially . The delete operation removes the value from (or leaves the set unchanged if ), and the successor query returns the smallest value in larger than or equal to , i.e., . Figure 3 shows the full code for the operations and an example of the data structure for a set of values. The data structure consists of a single array of integers. We view as a pointer from index to index in : if then ; otherwise . Figure 3 shows a minimal sequence of operations resulting in non-nested pointers (). A simple modification of the data structure also supports the predecessor query that returns the largest value in smaller than or equal to , i.e., . The only modification is that for we let instead of . See Figure 3.
The structure in Figure 3 is likely folklore in practice (but we failed to find a reference in the literature). The data structure is interesting due to its simplicity, low space usage (an array of integers), and low constants in the running time, but there exist data structures in the literature with theoretically superior complexity. See Table 1.
1.1 Related work
To support successor queries on a statics set, the canonical solution is to store the values in sorted order in an array and perform successor queries using binary search in time. In the static case of integers, the successors can be stored explicitly in an array of size , and successor queries can be looked up in constant time. If the set is dynamic, the classic solution is to store the values in a balanced binary search tree, like an AVL-tree [1] or a red-black tree [12], supporting insertions, deletions, and successor and predecessor queries in time. Binary search trees are inherently node oriented, where each of the nodes stores at least a value and two pointers to its two children, and possibly some bits of balance information (splay trees [20] do not store any balance information in the nodes, but the time bounds are amortized). Implicit binary search trees [7, 8, 15] reduce the space usage to a single array of size storing a permutation of (all additional information is encoded in the permutation of the values). In this paper we even allow the data structures to use space , where can be much larger than after many deletions. Another binary tree solution is an augmented static binary tree with leaves left-to-right representing , and where each node stores the maximum non-deleted node in its subtree. Such a tree can be stored in an array of size (where the children of node are nodes and , like in the binary heaps by Williams [28]) and supports updates and successor queries in time.
| Reference | Succ | Delete | Insert |
|---|---|---|---|
| Balanced binary search tree [1, 12] | |||
| Augmented static binary tree | |||
| van Emde Boas trees [25, 26] | |||
| Succinct rank-select [14] | |||
| Succinct successor [19] | |||
| Union-find [21] | |||
| Union-find, weighted linking only [6] | |||
| Union-find, path compression only [23] | |||
| Weighted quick-find (McIlroy and Morris) | |||
| Interval union-find [9] | |||
| This paper |
Exploiting that values are integers, van Emde Boas, Kaas and Zijlstra [25, 26] presented a dynamic data structure supporting insertions and deletions, and predecessor and successor queries in time and space. Lower bound trade-offs between query time and space usage (in the cell-probe model) were studied by Beame and Fich [3] and Pătraşcu and Thorup [16]. Succinct rank-select data structures, i.e., data structures using space bits close to the information theoretic lower bound, can also be used to answer successor queries, see, e.g., the recent dynamic rank-select structure by Li, Liang,Yu, and Zhou [14]. Note that dynamic rank-select data structures have a lower bound of time, see Pătraşcu and Thorup [17] – a higher lower bound than for predecessor and successor queries. Pibiri and Venturini [19] presented a data structure supporting insertions and deletions in amortized time, and successor and predecessor queries in time, using bits of space. Dinklage, Fischer and Herlez [5] did a comprehensive experimental study of dynamic integer predecessor data structures supporting both insertions and deletions.
In this paper we only consider the decremental version of the problem, where values can only be deleted and never be inserted again. We allow the deletion of an already deleted value. This problem is in the literature known as the interval union-find problem: Consecutive deleted integers form intervals with their (non-deleted) successor. Deleting an integer unions the two intervals of deleted integers to the left and right of . Italiano and Raman [13, Chapter 5.3] give a survey of the results for the interval union-find problem and list some applications. By exploiting the power of the RAM model, a data structure by Gabow and Tarjan [9] for the interval union-find problem supports successor queries in constant time and deletions in amortized constant time (see [22] for an introduction to amortized time analysis): Partition the integers into microsets of consecutive values. For each microset store the values in as a bitvector of length in a single word, and handle queries on a microset either by tabulation or dedicated RAM instructions (for finding the least or most significant set bit in a word111https://en.wikipedia.org/wiki/Find_first_set). A separate interval-union data structure is maintained for the macroset set , where if and only if the -th microset is non-empty, i.e., . The macroset structure consists of an array of size storing pointers to the nearest non-empty successor microset structure. The pointers are updated using the “relabel the smaller half” method when a microset becomes empty. While deletions and queries to microsets and queries to the macroset take constant time, deletions to the macroset take amortized time, but only happens when all values in a microset has been deleted, i.e., the overall time for Delete becomes amortized . The macroset data structure is in the literature known as the weighted quick-find data structure, and is attributed to McIlroy and Morris by Aho, Hopcroft and Ullman [2].
The interval union-find problem is a special case of the union-find problem, where an initial set of singleton sets are maintained under the union of two sets containing two values and , , and the query that returns a representative value for the set containing . Galler and Fischer [10] introduced the rooted tree representation for the union-find problem, where each set is represented by a rooted tree, the root is the representative of the set, and each node stores a pointer to the parent node. A operation traverses the path from node to the root, and compresses the path so that all visited nodes are shortcut to have the root as their parent, and performs and and links the two roots, making the smaller tree (with respect to size or “rank”) a child of the root of the larger tree. Path compression and weighted linked can be combined or applied separately. Only applying linking by size gives time Union and Find operations (Fischer [6]), whereas naive linking with path compression gives amortized time Union and Find operations (Tarjan and van Leeuwen [23, Theorem 4]). Tarjan [21] showed that combining path compression with weighted linking implies Find operations take time, where is a very slowly growing inverse Ackermann like function. Tarjan and van Leeuwen [23] analyzed various variations of path compression for the union-find problem. They proved that path halving achieves asymptotically identical time as path compression, for Find operations. Path halving was introduced by van Leeuwen and van der Weide [24, 27]. Union-find structures replacing weighted linking by randomized linking were considered by Goel, Khanna, Larkin and Tarjan [11], avoiding the space usage for weights or ranks. See Italiano and Raman [13, Chapter 4.2] for a survey of the results for the union-find problem. An experimental evaluation of 55 union-find data structures was done by Patwary, Balir and Manne [18] in the context of computing the connected components of various graphs, where a variant of Rem’s union-find data structure (described by Dijkstra [4, Chapter 23]) turned out to achieve the best performance.
Any union-find structure can be combined with the microset-macroset idea to solve the interval union-find problem. For the union-find structure using linking by rank and path compression, we obtain an interval union-find structure supporting deletions and successor queries in amortized time: deletions cause at most unions in the macro-structure, creating links. Since a link changing parent during a path compression is linked to a higher ranked node, a link can at most be changed times, and the total number of link updates during path compressions is . The resulting structure uses space bits.
The data structure we consider in Figure 3, is essentially the same as the union-find data structure by Rem, both consisting of an array of self-loops and forward pointers, except that we only consider the restricted case of the interval union-find problem, allowing us to simplify the operations. We apply path compression without weighted linking, where the representative of a set is the successor of all values in the set, and where is , that links below without performing , i.e., without performing a path compression and where is not necessarily a root.
1.2 Results
Heavily inspired by the results of Tarjan and van Leeuwen [23], we obtain the following results. For the successor-delete data structure in Figure 3 we obtain the following result.
Theorem 1.
A sequence of followed by Delete and Succ operations in arbitrary order takes time.
Note that for a sublinear number of queries, , the time is , e.g., for the time is . For a superlinear number of queries, where for a constant , the time is .
Our second result is that the data structure also supports the range reporting query in Figure 4, that reports in sorted order. Provided we never delete a value already deleted or we use the operation in Figure 4, that only modifies if has not yet been deleted, we obtain the following result.
Theorem 2.
A sequence of , followed by Delete, Succ, and RangeReport operations in arbitrary order, where the total output of the range reporting queries is , takes time, where .
Finally, a slight modification of our data structure also supports predecessor queries, see Figure 3 in Section 4, and obtains the following result.
Theorem 3.
A sequence of , followed by Delete, Succ, Pred operations, and RangeReport operations in arbitrary order, where the total output of the range reporting queries is , takes time, where .
It should be noted that maintaining two copies of the successor-delete data structure of Theorem 2, one for each direction for reporting successor and predecessor queries, respectively, actually achieves Theorem 3. The advantage of the data structure in Section 4 is that the space usage is only a single array of integers, and deletions only need to update a single array instead of two arrays.
In Section 6 we present our experimental results of comparing our data structure to the weighted quick-find data structure and the two-pass weighted union-find data structure with path compression, with and without using microsets.
1.3 Preliminaries
Throughout this paper we consider maintaining a set . We let and denote the successor and predecessor of in , respectively. We assume that the values and are never deleted, such that the and are always defined, for . (In our implementations we actually consider the integers , where and are never deleted, such that intuitively acts as and as .)
2 Successor-delete
Our basic successor-delete data structure is shown in Figure 3. It stores a set , where . It consists of a single array . For , it is an invariant that if and if . We view as a pointer from index to index in , such that all elements with as their successor in form a rooted tree with as the root. See the pointers below the array in Figure 3, where 12 is the root of the tree containing .
To compute , we can repeatedly let until . In the worst case this will take time. To achieve a better amortized performance, we apply path compression such that for each accessed we let it point directly to . Figure 3 shows two implementations of with two different path compressions: recursively or through two iterative passes: first follow the path from to find the root , and in a second traversal of the path sets all pointers to point to . Applying the recursive compression or the two-pass path compression results in the same data structure, but the two-pass path compression does not require a recursion stack. Slightly less aggressive than path compression is path halving, where every second node on the path is shortcut to its grandparent, i.e., . Path halving has the benefit that it can be implemented in a single traversal of the path from to , see Figure 4. The analysis in Section 5 only considers path compression.
A operation could set to . This would require a call to to update . We apply the simpler idea of just setting , avoiding the call to Succ during Delete that would be the natural approach if we implemented the successor data structure using a union-find data structure. If when performing , we in the code in Figure 3 naively reset , to avoid the check if . If one added this check, see Delete in Figure 4, we would only update if . This would ensure that over time only increases. Note that either way of implementing Delete can cause pointers not to be nested, as shown in Figure 3.
3 Range reporting
Another operation one could consider supported is the operation that reports the values in in sorted order, where . If all elements in were linked in increasing order, could trivially be supported by performing to find the first element to report (provided it is smaller than or equal to ), and then traverse and report the elements of the linked list until the first element larger than is encountered. The time would be the time for a Succ query and then linear in the output. The data structure in Figure 3 can support range reporting queries without such additional links: The first element to report can be found by performing , the next element in can the found by performing , and so on, until is larger than . See code in Figure 4. Even that this approach applies multiple Succ queries, we in Section 5 show that the amortized time for RangeReport is still the time for a single Succ query plus linear in the output, provided we are slightly careful about deletions.
4 Successor-predecessor-delete structure
If both successor and predecessor queries should be supported, one could maintain two separate data structures, using one array for each query direction. A deletion would then require performing a deletion on two data structures. A more space efficient way is to use the data structure in Figure 3, but where we for each store , except for . It follows that all values in are linked in decreasing order in . For , we define and maintain exactly as before. Figure 3 shows the resulting data structure, where Succ is implemented using two-pass path compression. Since for an needs to remove from the linked list of values in , we need to compute using , to update . We can shortcut to point directly to , now it is anyway computed. To support , observe that is , when , i.e., . Otherwise, for . Since we have a decreasing linked list of all elements in , can be implemented by first finding the largest value in by performing , and then traversing and reporting the values on the linked list in decreasing order until a value smaller than is encountered. The drawback of this solution is that for each operation a successor query is required to update . Elements are reported in decreasing order but can be changed to be increasing order by reversing the output (or mirroring the whole data structure), if this is desired.
5 Analysis
In this section we analyze the theoretical performance of the proposed data structures, i.e., we give the proofs of Theorems 1–3. The analysis follows the same lines of reasoning as used by Tarjan and van Leeuwen for the analysis of union-find data structures using naive linking [23].
5.1 Proof of Theorem 1
Consider the operations in Figure 3, and assume we perform followed by Delete and Succ operations in an arbitrary order. The Init operation together with the Delete operations clearly take time. What remains is to analyze the time spent on the Succ queries.
We assume that we never delete a value that is already deleted (or we use the Delete operation in Figure 4). We later show that this assumption is not necessary. Under this assumption, until is deleted, where is set to . Subsequently, is only changed by path compressions increasing , i.e., over time is non-decreasing.
For a successor query , we denote the index the root of the query. Let be the set of all root indices for the Succ queries during the sequence of operations. Clearly . Consider the indegree of an index over time. While is not deleted, the indegree of is zero. When is deleted, is set to point to , and gets indegree one. If is never the root of a query, , will never get another in-pointer, since path compressions only change pointers to point to the root of a query, and when is increased (because of a path compression), gets indegree zero and the indegree of remains zero for the remainder of the sequence of operations.
A query accesses a sequence of indices in , where , for , and . For the query we count separately the accesses to three indices: the index where the query starts, and the last two indices and , where the pointers do not change. For the intermediate nodes , we know they have indegree at least one before the query (since they are on the query path) and the in-pointer is updated (since we use path compression). If (), then by the above discussion had indegree one before the query (a pointer from ) and indegree zero after the query, and no further query can have as an intermediate index (but we could visit because of a query, where it would play the role of “”). It follows that the total number of accesses by all Succ queries to intermediate nodes not in is at most . It follows that the total number of nodes accessed by all queries is at most plus the number of accesses to intermediate nodes in .
To bound the remaining number of accesses to intermediate nodes in we introduce the notion of rank (like in [23]). For with , we define the rank of to be , where and , i.e., the rank of is the base logarithm of the number of indices between and in the array (excluding ) that eventually become a root during the sequence of operations. If has rank , then . Ranks are upper bounded by . We refine each rank into levels, , where an index of rank has level if . Consider a path compression that shortcuts an index over another index , i.e., before the shortcut is made and the new value of is at least the old value of . If then the rank of increases by at least one (to rank ), but if , then does not necessarily increase, but the level of index increases by at least one. It follows that for each Succ query and rank at most one accessed index of rank in does not change neither rank nor level (the rightmost rank index accessed). Over the sequence of operations, an index in can at most increase rank times and for each of the ranks it can increase level at most times. It follows that the total number of accesses to nodes for all Succ queries is at most
The total time for the sequence of operations becomes . Using and , the total time for , Delete and Succ queries is , as stated in Theorem 1.
To avoid the assumption that we never delete a value that is already deleted, observe that performing when , resets . If , then this causes to be reset to one, i.e., the rank of is reset to zero and the level of to one. But notice that the first path compression accessing will increase to at least the value before the deletion, i.e., the lost (rank, level) potential is regained by the first access to . So, in the analysis above we only need to account for one additional access to for each operation. If , then we above used that would never get indegree one again, if , and argued that we only would visit once as an intermediate. Setting will cause once again to gain indegree one, allowing one additional access to as an intermidate node. We charge these two additional accesses to the deletion. The remaining analysis is unchanged, and Theorem 1 holds without the assumption that we never delete a value that is already deleted.
5.2 Proof of Theorem 2
In Figure 4 we show how to implement using repeated calls to Succ. If is the size of the total output by RangeReport operations, these in total perform at most calls to Succ. We can replace in Theorem 1 with to obtain a bound on the total time for a sequence of operations also containing RangeReport queries. This bound is not linear in the output size though.
To give a better bound, we denote a subset of the links outer links. If and are two consecutive elements in , i.e., , then the links on the path from to are the outer links. Observe, that RangeReport after the first Succ query has been performed only accesses outer links, which are shortcut, such that the outer path afterwards between and consists of a pointer directly from to , i.e., all traversed links between and become non-outer links except for one. I.e., accessing the outer links can be charged to the output size plus the number of outer links becoming non-outer links, and the initial query.
An outer link from is created when is deleted. Provided that we never set except for the initial deletion of , a link that has become non-outer will never become outer again. It follows that we charge each Delete the creation of an outer link, and the total time becomes the bound given by Theorem 1 plus for the output size , and replacing with in the bound, to cover the initial Succ query performed by each RangeReport. This gives the bound stated in Theorem 2.
5.3 Proof of Theorem 3
The construction in Figure 3 is essentially the same as the structure in Figure 3 with respect to forward pointers for deleted indices. To maintain the linked predecessor chain, a operation for needs to perform a operation to update , before setting . The work is in addition to the cost for the call to .
Similar to the argument for range-reporting queries in Theorem 2, the links followed by are outer links, and we can apply the same accounting as in Theorem 2 that all outer links except one become non-outer links, and can be charged for the traversal. The link from to becomes a new outer link.
Each of the Pred and RangeReport queries essentially perform one Succ query, plus constant work and work linear in the output, respectively. The total work becomes the cost of Succ queries, plus the output size , plus the cost of the Init and Delete operations. The bound stated in Theorem 3 follows from Theorem 2.
6 Experimental evaluation
Variants of the proposed successor-delete data structure and some data structures from the literature were implemented in C, storing using 64 bit integers. The experiments were run on an HP EliteBook 640 G10 with an Intel Core i7-1365U processor, running Windows 11 (23H2), using GCC 14.2.0 (MSYS2), and compiled with optimization level -O3. The source code is available on GitHub222https://github.com/gsbrodal/sea25.
Each data point includes the time for , and all Delete and Succ operations performed. The time is the average over at least five runs, but sufficiently many runs to get a total running time of at least one second. Each data point is the minimum observed repeating the evaluation three times. We tested the correctness of all algorithms evaluated by checking if they returned the same answers to all tested successor queries (this test is not included in the measured running times).
We implemented five versions of the proposed successor-delete data structure:
1) with no path compression,
2) using recursive path compression,
3) a two-pass path compression,
4) a two-pass path compression and checked deletions, and
5) path halving.
For 1) – 3) and 5), deletions are implemented as in Figure 3, where is always set to when is deleted, whereas 4) only updates if is still in the set, i.e., (Figure 4). Furthermore, we implemented
6) the weighted quick-find data structure and
7) the union-find data structure with weighted linking and two-pass path compression. Whereas 1) – 5) only store the array of integers, both 6) and 7) store an array with three integers per node (parent pointer, weight, and the maximum in the set, i.e., the successor of all elements in the set).
Finally, we applied the microset-macroset idea from [9] to
8) the weighted quick-find structure,
9) the union-find structure with path compression and linking by weight, and
10) the successor-delete data structure with two-pass path compression.
For 8) – 10) a microset consists of 64 bits stored in a single long long, and we use the GCC builtin function __builtin_ctzll to compute the successor inside a microset333https://gcc.gnu.org/onlinedocs/gcc/Bit-Operation-Builtins.html.
That path compression is crucial to our data structure should be obvious: If we delete all elements without performing any path compression, all indices form one long chain of length , and will take linear time. Figure 5(left) shows that the total time is quadratic for followed by deleting all elements and performing times (since the time divided by is linear with slope 1 in a log-log plot). Figure 5(right) shows the same data without the variant with no path compression. All algorithms have a theoretical running time of for this sequence of operations, which is confirmed by all measured running times divide by being about constant. For , the recursive path compression failed because of a stack overflow on Windows, and we only show the results for for the recursive path compression variant for these sequences of operations.
As noted after the statement of Theorem 1, the worst-case behavior of our algorithm per successor query is when the number of successor queries is about . A sequence of operations forcing the successor-delete data structure with path compression to do work is to delete the elements in increasing order, and after pick a worst-case query, i.e., a query with longest path from to . The upper bound follows by Theorem 1, whereas the matching lower bound for this sequence follows by a result of Fischer [6, Theorem 1], who considered this sequence to prove an lower bound for the union-find problem with unweighted linking and path compression. Whereas our successor-delete data structure has a total running time of for this input, both the weighted quick-find and the union-find data structures will only need time, since all unions link a set of size with a set of size one, i.e., both the union-find and the weighted quick-find data structures will be a star containing all the deleted elements. Figure 6(left) shows the running time for this sequence of operations. The results show that the running times of the successor-delete data structures are in the same ballpark (but asymptotic slower than) the theoretically more efficient data structures.
Figure 6(right) shows results for times deleting a random integer from (possibly already deleted elements), and after each deletion performing a query for some with longest path from . The successor-delete data structures on this input benefit from that for a long time deletions are sparse in the input, and there are many small rooted trees with short leaf-to-root paths. Concerning the variant where deletions check if an element is already deleted, it appears from Figure 6(right) that for random deletions in inputs of size larger than this generates a non-negligible slowdown (“successor, 2-pass, checked” vs. “successor, 2-pass”). One could speculate that this is due to branch mispredictions and/or cache misses.
From Figure 6 it seems that combining the successor-delete data structure with the microset-macroset idea (“successor, 2-pass, microset”) makes its performance on the same level as the union-find and weighted quick-find data structures combined with microsets (“union find, microset” and “quick find, microset”, respectively) on the two types of sequences of operations we consider.
Figures 7 and 8 consider the same setup as in Figure 6, except that the number of queries is varied from to and are performed uniformly interleaved with the deletions. For incremental deletion sequences (Figure 7), it appears that the gap in performance between the successor-delete data structure and the theoretically more efficient data structures is maximized when there is one query per deletion. For random deletion sequences (Figure 8), the gap between the performance of using deletions with and without checking if an element has already been deleted appears consistent independently of the number of queries.
7 Conclusion
We have considered a simple data structure for supporting successor queries in a set of integers in the range , and an implementation shows that the data structure has low constants in practice, although there exist data structures with lower asymptotic worst-case behavior. The experiments show that the running time of the proposed algorithm is in the same ballpark as weighted quick-find and union-find with path-compression and linking by weight. The benefit of the proposed algorithm is that it only needs to store a single array of integers, opposed to three arrays for the two other structures. The code for the proposed data structure operations is very short, but this is also the case for the union-find and weighted quick-find data structures, so all data structures have a low constant in their running time. An open problem is to analyze the theoretical performance of the data structure when using path halving.
References
- [1] Georgy M. Adelson-Velsky and Evgenii M. Landis. An algorithm for the organization of information. Proceedings of the USSR Academy of Sciences (in Russian), 146:263–266, 1962.
- [2] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
- [3] Paul Beame and Faith E. Fich. Optimal bounds for the predecessor problem and related problems. Journal of Computer and System Sciences, 65(1):38–72, 2002. doi:10.1006/JCSS.2002.1822.
- [4] Edsger W. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976.
- [5] Patrick Dinklage, Johannes Fischer, and Alexander Herlez. Engineering predecessor data structures for dynamic integer sets. In David Coudert and Emanuele Natale, editors, 19th International Symposium on Experimental Algorithms, SEA 2021, June 7-9, 2021, Nice, France, volume 190 of LIPIcs, pages 7:1–7:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.SEA.2021.7.
- [6] Michael J. Fischer. Efficiency of equivalence algorithms. In Raymond E. Miller and James W. Thatcher, editors, Proceedings of a symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 153–167. Plenum Press, New York, 1972. doi:10.1007/978-1-4684-2001-2_14.
- [7] Gianni Franceschini and Roberto Grossi. Optimal worst-case operations for implicit cache-oblivious search trees. In Frank K. H. A. Dehne, Jörg-Rüdiger Sack, and Michiel H. M. Smid, editors, Algorithms and Data Structures, 8th International Workshop, WADS 2003, Ottawa, Ontario, Canada, July 30 - August 1, 2003, Proceedings, volume 2748 of Lecture Notes in Computer Science, pages 114–126. Springer, 2003. doi:10.1007/978-3-540-45078-8_11.
- [8] Gianni Franceschini and Roberto Grossi. Optimal implicit dictionaries over unbounded universes. Theory of Computing Systems, 39(2):321–345, 2006. doi:10.1007/S00224-005-1167-9.
- [9] Harold N. Gabow and Robert Endre Tarjan. A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences, 30(2):209–221, 1985. doi:10.1016/0022-0000(85)90014-5.
- [10] Bernard A. Galler and Michael J. Fischer. An improved equivalence algorithm. Communications of the ACM, 7(5):301–303, 1964. doi:10.1145/364099.364331.
- [11] Ashish Goel, Sanjeev Khanna, Daniel H. Larkin, and Robert Endre Tarjan. Disjoint set union with randomized linking. In Chandra Chekuri, editor, Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014, pages 1005–1017. SIAM, 2014. doi:10.1137/1.9781611973402.75.
- [12] Leonidas J. Guibas and Robert Sedgewick. A dichromatic framework for balanced trees. In 19th Annual Symposium on Foundations of Computer Science, Ann Arbor, Michigan, USA, 16-18 October 1978, pages 8–21. IEEE Computer Society, 1978. doi:10.1109/SFCS.1978.3.
- [13] Giuseppe F. Italiano and Rajeev Raman. Topics in data structures. In Mikhail J. Atallah, editor, Algorithms and Theory of Computation Handbook, Chapman & Hall/CRC Applied Algorithms and Data Structures series. CRC Press, 1999. doi:10.1201/9781420049503-C6.
- [14] Tianxiao Li, Jingxun Liang, Huacheng Yu, and Renfei Zhou. Dynamic “Succincter”. In 64th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2023, Santa Cruz, CA, USA, November 6-9, 2023, pages 1715–1733. IEEE, 2023. doi:10.1109/FOCS57990.2023.00104.
- [15] J. Ian Munro. An implicit data structure supporting insertion, deletion, and search in time. Journal of Computer and System Sciences, 33(1):66–74, 1986. doi:10.1016/0022-0000(86)90043-7.
- [16] Mihai Pătraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In Jon M. Kleinberg, editor, Proceedings of the 38th Annual ACM Symposium on Theory of Computing, Seattle, WA, USA, May 21-23, 2006, pages 232–240. ACM, 2006. doi:10.1145/1132516.1132551.
- [17] Mihai Pătraşcu and Mikkel Thorup. Dynamic integer sets with optimal rank, select, and predecessor search. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 166–175. IEEE Computer Society, 2014. doi:10.1109/FOCS.2014.26.
- [18] Md. Mostofa Ali Patwary, Jean R. S. Blair, and Fredrik Manne. Experiments on union-find algorithms for the disjoint-set data structure. In Paola Festa, editor, Experimental Algorithms, 9th International Symposium, SEA 2010, Ischia Island, Naples, Italy, May 20-22, 2010. Proceedings, volume 6049 of Lecture Notes in Computer Science, pages 411–423. Springer, 2010. doi:10.1007/978-3-642-13193-6_35.
- [19] Giulio Ermanno Pibiri and Rossano Venturini. Dynamic elias-fano representation. In Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter, editors, 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland, volume 78 of LIPIcs, pages 30:1–30:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.CPM.2017.30.
- [20] Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. Journal of the ACM, 32(3):652–686, 1985. doi:10.1145/3828.3835.
- [21] Robert Endre Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the ACM, 22(2):215–225, 1975. doi:10.1145/321879.321884.
- [22] Robert Endre Tarjan. Amortized computational complexity. SIAM Journal on Algebraic Discrete Methods, 6(2):306–318, 1985. doi:10.1137/0606031.
- [23] Robert Endre Tarjan and Jan van Leeuwen. Worst-case analysis of set union algorithms. Journal of the ACM, 31(2):245–281, 1984. doi:10.1145/62.2160.
- [24] Theodorus Petrus van der Weide. Datastructures: An Axiomatic Approach and the Use of Binomial Trees in Developing and Analyzing Algorithms. PhD thesis, Mathematisch Centrum, Amsterdam, 1980. URL: https://www.cs.ru.nl/Th.P.vanderWeide/docs/1980-Weide-AxAppr.pdf.
- [25] Peter van Emde Boas. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters, 6(3):80–82, 1977. doi:10.1016/0020-0190(77)90031-X.
- [26] Peter van Emde Boas, R. Kaas, and E. Zijlstra. Design and implementation of an efficient priority queue. Mathematical Systems Theory, 10:99–127, 1977. doi:10.1007/BF01683268.
- [27] Jan van Leeuwen and Theo van der Weide. Alternative path compression techniques. Technical Report RUU-CS-77-3, Department of Computer Science, University of Utrecht, The Neterlands, 1977. URL: https://dspace.library.uu.nl/handle/1874/15885.
- [28] J. W. J. Williams. Algorithm 232: Heapsort. Communications of the ACM, 7(6):347–348, 1964.
