Efficient Top-Down Updates in AVL Trees
Abstract
Since AVL trees were invented in 1962, two major open questions about rebalancing operations, which found positive answers in other balanced binary search trees, were left open: can these operations be performed top-down (with a fixed look-ahead), and can they use an amortised constant number of write operations per update? We propose an algorithm that solves both questions positively.
Keywords and phrases:
AVL trees, data structures, amortised complexity2012 ACM Subject Classification:
Theory of computation Sorting and searchingEditors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz HermanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Balanced search trees are among the most fundamental and basic data structures in computer science. Their formidable story started with the invention of AVL trees by Adel’son-Vel’skii and Landis [1] in 1962, who proposed a data structure and a maintenance algorithm thanks to which the insertion, deletion and research of an element in a linearly ordered set of size was feasible in time .
AVL trees are binary search trees in which each node maintains some form of balance: the heights of its children differ from each other by at most . Satisfying this property at each node ensures that the tree is of height . The number of comparisons required to search an element in a binary search tree being linear in the height of the tree, it is thus logarithmic in .
However, for instance, inserting a node into the sub-tree rooted at the left child of our node may increase the height of that sub-tree, damaging the balance of . Thus, the challenge of maintenance algorithms consists in efficiently modifying the parenthood relations inside the tree in order to ensure that each node remains balanced.
Unfortunately, in spite of their excellent worst-case height, AVL trees suffer from updating algorithms that are significantly less efficient than those of other balanced binary search tree structures such as weight-balanced trees [9], red-black trees [4], half-balanced trees [11], splay trees [12] or, more recently, weak AVL trees [5]. Indeed, as illustrated in Table 1, they have two shortcomings, from which almost all other algorithms are exempt:
-
starting from an empty tree, performing updates in a row may trigger rotations [2] or, more generally, write operations, which may also consist in modifying the balance of a node without changing the tree structure;
-
updates following an insertion or deletion must be performed in a bottom-up fashion; the inventors of weak AVL trees even said, in [5, p. 23], that “Top-down insertion or deletion with fixed look-ahead in an AVL tree is problematic. (We do not know of an algorithm; we think there is none.)”
| Tree family | Amortised write cost/update | Top-down update algorithm | ||||
|---|---|---|---|---|---|---|
| AVL (state of the art) | [2] | no | ||||
| AVL (this article) | yes | |||||
| Weight-balanced | [3] | yes | [7, 8] | |||
| Red-black | [4] | yes | [4] | |||
| Half-balanced | [11] | for insertions | [10] | |||
| Splay | [12] | yes | [12] | |||
| Weak AVL | [5] | yes | [5] | |||
Requiring only amortised write operations per update instead of would clearly improve the efficiency of AVL trees. As mentioned in [4, 5], being able to use top-down algorithms is also important, because requiring a bottom-up pass is costly: either it forces each node to maintain a link toward its parent, or it must be performed via recursive calls or storing visited nodes in a stack. This second approach is even more expensive in distributed settings, because each process should lock the entire branch it is visiting, preventing any update from starting before the previous update has ended.
A way of circumventing these deficiencies of AVL trees is to use weak AVL trees instead [5]. By having less demanding balance criteria, this variant of AVL offers update algorithms that require rotations in the worst case, and amortised write operations (which may include updating information on the balance of nodes) per update. Furthermore, these algorithms can be made top-down, with the following exception, which is common to all binary search tree data structures: instead of deleting an internal node , we must delete a node whose key immediately precedes or follows the key of , an approach called Hibbard deletion [6], and then replace the key of by that of . In AVL trees, must be a leaf or the parent of a leaf, which reduces the problem of node deletion to that of leaf deletion, up to to keeping a pointer on an internal node we may wish to delete.
In this article, we prove that top-down insertion and deletion with fixed look-ahead in an AVL tree is feasible; having a look-ahead of means that the algorithm, when it reads or writes information at a node , is forbidden to later read or write informations at nodes that do not descend from the th ancestor of . At the same time, this can also be write-efficient, thus bridging a gap of theoretical importance: it was somewhat unsettling to observe that this seminal balanced tree structure was also the one for which no efficient algorithms had been found, thus leading specialists in this field to conjecture that such algorithms did not exist.
We proceed step-by-step: in Section 2, we briefly present textbook algorithms for updating AVL trees; in Section 3, we propose a bottom-up algorithm that requires amortised write operations per update; finally, in Section 4, we propose a top-down algorithm that requires amortised write operations per update. Note that, unlike algorithms for AVL trees or some other structures (e.g., red-black trees), a given update may require these algorithms to perform rotations: our efficiency criterion is based on the amortised number of write operations (and, in particular, rotations) only, not on the maximal number of such operations that a single update may require.
Both algorithms have flavours similar to those used when designing weak AVL trees [5]: they rely on making sure that “transient” rebalancing operations decrease some integer-valued potential and, for the top-down algorithm, on being able to start and stop long enough yet finite sequences of such transient operations. This illustrates the usefulness of the approach used to design these algorithms, and also suggests that additional results similar to [5, Theorem 7.4] should also be valid here. The main difficulties, however, reside first in designing a suitable notion of potential, and then in managing to start and stop these sequences of transient operations; this task is easy when dealing with insertions only, and quite challenging when taking deletions into account.
2 State-of-the-art
2.1 Preliminaries
A binary search tree is an ordered rooted tree in which each tree node has one key , one left child and one right child , which can be tree nodes or null nodes; null nodes represent the empty tree and have no key nor children, and tree nodes whose two children are null nodes are called leaves. By definition, the descendants of consist of , its children and , and all their descendants (if they are tree nodes); these descendants form a sub-tree of rooted at and denoted by , and is one of their ancestors.
Node keys are pairwise distinct, and must belong to a linearly ordered set. For each tree node that descends from , we have if belongs to , and if belongs to . Moreover, the height of is the length (i.e., the number of parent-child edges) of the longest downward path starting from and ending in a leaf; it is denoted by . By convention, each null node has height .
A binary search tree is an AVL tree when the height difference belongs to the set for all tree nodes . This ensures that an AVL tree with nodes is of height .
Here, we follow the presentation of [5] and assume that AVL tree structures are implemented by assigning a rank to each tree node . This rank is our current estimation of the height , which may be slightly invalid if, say, some leaf was inserted way below , and we did not yet have the time to update information about the height of . Hence, we should control the rank differences for each node . Focusing on these differences, we say that is an -child if , and that is an -node if its children are - and -children.
In practice, these ranks are rarely stored directly in AVL implementations: instead, each node contains either (i) the difference between the ranks of its children and , or (ii) the difference between the rank of its parent and its own rank. In what follows, we will disregard these implementation details, and pretend that we stored node ranks in clear; adapting our algorithm to let it fit the frameworks (i) or (ii) is easy.
Since binary search trees represented ordered sets, the three main operations they must support are searching, inserting, and deleting an element. Searching an element is easy: when searching a key in a non-empty tree , we compare with ; then, we keep searching in if , or in if , or we declare that was found if ; whereas, when searching in an empty tree, we must declare that was nowhere to be found.
Inserting or deleting an element is substantially more difficult, because it requires altering the structure of the tree: this may change its height and even damage the balance of some tree nodes. As a result, the relation might fail for one node , or one rank difference might no longer belong to the set . The purpose of bottom-up rebalancing algorithms is precisely to eliminate this anomaly, either directly, or by “propagating it upward”, i.e., making it concern a strict ancestor of , with rank larger than .
2.2 Tree cuts and cut rebalancing
A cut of a tree is a list of nodes of that are pairwise incomparable for the ancestorship relation (i.e., the sub-trees are pairwise disjoint), ordered from left to right, and which is maximal for inclusion.
Let denote the root of . When the nodes have ranks , , the profile of the associated cut is defined as the integer tuple . Accordingly, we say that a node has profile when contains a tree cut of profile . When the order of the integers is irrelevant, we may also say that has profile , just writing a multiset of integers: this just means that there exists a permutation of such that has profile . In particular, an -node is simply a node with profile .
Below, we may modify the structure of a tree , thus generalising so-called rotations: starting from a cut of , let be the strict ancestors of the cut nodes , listed from left to right. We assign a new pair of children to each node , and decide that the new root should be some node , as long as the graph we obtain is a binary search tree rooted at ; nodes are said to be affected by the cut rebalancing. While doing so, (i) we must never modify the structure or rank of any of the sub-trees , and (ii) node ranks of nodes are reassigned to make sure that each node is a - or a -node in the resulting tree. Such an operation is called a cut rebalancing, and is easy to describe diagrammatically, like in Figure 1 below; cut nodes and their ancestors will typically be denoted with capital letters, from left to right.
2.3 Textbook bottom-up algorithm
Inserting a key may create a zero-edge, i.e., an edge between a parent and its child such that ; in that case, the integer is called the rank of the zero-edge. A zero-edge is created when is a leaf below which we wish to insert the key , because some child of , which used to be null, becomes a leaf.
Similarly, deleting a key may create a - or -node, which we call a four-node; this happens if is a -node one of whose children is a leaf containing the key we wish to delete, because that leaf will be transformed into a null node.
Zero-edges are dealt with as follows, and as illustrated in Figure 1. Up to using a left-right symmetry, and without loss of generality, we assume that , i.e., that both and are equal to the rank of the zero-edge:
-
1.
If and (row 1, left), the cut has profile . Let , and denote the cut nodes , and , and let and denote their ancestors and . We rebalance by letting become the parent of and , and become the parent of and , thus obtaining a tree with root . This cut rebalancing is a simple rotation.
-
2.
If and (row 1, right), we perform the same rotation; however, the resulting ranks differ.
-
3.
If and (row 2, left), the cut is to be rebalanced. Let , , and denote the cut nodes , , and , and let , and denote their ancestors , and . We rebalance by giving these seven nodes the shape of a complete binary tree: becomes the parent of and , becomes the parent of and , and becomes the parent of and , as well as the root of the resulting tree. This is a double rotation.
-
4.
If (row 2, right), we rebalance by not changing its structure, but simply increasing the rank of its root . This is a promotion.
Doing so, whenever we face a zero-edge of rank , we will either eliminate it at once, or replace it by a new zero-edge of rank , placed just above where used to be. Consequently, if inserting a key creates a zero-edge, we will simply propagate that zero-edge upward until it disappears.
Likewise, four-nodes are dealt with as follows, and as illustrated in Figure 2. Up to using a left-right symmetry, we assume that :
-
5.
If and (row 1, left), we perform a simple rotation.
-
6.
If and (row 1, right), we perform the same rotation; the resulting ranks differ.
-
7.
If and (row 2, left), we perform a double rotation.
-
8.
If (row 2, right), we decrease the rank of its root ; this is a demotion.
Cut rebalancing operations 5 to 7 are the same as operations 1 to 3 for eliminating zero-edges, except that the ranks of , and their descendants were all shifted down by one. In all eight cases, these operations result either in eliminating the four-node or in making the (former) parent of our new four-node.
3 Efficient bottom-up updates
3.1 Overview and complexity analysis
The textbook algorithm presented in Section 2.3 aims at eliminating or propagating upward zero-edges and four-nodes. Thus, each insertion or deletion update consists in (i) finding the location at which we want to create or delete a tree leaf, (ii) observe that we may have created a zero-edge or four-node, (iii) moving this anomaly upward, until (iv) we eliminate it once and for all. Step (i) requires no write operation on the tree structure, and steps (ii) and (iv) occur once per update. However, step (iii) may require arbitrarily many write operations per update.
Hence, cut rebalancing operations of cases 1 to 8 are called terminal if they result in eliminating the anomaly (thus being a part of step (iv)), and transient if they just propagate it upward (thus being a part of step (iii)): transient operations are operations 1 and 4 when used to be a -child, and operations 6 to 8 when had a parent that used to be a -node. We should prove that updating times an initially empty AVL tree triggers transient operations.
With the textbook algorithm, the statement we wish to prove is invalid, as was shown in [2]. Consequently, we will tweak the operations performed in cases 1 to 8, aiming at making them terminal whenever possible.
The complexity proof of the resulting algorithm is based on potential techniques. In our case, we define the potential of a a tree as the the number of -nodes, plus twice the number of full nodes it contains, where full nodes are defined as follows.
In absence of zero-edges, a node is called full when and its two children and are -nodes, and at least one of its grand-children and is a -node. In other words, a node is full when it has profile or, symmetrically, . When zero-edges are (temporarily) allowed, a node may also be considered full if is a -node whose -child is a -node; or if is a -node, and are - and/or -nodes, and one of and is a -child or a null, a - or a -node.
As we will prove:
-
1.
each update or write operation modifies the neighbourhood (parenthood relations and/or rank) of a constant number of nodes, thus increasing the tree potential by at most a constant;
-
2.
each transient operation aimed at eliminating a zero-edge of rank destroys one full node and creates one -node, thus decreasing the tree potential;
-
3.
each transient operation triggered by a deletion decreases the number of -nodes, and does not increase the number of full nodes, also decreasing the tree potential.
Statement 1 will be immediate, and thus we will focus on proving statement 2 in Section 3.3 (this will be Lemma 3), and statement 3 in Section 3.4 (this will be Lemma 4). Once all three statements are proved, we will finally be able to derive the following result.
Theorem 1.
Proof.
Starting from an empty tree, let insertion or deletion updates be performed. Let us keep track of the variations of tree potential while these updates are performed. There exists a constant such that each update requires:
-
1.
inserting or deleting a leaf, an operation that increases the potential by at most ;
-
2.
eliminating at most four zero-edges of rank , each of which increases the potential by at most ;
-
3.
performing a certain number of transient operations, each of which decreases the potential by at least one;
-
4.
performing at most one terminal operation, which increases the potential by at most .
Hence, if transient operations were performed in total, the potential increased by at most . It started with the value , and ended with a non-negative value, which shows that . Therefore, in total, these updates required performing no more than write operations.
3.2 Cut promotions and demotions
Some cut profiles, such as the ones that charaterise full nodes, allow rebalancing trees to either increase or decrease their rank. More precisely, if we are lucky enough, a tree with rank and profile will always admit a cut, whose profile does not need to be , that can be rebalanced to obtain a tree of rank , where is an integer of our choice. This cut rebalancing operation is called a cut promotion when , and a cut demotion when .
Some profiles ensure that a cut promotion or demotion can be performed, as outlined by the following result.
Lemma 2.
Trees with a full root can always be subject to a cut promotion; the root of the resulting tree is a -node. By contrast, trees with profile can always be subject to a cut demotion; the root of the resulting tree is a -node.
Proof.
The recipe given in Figure 3 (row 1) shows how to apply a cut promotion to a tree whose root is full. Hence, let us focus on applying a cut demotion on a tree that admits a cut with profile .
Replacing the three nodes with rank by their children gives an -node cut, whose nodes all have rank or , at least five of these nodes having rank . These nodes and their ancestors can be reshaped to form a -node complete binary tree of height , whose nodes at height must have rank or . Moreover, among the four leaves , at least one has rank , which proves that the root of has a left child of rank ; similarly, its right child has rank , and is a -node of rank . This is illustrated in Figure 3 (row 2).
In general, when cut-rebalancing a -node cut transforms a sub-tree into a tree with root and rank , or , the tree potential increases by no more than . Indeed, cut ancestors were affected, possibly becoming -nodes or full nodes; the parent of , whose child is now , might also become a -node or a full node; and the grand-parent and great-grand-parent of might become full nodes. Hence, cut promotions increase the tree potential by or less, and cut demotions increase this potential by or less.
3.3 Insertion
Let us modify the textbook insertion algorithm to make some transient operations, which should occur in cases 1 and 4 of page 1, terminal operations. In particular, all transient operations in our variant are already transient in the textbook algorithm.
Case 1 is easy to take care of. Indeed, let us say that a node is weak when is either a leaf or a -node whose -child is also weak. An immediate induction proves that, when using the textbook algorithm, the lower end of a zero-edge is always weak. This proves, in particular, the well-known fact that case 1 is actually spurious, since it features a zero-edge whose lower end is a -node.
It remains to take care of case 4, which we subdivide into two sub-cases. First, we promote like in the textbook algorithm, thus obtaining a tree of rank . Then,
-
4.
-
a)
if is a -node, we do nothing more: we just performed a node promotion;
-
b)
if is a -node, let us recall that is weak, which gives it the profile , whereas has profile ; hence, has profile , and a cut demotion transforms into a tree of rank again: we just performed a terminal operation.
-
a)
Lemma 3.
Each transient operation aimed at eliminating a zero-edge of rank destroys one full node and creates one -node, thus decreasing the tree potential.
Proof.
We just saw that the only transient case is case 4a, on the condition that was a -child.
If the parent of is full after was promoted, it must now be a -node whose other child is a -child. Hence, before promoting , its parent was already a -node whose children were a -and a -node, its child having two children and that were a -child and a -node. This means that was already full.
Similarly, if the grand-parent of is full after was promoted, it must now be a -node whose children are a -node (the node ) and a -node (the sibling of ), one of it grand-children or , say , being a -child or a -node. Before promoting , the node was already a -node, and either , in which case was already a -node, or , in which case was already a -node. This means that was already full.
Finally, promoting , in case it used to be a -child, does not make any other ancestor of a full node. Hence, our transient operation transformed a full node into a -node, and did not create any other full node or -node: as a result, it decreased the tree potential by one.
3.4 Deletion
Similarly, we wish to transform as many transient operations, which occur in cases 6 to 8 of the textbook textbook algorithm, into terminal operations.
The rotations and demotion performed in these cases transform a tree of rank rooted at a four-node into a tree of rank whose root is a -node. If is full, performing a cut promotion on results in another tree of rank ; similarly, if some child of is full, performing a cut promotion on and increasing the rank of by also transforms into another tree of rank ; and if both children of are full, is also full, a case we already took care of. In all cases, this two-step transformation of into is a terminal operation.
If, however, neither nor its children is full, the rotation or demotion we performed in case 6, 7 or 8 did not create any full node. Indeed, replacing by a tree of smaller rank did not make any ancestor of a full node. The only other nodes whose rank or induced sub-tree was changed are and its children, which we precisely assumed are not full either. In that case, our operation is transient if the parent of , say , was a -node, and became a four-node.
Lemma 4.
Each transient operation triggered by a deletion decreases the number of -nodes, and does not increase the number of full nodes, also decreasing the tree potential.
Proof.
We just proved that transient operations are those where was a -node and neither its new child nor the children of became full. Hence, we did not create any full node, and it suffices to check, on a case-by-case basis, that the number of -nodes decreased:
4 Top-down updates
4.1 Overview and complexity analysis
Our top-down algorithms for inserting a key in a tall enough AVL tree and for deleting from follow the same general ideas, which were already successfully applied to weak AVL trees [5]. We will discover, in a top-down manner, the branch below which should be inserted or deleted, where is the root of and each node is a child of the previous node . Using a look-ahead, and after discovering the first nodes on that branch, we must already decide what to do.
If we are lucky enough, we found a node , for some integer , that is deemed safe: propagating a zero-edge (during insertions) or four-node (during deletions) that we would have planted along our branch far enough below will stop when meeting . When such a node exists among the top few nodes of the branch, we say that was friendly; indeed, we should simply focus on updating the tree , whose rank is smaller than itself, and which will be transformed into a new tree of the same rank.
This fortunate case suggests focusing only on updating trees whose root is safe. Hence, if has an unsafe root, a preliminary step will consist in transforming into a tree whose root is safe, which we will then update; all subsequent (recursively called) updates will be performed on trees whose root is safe.
If, however, has a safe root but is unfriendly, we will choose a deep enough node, say , and transform the sub-tree into a tree whose root is safe, while promoting it (during insertions) or demoting it (during deletions). This transformation will create a zero-edge or a four-node that will be propagated upward and whose propagation will stop when meeting the safe root ; afterward, it just remains to update .
Once implemented, these ideas lead to a bottom-up updating algorithm, whose amortised complexity we study now.
Theorem 5.
Proof.
Like for Theorem 1, we start from an empty tree, perform insertion or deletion updates, and keep track of the variations of tree potential while these updates are performed. There is a large enough constant such that each update requires:
-
1.
optionally, making the tree root safe, which increases the tree potential by at most ;
-
2.
identifying some safe nodes; this requires no write operation and does not change the tree potential;
-
3.
operating on trees small height, i.e., whose height is smaller than a constant ; this requires performing a constant number of write operations, and increases the tree potential by a at most ;
-
4.
performing a certain number of batches, each of which consists in (i) cut-demoting a node to make it safe and make its ancestor a four-node; (ii) optionally, performing a cut rebalancing for ensuring that some node will stop propagating four-nodes; (iii) transient operations; and (iv) one “terminal” operation where the four-node stop being propagated.
In each batch, steps (i), (ii) and (iv) increase the tree potential by no more than a constant, and each transient operation decreases the tree potential. Thus, each batch decreases the tree potential by , i.e., by no more than , where is a constant; it also requires performing write operations. In particular, if , and if batches of operations were performed in total, the potential increased by at most ; of course, making sure that means that we will need to determine a suitable constant .
The tree potential started with the value , and ended with a non-negative value, which shows that . Therefore, in total, these updates required performing no more than write operations.
Hence, it remains to:
-
1.
define our notions of safe nodes, both for insertions and for deletions;
-
2.
prove that trees rooted at an unsafe node can be cut promoted to make their root safe, during insertions;
-
3.
prove that trees rooted at an unsafe node can be cut demoted to make their root safe, during deletions;
-
4.
compute the constants hidden in the notations above, in order to effectively compute a constant for which batches decrease the tree potential.
Below, we will prove that choosing during insertions, and during deletions is enough for our purposes: each batch will decrease the tree potential by at least , and other operations will increase the tree potential by per update. These values of are grossly sub-optimal: for example, choosing during insertions, and during deletions would be enough; computing these better values, although not intrinsically difficult, involves lengthy case-by-case studies that we chose to avoid here.
4.2 Insertion
Below, we say that a node is insertion-safe (or simply safe, when the context is clear) when (i) and is a -node, or (ii) and is not full. We showed in Section 3.2 that each tree rooted at a full node, i.e., each tree of rank whose root is unsafe, can be subject to a cut promotion, resulting in a tree of rank whose root is a -node, thus being safe.
This makes following the recipe of Section 4.1 easy once the desired integer is chosen. First, if , just use the bottom-up algorithm of Section 3.3, which may result in a tree of rank only in case was rooted at an unsafe node. Otherwise, , and:
-
1.
if is rooted at an unsafe node, cut-promote it, thus obtaining a tree of rank whose root is safe;
-
2.
if is friendly, let be the least positive integer for which is safe: we simply focus on inserting the key in ;
-
3.
if is unfriendly, the first nodes on our branch are all unsafe; thus, they are -nodes, and the branch contains at least nodes, with the guarantee that , so that all nodes are full; hence we cut-promote and promote all nodes , thereby creating a zero-edge between and , which we eliminate by using the cut rebalancing prescribed in Section 3.3.
Lemma 6.
Step 3 consists in a batch operation that decreases the tree potential by or more.
Proof.
It suffices to decompose step 3 in atomic operations:
-
i)
a cut promotion, which increases the tree potential by or less;
-
ii)
replacing each of the full nodes by a -node, which decreases the tree potential by ;
-
iii)
eliminating a zero-edge through a terminal cut rebalancing performed on a cut with to nodes, and optionally followed by a cut demotion; this increases the tree potential by or less.
4.3 Deletion
We will now characterise deletion-safe (or simply safe, when the context is clear) nodes: a node should be safe when they can stop propagating four-nodes. There will be three families of safe nodes: (i) surely safe nodes, which stop propagating four-nodes even when using the textbook deletion algorithm, (ii) barely safe nodes, which are some of the nodes that can be transformed to create surely safe nodes in their close vicinity, and (iii) simply safe nodes, which stop propagating four-nodes when using the algorithm of Section 3.4.
Surely safe nodes are the nodes of rank with profile , or rank and profile . Indeed, if some child of is to be replaced by a node of smaller rank, either (i) had profile , in which case its new profile is simply , or (ii) had profiles and , in which case was the -child of (because the -child of had profile ), and we fall in case 5 of the bottom-up algorithm.
Detecting other kinds of safe nodes is more challenging; in particular, whether a node is safe may depend on which descendants of are to be replaced by nodes of smaller rank. More precisely, let be a node that belongs to the branch below which the key will be deleted, and let .
Lemma 7.
Each tree can be subject to a cut rebalancing, called cut saving, that yields a tree of rank or in which has a surely safe ancestor. This cut saving increases the tree potential by no more than .
Proof.
Let be the root of . If is surely safe already, there is nothing to do. Hence, we assume that has profiles and . Replacing its descendants of rank by their children gives a cut with profile , or .
In the first case, the children of must have profiles and , and once again, there is nothing to do. In the third case, cut-demoting transforms it into a tree of rank whose root is a -node.
It remains to treat the second case. Let be the node of rank in the cut , and let , , and be the cut ancestors. By symmetry, we assume that coincides with , or ; if , that descends from , or ; and if and descends from , there are no more -nodes among and than among and . Then, we distinguish ten cases; the tree will be of rank in cases 1 to 8, and in cases 9 and 10:
-
1.
If (i) or and (iii) descends from , or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
2.
If (i) , (ii) is a -node and (iii) descends from or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
3.
If (i) , (ii) is a -node (whereas is not), and (iii) descends from or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
4.
If (i) , (ii) is a -node, and (iii) descends from or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
5.
If (i) , (ii) is a -node (whereas is not), and (iii) descends from or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
6.
If (i) and (iii) descends from or , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
7.
If (i) , (ii) is a -node, and (iii) descends from , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
8.
If (i) , (ii) is a -node (whereas is not), and (iii) descends from , we rebalance as shown in Figure 4 (tree ); this makes a surely safe ancestor of .
-
9.
If (i) or , (ii) all of , and coincide with or are -nodes, and (iii) descends from or , we first rebalance as shown in Figure 4 (tree ). The sub-tree has profile and we cut-demote it, before demoting the node itself; this yields a tree of rank -1 whose root is a -node.
-
10.
If (i) , (ii) all of , , and are -nodes, and (iii) descends from , we first rebalance as shown in Figure 4 (tree ). The sub-tree has profile and we cut-demote it, before demoting the node itself; this yields a tree of rank -1 whose root is a -node.
All cases consist in performing a cut rebalancing operation on a cut with at most nodes, optionally followed by a cut demotion. Hence, the saving operation increases the tree potential by no more than .
We can now present two additional notions of safety. First, a node is said to be barely safe when the cut saving transforms into a tree of rank . Second, as already mentioned, a node is said to be simply safe if its stops propagating four-nodes. Fortunately, deciding whether a node is simply safe is easy, as outlined by the following result.
Lemma 8.
Being simply safe depends on local criteria only.
Proof.
Provided that the node is not surely safe, and if the node has just been replaced by a node of smaller rank, the textbook algorithm must now replace by a tree whose root is a -node with rank . Furthermore, as can be seen in Figure 2, at least one of its children and will also be a -node. Hence, we say that is -strong when the child is a -node.
Being - or -strong depends on local criteria only. Indeed, the node is -strong when:
-
coincides with , and has profile , , or ; or
-
coincides with , and has profile , or .
Hence, being -strong is a purely local criterion and, symmetrically, so is being -strong.
Then, knowing whether is -strong and/or -strong allows checking whether itself is safe, based on other local criteria. For instance, when coincides with , the node is safe when it has profile or , or in the following cases:
-
6.
and , and
-
a)
has profile , or , or
-
b)
has profile and is -strong;
-
a)
-
7.
and , and
-
a)
has profile , , , , , , or , or
-
b)
has profile or and is -strong;
-
a)
-
8.
and , and
-
a)
has profile , or
-
b)
has profile and is -strong.
-
a)
These cases were listed by simply enumerating all conditions that could leaf to creating a full node while cut-rebalancing a tree in cases 6 to 8. Hence, some of them are redundant; for instance, in case 7a, if has profile , it must also have had profile , which was sufficient for flagging it as safe. Symmetric characterisations exist when coincides with the left child of the node .
Safe nodes and the above saving procedure allow us to implement the recipe succinctly described in Section 4.1. If is of rank and its root is barely safe, first use the saving procedure to let a surely safe node appear on the branch below which will be deleted, and then use the bottom-up algorithm of Section 3.4; this will result in a tree of rank . Otherwise, if is of rank and its root is not barely safe, directly use the bottom-up algorithm of Section 3.4. It may result in a tree of rank only when was rooted at an unsafe node.
We focus now on trees of rank :
-
1.
If is rooted at an unsafe node, the saving procedure transforms it into a tree of rank whose root is safe.
-
2.
If is friendly, let be the least positive integer for which is safe: we simply focus on inserting the key in .
-
3.
If is unfriendly, we first consider two sub-cases: if the root of is surely or simply safe, we set ; otherwise, applying the saving algorithm to results, without altering the sub-tree , in a tree of rank in which has a surely safe ancestor, the least of which we call . Then, observing that , we apply the saving procedure to the sub-tree , which makes a four-node. We propagate once this four-node by using the textbook deletion algorithm, i.e., we do not cut-promote full-nodes that might result from using the textbook algorithm; then, we keep propagating our four-node using the algorithm from Section 3.4, a propagation that will stop only when reaching the node , at which point the four-node is eliminated.
Lemma 9.
Step 3 consists in a batch operation that decreases the tree potential by or more.
Proof.
Like for Lemma 4, it suffices to decompose step 3 in atomic operations:
- i)
- ii)
-
iii)
propagating once the four-node by using the textbook deletion algorithm on : this consists in rebalancing a cut with at most nodes, thus increasing the tree potential by no more than ;
-
iv)
propagating the four-node by using the algorithm of Section 3.4 on nodes , and stopping just before one meets the surely safe ancestor of ; there are at least such propagations, hence the tree potential decreases by at least ;
-
v)
eliminating the four-node by using the algorithm of Section 3.4 on ; this consists in rebalancing a cut with at most nodes, then possibly performing a cut promotion, and even a node promotion afterward; these three phases, two of which are optional, increase the tree potential by at most , , and , respectively.
References
- [1] Georgii Adel’son-Velskii and Evgenii Landis. An algorithm for organization of information. In Doklady Akademii Nauk, volume 146(2), pages 263–266. Russian Academy of Sciences, 1962.
- [2] Mahdi Amani, Kevin Lai, and Robert Tarjan. Amortized rotation cost in AVL trees. Information Processing Letters, 116(5):327–330, 2016. doi:10.1016/J.IPL.2015.12.009.
- [3] Norbert Blum and Kurt Mehlhorn. On the average number of rebalancing operations in weight-balanced trees. Theoretical Computer Science, 11:303–320, 1980. doi:10.1016/0304-3975(80)90018-3.
- [4] Leonidas Guibas and Robert Sedgewick. A dichromatic framework for balanced trees. In 19th Annual Symposium on Foundations of Computer Science (FOCS), pages 8–21. IEEE Computer Society, 1978. doi:10.1109/SFCS.1978.3.
- [5] Bernhard Haeupler, Siddhartha Sen, and Robert E Tarjan. Rank-balanced trees. ACM Transactions on Algorithms (TALG), 11(4):1–26, 2015. doi:10.1145/2689412.
- [6] Thomas Hibbard. Some combinatorial properties of certain trees with applications to searching and sorting. Journal of the ACM, 9(1):13–28, 1962. doi:10.1145/321105.321108.
- [7] Vincent Jugé. Grand-children weight-balanced binary search trees. In 19th International Symposium on Algorithms and Data Structures (WADS), volume 349 of LIPIcs, pages 14:1–14:20, 2025. doi:10.4230/LIPICS.WADS.2025.14.
- [8] Tony Lai and Derick Wood. A top-down updating algorithm for weight-balanced trees. International Journal of Foundations of Computer Science, 4(4):309–324, 1993. doi:10.1142/S0129054193000201.
- [9] Jürg Nievergelt and Edward Reingold. Binary search trees of bounded balance. In 4th Annual ACM Symposium on Theory of Computing (STOC), pages 137–142. ACM, 1972. doi:10.1145/800152.804906.
- [10] Hendrik Olivié. A study of balanced binary trees and balanced one-two trees. Universitaire Instelling Antwerpen (Belgium), 1980.
- [11] Hendrik Olivié. A new class of balanced search trees: half-balanced binary search tress. RAIRO. Informatique théorique, 16(1):51–71, 1982. doi:10.1051/ita/1982160100511.
- [12] Daniel Sleator and Robert Tarjan. Self-adjusting binary search trees. Journal of the ACM, 32(3):652–686, 1985. doi:10.1145/3828.3835.
