Enumeration of Ordered Trees with Leaf Restrictions
Abstract
An -ary tree for a constant is a rooted tree in which each node has at most children. A node having no children is called a leaf. For a given rooted tree and a node , the number of edges from the root to is called the depth of . We call a vector of nonnegative integers an (-ary) distribution if there is an -ary tree such that the number of leaves at each depth in is . Although not every vector of nonnegative integers is a distribution, a distribution can be associated with many -ary trees. In this paper, we present an algorithm to enumerate all -ary trees for a given distribution. Our algorithm reports the first tree in time, and then each subsequent -ary tree in time by representing each tree as the difference from the previous one. The algorithm can be restricted to computing all trees that are full, i.e., trees whose nodes have exactly or no children.
Keywords and phrases:
binary trees, ordered trees, rooted trees, enumeration algorithm, constant-time delayCategory:
ResearchFunding:
Yasuaki Kobayashi: JSPS KAKENHI Grant Numbers JP23K28034, JP24H00686, and JP24H00697.Copyright and License:
2012 ACM Subject Classification:
Mathematics of computing Enumeration ; Mathematics of computing Trees ; Theory of computationEditors:
Alessio Conte, Andrea Marino, Giovanna Rosone, and Jeffrey Scott VitterSeries and Publisher:
Open Access Series in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
It is our pleasure to dedicate this work to Roberto Grossi, whose broad enthusiasm for algorithms and data structure has significantly shaped the landscape of enumeration algorithms [9]. Over the past decades, his contributions made significant impact on structural and algorithmic aspects of trees, strings, and graphs, inspiring both theoretical advances and practical applications.
In this spirit, we turn our attention to the enumeration of ordered trees, a topic rooted in combinatorics with strong connections to data structures, formal languages, and algorithmic analysis. Ordered trees play a fundamental role in various domains, hierarchical data representations, and the analysis of recursive algorithms are major examples. Enumerating ordered trees uncovers rich combinatorial patterns, with which we hope to offer a tiny tribute to Roberto’s enduring influence, exploring a topic in line with his passion for enumerating combinatorial structures of mathematical beauty.
We here tackle a problem whose restricted version on binary trees has been posed in 2023 as an open problem presented at IWOCA 2023: The question is to efficiently enumerate all rooted full binary trees whose number of leaves at each depth is given as an input. This problem spawned from the research of prefix-free codes in coding theory.
Here, we tackle this enumeration problem in even more general settings: binary (not necessarily full) trees and -ary trees of a fixed depth for a constant . For any of these tree types, our algorithm outputs the first solution in time. Subsequently, it outputs the difference between two successive solutions in time with constant delay, where is the number of leaves and is the maximum number of leaves at any depth. We here define delay as the time between the end of the previous output and the start of the subsequent output.
2 Related Work
With respect to the enumeration of rooted ordered trees, we are aware of an algorithm with constant delay to output all trees [3], or with restrictions on a specific diameter [12] or on a fixed number of internal nodes and leaves [17]. Given the number of internal nodes, Zaks [18] gave an enumeration algorithm for all -ary trees having internal nodes. This algorithm reports the trees in a lexicographical order based on a bit encoding of the trees. Recently, an algorithm [5] enumerating AVL trees has been proposed. In addition, closed formulas have been studied for counting various properties in rooted ordered trees of unbounded degree [6, 7].
For the enumeration of full binary trees with leaves, we propose a -bit encoding of a full binary tree. We use this encoding in the same way as Ruskey and Proskurowski [13] for the enumeration of binary trees by using Gray codes for enumerating bit encodings. Another bit encoding variant has been proposed by Baba et al. [1], who however used a depth-first encoding based on DFUDS [2], while our encoding is level-wise in the spirit of LOUDS [10].
3 Preliminaries
Given a fixed integer , an -ary tree is an ordered rooted tree whose nodes each have at most children, which are ordered. In this paper, we only consider trees that have at least two nodes. Except for the root, the parent of each node is uniquely determined. There is a left-to-right ordering of the child nodes with respect to the parent. A node is called a leaf if it has no child, otherwise it is called an internal node. For a given rooted tree and a node , the number of edges from the root to is called the depth of , which we denote by . In particular, the depth of the root is zero. The height of the tree is the depth of the leaf with the largest depth. A full -ary tree is an -ary tree whose nodes are either leaves or internal nodes with children. A special case is , for which we name a -ary tree a binary tree.
In what follows, we start with the enumeration of full binary trees, which is the easiest case. Then we extend our techniques to arbitrary binary trees, and finally consider -ary trees.
4 Properties of Binary Trees
Our input is a so-called binary (tree leaf) distribution, a term coined by Buro [4, Lemma 2.3]. It is defined as follows. We call an integer vector a binary distribution if there is a binary tree with depth such that has leaves with depth for every . We say that the binary (tree leaf) distribution of is – there can be many trees that have the same binary distribution. A binary distribution is also called a full-binary distribution if there is a full binary tree with the same property. Buro [4, Thm. 2.4] showed that the full-binary distribution of a full binary tree with maximum path length is unique. Here we study the connection of arbitrary binary distributions and their corresponding trees. In detail, the first problem we want to tackle in this paper is as follows.
Problem 1.
Given a (full-)binary distribution , enumerate all (full) binary trees whose (full-)binary distribution is .
We first characterize the necessary conditions for an integer vector to be a (full) binary distribution. For that, we make use of Kraft’s inequality [11].
Lemma 2 (Kraft’s inequality).
For a binary tree with leaves , it holds that .
Let be the weight of a vector of nonnegative integers. Then is a necessary condition according to Kraft’s inequality. We show that it is also sufficient.
Lemma 3.
A vector of nonnegative integers is a binary distribution if and only if .
Proof by induction on the depth ..
For , the binary tree can have only one or two leaves, i.e., or , and in any case holds. For the induction hypothesis, assume that our claim holds for a depth . Given a vector of nonnegative integers with , we show that there exists a binary tree whose binary distribution is . For that, we define a vector and distinguish the following two cases.
-
If is even, we set , and thus . By the induction hypothesis, there is a binary tree whose binary distribution is . If we expand leaves with depth in to internal nodes with two leaves, we obtain a binary tree whose binary distribution in .
-
If is odd, we rewrite the inequality as . Observing that the right summation only consists of even terms except for , the left-hand side is even while the right-hand side is odd. This allows us to increment the right-hand side by one while still retaining validity, i.e., as . We therefore can apply the even case analysis for the vector .
Lemma 4.
A vector with nonnegative integers and is the full-binary distribution of a full binary tree if and only if .
Proof.
We prove both directions by induction, for which we use as the starting point. For depth , on the one hand, there is only one full binary tree with the root having two leaves as children. On the other hand, there is only one vector with , and therefore the claim holds. For the induction hypothesis, let us assume that the claim holds for .
Direction .
Consider a full binary tree whose full-binary distribution is . We can group the leaves into pairs, each pair of leaves sharing the same parent at depth . We therefore can contract these pairs with their parents to a contracted tree having leaves at depth . Since is a full binary tree and , according to the induction hypothesis, the full-binary distribution of satisfies . This yields because
Direction .
Given a vector with nonnegative integers such that , i.e., . Multiplying this equation on both sides by gives , and hence must be even since all other terms are even. Therefore, is integer and is a vector with because
By the induction hypothesis, is a full-binary distribution of a full binary tree . We now expand leaves of at depth to internal nodes, each having two leaf children. This gives us a full binary tree whose full-binary distribution is .
Let for .
Corollary 5.
Given a full-binary tree distribution , then is the number of nodes with depth , which is even, for every .
Proof.
Following the proof of Lemma 4, we iteratively merge the leaves on depth with their parent node for from to . At each step, we obtain a full binary tree such that the number of leaves with the highest depth is always even.
Theorem 6.
The number of full binary trees having a full-binary distribution with and is
Proof.
We show the claim by induction on . The base case is , for which we have seen that is uniquely defined. For the induction hypothesis, let us assume that the claim holds for .
Consider a full-binary distribution with and . Then is even for all according to Corollary 5. In particular, the vector defined by for and is an integer vector with . With the induction hypothesis for we obtain
Each of these trees has leaves with depth . Fix one of them, which we modify to obtain a tree whose full-binary distribution is . For that, we select of those leaves, for which we have possibilities. By modifying each of the trees whose full-binary distribution is in all possible ways, we obtain
By construction, all modifications are distinct by using only expansions. If two expansions are equal, they must have originated from the same tree we expanded.
Example 7.
For , there are two full-binary distributions and . For , , and there is only the perfect full binary tree having as full-binary distribution. For , , there are two full binary trees, as shown in Figure 1.
5 Enumeration of Full Binary Trees
In what follows, we present an enumeration algorithm for Problem 1 in the case of full binary trees. Our main idea is to process the trees in a linear order, which we define by a binary encoding of the trees.
For that, we recall the result of Corollary 5 that gives the number of nodes with depth , which is common to all full binary trees having the full-binary distribution . In short, we define the vector with for . Then is the number of all nodes. In particular, we can use an -length bit vector to specify which nodes on depth are internal nodes or leaves. The list of bit vectors on all depths is sufficient to represent a full binary tree. Then . For , this means that we can represent each full binary tree whose full-binary distribution is in bits. We use this bit representation to enumerate these trees. For that, we build on the algorithm of [14].
Lemma 8 ([14]).
There is an algorithm that enumerates all subsets with integers of the integer range in ascending lexicographic order with constant delay.
The connection to our problem is that a bit vector of length with “1”s exactly represents such a subset. Since we can impose the natural lexicographic order on the -length bit vectors, the missing step is to find, given a -length bit vector representing a full binary tree, the lexicographic succeeding bit vector that represents a tree we want to output.
To facilitate our starting, we assume that for every . Suppose that we have a full-binary tree whose distribution is . We deduce from a bit vector representing . For explanation, we assume that the least significant bits are on the left. We start with a bit vector of length to represent the order of the internal nodes and leaves on each depth. To do this, we partition by depths such that is a bit vector of length with if and only if the -th node with depth is a leaf. In particular, has zeros. As an example, the values of for the trees depicted in Figure 1 are 001111 (left), 1011 (middle), and 0111 (right). (Recall that does not encode the root node, for which we assume it is always an internal node.) As a practical optimization, we can omit the highest depth since it always contains leaves, i.e., contains only zeros. Since we can reconstruct from uniquely, there is a one-to-one mapping from full binary trees with at least two nodes and a subset of bit strings defined by the above tree encoding. Consequently, each full binary tree has exactly one specific bit vector representation.
In what follows, we represent each full binary tree having the distribution with the bit vector , which is stored as a node in a so-called enumeration tree . The root of represents the full binary tree obtained by grouping all internal nodes to the left side, which we expressed by the lexicographic least bit vector, obtained by shifting all “1”s in every of to the left. An -node is augmented by a bit vector representing a full binary tree and an integer, which we call the change level. The change level of an -node states the depth at which the bit vector of differs from its parent; as an exception, we stipulate that the change level of the root node is 0. The change level is an invariant we impose on in the sense that, when traversing from a node to of its children , the tree represented by and by can and must differ only at the change level of . A consequence is that -nodes with change level are leaves.
We organize the children of an -node as follows. Fix one -node with bit vector and change level . Then the children of correspond to all possible rearrangements of the bit patterns in one of the vectors . We start with the first children of that enumerate all possible bit patterns of (resulting by permutations). So they have change level with a bit vector that differs from ’s bit vector only at depth . (In particular, these nodes are leaves by the definition of the change level.) Next, we continue with the group of children for and change level , and so on, up until and change level . In particular, we arrange each group of children with the same change level in lexicographic order of their corresponding bit vector . This order of the children and the order of their groups allows us to enumerate all full binary trees by a pre-order traversal of in lexicographic order. (In particular, the lexicographic order imposes the invariant that we never enumerate a full binary tree twice.) This structure of warrants the following – see Figure 2 for an example and Figure 3 for a more elaborated example involving a non-binary choice at one level:
-
Any subtree rooted at a node of is an enumeration tree itself. That is because an -node with change level representing a tree is the root node of an enumeration tree that considers only the suffix of for the enumeration. In detail, for an -node with change level it holds that for every by the construction – recall that a node with change level only makes a change on compared to its parent node, and reading the change levels upwards to the root gives a monotone decreasing sequence of change levels (so there is no way that any deeper level has changed with respect to the initial bit vector of the root.) In particular, an -node is the lexicographically least among all its descendants.
-
Since deeper recursion levels only touch shorter suffixes of the bit vectors, the rightmost leaf of an -node is lexicographically smaller than ’s right sibling (if it exists). Therefore, all -nodes represent different vectors, and the sum of all -nodes is the size of the output we want to compute.
If we now run the algorithm of Lemma 8 on nested loops we can traverse the constructed enumeration tree in a pre-order traversal and output each node upon visiting it. The maximum delay happens when moving to the next value in the lowest loops. This delay is , but amortized because to reach a depth of we have already output nodes.
It is left to argue that we can transform the sequence of outputted -nodes into the full binary trees we want to output. For that, we initially build the full binary tree represented by the root of from scratch, taking time. Subsequently, for each -node visit, it suffices to change the order of the nodes of the previously computed full binary tree at a specific depth, costing time.
Lemma 9.
Given a full-binary distribution with for all , we can enumerate all full binary trees whose full-binary distribution is in lexicographic order with constant amortized delay or worst-case delay. The precomputation takes time.
We can generalize this approach to full-binary distributions having zero values. The obstacle is that we cannot guarantee constant delay since we might traverse long paths with zero weights. As a remedy, we want to skip each depth with . To do that, let us denote by all depths with leaves. Then it suffices to restrict the values of a change level of a node to values of . By construction, the root node has change level , and a node with change level has children whose augmented value is the successor of in .
Lemma 10.
Given a full-binary distribution with , we can enumerate all full binary trees whose full-binary distribution is in lexicographic order with constant amortized delay or worst-case delay. The precomputation takes time.
To obtain worst-case constant delay time, we can drop the requirement to output in lexicographic order by using a folklore technique, such as the alternating output technique [15]. The idea is to output nodes on odd depths when visiting them the first time, but postpone the output of nodes on even depths until we backtrack and move outside their respective subtrees.
Theorem 11.
Given a full-binary distribution , the corresponding full binary tree can be output in time for the first tree and then output all trees in the form of differences with a size of with a delay of in time, where and .
6 Enumeration of Binary Trees
In what follows, we show how to generalize the enumeration of full binary trees to general binary trees, where internal nodes are allowed to have one or two children.
From Lemmas 3 and 4, the binary distribution of a non-full binary tree also satisfies . We make the connection to full binary trees by adding dummy leaves to all internal nodes that have only one child. To this end, we denote the number of dummy leaves added at depth by , and construct a vector . The sum gives a vector that is the binary distribution of a full binary tree because . In what follows, we call the complement vector of , and the completion of . For such , a nonnegative integer vector that satisfies is called the subcomplement vector of .
Lemma 12.
For a complement vector of we have that .
Proof.
We can only attach a dummy leaf at depth to internal nodes that have exactly one child on depth . By the pigeonhole principle, there are at most many such children on depth .
Lemma 13.
If the binary distribution satisfies , then there exists a complement vector for .
However, the complement vector is not unique, in general. For example, is not a full-binary distribution since , while and are both complement vectors of . See Figure 4 for an example.
Also, even if satisfies , it does not necessarily mean that is the complement vector of . For example, has as a complement vector, but is not a complement vector: A tree having the binary distribution cannot have six dummy leaves on depth 4. However, a necessary condition is given below.
Lemma 14.
A nonnegative integer vector is a subcomplement vector of a complement vector of a binary distribution if and only if , , and .
Proof.
Let .
Direction .
In order for to be a subcomplement vector of , a binary tree with leaves, including dummy leaves, must exist. Therefore, it follows from Corollary 5 that is even, so . Also, since dummy leaves can only be attached to internal nodes that have exactly one child (in this case, a leaf), by Lemma 12. Furthermore, in order for to be a binary tree (with dummy leaves), must hold.
Direction .
We show that there exists a complement vector for such that .
For depth , must be either or . In the case of , represents a full binary tree, so the complement vector of is , and thus . In the case of , the only complement vector is , and hence is the unique subcomplement vector of satisfying . Thus, in both cases the subcomplement vectors are uniquely defined by the right-hand side conditions.
In what follows, we assume that . Given a binary distribution of a binary tree , by assumption and are even numbers. Hence, we can apply the contraction trick like in previous proofs. For that, consider the vector . The vector is well-defined: Since and , is a positive integer. Since
by Lemmas 3 and 13, is a binary distribution of a binary tree . The normal leaves (i.e., excluding dummy leaves) of depth in are in total, of which
-
of them are former leaves of unaffected by the contraction,
-
internal nodes of are changed into leaves on depth by contracting them with their two child leaves (where one may be a dummy leaf).
The resulting binary tree therefore has leaves on depth in total. The contraction of to by removing depth does not affect the number of leaves at depth or less in . By definition, there is a complement vector of that can turn into a full binary tree. Finally, we turn leaves of on depth into internal nodes, each having one child. This restores and induces the complement vector of .
For every binary tree, there is only one way to add dummy leaves to internal nodes to make it a full binary tree. Hence, the complement vector for each binary tree, whose completion is the binary distribution of the full binary tree, is uniquely determined. Conversely, given a full binary tree and a configuration classifying leaves in dummy leaves and non-dummy leaves, we can obtain the original binary tree by removing dummy leaves according to the configuration to obtain the original binary tree. Based on this observation, we first find the completion, i.e., the corresponding full binary tree, for , and then enumerate the binary trees by considering all ways of adding dummy leaves while enumerating the corresponding full binary completions. That is, by enumerating all possible complement vectors for , and then considering the process of enumerating the different ways of adding dummy leaves while considering the corresponding full binary completion, we can enumerate all binary trees. In summarizing the above, the enumeration algorithm is designed in a three-layer structure:
-
1.
enumeration of complement vectors (Section 6.1),
-
2.
enumeration of the way of adding dummy leaves to each complement vector (Section 6.2), and
-
3.
enumeration of binary trees for each way of adding dummy leaves (Section 6.3).
6.1 Enumeration of Complement Vectors
We enumerate the complement vectors with the insights of Lemmas 13 and 14. In detail, Lemma 14 gives us a strategy to enumerate the complement vectors by enumerating subcomplement vectors. We show the algorithmic steps in Algorithm 1, which is a recursive algorithm that determines the values of in descending order in a depth-first manner. We initially call EnumCompVector with the arguments , for the weight of the binary distribution , the -dimensional vector with entries, the depth and the number of conceptional leaves on depth . The last argument needs a definition: This algorithm recursively contracts leaves and dummy leaves at depth to the so-called conceptional leaves on depth . If we have dummy leaves and ordinary leaves on depth , then the number of conceptional leaves at depth is , and we consider from then on the binary distribution for a tree of depth . Now, in the function EnumCompVector, we enumerate all satisfying the conditions of Lemma 14 at Line 5. We proceed with each such at Line 6 with a subcomplement vector . At Line 6, we check if we have obtained a completion with at that point. If is a complement vector of , we output at Line 7 and return. If we can still increase a value in to add dummy leaves (Line 10), we recursively call the procedure for the preceding depth . By Lemma 14, if we recurse, we either continue recursing or turn a subcomplement vector into a complement vector. We can write each recursive call in the form of a tree, where each node is a call of EnumCompVector: the root is the initial call, and each leaf is a call that returns a complement vector. See Figure 5 for an example. Finally, since we can evaluate the checks in the pseudocode in time, the delay between the output of two complement vectors is time.
6.2 Enumeration of Dummy Leaves
Given the complement vector for , there can be multiple ways to add the dummy leaves specified by to a binary tree whose binary distribution is . Our task is to enumerate all these ways. For that, we follow the strategy of Section 5, where we enumerate the leaf/internal node configurations at depth for full binary trees. However, unlike normal leaves, the sibling of a dummy leaf must be a normal leaf or an internal node. To facilitate checking this condition, we enumerate the internal node one level above the dummy leaf itself, rather than at the configuration of the dummy leaf. We call such an internal node the dummy parent of a dummy leaf. For each dummy parent, it suffices to enumerate the two possibilities of having its dummy leaf as its left or right child.
First, as in Section 5, suppose that we have a full-binary distribution . From the proof of Corollary 5, the number of nodes at depth is .
Suppose is the smallest depth at which a dummy leaf appears for the first time. At depth , there are dummy leaves. By definition, there are dummy parents at depth . We specify these dummy parents by a bit vector of length with “1”s at positions.
In doing so, at depth , we have already determined the dummy parents of the dummy leaves at depth , regardless of whether they have a dummy leaf as their left or right child, so the number of nodes that we can freely choose at depth is . We again specify the dummy parents at depth and recurse. Therefore, we can express our selection of dummy parents by a -length bit vector with “1”s in positions. By recursing on deeper levels, we can apply this technique on all depths, generalizing from to any . In what follows, we want to express the configuration of dummy parents on each level by a bit vector, for which we want to bound the size of .
Lemma 15.
Each is at most .
Proof.
From Lemma 14, . So there is an such that the induction hypothesis holds for every . Then we obtain
where we used Lemma 12 for .
Therefore, we can specify dummy parents with an -length bit vector at each depth. We denote the sequence of bit vectors for specifying the dummy parent configuration by .
Furthermore, we express whether a dummy leaf at depth is the left or right child of its parent by a bit vector of length (for each configuration of the dummy parents at depth , there are ways for the dummy leaves at depth to be attached). In total, we can express the dummy leaf configuration by a -length bit vector. Since by the above analysis, this bit vector has length .
An enumeration of bit vectors are Gray codes [8], which have delay because two subsequently output bit vectors differ by one bit. Here we use Gray codes to enumerate the dummy leaf configuration for each dummy parent configuration with constant delay.
To conclude, we can enumerate the configuration of dummy leaves in the following manner:
-
Enumerate the configuration of the dummy parents with constant time delay, and output this configuration in .
-
Generate a Gray code of bits for each configuration of dummy parents, and enumerate the configuration of dummy leaves.
Figure 6 shows an example of a dummy parent enumeration tree and a Gray code for a binary distribution and its complement vector .
6.3 Enumeration of the Leaf Configuration
Based on the previous discussions, at this point, we have determined the configuration of dummy parents with dummy leaves and dummy parents of nodes at each depth of the binary tree. Here, we assign roles to the remaining nodes as leaves and internal nodes having no dummy leaves as children. As before, we can specify the roles in the form of a bit vector of length with “1”s. We call such a vector sequence a leaf configuration vector sequence and denote it by . Fixing one configuration of dummy leaves, we enumerate combinations in the same way as in Section 5 to obtain all binary trees having that configuration of dummy leaves. We proceed in this order here despite that once is determined, the number of dummy parents and leaves at each depth is fixed, so we can determine the leaf configuration vector independently of the dummy parent configuration. Figure 7 gives an example of a leaf configuration enumeration tree for a binary distribution and its complement vector .
6.4 Overview of the Algorithm
We combine the above to enumerate binary trees using enumeration trees and Gray codes. For the sake of explanation, we enumerate the configuration of dummy leaves not directly after enumerating the configuration of dummy parents – we can take care of the dummy leaves at any time after we have determined the dummy parents. This also allows us to postpone the computation of the configuration of dummy leaves to the end.
We visualize our pipeline that ends at the generation of Gray codes following pointers from three nested enumeration tree traversals in Figure 8. There, we start with a top-down traversal of the leftmost tree . Each leaf of determines a complement vector . We perform the actual traversal by executing EnumCompVector. Fixing , we move to the neighboring trees and to enumerate all dummy parent configurations and all leaf configurations and , respectively. First, we enter the enumeration tree . Each time we visit a new node in , we jump into and traverse this tree. For each node we visit, we have determined . Given , for each Gray code we enumerate, we finally output . From the discussion in Section 5, we can determine and output the pair with constant delay. Since we can also determine with constant delay, we can output for a fixed with constant delay. However, the delay in outputting itself is , but this time bound is dominated by the time bound to output the first for . This leads to the following complexities.
Theorem 16.
Given a binary distribution , the corresponding binary tree can be output in time for the first tree and then output all trees in the form of differences with a size of with a delay of .
While the complexities stated in Theorem 16 are in terms of , we create a full-binary tree whose binary distribution is for each completion vector . Nevertheless, for all , and therefore the number of nodes of this full-binary tree is asymptotically equal to the number of leaves of the tree we output.
Finally, an example of the binary tree corresponding our running example is given in Figure 9.
7 Extension to -ary trees
In this section, we extend the results of the previous section to general -ary trees for a constant . Like for binary trees (), we first define the notion of an -ary distribution.
7.1 Properties of full -ary trees and their enumeration
We call a vector of integers a (full) -ary distribution if there is a (full) -ary tree that has leaves on depth for every . For a nonnegative integer vector , we define the function , which is a generalization of with . The problem we want to tackle in this section can be stated as follows.
Problem 17.
Given a (full) -ary distribution , enumerate all (full) -ary trees whose -ary distribution is .
Like in the case of binary trees, the following holds.
Lemma 18 (Kraft’s inequality).
Let . A nonnegative integer vector that satisfies is a -ary distribution if and only if .
Corollary 19.
Let be a nonnegative integer vector satisfying and is a full -ary distribution. Then, is a multiple of , and represents the number of nodes at depth of the -ary tree having as its -ary distribution.
Theorem 20.
Let . Given a full -ary distribution satisfying , the number of all full -ary trees that have as their -ary distribution is
From these properties, the enumeration of full binary trees can be extended to the enumeration of full -ary trees. For that, we observe that the approach for enumerating full binary trees analyzed in Section 5 is independent of the degree of the tree, since the representation only considers the configuration of internal nodes and leaves per depth. Therefore, we can generalize this approach in a straightforward manner by replacing and with and , respectively, and updating the values of . We obtain the following result based on an extension of Theorem 11:
Theorem 21.
Given a full -ary distribution , the corresponding full -ary tree can be output in time for the first tree and then output all trees in the form of differences with a size of with a delay of in time, where and .
7.2 Enumeration of -ary trees
We can carry out the enumeration of -ary trees in the same way as binary trees, for which we had three levels of enumeration: for the complement vectors, the dummy leaves, and the actual leaves. Here, we proceed in the same way.
7.2.1 Enumeration of full -ary tree complement vectors
Suppose we have an -ary distribution with . We fix one -ary tree corresponding to and add dummy leaves to all internal nodes with fewer than children to make it a full -ary tree. In this case, let the number of dummy leaves added at depth be , and consider the vector . By construction, we obtain that . We call the vector constructed in this way the complement vector of , and we call the full -ary tree complement of . For such , a nonnegative integer vector is called a subcomplement vector of if . Similarly to binary trees in Lemma 13, the following holds.
Lemma 22.
Given an -ary distribution with , there exists a complement vector of .
Generalizing Lemma 14, the following is true.
Lemma 23.
A nonnegative integer vector is a subcomplement vector of the complement vector of an -ary distribution if and only if , , and .
Using these lemmas, we can generalize Algorithm 1 to Algorithm 2 to enumerate the complement vector of an -ary distribution.
7.2.2 Enumeration of the configuration of dummy leaves
Suppose that we have determined the complement vector of . The next step is to determine the configuration of dummy leaves, which is (like for binary trees) not unique, in general. However, in the case of -ary trees, among all siblings of a dummy leaf there must be at least one normal node, i.e., an internal node or a leaf that is not dummy. Therefore, we cannot infer from a dummy parent its number of dummy leaves, and thus specifying the configuration of dummy nodes with a Gray code like for binary trees seems not possible. Instead, we enumerate the number of dummy leaves that each dummy parent has from 0 to , and then enumerate the configuration of dummy leaves corresponding to that number. This strategy is slightly more complicated than in the case of binary trees because of this additional intermediate step, but the calculation time is asymptotically absorbed by the time spent for determining one complement vector. Having computed the complement vector, the following steps are analogous to the case of binary trees, for which we obtain constant time delay per output.
We enumerate all possible ways to place the dummy leaves at each depth by choosing a combination of the configuration of the dummy leaves at depth . For that, we first specify how many dummy leaves each parent has at depth . According to Corollary 19, in the complemented tree corresponding to the binary distribution with the added dummy leaves, there are nodes at depth (i.e., there are parents at depth ). We assign each parent a rank from in order from left to right, and we say that is the number of dummy leaves that the -th parent has. Since each parent must contain at least one normal node, , and the number of dummy leaves at depth is , so satisfies
| (1) |
We use a known result for enumerating all distinct integer tuples with satisfying Equation 1 with constant delay [16]. For each such integer tuple, we enumerate the configuration of dummy leaves corresponding to each . After that, all -ary trees can be enumerated with constant time delay as in the case of binary trees.
8 Conclusion
We have addressed the problem to efficiently enumerate trees whose numbers of leaves on each depth are given by a query vector. Specifically, for any leaf distribution of (full) binary or (full) -ary trees, we obtain the following time complexities. First, we can output the first tree in time. We can output every subsequent tree, after constant delay, in time by encoding this tree as a difference to a previous one using words. Here, is the total number of nodes and is the maximal number of nodes at any depth. These results hold for both full binary trees and full -ary trees with , and extend to general binary trees with analogous enumeration guarantees.
As future work, we want to determine the exact number of full binary trees and full -ary trees with distribution by a closed-form product formulas involving binomial coefficients. We further strive to give practical implementations of our proposed enumeration algorithms.
References
- [1] Masahiro Baba, Hirotaka Ono, Kunihiko Sadakane, and Masafumi Yamashita. A succinct representation of a full binary tree. Technical Report 0, Record of Joint Conference of Electrical and Electronics Engineers in Kyushu, 2009. doi:10.11527/jceeek.2009.0.59.0.
- [2] David Benoit, Erik D. Demaine, J. Ian Munro, Rajeev Raman, Venkatesh Raman, and S. Srinivasa Rao. Representing trees of higher degree. Algorithmica, 43(4):275–292, 2005. doi:10.1007/s00453-004-1146-6.
- [3] Terry Beyer and Sandra Mitchell Hedetniemi. Constant time generation of rooted trees. SIAM J. Comput., 9(4):706–712, 1980. doi:10.1137/0209055.
- [4] Michael Buro. On the maximum length of Huffman codes. Inf. Process. Lett., 45(5):219–223, 1993. doi:10.1016/0020-0190(93)90207-P.
- [5] Jeremy Chizewer, Stephen Melczer, J. Ian Munro, and Ava Pun. Enumeration and succinct encoding of AVL trees. In Proc. AofA, volume 302 of LIPIcs, pages 2:1–2:12, 2024. doi:10.4230/LIPICS.AOFA.2024.2.
- [6] Nachum Dershowitz and Shmuel Zaks. Enumerations of ordered trees. Discret. Math., 31(1):9–28, 1980. doi:10.1016/0012-365X(80)90168-5.
- [7] Sen-Peng Eu, Seunghyun Seo, and Heesung Shin. Enumerations of vertices among all rooted ordered trees with levels and degrees. Discret. Math., 340(9):2123–2129, 2017. doi:10.1016/J.DISC.2017.04.007.
- [8] Frank Gray. Pulse code communication. US Patent 2632058, 1953.
- [9] Roberto Grossi. Enumeration of paths, cycles, and spanning trees. In Encyclopedia of Algorithms, pages 640–645. Springer, 2016. doi:10.1007/978-1-4939-2864-4_728.
- [10] Guy Jacobson. Space-efficient static trees and graphs. In Proc. FOCS, pages 549–554, 1989. doi:10.1109/SFCS.1989.63533.
- [11] Leon Gordon Kraft. A device for quantizing, grouping, and coding amplitude-modulated pulses. Master’s thesis, Massachusetts Institute of Technology, 1949.
- [12] Shin-Ichi Nakano and Takeaki Uno. Constant time generation of trees with specified diameter. In Proc. WG, volume 3353 of LNCS, pages 33–45, 2004. doi:10.1007/978-3-540-30559-0_3.
- [13] Frank Ruskey and Andrzej Proskurowski. Generating binary trees by transpositions. J. Algorithms, 11(1):68–84, 1990. doi:10.1016/0196-6774(90)90030-I.
- [14] Ivan Stojmenovic. A simple systolic algorithm for generating combinations in lexicographic order. Computers & Mathematics with Applications, 24(4):61–64, 1992.
- [15] Takeaki Uno. Two general methods to reduce delay and change of enumeration algorithms. Technical Report NII-2003-004E, NII Technical Report, 2003. doi:10.20736/0000000385.
- [16] Timothy R. Walsh. Loop-free sequencing of bounded integer compositions. Journal of Combinatorial Mathematics and Combinatorial Computing, 33:323–345, 2000.
- [17] Katsuhisa Yamanaka, Yota Otachi, and Shin-Ichi Nakano. Efficient enumeration of ordered trees with leaves. Theor. Comput. Sci., 442:22–27, 2012. doi:10.1016/j.tcs.2011.01.017.
- [18] Shmuel Zaks. Lexicographic generation of ordered trees. Theor. Comput. Sci., 10:63–82, 1980. doi:10.1016/0304-3975(80)90073-0.
Appendix
| symbol | meaning |
|---|---|
| distribution | |
| maximal depth and dimension of | |
| number of leaves | |
| maximum value in | |
| complement vector | |
| subcomplement vector | |
| weight function | |
| for |
