Space-Efficient Depth-First Search via Augmented Succinct Graph Encodings
Abstract
We call a graph separable if a balanced separator can be computed for of size with . Many real-world graphs are separable such as graphs of bounded genus, graphs of constant treewidth, and graphs excluding a fixed minor. In particular, the well-known planar graphs are separable. We present a succinct encoding of separable graphs such that, after the encoding is computed, any number of depth-first searches (DFS) can be performed from any given start vertex, each in time and bits in the word RAM model. After the execution of a DFS, the succinct encoding of is augmented such that the DFS tree is encoded inside the encoding while maintaining succinctness. Afterward, the encoding provides common DFS-related queries in constant time. These queries include queries such as lowest-common ancestor of two given vertices in the DFS tree or queries that output the lowpoint of a given vertex in the DFS tree. Furthermore, for planar graphs, we show that the succinct encoding can be computed in bits and expected linear time, and a compact variant can be constructed in time and bits. For other separable graph classes the runtime and space usage depends on the specific algorithms used to find balanced separators in graphs of .
Keywords and phrases:
Depth-First Search, Succinct, Space Efficient, Separable Graphs, Planar Graphs, Table Lookup, -DivisionCopyright and License:
2012 ACM Subject Classification:
Theory of computation Data structures design and analysis ; Theory of computation Graph algorithms analysisEditors:
Ho-Lin Chen, Wing-Kai Hon, and Meng-Tsung TsaiSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Depth-first search (DFS) in graphs forms the backbone of algorithms for a number of applications like finding vertex or edge cuts. A depth-first search implicitly computes a tree (or forest), over the vertices of the graph that is called a DFS tree (or forest), which has crucial structural properties that are commonly used for applications. DFS that runs in linear time in the number of vertices and edges utilizes two folklore data structures: a stack to keep track of the current paths into the graph and an adjacency list to efficiently iterate over the neighbors of a given vertex. Standard implementations of this approach use bits of space for the stack and bits for each vertex identifier, where refers to the number of vertices in a given graph. Lowering the space requirements for depth-first search to bits while still maintaining a (nearly) linear runtime was the aim of a series of works during the last years: one of the first algorithms is due to Asano et al. [3] who reduced the space to bits, but increased the runtime to , where refers to the number of edges in a given graph. After several improvements [4, 7, 9] to this result, Hagerup [13, Theorem 5.4] presented the current state-of-the-art algorithm using bits and time.
In many circumstances, we do not need to be able to handle input graphs of any kind. Instead, the input graphs have a special structure that we can utilize to devise algorithms that are more efficient than the ones handling the general case. Typical examples are planar graphs, which can be drawn in the plane without edge crossings, or generalizations of it like graphs of bounded genus or graphs excluding a fixed minor. What these graphs have in common is that they are sparse, meaning . As there exists a DFS that uses time and bits on general graphs, as stated by Hagerup [13, Theorem 4.1] and Banerjee et al. [4, Lemma 2], we can execute a DFS with time and bits on sparse graphs. If we shift our attention to -bit DFS algorithms, we can observe some strong indications from complexity that no sublinear space polynomial time algorithm can exist for a special variant of DFS, called lexicographical DFS [3, 9]. A concrete sublinear space algorithm for this problem restricted to planar graphs has an unwieldy massive polynomial runtime [18], due to the reliance on the logspace reachability result of Reingold [27].
As mentioned above, computing a DFS is often just the beginning or part of an algorithm solving more involved graph problems like computing biconnected and strongly-connected components. Typically, these applications of DFS store certain meta information that is computed while traversing the graph. Examples are pre-order numbering that numbers the vertices in the order of their exploration [28], and lowpoints that help to identify vertex and edge cuts [17]. If we want to extend existing approaches for computing DFS that are efficient in both space and time to this more general setting, we can not easily store the meta information of a vertex using any standard encoding. Again, this would use bits for each vertex and, hence, creates a total memory footprint of bits. While there exists space-efficient techniques for, e.g., computing pre-order numbers or lowpoints [7, 13, 20], they require significantly more resources than the space-efficient DFS variants, e.g., the algorithm Chakraborty et al. [7] for computing lowpoints uses bits and time.
In this paper we present both (1) a data structure that is able to keep meta information we compute for the vertices of a graph space-efficiently and (2) an approach for computing DFS traversals space-efficiently as well as extensions to some common applications. Our results are applicable to classes of separable graphs which have balanced separator size for some . This covers, in particular, planar graphs, graphs of bounded genus and graphs excluding a fixed minor.
Data Structure: Nested Divisions and Augmentations.
An encoding of an element from a universe is called succinct (compact) if it uses () bits where . In this paper we present a succinct encoding for separable graphs (precisely defined in Section 2.1), such as planar graphs, that provides the following functionalities. After the encoding is initialized, a DFS can be executed (from any given start vertex) in time and bits and afterward information of the executed DFS can be queried in constant time. If needed, a new DFS can be computed at any given time. Such information includes the pre-order numbering and lowpoints of a vertex, and lowest-common ancestor of two vertices. Furthermore, for planar graphs, we show that the succinct encoding can be computed in expected linear time with bits used during the construction step, and a compact variant can be constructed in time and bits.
The data structure we present, called succinct nested division, extends a succinct encoding of separable graphs by Blelloch and Farzan [6]. The encoding of Blelloch and Farzan is built on dividing the input graph into smaller “mini pieces”. Mini pieces are, in turn, divided into even smaller “micro pieces” that are small enough such that relevant structural information about them can be pre-computed and kept in a lookup table. The key property of this approach is that there are only few “boundary vertices” in each piece that are contained in multiple pieces. This suggests algorithms that, like the data structure itself, switch between these three levels. In fact, different algorithmic approaches are needed for different levels to maintain our target space and time bounds.
We augment the encoding of Blelloch and Farzan with additional data structures and show that this can be used to design more complex queries than the standard graph access queries provided by them. Intuitively, the boundary vertices act like relays since any interaction between two non-boundary vertices in different pieces must necessarily pass through them. Consequently, long-range dependencies (e.g., long paths), are efficiently mediated by the sparse set of boundary vertices. Hence, pieces can often be considered in isolation with few interactions between them, and all interactions between vertices of different pieces are communicated via boundary vertices. To capture the idea we intuitively call a property strongly local if, for all non-boundary vertices , can be evaluated only with information stored with the boundary vertices together with a small amount of information that is directly encoded in , typically for us there exists a boundary vertex of such that .
For circumventing the trivial lower bound of on the number of bits required for encoding members of a family of labeled graphs with vertices, we work with so-called unlabeled graphs. This means that we are able to choose our own vertex labels for the encoding so that we can construct a succinct encoding for separable graphs that uses bits [6]. Such a setting is very common, e.g., used by Blelloch and Farzan’s original encoding [6] as well as other works [1, 2, 8, 11, 19, 22, 25].
Algorithms: Depths-First Search and Applications.
Our succinct encoding for separable graphs is constructed such that a DFS can be performed directly on the encoding from any given starting vertex. Afterward, various queries regarding the DFS traversal are available. In particular, we directly provide the necessary queries that, e.g., Hopcroft and Tarjan’s biconnected-component algorithm [17], or Schmidt’s algorithm for chain decompositions [28] require. We effectively provide an interface that allows us to implement typical standard algorithms without the need to design specialized techniques. The following theorem summarizes our main results. Note that the runtime and bits are sublinear once the encoding is computed. The operations presented in it are only a set of examples of queries that have the previously outlined intuition of strongly local, chosen such that the previously mentioned standard algorithms can be executed directly. Without loss of generality, we assume that all graphs we work with are connected. If a given graph is not connected, one can apply all our results to each connected component separately.
Theorem 1.
Let be a separable graph. There exists a succinct encoding of that provides neighborhood iteration, adjacency and degree queries in constant time per element output. Additionally, it provides the operation that executes a DFS in time and bits from a given start vertex such that at all times the encoding remains succinct. After the execution of a DFS the following queries are available in constant time for the last computed DFS tree.
-
iteration over all children of .
-
: output the parent of .
-
/: return the pre-/post-order number of vertex .
-
: return the number of descendants of .
-
: return the depth of .
-
: return the lowpoint of .
-
: return the lowest-common ancestor of vertices and .
Replacing succinct with compact in Theorem 1, and additionally restricting to be a planar graph, the encoding can be constructed with bits and in time. The change from succinct to compact is due to our wish to avoid the use of a certain succinct dictionary data structure of Raman et al. [26] relying on hashing functions that can only be computed in expected linear time. When we replace the dictionary with a simpler variant of Baumann and Hagerup [5], we obtain a compact encoding that can be computed much more easily. When we are fine with expected linear time, we can use the dictionary of Raman et al., and thus can construct the succinct encoding in bits and expected time. Subsection 3.3 presents the details regarding the construction steps. For other graph classes the construction time and bits depend on the runtime of the respective separator algorithms used as subroutines to divide the input graph.
Corollary 2.
Let be a planar graph. There exists a compact (succinct) encoding of that can be constructed in time (expected time) and bits that provides the same functionalities as the encoding of Theorem 1.
Related Work on Graph Divisions.
Recursive divisions have been used as the basis for algorithms and data structures for decades, commonly based on the so-called -division for planar graphs, defined precisely in the next section. These divisions build on a recursive application of the linear-time separator algorithm of Lipton and Tarjan [24], and the improvement of Goodrich [12] who showed that the entire recursively application can be executed in linear time – a standard approach would use time. Recursive divisions have been successfully applied to algorithmic problems such as maximum flow and minimum cut and all-pairs shortest path [10]. Applications in the field of data structures include decremental data structures for connectivity [15, 16] where decremental refers to modifications of a graph that remove vertices or edges. None of the mentioned results have a focus on space-efficiency, and internally they use other techniques compared to our approach.
Structure of the Paper.
2 Nested Divisions
The present section contains an overview of what we view as a nested division, beginning with the graph-theoretic properties they provide us with (Subsection 2.1), and continuing how Blelloch and Farzan used nested divisions as a data structure to encode separable graphs (Subsection 2.2). Finally, we extend the section with our ideas of abstract augmentations in nested divisions we provide to allow complex queries (Subsection 2.3). We use the notation for any natural number to refer to the set .
2.1 Graph-Theoretic Properties of Nested Divisions
We describe a slightly generalized variant of the well-known concept of -divisions sketched in the introduction. We start with a small set of definitions. A balanced separator of a graph is a set of vertices such that each connected component of contains at most a constant fraction of the vertices. The exact constant does not matter to us. We call a class of graphs that is closed under taking vertex induced subgraphs -separable if there exists a constant with , such that each has a balanced separator of size . We simply say that a graph or class of graphs is separable if the size of the balanced separator is of no particular concern to us.
Definition 3 (Relaxed Division).
Let be a graph belonging to an -separable class of graphs for some . An -relaxed division is a decomposition of into a collection of subgraphs, called pieces, satisfying:
-
1.
Each piece is a subgraph with at most vertices such that there are pieces in total.
-
2.
Every edge is assigned to exactly one piece (containing both endpoints of the edge).
-
3.
Each piece has boundary vertices, which are vertices shared between different pieces.
We call the relaxation and the piece size. Edges between two boundary vertices are called boundary edges, edges between two non-boundary vertices non-boundary edges and all other edges transitional edges.
See Figure 1 for a sketch of such a division. The reason we introduce the relaxation , is due to our desire to achieve space-efficiency when algorithmically constructing a division. There exists an algorithm that computes a -relaxed division via recursive separator searches that uses only bits and, for planar graphs, time [21]. If the construction step of our encoding must not be space-efficient, one can of course use a standard non-space efficient algorithm [12, 23], and therefore have a relaxation factor of ; for our application a factor of allows for a more (space-)efficient computation of our final encoding, buy when using the encoding makes no difference. Common applications of divisions use them in a recursive fashion, and so do we. In particular, we construct for each piece of some -relaxed division a -relaxed division where . For this, we define a nested division for a separable graph as follows.
Definition 4 (Nested Division).
Let be a graph belonging to an -separable class of graphs for some . An -nested division of is an -relaxed division of into pieces such that for each piece we have a -relaxed division . We call pieces of mini pieces and pieces of some micro pieces.
If we have an -nested division for a graph each vertex can be viewed as being part of three levels:
-
graph: Each vertex is part of .
-
mini: Each vertex of is assigned to one or more mini pieces.
-
micro: Each vertex of is assigned to one or more micro pieces.
Given an -nested division for a graph we can categorize all vertices as follows. We call a vertex mini (micro) boundary if it is a boundary vertex in a mini (micro) piece, and mini (micro) non-boundary if it is not a boundary vertex in a mini (micro) piece. Edges are categorized similarly: an edge is called mini (micro) boundary edge if both endpoints are mini (micro) boundary vertices, mini (micro) non-boundary edge if both endpoints are mini (micro) non-boundary vertices, and mini (micro) transitional edge otherwise. We routinely drop the specifier mini or micro if we make statements that apply to both mini and micro levels. We are interested in bounding the number of total occurrences of mini (micro) boundary vertices among all mini (micro) pieces, referred to as mini (micro) duplicates. Let be an -relaxed division constructed for an -separable graph . As each piece has boundary vertices, the total number of duplicates is . Thus, for a nested division we have (duplicates of) mini boundary vertices, and (duplicates of) micro boundary vertices. We refer to a more detailed description of the reasoning behind these bounds to Blelloch and Farzan [6]. We summarize it in the following lemma.
Lemma 5 ([6]).
Let be a -nested division constructed for an -separable graph with relaxation parameters , and piece sizes and . Then there are (duplicates of) mini boundary vertices and (duplicates of) micro boundary vertices.
For all of our use cases of -nested divisions we have , and , with the exact polynomial chosen such that we have (duplicates of) mini boundary vertices, and (duplicates of) micro boundary vertices. For the remainder of our paper, when we refer to a nested division we refer to a -nested division.
2.2 Nested Divisions as Data Structures
As we now describe concrete data structures, we fix our model of computation to the word RAM model with word size . For the rest of the section, let be a separable graph and assume that a nested division is given for . We begin by outlining the most basic functionality we require for a nested division when used as a data structure to encode a graph. Blelloch and Farzan presented a succinct encoding for unlabeled separable graphs that effectively uses a nested division at its core, and the framework we present in this subsection is based on their work [6], defined slightly more general to allow for our later augmentations. While they do not use an -relaxation factor for the initial -relaxed division of the nested division (they use their own variation of a standard recursively constructed -division), their framework works with without any modifications. Their general idea is to succinctly encode the nested division, and encode each piece at the micro level as an index into a lookup table. Boundary vertices are encoded via additional data structures that use standard non-space efficient techniques, e.g., using arrays and lists. Combined with a succinct bidirectional mapping that maps a vertex to its different occurrences in each level (graph, mini and micro) they realize neighborhood, adjacency and degree queries.
To describe the functionality of these bidirectional mappings, we introduce a labeling of pieces and vertices. Each mini piece is uniquely identified by an id . Analogously, the relaxed division constructed for each is identified as . Each micro piece is identified by a tuple with indicating it is the piece with id , i.e., piece . Each vertex has a graph label assigned from , a mini label assigned from in each mini piece it is contained, and a micro label assigned from in each micro piece it is contained. When we talk about a vertex we always assume that is identified by its graph label. The mappings that are provided are as follows: given a vertex output the tuples such is the mini label of in the piece . Analogously for a given tuple where refers to a mini piece and is the label of some vertex of , output the tuple where is the micro label of vertex in a micro piece . Note that each of these queries can output more than one element if the vertex is a boundary vertex at the respective level. For less verbose writing we implicitly assume that we always have access to all mappings outlined in this paragraph and only make distinctions between the different types of labeling if necessary. We refer to the set of all outlined mappings as translation mappings.
Lemma 6 (Succinct Nested Division [6]).
Let be a graph and a separable class of graphs. There exists a succinct data structure of that represents a nested division of such that level mapping queries and level graph queries are provided in constant time.
2.3 Augmenting Nested Divisions
In the following we want to augment the nested division of Lemma 6 with various capabilities. For this we analyze the runtime and memory budget we have when our goal is to run some algorithms in time and with bits, followed by outlining necessary techniques. As stated, we construct our nested divisions with , and . This means that for each of the mini boundary vertices we have a memory budget of bits. Analogously, we have a budget of bits for each of the micro-boundary vertices. This means, we have a budget of bits for each mini piece and for each micro piece. These bounds allow us to use more space than what is needed for our applications, as we only require bits per mini boundary vertex (and mini piece), and bits per micro boundary vertex (and micro piece) for our applications. We have the same budget for the runtime as we do for memory, but for our application only require (amortized) constant time per boundary vertex.
For micro pieces we use a table lookup technique for constant time computations to subproblems. Micro pieces are so tiny, that any query we require can be described by a constant number of words in the word RAM model. We assume that we have a lookup table available that lists all graphs with at most vertices of the separable graph class we work with. The table lists all graphs modulo isomorphism, i.e., no two entries are isomorphic. This is not enough for us, as we require additional data for the boundary vertices. Instead, we require the graphs of the lookup table to be partially augmented, meaning for at most vertices of each graph of the lookup table we store a bit string (called a partial vertex augmentation) of some fixed length , and an additional bitstring of length (called a partial graph augmentation).
Using a lookup table that lists all partially augmented graphs with at most vertices, we can realize what we refer to as swapping indices. Each micro piece is encoded as an index into the lookup table referencing some graph together with its partial augmentations. When we want to update values stored for (the micro boundary vertices of) the micro piece we concretely do this by changing the index we store. Let be some micro piece encoded as an index into the lookup table. To update an augmented value in the micro piece, e.g., color a boundary vertex, we can replace the index with a different index . See Figure 2 for an example. We have to take care while constructing the table that any changes of the partial augmentation (i.e., the values stored for boundary vertices) does not change the internal labels of the graph, i.e., the names of the vertices stay the same independent of the changes we make to the data stored for boundary vertices. For the rest of our paper we always assume that we have access to such a lookup table.
3 Depths-First Search in Sublinear Time and Space
We present our techniques for executing a DFS directly on the succinct encoding. First, we show how the DFS is computed such that a so-called palm tree is encoded inside the nested division (Subsection 3.1), followed by our description of how to provide queries for so-called strongly local values (Subsection 3.2). As examples for strongly local values we describe the technical realization of providing queries regarding the meta-information of a DFS, e.g., pre-order numbering, lowpoints, lowest-common ancestors, and more. We end the section with a description of how our main results can be realized.
For this section, let be a separable graph given as a succinct nested division (Lemma 6). We begin with the description of how to execute a DFS to compute a palm tree that is a directed version of such that the edges are directed according to the DFS traversal, and by their direction partitioned into two sets, tree edges and back edges. Effectively, a palm tree is a DFS tree with additional back edges. Each vertex is assigned a pre-ordering number indicating at which point in time it was traversed for the first time. Analogously, a post-ordering number is indicating at which point in the time DFS backtracked past the vertex (i.e., all children of have been fully explored). All edges of the palm tree are directed such that, for a tree edge of the DFS tree, it holds that , and for all other edges (the back edges), it holds that . In the next subsection we compute some palm tree directly on and afterward provide constant time queries regarding such as iteration over all children of a vertex . If we are only interested in the directed tree edges of , we refer to it as the DFS tree.
3.1 Iterator based DFS to compute a Palm Tree
In addition to the standard stack-based approach for implementing a DFS, there is also an iterator approach that stores for each vertex an iterator of the adjacency list. We implement the second variant. Let be the palm tree constructed by some DFS traversal. See Figure 3 for an example of the following description. We say enters a piece if the there exists tree edges and with only the edge part of the piece , or if edge is part of the piece , is a tree edge, and is the root of . We then call an entry vertex of . Furthermore, we say exits if there exist tree edges and with only part of and not part of . We then call an exit vertex of . We say starts at a piece if the tree edge exists with the root of , is the first vertex visited after and is part of . Note that the entry of a piece is by definition a boundary vertex of some piece , or the root of . For us there is no difference between the two cases of entry vertex (i.e., boundary vertex or root). We refer to an exit vertex together with its matching entry vertex as an entry-exit pair. For easier description we refer to an entry vertex that has no matching exit vertex as part of an entry-exit pair where the entry is mapped to some special symbol, called the null exit. The following observation directly follows.
Observation 7.
Let be the number of (micro/mini) boundary vertices contained in a (micro/mini) piece and any palm tree. Then there are entry-exit pairs associated with .
Let be some micro piece stored as an index to the lookup table. We define a
partial DFS state of as a coloring of the vertices of with the colors
unvisited (white), currently visited (gray) and finished (black), and a
constant number of (possibly empty) ordered sets of entry-exit pairs,
ordered by, e.g., the pre-/and post-order numbering of their respective entry vertex, with ties being
broken arbitrarily.
By the capabilities of the table lookup outlined in the previous section, an
index of the lookup table is able to encode this information. The idea is
that, when the DFS enters a micro piece , categorized by some index of
the lookup table encoding together with its partial DFS state, we obtain
in time an index of the lookup table encoding together with the
next partial DFS state. We then swap out the index with the index .
Note that this query depends on: the piece , the coloring of the vertices
of , the ordered set of entry-exit pairs and the vertex we used to just
enter the piece (or the vertex we just backtracked to). This information
is enough to obtain a new DFS state that correctly advances the DFS
through while considering the current partial DFS state. Since such an
advancement results in the DFS traversing up to (or backtracking to) the next
micro boundary vertex we obtain a new entry-exit pair in time. There
are multiple possible viable partial DFS states that one could obtain by this
operation, but we store one fixed state per entry of the lookup table,
which is all that we require.
See Figure 3 again for a visualization of
how the DFS progresses through a piece. The change from one state of the
figure to the next is done in time for micro pieces.
A final note considers marking vertices as gray or black in micro pieces. When we
visit a micro boundary vertex , we must color in all micro
pieces that contain (as micro label ). The translation
mappings allow us to iterate over all micro pieces that contain in
constant time per element output. For each such element output we must switch
out the respective index of the micro pieces via the lookup table.
This is done exactly twice (white to gray, and gray to black) per duplicate of
a micro boundary vertices, and thus the time is linear in the number of total
micro boundary vertices.
We now describe the data structures required for implementing the DFS. We begin
with a description of how to implement an iterator over the neighborhood that
can be paused and resumed. Note that while the nested division
provides neighborhood iteration (Lemma 6), it is not assumed that
this iteration can be paused and continued at a later point in time without
starting the process from the beginning. First, we describe this process for
micro boundary vertices in some piece . For each such store an
array containing the indices of each micro piece that contains
(as micro label ). An iterator for now consists of an
index of together with an index into the adjacency array of
in . Note that the array contains many entries for each micro
boundary vertex. By Lemma 5 this is asymptotically no
problem, as the number of entries in all arrays depends linearly on the
number of micro boundary vertices. The arrays can be built in sublinear
time and space by iterating over all micro pieces, and inside each micro piece
iterating over all micro-boundary vertices. We store the analogous data
structure for all mini boundary vertices.
We now must store the state of each iterator during the DFS. Construct an array
storing for each boundary vertex the state of the iterator over ’s
neighborhood together with ’s parent in the DFS tree and the color
visited/unvisited. This uses bits per mini boundary vertex. For
each mini piece we construct arrays storing the respective information of
micro boundary vertices. As each vertex (identified via mini label )
in has a degree of and must have its parent in (identified
by some mini label ), this information can be stored with
bits per micro boundary vertex. All other vertices are
handled by the lookup table. All other data structures use bits.
The DFS naturally computes the palm tree, but it remains to show how to store it such that afterward we can execute common queries such as iteration over all children (i.e., incident tree edges), or iteration over all back edges that start/end at a given vertex. We describe this for iteration over all children of a vertex, the other queries are realized in the exact same fashion. For micro non-boundary vertices these queries can directly be provided by the table lookup, or by recursive structure. We thus focus on the boundary vertices. For each mini boundary vertex we maintain a list containing the ids of all mini pieces such that has a child in . Iteration over the children of can then be expressed as an iteration over , and for each we output all children of in . Concretely this is done by outputting all children of (the mini label of in ) as their respective mini labels, and then translating them to their global labels. For micro boundary vertices, analogous structures are built. The space analysis is analogous to the space analysis of the previous paragraph, i.e., it uses bits total. We summarize the results in the next lemma.
Lemma 8.
Let be a separable graph given as a nested division . In time and bits we can execute a DFS on and store the resulting palm tree such that the following queries are available for any vertex , in constant time (per element output).
-
Iteration over the children of in , ordered by .
-
Iteration over all back edges starting/ending at in .
-
Output the parent .
3.2 Meta Information of a Depth-First Search
During the computation of a DFS it is common to store various values that relate to the traversal such as the pre-order of a vertex , or more complex values such as the lowpoint of a vertex , defined as the lowest numbered (by pre-order numbering) vertex reachable via a path starting at that consists of zero or more tree edges and at most one back edge. We can augment with this information as follows. We start to describe the computation of the pre-ordering number . Post-order can be implemented in an analogous way. Assume that we have constructed the palm tree of Lemma 8.
Pre-order numbering.
We begin with an observation regarding the pre-order numbering of some non-boundary vertex assigned to some piece (the following description applies both for micro and mini pieces), and denote with the number of vertices of . Note that any DFS must enter via an entry vertex , which is either the root or a boundary vertex. If the DFS leaves via an exit vertex, the DFS can discover new vertices in only if it enters via another entry vertex, or it backtracks over the exit vertex. By this we can observe two scenarios: vertex is visited either (1) within the next steps after the DFS enters via an entry vertex , or (2) within the next steps after a DFS backtracks over some exit vertex . For (1) we have for some entry vertex , and for (2) we have for some exit vertex . What we have shown is that the pre-order numbering of a non-boundary vertex can be computed as a small local offset plus the pre- or post-order numbering of some entry or exit vertex , called the reference boundary vertex. It is crucial that the local offset is bounded by the number of vertices in .
For a mini boundary vertex , we store and explicitly in an array with bits each. For a mini non-boundary vertex , and its mini label in a piece , we store the local offset together with its reference boundary (stored as its mini label ). Storing these values explicitly would use bits total, and as such we again employ a recursive strategy. We apply the mentioned strategy only for vertices being mini non-boundary vertices, that are also micro boundary vertices and we store the local offset and reference boundary explicitly. For non-boundary vertices of a micro piece we use the table lookup to compute both the local offset together with the reference boundary on-the-fly. This general strategy can be used for any strongly local value, possibly in combination with some additional structures required for more involved queries. Simple values such as the post-order, depth of a vertex, or the number of descendants, can be handled exactly the same way as the pre-order. For two of the more involved values, lowpoint and lowest-common ancestor, we describe the details next.
Lowpoints.
Recall that of a vertex is defined as the lowest pre-order number such that is reachable via a path that traverses the DFS subtree rooted at together with at most one back edge of the palm tree. The lowpoint is a crucial value for typical DFS applications such as identifying biconnected components. We show that is strongly local. Assume that is a non-boundary vertex in some piece . Denote with the number of vertices in . First note that any back edge must have in . Thus, either is , depends on a back edge that lies within , or it depends on the lowpoint of a boundary vertex that is a descendant of . Thus, we require the lowpoint of a boundary vertex of , together with a local offset . See Figure 4a.
The standard algorithm due to Hopcroft and Tarjan computes the lowpoints during a DFS traversal by updating it anytime the DFS backtracks to a vertex and anytime a back edge is discovered. For non-boundary vertices inside a micro piece, this update can be done via the table lookup using the augmentations provided by the lookup table for each boundary vertex of a micro piece . Thus, anytime the DFS enters or leaves a micro piece , we can update the current lowpoint values of each non-boundary vertex of . As with the pre-order number, we store the required values for the boundary vertices explicitly in an array. The runtime of updating all values across the DFS is .
Lowest Common Ancestor.
The operation returns the root of the smallest subtree of a tree such that and are contained in , i.e., it is the deepest vertex in the tree such that and are descendants of where deepest means it has the maximal distance from the root of . We show that LCA queries for two vertices can be reduced to strongly local values, one for each of the vertices. Let be two non-boundary vertices of a piece . If the LCA of lies within , then the value is clearly local. If the LCA does not lie within , there must be a path from to and to such that each of these paths contains a boundary vertex and , respectively. As there can be multiple such boundary vertices, let it be the first ones encountered by traversing the tree edges in reverse direction, i.e., the nearest such vertices. Then the LCA of and is equal to the LCA of and . Thus, for each pair of non-boundary vertices we only require to know the . The same is true, if and are non-boundary vertices in different pieces. Again, and are the nearest ancestors being boundary vertices in their respective pieces. Note that the choice of and is fixed for and , respectively, independent of the concrete query. See Figure 4b. To handle the LCA between mini boundary vertices, and for each mini piece, between micro boundary vertices, we use a known data structure to evaluate the LCA queries such as the data structure of Harel and Tarjan et al. [14]. For a tree with vertices, this data structure is initialized in time and uses bits such that LCA queries are available in linear time. We can not input our entire DFS tree to this data structure, and as such construct special smaller trees that capture the ancestry only between mini boundary vertices, and between micro boundary vertices for each mini piece. The time to construct the necessary structures is , and we use bits.
We describe the concrete construction of the required smaller trees below. First observe the following. Let be some tree with vertex set and a set of important vertices, such that for pairs of vertices of we want to know their LCAs. First, all leaves in are irrelevant (can not be an LCA of a pair of vertices in ) and we recursively can remove them. Next consider vertices of of degree . These vertices are also irrelevant. We can replace such vertices and its two edges by a single edge . This tree now contains the vertices of and all vertices that are LCA’s of pairs of vertices of . Since we recursively removed all vertices of with degree , the tree obtained has vertices.
Now consider the DFS tree and the set of mini boundary vertices. Taking the set of mini boundary vertices as the important vertices, we can see that we can shrink such that it only contains vertices without losing any information regarding the LCAs between boundary vertices. Analogously, for all micro boundary vertices inside a mini piece. The shrunken tree for a micro piece can be built via table lookup where we can support queries for every piece and every arbitrary set of important vertices of the piece. Thus, we can obtain in time a (set of) shrunken LCA tree(s) for piece , which consists of the shrunken whole DFS tree with respect to the boundary vertices of the piece as important vertices. To build the shrunken tree for all mini boundary vertices, we iterate over all mini pieces and obtain shrunken with respect to the mini boundary vertices as important vertices, via table lookup. We so can build all required shrunken trees in time using bits.
Lemma 9.
Let be a separable graph given as a nested division such that encodes the palm tree of some DFS in . After time preprocessing and using bits we can provide the following queries.
-
iteration over all children of .
-
: output the parent of .
-
/: return the pre-/post-order number of vertex .
-
: return the number of descendants of .
-
: return the depth of .
-
: return the lowpoint of .
-
: return the lowest-common ancestor of vertices and .
3.3 Constructing Nested Divisions efficiently
For planar graphs we can construct the encoding of Lemma 6 efficiently, both in time and space. Kammer and Meintrup [21] presented a space-efficient algorithm for computing a -relaxed divisions of minor closed graph classes that roughly works as follows. The input graph is replaced with a minor that contains vertices such that each vertex of represents vertices of . Then, any algorithm can be used for computing an -relaxed division (a standard -division) of , which then can be turned into a -relaxed division of . As is executed on a graph with vertices, we achieve a speedup of a factor of , and a space reduction by the same factor. Translation between the graph and uses linear time and bits. When recursively constructing divisions of mini pieces we can use standard algorithms as the graphs are small enough. For planar graphs, linear time -division algorithms [12, 23] exist, and thus the entire computation of the recursive division runs in linear time. For other graph classes the runtime depends on how efficiently separators can be found.
The final non-trivial part of constructing our encoding are related to the translation mappings between the different levels. The mapping is constructed using the powerful fully-indexable dictionary (FID) of Raman et al. [26] that, for a given universe , uses bits total when managing a set with , which fits Blelloch and Farzan’s use case. Effectively, this FID manages a compressed bit vector of length where in bits at index are set to . Due to the use of Raman et al.’s FID their construction takes polynomial time even for the simple case of encoding planar graphs and without regard to space-efficiency during the construction step. While Raman et al. mention their data structure can be constructed in expected linear time, a deterministic linear time construction is not known. This is due to the use of a certain hash function that is required. When succinctness is not required much simpler dictionaries suffices for the translation mappings. In this case, we effectively follow the same techniques outlined by Blelloch and Farzan, but use the indexable dictionary of Baumann and Hagerup, which can be constructed in time and uses bits for managing any set :
Lemma 10 ([5]).
Let be a bit string. In time and bits a data structure can be constructed such that afterward the following queries can be executed in constant time for .
Using the FID of Lemma 10 we are able to construct a compact variant of the encoding of Lemma 6 in time and bits. If one is fine with expected linear time, the FID of Raman et al. [26] can be used and the encoding remains succinct. The functionalities of the different variants are the same.
Corollary 11 (Compact Nested Division).
For planar graphs, a compact (succinct) variant of the encoding of Lemma 6 can be constructed with bits and time (expected time).
Combining Corollary 11 with Theorem 1 we are able to show Corollary 2. We give a more general version for arbitrary separable graphs next. In the following we denote with the runtime of an algorithm for graphs with vertices of some separable graph class that recursively computes a balanced separator until the graphs are small enough (of poly-logarithmic size), and with the number of bits required by such an algorithm.
Corollary 12.
Let be a graph of a separable graph class . There exists a compact (succinct) encoding of that can be constructed in time (expected time) and bits that provides the same functionalities as the encoding of Theorem 1.
References
- [1] Hüseyin Acan, Sankardeep Chakraborty, Seungbum Jo, and Srinivasa Rao Satti. Succinct data structures for families of interval graphs. In 16th International Symposium on Algorithms and Data Structures (WADS 2019), pages 1–13. Springer, 2019. doi:10.1007/978-3-030-24766-9_1.
- [2] Luca Castelli Aleardi, Olivier Devillers, and Gilles Schaeffer. Optimal succinct representations of planar maps. In Proceedings of the Twenty-Second Annual Symposium on Computational Geometry (SCG 2006), pages 309–318. ACM, 2006. doi:10.1145/1137856.1137902.
- [3] Tetsuo Asano, Taisuke Izumi, Masashi Kiyomi, Matsuo Konagaya, Hirotaka Ono, Yota Otachi, Pascal Schweitzer, Jun Tarui, and Ryuhei Uehara. Depth-first search using bits. In Hee-Kap Ahn and Chan-Su Shin, editors, 25h International Symposium on Algorithms and Computation (ISAAC 2014), pages 553–564. Springer, 2014. doi:10.1007/978-3-319-13075-0_44.
- [4] Niranka Banerjee, Sankardeep Chakraborty, Venkatesh Raman, and Srinivasa Rao Satti. Space efficient linear time algorithms for bfs, DFS and applications. Theory Comput. Syst., 62(8):1736–1762, 2018. doi:10.1007/S00224-017-9841-2.
- [5] Tim Baumann and Torben Hagerup. Rank-select indices without tears. In 16th International Symposium on Algorithms and Data Structures (WADS 2019), pages 85–98. Springer, 2019. doi:10.1007/978-3-030-24766-9_7.
- [6] Guy E. Blelloch and Arash Farzan. Succinct representations of separable graphs. In 21st Annual Symposium on Combinatorial Pattern Matching (CPM 2010), pages 138–150. Springer, 2010. doi:10.1007/978-3-642-13509-5_13.
- [7] Sankardeep Chakraborty, Venkatesh Raman, and Srinivasa Rao Satti. Biconnectivity, Chain Decomposition and st-Numbering Using O(n) Bits. In 27th International Symposium on Algorithms and Computation (ISAAC 2016), volume 64 of LIPIcs, pages 22:1–22:13. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ISAAC.2016.22.
- [8] Yi-Ting Chiang, Ching-Chi Lin, and Hsueh-I Lu. Orderly spanning trees with applications to graph encoding and graph drawing. In 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2001), pages 506–515. SIAM, 2001. URL: http://dl.acm.org/citation.cfm?id=365411.365518.
- [9] Amr Elmasry, Torben Hagerup, and Frank Kammer. Space-efficient Basic Graph Algorithms. In 32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015), volume 30 of LIPIcs, pages 288–301. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPIcs.STACS.2015.288.
- [10] Greg N. Federickson. Fast algorithms for shortest paths in planar graphs, with applications. SIAM Journal on Computing, 16(6):1004–1022, 1987. doi:10.1137/0216064.
- [11] Richard F. Geary, Rajeev Raman, and Venkatesh Raman. Succinct ordinal trees with level-ancestor queries. ACM Trans. Algorithms, 2(4):510–534, 2006. doi:10.1145/1198513.1198516.
- [12] M.T. Goodrich. Planar separators and parallel polygon triangulation. Journal of Computer and System Sciences, 51(3):374–389, 1995. doi:10.1006/jcss.1995.1076.
- [13] Torben Hagerup. Space-efficient DFS and applications to connectivity problems: Simpler, leaner, faster. Algorithmica, 82(4):1033–1056, 2020. doi:10.1007/S00453-019-00629-X.
- [14] Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing, 13(2):338–355, 1984. doi:10.1137/0213024.
- [15] Jacob Holm, Giuseppe F. Italiano, Adam Karczmarz, Jakub Lacki, and Eva Rotenberg. Decremental SPQR-trees for Planar Graphs. In 26th Annual European Symposium on Algorithms (ESA 2018), volume 112 of LIPIcs, pages 46:1–46:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ESA.2018.46.
- [16] Jacob Holm and Eva Rotenberg. Good -divisions imply optimal amortized decremental biconnectivity. Theory Comput. Syst., 68(4):1014–1048, 2024. doi:10.1007/S00224-024-10181-Z.
- [17] John Hopcroft and Robert Tarjan. Algorithm 447: efficient algorithms for graph manipulation. Commun. ACM, 16(6):372–378, June 1973. doi:10.1145/362248.362272.
- [18] Taisuke Izumi and Yota Otachi. Sublinear-Space Lexicographic Depth-First Search for Bounded Treewidth Graphs and Planar Graphs. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020), volume 168 of LIPIcs, pages 67:1–67:17, Dagstuhl, Germany, 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ICALP.2020.67.
- [19] G. Jacobson. Space-efficient static trees and graphs. In 30th Annual Symposium on Foundations of Computer Science (FOCS 1989), pages 549–554, 1989. doi:10.1109/SFCS.1989.63533.
- [20] Frank Kammer, Dieter Kratsch, and Moritz Laudahn. Space-efficient biconnected components and recognition of outerplanar graphs. Algorithmica, 81(3):1180–1204, 2019. doi:10.1007/S00453-018-0464-Z.
- [21] Frank Kammer and Johannes Meintrup. Space-Efficient Graph Coarsening with Applications to Succinct Planar Encodings. In 33rd International Symposium on Algorithms and Computation (ISAAC 2022), volume 248 of LIPIcs, pages 62:1–62:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.ISAAC.2022.62.
- [22] Kenneth Keeler and Jeffery Westbrook. Short encodings of planar graphs and maps. Discrete Appl. Math., 58(3):239–252, 1995. doi:10.1016/0166-218X(93)E0150-W.
- [23] Philip N. Klein, Shay Mozes, and Christian Sommer. Structured recursive separator decompositions for planar graphs in linear time. In 45th Annual ACM Symposium on Theory of Computing (STOC 2013), pages 505–514. Association for Computing Machinery, 2013. doi:10.1145/2488608.2488672.
- [24] Richard J. Lipton and Robert Endre Tarjan. A separator theorem for planar graphs. SIAM Journal on Applied Mathematics, 36(2):177–189, 1979. doi:10.1137/0136016.
- [25] J. Ian Munro, Patrick K. Nicholson, Louisa Seelbach Benkner, and Sebastian Wild. Hypersuccinct Trees - New Universal Tree Source Codes for Optimal Compressed Tree Data Structures and Range Minima. In 29th Annual European Symposium on Algorithms (ESA 2021), volume 204 of LIPIcs, pages 70:1–70:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.ESA.2021.70.
- [26] Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans. Algorithms, 3(4):43–es, 2007. doi:10.1145/1290672.1290680.
- [27] Omer Reingold. Undirected connectivity in log-space. J. ACM, 55(4), September 2008. doi:10.1145/1391289.1391291.
- [28] Jens M. Schmidt. A simple test on 2-vertex- and 2-edge-connectivity. Information Processing Letters, 113(7):241–244, 2013. doi:10.1016/j.ipl.2013.01.016.
