Edge-Minimum Walk of Modular Length in Polynomial Time
Abstract
We study the problem of finding, in a directed graph, an -walk of length which is edge-minimum, i.e., uses the smallest number of distinct edges. Despite the vast literature on paths and cycles with modularity constraints, to the best of our knowledge we are the first to study this problem. Our main result is a polynomial-time algorithm that solves this task when and are constants.
We also show how our proof technique gives an algorithm to solve a generalization of the well-known Directed Steiner Network problem, in which connections between endpoint pairs are required to satisfy modularity constraints on their length. Our algorithm is polynomial when the number of endpoint pairs and the modularity constraints on the pairs are constants.
In this version of the article, proofs and examples are omitted because of space constraints. Detailed proofs are available in the full version [3].
Keywords and phrases:
Directed Steiner Network, ModularityFunding:
Antoine Amarilli: Amarilli was partially supported by the ANR project EQUUS ANR-19-CE48-0019, by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 431183758, and by the ANR project ANR-18-CE23-0003-02 (“CQFD”).Copyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Parameterized complexity and exact algorithms ; Theory of computation Shortest pathsAcknowledgements:
We are grateful to the reviewers of the conference version for their helpful feedback. We also would like to thank the Simons Institute Fall 2023 programs “Logic and Algorithms in Database Theory and AI” and “Data Structures and Optimization for Fast Algorithms” for the initiation of this work. Last, we are grateful to Xiao Hu and Mikaël Monet for early discussions.Editors:
Raghu MekaSeries and Publisher:

1 Introduction
We begin with a simple question: Given an -vertex, -edge directed graph and terminals ,, can we efficiently find an odd-length -walk that is “edge-minimum”, i.e., has the minimum number of distinct edges? This question may appear similar to classical problems from the vast literature on paths and cycles with parity constraints. For instance, one may think of the classical “shortest odd -path” problem, which is well-known to be NP-hard (this is via a simple reduction from the 2-disjoint paths problem of [19], and can be proved in the same way as [37, Proposition 2.1]). However, this hardness result only applies to simple paths, whereas the edge-minimum odd -walk may not be a simple path. One may also think of the “shortest odd -walk” problem (i.e., minimizing the length), which is well-known to be in polynomial time111Make two copies of the vertex set, and for every edge in the original graph, add the edges and . Then find the shortest walk from to . This also trivially generalizes to modularities which are polynomial in .. However, the edge-minimum odd -walk may be longer than the shortest odd -walk while using fewer distinct edges. See Figure 1 for an example of a graph in which the solutions to all three of these problems is different.
The focus of this paper is the Edge-Minimum Walk of Modular Length problem, which is the above problem generalized to an arbitrary modularity and remainder . Let us define it formally:
Edge-Minimum Walk of Modular Length (EWM):
Input: An unweighted directed222One could also ask
this question on an undirected graph. However, in this case, one unnatural
aspect of the problem is that one can always traverse the same edge back and
forth over and over to add any multiple of 2 to the path length.
We use this observation to show in the full version [3] that EWM on undirected graphs reduces to EWM on directed
graphs. graph , terminals ,, and non-negative integers .
Output: An -walk of length which is edge-minimum, i.e., uses the minimum number of distinct edges (or if no such walk exists).
We stress that the modularity constraint does not apply to the number of distinct edges, but only to the length of the walk. We are most interested in the regime where , and hence , are constants; our algorithms will work without this assumption, but they achieve worse complexities.
Despite the vast literature on paths and cycles of given modularities [31, 26, 37, 36, 39, 27, 4, 38, 10, 41, 34, 32, 29, 40, 24, 1, 35, 6, 25, 14, 16, 7, 23] (see [2] for a survey), to the best of our knowledge we are the first to study the EWM problem.
EWM can also be viewed as a problem in the field of network design. Network design problems ask questions of the form “find an edge-minimum subgraph with a certain property”. Famous network design problems include, for instance, Minimum Spanning Tree, Traveling Salesperson, and -Shortest Path. We can give an equivalent rephrasing of EWM in this way: find an edge-minimum subgraph that contains an -walk of length . The network design problem most related to EWM is the Directed Steiner Network problem (DSN) [17, 18, 8, 20, 15, 9, 33, 21, 13]. In DSN, the input is a directed graph and a set of terminal pairs (,),…,(,); the goal is to find an edge-minimum subgraph that contains a path for all terminal pairs. When is arbitrary, DSN generalizes Directed Steiner Tree and is therefore NP-hard. When is constant, Feldman and Ruhl [17] showed that DSN can be solved in polynomial time. In fact, we show in Section 8 that, in the special case where the set of terminal pairs is strongly connected, the DSN problem with constant can be directly expressed as a special case of EWM with constant . As a consequence, our main result will imply, as a byproduct, the known result that this special case of DSN is in polynomial time for constant [17] (albeit with a larger exponent). We also study a problem generalizing both EWM and DSN and give an algorithm that subsumes the tractability of both problems, as we will explain later in the introduction.
EWM is also related to problems from database theory, in particular the evaluation of regular path queries (RPQs) on graph databases. More precisely, a graph database is a graph whose edges are labeled with symbols from a fixed alphabet, and an RPQ is a regular expression over the alphabet. The results of the RPQ are vertex pairs such that there is a walk from to (or sometimes a simple path or a trail, depending on the variant) whose edge labels from a word that matches (see e.g. [12, 11, 5, 28]). In particular, RPQs of the form express walks of length for constant and . The EWM problem then corresponds to the smallest witness problem [30, 22] for such queries, which is the problem of finding a sub-database of minimum size that satisfies the query. However, to the best of our knowledge there is no prior work on the smallest witness problem for RPQs enforcing modularity conditions, or indeed for RPQs in general: our work can be seen as a first step towards addressing this problem.
Our results.
What is the complexity of EWM for constant ? It is unclear a priori; EWM is related to both polynomial-time solvable problems such as Shortest Modular-Length -Walk and Directed Steiner Network, as well as NP-hard problems such as Shortest Modular-Length -Path. Our main result is that EWM is in polynomial time for constant :
Theorem 1.1.
There is an algorithm solving EWM in time: .
Specifically, the exponent of in Theorem 1.1 has reasonable constants: , though we did not focus on optimizing them.
Generalizations.
A natural generalization of EWM is to consider weighted graphs. EWM admits two natural notions of weights: (1) each edge has a length, and the length of a walk is the sum of the lengths of the edges, and (2) each edge (or vertex) has a cost and the edge-minimum walk is measured in terms of the sum of the costs of the edges (or vertices) that it traverses. We show in Section 7 that our algorithm extends to accommodate costs, and that it can also accommodate lengths provided that their values are polynomial. In the case where lengths and the value are super-polynomial, we show that the problem is NP-complete by reduction from Subset Sum.
Inspired by Directed Steiner Network, we also extend our results in Section 8 from one walk to multiple walks. The input is a directed graph , and endpoint pairs along with modularity constraints: , and the goal is to find an edge-minimum subgraph that contains a walk from to of length , for all . Generalizing our main result, we show (Theorem 8.4) that this problem, which subsumes the DSN and EWM problems, also admits a polynomial-time algorithm for constant values and a constant number of endpoint pairs. Our approach to show the result focuses on the case of strongly connected solutions (in line with the connection to DSN mentioned earlier), before extending the proof to arbitrary solutions by re-using lemmas from [18].
2 Technical Overview
Our work is related to the Directed Steiner Network (DSN) problem studied by Feldman and Ruhl [17], where we must find an edge-minimum subgraph of the input directed graph which satisfies connectivity requirements. Their work shows that the DSN problem is tractable when the number of endpoint pairs in the connectivity requirements is constant. Following their work, Feldmann and Marx [18] have investigated the parameterized complexity of the DSN problem. One key idea of their approach is to show bounds on the cutwidth of solutions to the DSN problem. Informally, cutwidth is a parameter that limits how many edges are “cut” when arranging vertices in a well-chosen order . Bounding the cutwidth of a graph ensures that we can go over vertices in the order of and that, at any cut along , all edges except a constant number will be entirely to one side of the cut. (See Section 3 for the formal definition of cutwidth.)
Feldmann and Marx show that when the number of endpoint pairs is bounded by a constant , then DSN instances always have a solution of constant undirected cutwidth (depending only on ). They further show that bounded-cutwidth solutions can be computed in polynomial time by dynamic programming. We follow the same framework and show that solutions to EWM also have bounded cutwidth. Then we can find the solution by an approach similar to the dynamic programming algorithm of [18], or to the token game of [17].
Thus, our main goal is to solve the purely structural problem of bounding the cutwidth of edge-minimum -walks of length in arbitrary directed graphs. We only focus on this goal in the rest of the technical overview. To provide intuition, we will begin by considering the case of , (i.e. odd walks), and then outline the obstacles that arise when generalizing to arbitrary .
Odd walks.
Let denote an edge-minimum odd-length -walk, and let be the subgraph of consisting of the union of edges in the walk . We first observe that, without loss of generality, does not contain any even cycles: If contained an even cycle, we could simply delete it, and would still be of odd length, and still be edge-minimum. (To clarify, we are not referring to even cycles in , but rather even cycles in the walk itself. After deleting a cycle from , edges from can still appear in if they are traversed at another point of the walk .)
Now suppose contains two odd cycles in succession. Again, we could simply delete both cycles, and would still be of odd length, and still be edge-minimum. So, we can suppose without loss of generality that has either no cycles (and trivially has small cutwidth), or consists of a simple path , followed by an odd cycle , followed by a simple path . We assume this latter case in the rest of the discussion, and we define by looking at first time that the walk re-visits a previously visited vertex: this ensures that, by construction, is vertex-disjoint from . However, the graph does not necessarily have small cutwidth, because could intersect and in intricate ways forming an unbounded number of nested cycles. It is a priori conceivable that, e.g., a grid-like structure with high cutwidth could emerge. However, we can show that this does not happen.
First, we can observe that once leaves , without loss of generality it does not return. This is because if leaves at vertex and then returns at vertex , we can delete the subpath of from to , and instead traverse only the edges of to get from to . This way, we can still ensure that is of odd length since every traversal of changes the parity of , and we can traverse for “free” without adding any extra edges.
Finally, we consider how and can interact. We can observe that without loss of generality cannot hit a vertex on , then leave , then return to a later vertex on . In this case we could delete the subpath of from to , and instead traverse to get from to . Again, we can still ensure that is of odd length since we can always traverse for free to change the parity of . We stress that this argument only holds when re-enters at a later vertex, and in fact, can re-enter at an earlier vertex an unbounded number of times.
With all of these observations in mind, we conclude that if has a cycle, then must have the very specific structure depicted in Figure 2. Then, it is visually clear that any vertical cut only has a constant number of crossing edges, and thus has constant cutwidth.
This concludes the outline for the case of odd walks. The same argument holds for even walks, and more generally an argument for a particular choice of constants is easily extendable by reduction to other constant values with the same , simply by adding a new source connected to the original source by a path of constant length mod . The challenging part is extending to arbitrary . Our approach is to define a segment decomposition of the walk and prove that each successive segment allows us to expand the reachable set of remainder values, so that the number of segments is bounded as a function of (Lemma 4.3). We then show that the cutwidth of a walk can be bounded linearly in its number of segments (Proposition 5.1): we do this by defining a suitable ordering of the vertices following the chunks of the walk, which refine the segments in the decomposition. These two results imply that solutions to EWM have bounded cutwidth and hence that they can be found in polynomial time.
3 Preliminaries
In this section, we present preliminary notions used throughout the proof of our main result (Theorem 1.1).
Basic definitions.
All graphs in the paper are finite and directed. Graphs are not necessarily simple; they may contain self-loops, but they do not contain multi-edges. Given a directed graph , a walk in is a sequence of edges such that the target vertex of is equal to the source vertex of for each . Note that an edge of may occur at several positions in : considering a position for which , we write to refer to that occurrence of at the -th position of . However, when convenient we abuse notation and identify with itself; e.g., we talk about the source and target vertices of to mean those of . For we call a first-visited edge occurrence if it is the first occurrence of some edge , i.e., if there is no such that . Otherwise, is a revisited edge. Of course, each edge that occurs in is first-visited at exactly one position.
The source vertex of is that of , and its target vertex is that of : we then call an -walk. The length of is . The set of vertices used by , denoted , is simply the set of vertices that occur in , i.e., occur in some edge of ; in particular the source and target vertices of are in . The set of edges used by is . The subgraph spanned by is . Note that is not the subgraph induced by , as it only contains the edges that actually occur on the walk. A subwalk of is a contiguous subsequence of . We denote subwalks of with the slicing convention, i.e., and . Note that the right endpoint is included so that, e.g., is empty precisely when , and is empty precisely when . When and are walks and when the target vertex of is the source vertex of , we write to mean the walk obtained by concatenating and : it admits and as subwalks, and of course .
For convenience, throughout the paper we denote by the graph spanned by the subwalk (up to included); by convention is always the graph with the single vertex and no edges.
Strongly connected components.
A strongly connected component (SCC) of a graph is a maximal set of vertices of such that, for any two distinct vertices , there is a directed path from to in . As usual, belonging to the same SCC is an equivalence relation, so that SCCs form a partition of the vertices of . We say that an SCC of is non-trivial if the subgraph of induced by contains at least one edge. This is the case if and only if contains more than one vertex, or contains a single vertex on which there is a self-loop.
EWM and solutions.
We study the Edge-Minimum Walk of Modular Length problem (EWM), as defined in the introduction: given a directed graph , terminals ,, and non-negative integers , we wish to compute a subset of such that has an -walk of and is edge-minimum (i.e., the number of edges of is minimum). This phrasing is somewhat different from the one given in the introduction, in which we wanted to compute an edge-minimum walk. However, we can easily compute an edge-minimum walk from by a product construction, in time .
In the sequel, we only consider the computation of edge-minimum subsets of edges (rather than walks). When fixing inputs , , , , and , we talk about a candidate solution to mean a subset of such that has an -walk of length but is not necessarily edge-minimum, and of an optimal solution to mean a candidate solution which is additionally edge-minimum.
Cutwidth.
The cutwidth is a structural parameter of graphs. We show that there always exists an optimal solution to EWM with bounded cutwidth, which suffices to ensure that such solutions can be found with an efficient algorithm. We first recall the definition of cutwidth for undirected graphs. Let be an undirected graph. An ordering of is a total order over the vertex set of . A cut of that respects the ordering is a partition such that for all . We say that an edge of crosses the cut if it has one endpoint in and one in , i.e., and are both nonempty. Note that self-loops never cross cuts. The cutwidth of a cut that respects is the number of edges that cross this cut, and the cutwidth of is the maximum cutwidth of a cut that respects . The cutwidth of is then the minimum cutwidth of an ordering of .
We now define cutwidth for directed graphs. Let be a directed graph. Its underlying undirected graph is where : note that self-loops are not reflected in . We then define the cutwidth of to be that of . We underscore that our definition of cutwidth for directed graphs is always the undirected cutwidth: we do not consider the directed cutwidth in this paper. We then define the cutwidth of a walk of as the cutwidth of the directed graph .
4 Segment Decomposition of a Walk
Our proof of our main theorem (Theorem 1.1) consists of three steps, spanning this section and the next two sections. First, in this section we introduce the notion of the segment decomposition of a walk, and we show that any walk can be transformed into a walk that satisfies the same modularity conditions (i.e., has the same remainder modulo ), uses a subset of the edges (i.e., ), and has a segment decomposition with segments. Second, in the next section, we will show how the cutwidth of the graph spanned by can be bounded linearly in this number of segments, implying that optimal solutions to EWM have bounded cutwidth. Third, in Section 6, we show that this cutwidth bound makes it possible to solve EWM in polynomial time.
Segment decomposition.
Let be a walk. Its segment decomposition is a sequence of walks such that ; the number is the number of segments of . The segment decomposition is defined by processing the walk from left to right as follows. Initially, the segment decomposition is the empty sequence, and the entire walk remains to be decomposed. At any point of the decomposition, we have already computed the first segments for some number of segments, and can write . If is non-empty, we compute the next segment as a prefix of .
Informally, we read and terminate the segment whenever we have a first-visited edge followed (not necessarily immediately) by an edge where can be reached by a path from using only edges of up to excluded. The intuition of segments is that they capture a moment where the walk revisits a vertex from a vertex via a path that starts with an edge which had not been previously visited in , even while there was already a path from to in the graph spanned by the walk before was visited. Formally, we will look for the segment end inside the suffix of , according to the following definition:
Definition 4.1 (Segment end, Segment detour).
Having fixed , let be the total length of the already-computed segments. For , we say that segment ends at if is minimum such that the following hold:
-
There is such that is first-visited in ; we write as ;
-
Writing the edge as , then there is a path from to in the graph which contains the edges of the walk up to excluded. Note that, in particular, if then this requirement is always satisfied using the empty path from to .
We stress that the path from to required by the second item of the definition is a path in the graph spanned by a certain prefix of ; it may be the case that the path does not occur as a subwalk of . As for the subwalk from to in , which is a subwalk of , we call it the segment detour of , denoted . Note that, having chosen the minimal , there could be multiple valid choices for , and we pick one arbitrarily.
Given the definition of a segment end above, the next segment of the decomposition after is simply . This means that we terminate the -th segment right after the edge , and continue the decomposition if does not yet cover the entire walk . If there is no segment end at any , then we take all the rest of the walk to be the last segment of the decomposition and we finish. Note that for this reason, unlike other segments, the last segment of the decomposition may not feature a segment detour.
Let us formalize which paths are known to exist at the moment when a segment ends:
Lemma 4.2.
Let be a walk and let be the -th segment of the walk. Let be the ending vertex of , and assume that , i.e., the segment finishes before the end of the walk. Let in be the first edge of the segment detour . There is:
-
the segment detour from to in
-
a path from to in .
-
a path from to in .
Notice that we may have , and then and may be empty; but is never empty because it contains at least .
In the sequel for each we denote by and a pair of paths obtained by applying Lemma 4.2 (if there are multiple valid choices for and , we choose arbitrarily).
Achievable differences.
For a walk and segment , we denote by the value and call this the333Note that there could be several choices for the difference achievable by depending on the arbitrary choices made earlier, e.g., depending on the first edge for the segment detour, and on the choice of paths and ; nevertheless, we fix one single value according to the choices made. difference achievable by . We often drop the subscript when clear from context. Intuitively, we know that we can modify the walk to replace the subwalk from to by the existing path that goes from to , and this change subtracts from the length of the walk modulo ; we will make this formal in the sequel. It should also be intuitively clear that achievable differences can be assumed to be non-zero in an edge-minimum walk: intuitively, an achievable difference of 0 means that we can replace the detour by the existing path that has the same endpoints, without changing the length of the walk modulo . We know that this change does not harm the edge-minimality of because consists of already visited edges. What is more, having too many segments with the same achievable difference is useless, because we can intuitively replace by and compensate the effect of this change by doing similar substitutions in earlier segments. Then, thanks to Lagrange’s theorem on the additive group , the number of segments can in fact be bounded by . We formalize this in the following lemma, which is the main result of this section:
Lemma 4.3 (Segment Decomposition Lemma).
For every -walk with length , there is another -walk of length using only edges from whose number of segments is at most .
The rest of this section is devoted to proving the Segment Decomposition Lemma; we will then use it in the next section to bound the cutwidth of some edge-minimum solution.
Recall that a preorder over a set is a binary relation that is both reflexive and transitive (this is a weaker requirement than non-strict partial orders which are additionally required to be antisymmetric). To prove the Segment Decomposition Lemma (Lemma 4.3), we define a preorder over the edge-minimum walks that are using a specific set of edges. Intuitively, the preorder favors walks which do their first visit of edges as late as possible relative to the end of the walk. More precisely, the preorder criterion asks us to minimize the number of remaining steps of the walk at the moment where the last first-visited edge is visited; then break ties by minimizing the number of remaining edges at the moment where the second-to-last first-visited edge is visited; and so on. This is designed to ensure that replacing a detour by improves the walk according to the preorder.
Let us fix from now on the graph , the source and target , and the integers and . For any subset of edges , we define an -walk to mean an -walk of length in which uses precisely the edges of . Let us then define the order:
Definition 4.4.
Let be a subset of edges and let . Let be an -walk, and let be the indexes of the first-visited edges in ascending order, i.e., are the first-visited edges of and . The first-visited timestamp of is then the -tuple .
We define as follows a preorder relationship called the timestamp preorder on -walks. Let and be two -walks, and let and be their first-visited timestamps, respectively. Then we have if we have in lexicographic ordering.
In other words, the timestamp preorder compares two walks by the number of remaining edges to traverse when visiting the last first-visited edge, then the second-to-last, and so on. Note that the timestamp preorder is not antisymmetric because two different walks on the same set of edges may happen to have the same first-visited timestamp (i.e., they visit first-visited edges at the same positions from the end, even though the identity of these edges and the revisits may be different). We then define:
Definition 4.5.
If is an -walk, we say is timestamp-minimum if, for every -walk , we have .
Note that, whenever there is an -walk, then there is a timestamp-minimum -walk, but it is not necessarily unique because the timestamp preorder is not antisymmetric. We are now ready to conclude the section by proving the Segment Decomposition Lemma (Lemma 4.3):
Proof sketch.
We show that a timestamp-minimal walk has at most segments. We consider the successive subgroups of generated by the achievable differences of the successive segments, and show that these must be a strictly increasing subsequence, so that the bound follows by Lagrange’s theorem. To do this, we show as an intermediate claim that walks can be modified to augment their length by any combination of achievable differences of previous segments, while still using the same edges. Thanks to this claim, assuming by contradiction that there is a segment whose achievable difference is already achievable using the preceding segments, then we rewrite the segment to replace its detour by the path from Lemma 4.2, and use the claim to modify the walk and fix its length. We then show that this modification yields a walk which is smaller in the timestamp preorder, contradicting the minimality of . See the full version [3] for the full proof.
5 Number of Segments and Cutwidth Bounds
In this section, we continue our proof of Theorem 1.1 by showing that, for any walk , the number of segments in the decomposition of the previous section gives a bound on the cutwidth of up to a constant factor. This result is completely independent from the definition of the EWM problem. Formally, we show:
Proposition 5.1 (Segment Cutwidth Bound).
For any walk , letting be the number of segments in its segment decomposition, the cutwidth of is at most .
We remark that there is no converse of this result: a walk formed of a succession of cycles connected by single edges will have an unbounded number of segments but has cutwidth 2. Combining Proposition 5.1 with Lemma 4.3 establishes the following:
Corollary 5.2.
For every -walk with length , there is another -walk of length using only edges from such that the cutwidth of is at most .
This implies the following bound on the cutwidth of optimal solutions to the EWM problem, on which our algorithm relies:
Corollary 5.3.
For any graph , terminals ,, and non-negative integers , every optimal solution to the EWM problem on , , , , and has cutwidth at most .
To prove Proposition 5.1, we need several intermediate steps. First, we define the notion of chunk of a walk, which is a contiguous sequence of first-visited edges whose intermediate vertices are also first-visited. Second, we define the ordering on the vertices of along which the cutwidth bound will be shown: this order is defined using the notion of chunks, intuitively because all vertices of a chunk will be ordered relative to the already-visited vertices that are the endpoints of the chunk (also distinguishing special cases like cycle chunks and tadpole chunks). Third, we show that the cutwidth along the order is bounded by , by showing for each segment that the number of times its first-visited edges can cross the cut is at most . Throughout this section, we fix an arbitrary walk in a graph , and call and its source and target vertices.
Chunks.
We define first-visited vertices similarly to first-visited edges: for every vertex occurring in , we say is first-visited at if is the first edge in which occurs. Note that a vertex which is first-visited at always occurs as the target vertex of , except in the specific case of the first edge where both the source and the target vertex are first-visited at . Of course, every vertex of is first-visited at exactly one position. Further, if a vertex is the target vertex of an edge and is first-visited at , then both and (if it exists) must be first-visited edges.
We can now define chunks:
Definition 5.4.
A chunk of is a maximal subwalk of where all edges are first-visited, and where, for every (if any), the target vertex of is first-visited at .
In other words, a chunk is a maximal sequence of one or more consecutive first-visited edges such that the first and last vertex are not first-visited (unless they are extremities of the whole walk) but all intermediate vertices are first-visited. A chunk may consist of a single first-visited edge between two already-visited vertices, in which case there are no intermediate vertices.
The first and last vertices of chunk are the source vertex of , and the target vertex of , respectively. Note that two successive chunks in the walk need not be separated by revisited edges and can simply be separated by a revisited vertex: for instance, for the length- walk , all edges are first-visited, but there are two chunks of length 1 which are separated by the revisit of the vertex .
It will be useful to distinguish two special kinds of chunks. First, tadpole chunks, which loop back on an intermediate vertex of the chunk:
Definition 5.5.
A chunk is a tadpole if it consists of at least two edges and if, letting be the successive vertices that it visits, with its first vertex and its last vertex, we have for some .
In other words, a tadpole is a chunk that consists of a path (containing at least one edge), followed by a cycle (possibly a self-loop).
Second, cycle chunks, which loop back on the vertex from which they started:
Definition 5.6.
A chunk is a cycle if its first vertex and last vertex are identical.
Note that these two cases are mutually exclusive. A chunk which is neither a tadpole nor a cycle, and thus is a simple path, is a normal chunk.
The notion of chunk must be distinguished from the notion of segments used in the segment decomposition of the previous section, but we can notice the following connections between the two notions:
-
Segment ends never happen within a chunk: indeed, the intermediate vertices reached within a chunk are first-visited by definition, so there is no way to reach them except by the edge that precedes them, and thus no earlier path can reach in the walk.
-
At the end of a tadpole chunk or cycle chunk, the current segment always ends. Indeed, the last edge of the chunk reaches a vertex which already has an outgoing edge in the chunk: this edge is a first-visited edge , and the empty path from to itself allows us to finish the segment at the end of the chunk, i.e., just after .
In summary, chunks are contiguous subsequences of the walk that never straddle segment boundaries, and the end of a tadpole chunk or cycle chunk always triggers the end of a segment. Also note that, by definition, chunks form a partition of the first-visited edges. This implies that, except possibly for the last segment, every segment must contain at least one chunk, because they contain at least one first-visited edge.
Defining the ordering.
Having defined the notion of chunks of the walk , we now define the ordering along which we will show that the cutwidth is bounded. This definition only depends on chunks; it does not depend on the segment decomposition. We see the total order as a sequence of vertices. Initially, the order is the empty order on the single vertex . Then, for every chunk successively, we consider the (possibly empty) sequence of the intermediate vertices of . Recall that the first vertex of is already in the domain of : either it is (for the first chunk), or it is a vertex which is already visited. Then there are four cases:
-
Tadpole: If is a tadpole, then we insert all its intermediate vertices at the end of the current ordering , in the order in which they were first-visited.
-
Cycle: Otherwise, if is a cycle, then, letting be its first and last vertex, we know that already occurs in , and we insert all its intermediate vertices right after the vertex , in the order in which they were visited.
-
Normal: We consider two subcases:
-
–
First, suppose that the last vertex of is first-visited at . This is only possible with , so that is the last chunk. Then we do the same as in the tadpole case: insert the intermediate vertices and at the end of the current ordering , in the order in which they were first-visited.
-
–
Otherwise, is a chunk whose last vertex is first-visited at with so already occurs in , and as we explained the first vertex of the chunk also already occurs in . Further, we have because the chunk is not a cycle. Then, we insert the intermediate vertices so that the vertices along the chunk are ordered in a monotone fashion in the ordering. In other words, let be the successive intermediate vertices of the chunk, so that its edges are . If then we insert in order between and , and if we insert in order between and . The order between the newly inserted elements and the existing elements between and is arbitrary, for instance we can arbitrarily say that we insert the new vertices just after the smallest of and in .
-
–
Note that, in all cases above, the (intermediate) vertices of a chunk are always ordered as a monotone sequence.
SCCs of the segment decomposition.
We have defined, from our walk , the order along which we will bound the cutwidth. To show the cutwidth bound, we establish two results describing the SCCs of the graph generated by the walk and its close relationship with our ordering :
Lemma 5.7.
At each step , the successive SCCs of are linearly ordered by the reachability relationship (i.e., every vertex in is reachable from every vertex in iff ), and the target vertex of the last edge is in the last SCC of .
Lemma 5.8.
For every prefix of the walk such that a chunk of ends at , the ordering induced by is consistent with the topological order of the SCCs of .
Bounding the cutwidth from the number of segments.
We can now turn back to the segment decomposition introduced in the previous section for the walk , and show how the number of segments of can be used as a bound on the cutwidth of the graph spanned by , following the order that we defined. For this, we will consider an arbitrary cut of following , and count how many edges of cross the cut. As each edge is first-visited once, and first-visited in exactly one segment, it suffices to bound how many edges each segment contributes to the cut, and to count only the first-visited edges of each segment. We want to show:
Lemma 5.9.
For each segment of the walk , the number of first-visited edges in that cross the cut is at most 3.
This immediately implies the Segment Cutwidth Bound (Proposition 5.1) stated at the beginning of the section. This bound of edges crossing the cut can be proved by a case analysis summarized in the following lemmas:
Lemma 5.10.
Let be a segment of and assume that the last chunk of is a cycle . Then crosses the cut at most twice. Further, if it does cross the cut twice then it must be the case that the first and last vertex of is in and that contains some intermediate vertex in .
Lemma 5.11.
Let be a segment of and assume that the last chunk of is a tadpole . Then crosses the cut at most twice. Further, if it crosses the cut twice then it has an intermediate vertex in and an intermediate vertex in .
Lemma 5.12.
Let be a normal chunk of , then it crosses the cut at most once.
Lemma 5.13.
Let be a normal chunk of which crosses the cut forwards, i.e., the starting vertex of the chunk is in and its ending vertex is in . Then the segment containing ends at .
All that remains is to bound the contribution to the cut of the normal chunks that cross the cut backwards. To this end, let us distinguish the last chunk of a segment which we call the final chunk, and the remaining chunks in that segment which we call non-final chunks. Non-final chunks must be normal (because the segment ends right after cycle chunks or tadpole chunks), and they cannot cross the cut forward by the previous lemma.
We are ready to conclude the proof of Lemma 5.9:
Proof sketch.
We consider a segment and consider the SCC decomposition of the graph of the walk when the segment starts, using Lemma 5.7. By Lemma 5.8 the segment begins in the rightmost SCC according to the order. Now the intuition is the following: if all chunks but the final one stay on the same side of the cut, then only the final chunk can cross the cut and the preceding lemmas (Lemma 5.10, Lemma 5.11 and Lemma 5.12) show that the bound is satisfied.
Otherwise, there is a first non-final chunk which ends on some SCC that contains nodes to the left of the cut and this chunk thus potentially crosses the cut (backwards, by Lemma 5.13). If the next chunk is final, the bound is satisfied, so we may assume the next chunk is not final. This next chunk may also cross the cut, but in any case it has for target an SCC which is further left than , a property we exploit to show that the walk cannot return to the right of the cut without terminating the segment. This implies that the cut is crossed at most 4 times in total: once for each of the two non-final chunks considered, and potentially twice for the final chunk. We lower the bound to 3 by showing that in fact the final chunk can only cross the cut once if it is preceded by two chunks that already crossed the cut. See the full version [3] for the full proof.
6 Computing an Edge-Minimum Walk of Bounded Cutwidth
Up to now, we have shown Corollary 5.3: optimal solutions to the EWM problem have cutwidth at most . In this section, we conclude the proof of Theorem 1.1 by showing that optimal solutions of bounded cutwidth can be efficiently found.
At a high level, our algorithm proceeds by searching for a shortest path in a graph of configurations. The approach is similar to the approaches in [17] and [18]. The intuitive principle of our algorithm is the following. As we follow a path in the graph of configurations and move from one configuration to another, we choose which edges of the original graph to keep in the solution. To be more precise, the path will start on the empty configuration which denotes that no edges have been selected, and will iteratively add vertices and some of their incident edges to the solution.
To make sure that the selected edges do form a solution, we could try to record in the configurations the exact set of edges kept in the solution so far. However, the space of configurations would then be exponential. To avoid this, a configuration does not really store the entire set of chosen edges: instead it concisely represents the possible lengths modulo of walks between a small subset of vertices of , whose size is bounded as a function of the cutwidth.
The distance in the configuration graph from the initial configuration to any configuration will reflect the smallest cardinality of a set of edges that achieve the set of walks witnessed by . The algorithm therefore looks for a shortest path in the configuration graph from the initial configuration to any configuration which witnesses the -walk of length required by the EWM problem.
Formally, fix the input graph , the source and the target , and the modularity requirement . Let be a domain size bound, which is related to the cutwidth, but will be precisely defined later. Let us define the notion of configurations formally:
Definition 6.1 (Configuration).
A -configuration is a subset of at most vertices of , called the domain of the configuration, together with a function mapping each ordered pair for to a subset of . Having fixed the vertex set of the input graph , we denote by the set of all possible -configurations, and omit the subscript when clear from context.
To compute transitions in the configuration graph, our algorithm will need to compute the closure of a configuration . The point of the closure is to ensure that, whenever we can achieve a walk from to via intermediate vertices of and using remainders given by , then that walk can be witnessed directly by a value in . Formally:
Definition 6.2 (Closure).
Given a configuration , an internal walk of is a sequence of vertices of together with a choice of remainder for each step. Formally, it is a sequence and a choice of remainders such that we have for each . The total length of the internal walk is the sum of the remainders modulo , namely, .
The closure of is the configuration where, for each , for each , we have iff there is an internal walk of total length from to in . Note that we always have because for each we can take the single-edge internal walk with remainder .
We can easily compute the closure of a configuration in polynomial time in and , for instance using a product construction. Specifically, create a graph on vertices , then add the following edges: for each and each , for each create an edge from to . Then compute the transitive closure and define to be the set of such that has a path to .
We now define the graph over configurations, which we call and which also depends on the directed graph given as input to EWM. We omit the subscripts when clear from context.
Definition 6.3.
The graph of configurations is a weighted and labeled directed graph: each edge carries an integer cost and is labeled by a subset of edges of the original graph . The vertex set of is the set of all -configurations. To define the edges, let us choose any configuration and define the outgoing edges of . These edges are of two kinds:
-
Forget: from with nonempty, for each , letting be the new domain, we have an edge leaving which has cost 0, is labeled with the empty set of edges, and leads to where is the restriction of to
-
Introduce: from with , for each , let be the new domain, and let be the set of edges of which are of the form or with ; we also add to the self-loop edge on if it exists. Then for each , we have an edge leaving which has cost , is labeled by , and leads to the closure of the configuration with intuitively defined from by adding the edges of , formally:
-
–
For all , we initialize .
-
–
For each , we set .
-
–
For each edge , we set . Note that by definition of this case is disjoint from the previous one.
-
–
Note that we may have several Introduce edges with the same source and target configurations but labeled with different sets of edges and with different costs; in this case all of these edges exist in as defined above, although we could equivalently have decided to keep only one of these edges among those with minimum cost.
We have now defined the graph of configurations. We will look for shortest paths in this graph from the initial configuration to a final configuration, namely:
Definition 6.4.
The initial configuration is simply the configuration with empty domain.
A configuration is final if it contains and and features a walk with the requisite remainder, i.e., the configuration is final if and .
We can now define our algorithm to solve the EWM problem, which we call the EWM algorithm. We first set the value , following the cutwidth bound, to be: , i.e., three more than the cutwidth bound that follows from Corollary 5.3. (Intuitively, we add 2 to make sure that the sources and can always be part of the domain of configurations, and we add 1 extra to make sure that we can always perform Introduce steps before Forget steps.) The EWM algorithm then builds explicitly the graph and computes a shortest path in from the initial configuration to a final configuration. Once such a shortest path is found444 If there is no path in from an initial configuration to a final configuration, then we return to indicate that there is no subgraph of with an -walk of length . Note that this case can be excluded from the outset, simply by checking whether contains an -walk of length : this can be done, e.g., with the product construction., then the algorithm returns the subgraph of formed of the edges of obtained as the union of all edge labels in .
Proving that the EWM algorithm correctly solves the EWM problem is relatively straightforward: we show that whenever the algorithm return a subgraph, this subgraph is indeed a candidate solution. Conversely, we show that for every optimal solution there is a path from in the configuration graph from some initial configuration to some final configuration, such that the union of labels along the path is without duplicates (hence the path has cost ), which guarantees that the algorithm returns the optimal solution. Those properties can be proved by establishing the following invariant:
Claim 6.5.
Let be a path in from the initial configuration . Let , , be the sequence of subgraphs of defined in the following way: we have , and for each we set where is the edge set that labels the edge of used to go from to in . Then the following is true: for any , writing , for any , for any , we have iff there is a -walk in whose length modulo is .
Note that, in the definition of the graphs , we may add the same edge of multiple times because it occurs in the label of several edges of ; this can happen even if is a simple path in . As it turns out, this never happens when is a shortest path (because we are then, in effect, paying twice for the same edge); but the invariant of Claim 6.5 applies to paths even if they do traverse edges labeled with non-disjoint edge sets.
Altogether, we have explained why the algorithm correctly solves the EWM problem, and the algorithm can be shown to have complexity . Instantiated with our choice of , we get a final complexity of for the algorithm. See the full version ([3]) for details. This concludes the proof of our main result (Theorem 1.1).
7 Extension to Weighted Graphs
Having shown our main result (Theorem 1.1), in this section we start exploring variants and extensions of the EWM problem. We first study two extensions of EWM in this section. First, we study the addition of costs on edges, meaning that we look for a walk where the total cost is minimum (instead of the number of edges). Second, we study the addition of integer lengths on edges.
In this section and the next, the problems that we define and study are always posed on a directed graph and ask about the existence of a subgraph of satisfying certain properties and which is optimal according to some criterion (usually, being edge-minimum). We use the same terminology as before: a candidate solution is a subgraph which satisfies the properties but is not necessarily edge-minimum, and an optimal solution is a candidate solution which is additionally edge-minimum.
Costs on edges and vertices.
We first study the extension of the EWM problem with costs on edges. Specifically, the EWM problem with costs on edges takes as input a directed graph , a cost function giving to each edge a cost , a pair of a source and target , and a modularity requirement for integers and . The output to the EWM problem with costs on edges is an -walk of length such that the cost is minimum. We assume that all costs given by are nonnegative. Indeed, in the presence of negative weights, we lose the correspondence explained in Section 3: computing a minimum-weight subgraph that contains a walk reduces to the case of positive weights (all negative-weight edges will always be included in the optimal solution subgraph); but computing a minimum-weight walk is NP-hard even without modularity constraints:
Proposition 7.1.
The following problem is NP-complete: given a directed graph , a cost function , and source and target , compute a minimum-weight subset of edges such that there is an -walk using precisely the edges in .
Our tractability result for EWM (Theorem 1.1) can be easily extended to solve the EWM problem with costs on edges. Specifically, the Segment Decomposition Lemma (Lemma 4.3) shows that there must be an optimal solution to EWM with costs on edges that obeys the bound on the number of segments, because the function on subset of edges is monotone. Then the Segment Cutwidth Bound (Proposition 5.1) applies as is, and the algorithm to find an optimal solution of bounded cutwidth from the previous section can be easily extended: simply modify the definition of the graph of configurations (Definition 6.3) so that the cost of an edge labeled with a set of edges of is no longer the cardinality of but the total cost . Other than that, the algorithm proceeds in the same way, and computes a shortest path which now reflects the subgraph of satisfying the requirements which has minimum cost instead of minimum cardinality.
We also mention that the EWM problem can be defined to have costs (unit costs or not) on vertices instead of edges. In this case, the cost of a candidate solution is the number of different vertices traversed by the walk (or more generally their total cost). However, this problem is interreducible to the EWM problem with costs on edges:
Lemma 7.2.
The EWM problem (with unit costs on edges) reduces to the EWM problem with unit costs on vertices, and the EWM problem with costs on edges reduces to the EWM problem with costs on vertices. Conversely, the EWM problem with unit costs on vertices reduces to the EWM problem with unit costs on edges, and the EWM problem with costs on vertices reduces to the EWM problem with costs on edges.
Lengths on edges.
Having studied the use of costs on edges to change the optimization criterion, we turn to a different problem variant where we annotate edges with integer lengths. Specifically, the EWM problem with lengths takes as input a directed graph , a length function , and a pair of a source and target and a modularity requirement for integers and . The answer to the EWM problem with lengths is an -walk that traverses a minimum number of distinct edges and whose total length is , i.e., is . (Note that the length of an edge is summed as many times as the edge is traversed.)
The impact of allowing lengths is different depending on the regime. In the case where and are constant, or given in unary, then we can rewrite the graph in polynomial time to eliminate edge lengths. Specifically, letting be the number of edges in the graph, we replace each edge of length by a path on edges (where is the number of edges in the graph). This ensures that a walk in the new graph has the same modularity as the corresponding walk in the original graph, and the cost of a candidate solution in the rewritten graph is , where is the number of original edges traversed and . This ensures that an optimal solution in the original graph indeed minimizes the number of edges taken from the original graph.
One different regime is when and are written in binary and do not necessarily have polynomial value. In this case, the EWM problem with lengths written in binary can be shown to be NP-hard by an easy reduction from Subset Sum. In fact, just the problem of deciding the existence of a walk of length is NP-hard:
Proposition 7.3.
The problem of deciding whether an -walk of length exists, where and the lengths of are written in binary, is NP-hard.
We do not know whether an analogous hardness result holds for the original EWM problem (without edge lengths) in the setting where and can be written in binary and do not necessarily have polynomial value. Indeed, in the proof above, we crucially use the edge lengths to concisely write the numbers given in the Subset Sum instance; writing them in unary as paths of the corresponding length will not give a polynomial-time reduction and indeed the Subset Sum problem can be solved in pseudo-polynomial time.
We last address the question of showing an upper bound on the complexity of EWM with lengths, by showing an NP upper bound. Of course the bound is phrased on the decision version of EWM with lengths, i.e., given an instance of EWM with lengths and a threshold , we wish to decide whether there exists a candidate solution having at most different edges. We have:
Proposition 7.4.
The following problem is in NP: given a directed graph , a function that assigns lengths (written in unary) to each edge, integers , and (written in binary), source and target , decide if there is a subset of at most edges that contains an -walk of length according to .
8 Connections to DSN and SCSS
We conclude the paper by exploring how EWM relates to the Directed Steiner Network (DSN) problem mentioned in the introduction, and more specifically the Strongly Connected Steiner Subgraph (SCSS) problem. We first define the SCSS problem and show that it is subsumed by EWM, in the sense that a polynomial algorithm for EWM (with fixed and ) gives a polynomial algorithm for SCSS with a constant number of terminals. Then, we introduce a problem generalizing both SCSS and DSN on the one hand, and EWM in the other: we study how to find the smallest subgraph satisfying connectivity requirements on specified endpoint pairs with specified modularities. We show that we can generalize our results to this problem, and provide a polynomial algorithm for the setting when the number of endpoint pairs and the modularity requirements are constants.
Reducing SCSS to EWM.
Let us recall the definition of the SCSS problem [17, 18] before explaining how it can be reduced to EWM. In the SCSS problem, the input simply consists of a directed graph with an input set of terminals, and we want to find the smallest subgraph that is strongly connected and contains edges incident to each terminal in . In other words, SCSS is a special case of DSN where the connectivity requirements on the vertices of require that they are strongly connected. The SCSS problem is NP-hard in general (by an easy reduction from the directed Steiner tree problem) but it can be solved in polynomial time provided that the size of is bounded by a constant, as shown in [17]. Thus, for , we write -SCSS to refer to the SCSS problem where the set of terminals is required to contain at most vertices. Following our previous terminology in the paper, when we talk of candidate solutions we mean a subgraph satisfying the requirements of a problem (e.g., for -SCSS, a strongly connected subgraph containing the requisite terminals), and by optimal solution we mean a candidate solution which is also edge-minimum.
Let us show that the SCSS problem can be reduced to EWM, which implies that Theorem 1.1 gives an alternative proof of the results of [17] (with worse bounds). More precisely, we show:
Lemma 8.1.
For any constant , we can compute constants and in such that there is a linear-time reduction from -SCSS to EWM with the constant values and .
The previous reduction shows that our EWM problem is intuitively at least as complicated as SCSS, given that the tractability of EWM for constant implies that of SCSS for constant . (This being said, we do not claim that the polynomial algorithm given by our results is as efficient as that of earlier works [17, 18].) Alternatively, instead of this black-box reduction, we can also adapt our techniques to recapture the tractability of SCSS by showing a bound on the cutwidth of optimal solutions, using the segment decomposition. We exemplify below how this is done, and will revisit this afterwards when extending EWM to support multiple paths. Recall the notion of timestamp-minimum walks (Definition 4.5), and let us show the following result; note that it applies to edge-minimal solutions (not just optimal solutions).
Lemma 8.2.
Let and be an SCSS instance. Every edge-minimal solution to the instance has cutwidth at most .
What is more, for any terminal , considering all -walks that start and conclude in and visit all terminals from , then any timestamp-minimum -walk among those -walks has at most segments.
We remark that the cutwidth bound of given by this result is slightly better than the bound of shown in [18].
Defining Directed Steiner Network with Modularity constraints.
The reduction from SCSS to EWM leads to the natural question of whether there could be a problem which subsumes SCSS and EWM, or even DSN and EWM; and a polynomial-time algorithm that subsumes the tractability of all these problems. We now introduce such a problem, dubbed Directed Steiner Network with Modularity (DSNM), and show a polynomial-time algorithm to solve it.
Definition 8.3.
For , the -Directed Steiner Network with Modularity problem (-DSNM) takes as input a graph and a -requirement specification consisting of tuples for such that and and . Our goal is to compute an optimal solution, i.e., an edge-minimum subgraph of such that contains a walk from to of length for every .
Our last result in this paper is to show that the -DSNM problem is in PTIME when and the modulo values are constants. This result subsumes Theorem 1.1 and the tractability of -DSN for constant shown in [17, 18].
Theorem 8.4.
We can compute in time an optimal solution to -DSNM, where denotes the least common multiple (LCM) of the in the input -requirement specification.
Note that our exponents are given up to constant factors, so we do not claim to recover the same exponents as earlier works on -DSN without modularity constraints (i.e., for ).
The overall strategy to prove Theorem 8.4 is to follow the methodology used for EWM, adjusting the constructions presented so far in the paper. Deviating somewhat from EWM, but following prior work about DSN [17, 18], in our study of DSNM we will focus on the SCCs of an optimal solution and study them separately, instead of studying the entire solution graph. Our main objective is to bound the cutwidth of SCCs in an optimal solution as a function of and of the number of endpoint pairs. We can then easily deduce like in [18] that the cutwidth of optimal solutions is bounded as a function of and . To compute the solutions of bounded cutwidth, we then modify slightly the algorithm of Section 6.
To study the SCCs of optimal solutions, it will be useful to introduce an analogue of the SCSS problem featuring modularities. Specifically, we call -SCSSM the problem that takes the same input as -DSNM and returns a smallest strongly connected subgraph of such that contains a walk from to of length for every . Note that, unlike the SCSS problem which was simply defined by a set of terminals, in -SCSSM we specify a set of tuples connecting pairs of vertices, because we need to specify which remainder must be achieved by which endpoint pair. The difference with -DSNM is only that the solution graph is required to be strongly connected. Our study of -SCSSM is motivated by the fact that the SCCs of optimal solutions to -DSNM are optimal solutions to instances of -SCSSM, as was already known in the setting without modularities [17]. Namely:
Claim 8.5.
Let be a directed graph, and be a -requirement specification. In any optimal solution to -DSNM on for , for any SCC of , there exists a -requirement specification such that (as a set of edges) is an optimal solution to -SCSSM on for .
We next bound the cutwidth of optimal solutions to the -SCSSM problem specifically. Then we will deduce a cutwidth bound on optimal solutions to -DSNM using a result of [18]. Finally, we explain how to design an algorithm to compute these solutions.
Bounding the cutwidth of solutions to -SCSSM
Let us start by showing that solutions to -SCSSM have a small segment number and hence a small cutwidth:
Lemma 8.6.
Let be a graph and be a -requirement specification, denote by the respective modularities of its tuples, and denote by the least common multiple (LCM) of . Then every optimal solution of the -SCSSM problem on and can be covered with a walk of at most segments and therefore has cutwidth at most .
Note that this claim only works for the -SCSSM problem, not the -DSNM problem, because it assumes the solution can be covered by a single walk which is not true in general for -DSNM. Let us show the result:
Proof sketch.
The proof of this result can be decomposed into three steps. In step 1, we define the notion of a legal covering walk, which is a walk that covers the edges of a solution in a prescribed order: first visit all terminals to witness that they are strongly connected in a subwalk (similarly to Lemma 8.2), then visit all paths with the prescribed modularities in a subwalk . In step 2, we carefully impose a variant of timestamp-minimality on legal covering walks towards bounding their number of segments. In step 3, we use Lemma 8.2 (on the first part of the legal covering walk) and an argument adapted from Lemma 4.3 (for the second part) to show that the number of segments is bounded. The point of splitting the walk in two is that visits enough edges to allow us to move to arbitrary endpoints and ensure strong connectedness: this allows us to replay arbitrary detours in any subwalk no matter where they occur, intuitively “pooling” the achievable differences between all the subwalks. See the full version [3] for the full proof.
From -SCSSM to -DSNM.
We have shown that optimal solutions to the -SCSSM problem can be assumed to have bounded cutwidth. We now wish to show that the same is true for optimal solutions to the -DSNM problem. This is simple, using Lemma 8.6 above along with two lemmas from [18]:
Lemma 8.7.
Fix a graph and a -requirement specification , and let be the LCM of the in . Let be an optimal solution to -DSNM on and . Then has cutwidth at most .
Computing optimal solutions.
We have shown in Lemma 8.7 that optimal solutions to the -DSNM problem have cutwidth . The only remaining thing to prove Theorem 8.4 is to adapt the dynamic algorithm from Section 6 to compute solutions to that problem instead of EWM. We explain in the full version [3] how this is done.
9 Future work
We now outline possible directions for future work in light of our results:
Improving our bounds.
One natural direction for further research would be to improve the complexity bounds that we show. Our algorithm for EWM runs in time . Can the factor of in the exponent be improved, e.g., by getting an algorithm in time ? Or, on the other hand, is the problem NP-hard? (We have only shown that the variant of EWM with edge lengths is NP-hard, in Proposition 7.1.)
Intermediate problems between shortest walk and edge-minimum walk.
It could be interesting to search for walks that are minimized according to some criterion that is “intermediate” between shortest walk and edge-minimum walk, e.g., if the cost of each edge is expressed as a function of how many times it is traversed by the walk. Such a general framework would capture the two problem variants that we contrast in the introduction: shortest walks are the case where we are charged when traversing an edge times, and edge-minimum walks are the case where we are charged when traversing an edge times and charged when we do not traverse it.
Edge-minimum walks satisfying other constraints.
Last, one other problem of interest would be to investigate the complexity of finding edge-minimum subgraphs guaranteeing the existence of -walks satisfying other properties. One very natural example is the following: Given a number , and a directed graph with specified vertices , find an edge-minimum -walk of length exactly . (This is the same as EWM except the length is exactly , instead of .) The trivial algorithm for this problem is in time , but can we do better? To our knowledge, this problem has not been studied.
Another example is looking for edge-minimum walks that achieve constraints expressed by a finite semigroup. For instance, assume that each edge of the graph is labeled by a semigroup element, and that we want a walk whose evaluation in the semigroup achieves a specific target element of the semigroup, where evaluating the walk means multiplying the labels of its edges in the order that they are traversed. This problem generalizes the EWM problem (with lengths on edges), which uses the semigroup . An alternative way to phrase this problem is in the language of regular path queries (RPQs) mentioned in the introduction: fixing a regular language on an alphabet , and given a directed graph with terminals and and with edges labeled by letters of , find an edge-minimum subgraph with an -walk which evaluates to a word that belongs to . For which fixed regular languages can this problem be solved in polynomial time in ?
References
- [1] Noga Alon and Michael Krivelevich. Divisible subdivisions. J. Graph Theory, 98(4), 2021. doi:10.1002/jgt.22716.
- [2] Antoine Amarilli. Survey of results on the ModPath and ModCycle problems, 2024. doi:10.48550/arXiv.2409.00770.
- [3] Antoine Amarilli, Benoît Groz, and Nicole Wein. Edge-minimum walk of modular length in polynomial time, 2024. arXiv:2412.01614.
- [4] Esther M. Arkin, Christos H. Papadimitriou, and Mihalis Yannakakis. Modularity of cycles and paths in graphs. J. ACM, 38(2), 1991. doi:10.1145/103516.103517.
- [5] Guillaume Bagan, Angela Bonifati, and Benoît Groz. A trichotomy for regular simple path queries on graphs. J. Comput. Syst. Sci., 108, 2020. doi:10.1016/J.JCSS.2019.08.006.
- [6] Andreas Björklund, Thore Husfeldt, and Petteri Kaski. The shortest even cycle problem is tractable. In STOC, 2022. doi:10.1145/3519935.3520030.
- [7] Archit Chauhan, Samir Datta, Chetan Gupta, and Vimal Raj Sharma. The even-path problem in directed single-crossing-minor-free graphs, 2024. doi:10.48550/arXiv.2407.00237.
- [8] Chandra Chekuri, Guy Even, Anupam Gupta, and Danny Segev. Set connectivity problems in undirected graphs and the directed Steiner network problem. ACM Trans. Algorithms, 7(2), 2011. doi:10.1145/1921659.1921664.
- [9] Rajesh Chitnis, Andreas Emil Feldmann, and Pasin Manurangsi. Parameterized approximation algorithms for bidirected Steiner network problems. ACM Trans. Algorithms, 17(2), 2021. doi:10.1145/3447584.
- [10] Fan R. K. Chung, Wayne Goddard, and Daniel J. Kleitman. Even cycles in directed graphs. SIAM J. Discret. Math., 7(3), 1994. doi:10.1137/S0895480192225433.
- [11] Mariano P. Consens and Alberto O. Mendelzon. Graphlog: a visual formalism for real life recursion. In PODS, 1990. doi:10.1145/298514.298591.
- [12] Isabel F. Cruz, Alberto O. Mendelzon, and Peter T. Wood. A graphical query language supporting recursion. In SIGMOD, 1987. doi:10.1145/38713.38749.
- [13] Gianlorenzo D’Angelo and Esmaeil Delfaraz. Approximation algorithms for node-weighted directed Steiner problems. In IWOCA, 2024. doi:10.1007/978-3-031-63021-7_21.
- [14] Shagnik Das, Nemanja Draganić, and Raphael Steiner. Tight bounds for divisible subdivisions. J. Combin. Theory Ser. B, 165, 2024. doi:10.1016/j.jctb.2023.10.011.
- [15] Irit Dinur and Pasin Manurangsi. ETH-hardness of approximating 2-CSPs and directed Steiner network. In ITCS, 2018. doi:10.4230/LIPICS.ITCS.2018.36.
- [16] Ajit A Diwan. Cycles of weight divisible by , 2024. doi:10.48550/arXiv.2407.01198.
- [17] Jon Feldman and Matthias Ruhl. The directed Steiner network problem is tractable for a constant number of terminals. SIAM J. Comput., 36(2), 2006. doi:10.1137/S0097539704441241.
- [18] Andreas Emil Feldmann and Dániel Marx. The complexity landscape of fixed-parameter directed Steiner network problems. ACM Trans. Comput. Theory, 15(3–4), 2023. doi:10.1145/3580376.
- [19] Steven Fortune, John E. Hopcroft, and James Wyllie. The directed subgraph homeomorphism problem. Theor. Comput. Sci., 10, 1980. doi:10.1016/0304-3975(80)90009-2.
- [20] Esther Galby, Sándor Kisfaludi-Bak, Dániel Marx, and Roohani Sharma. Subexponential parameterized directed Steiner network problems on planar graphs: A complete classification. In ICALP, 2024. doi:10.4230/LIPICS.ICALP.2024.67.
- [21] Jiong Guo, Rolf Niedermeier, and Ondrej Suchý. Parameterized complexity of arc-weighted directed Steiner problems. SIAM J. Discret. Math., 25(2), 2011. doi:10.1137/100794560.
- [22] Xiao Hu and Stavros Sintos. Finding smallest witnesses for conjunctive queries. In ICDT, 2024. doi:10.4230/LIPICS.ICDT.2024.24.
- [23] Alpár Jüttner, Csaba Király, Lydia Mirabel Mendoza-Cadena, Gyula Pap, Ildikó Schlotter, and Yutaro Yamaguchi. Shortest odd paths in undirected graphs with conservative weight functions. Discrete Appl. Math., 357, 2024. doi:10.1016/j.dam.2024.05.044.
- [24] Naonori Kakimura and Ken-ichi Kawarabayashi. Packing cycles through prescribed vertices under modularity constraints. Adv. in Appl. Math., 49(2), 2012. doi:10.1016/j.aam.2012.03.002.
- [25] Ken-ichi Kawarabayashi, Stephan Kreutzer, O-joung Kwon, and Qiqin Xie. A half-integral Erdos-Posa theorem for directed odd cycles. In SODA, 2023. doi:10.1137/1.9781611977554.ch118.
- [26] Andrea S. LaPaugh and Christos H. Papadimitriou. The even-path problem for graphs and digraphs. Networks, 14(4), 1984. doi:10.1002/NET.3230140403.
- [27] Anna Lubiw. A note on odd/even cycles. Discret. Appl. Math., 22(1), 1988. doi:10.1016/0166-218X(88)90125-4.
- [28] Wim Martens, Matthias Niewerth, and Tina Popp. A trichotomy for regular trail queries. Log. Methods Comput. Sci., 19(4), 2023. doi:10.46298/LMCS-19(4:20)2023.
- [29] William McCuaig. Pólya’s permanent problem. Electron. J. Comb., 11(1), 2004. doi:10.37236/1832.
- [30] Zhengjie Miao, Sudeepa Roy, and Jun Yang. Explaining wrong queries using small examples. In SIGMOD, 2019. doi:10.1145/3299869.3319866.
- [31] Burkhard Monien. The complexity of determining a shortest cycle of even length. Computing, 31(4), 1983. doi:10.1007/BF02251238.
- [32] Zhivko Prodanov Nedev. Finding an even simple path in a directed planar graph. SIAM J. Comput., 29(2), 1999. doi:10.1137/S0097539797330343.
- [33] Mehdy Roayaei and Mohammadreza Razzazi. Parameterized complexity of directed Steiner network with respect to shared vertices and arcs. Int. J. Found. Comput. Sci., 29(7), 2018. doi:10.1142/S0129054118500302.
- [34] Neil Robertson, P. D. Seymour, and Robin Thomas. Permanents, Pfaffian orientations, and even directed circuits. Ann. of Math. (2), 150(3), 1999. doi:10.2307/121059.
- [35] Ildikó Schlotter and András Sebő. Odd paths, cycles and -joins: Connections and algorithms, 2022. arXiv:2211.12862.
- [36] Paul Seymour and Carsten Thomassen. Characterization of even directed graphs. J. Combin. Theory Ser. B, 42(1), 1987. doi:10.1016/0095-8956(87)90061-X.
- [37] Carsten Thomassen. Even cycles in directed graphs. Eur. J. Comb., 6(1), 1985. doi:10.1016/S0195-6698(85)80025-1.
- [38] Carsten Thomassen. The even cycle problem for planar digraphs. J. Algorithms, 15(1), 1993. doi:10.1006/jagm.1993.1030.
- [39] Vijay V. Vazirani and Mihalis Yannakakis. Pfaffian orientations, 0-1 permanents, and even cycles in directed graphs. Discret. Appl. Math., 25(1-2), 1989. doi:10.1016/0166-218X(89)90053-X.
- [40] Paul Wollan. Packing cycles with modularity constraints. Combinatorica, 31(1), 2011. doi:10.1007/s00493-011-2551-5.
- [41] Raphael Yuster and Uri Zwick. Finding even cycles even faster. SIAM J. Discrete Math., 10(2), 1997. doi:10.1137/S0895480194274133.