Abstract 1 Introduction 2 Preliminaries 3 Multiobjective 𝒔-𝒕 cut problem 4 Experiments 5 Results 6 Conclusions References

Parameterized Algorithms for Computing Pareto Sets

Joshua Marc Könen ORCID Institute of Computer Science, University of Bonn, Germany Heiko Röglin ORCID Institute of Computer Science, University of Bonn, Germany Tarek Stuck Institute of Computer Science, University of Bonn, Germany
Abstract

The problem of computing the set of Pareto-optimal solutions has been studied for a variety of multiobjective optimization problems. For many such problems, algorithms are known that compute the Pareto set in (weak) output-polynomial time. These algorithms are often based on dynamic programming and by weak output-polynomial time, we mean that the running time depends polynomially on the size of the Pareto set but also on the sizes of the Pareto sets of the subproblems that occur in the dynamic program. For some problems, like the multiobjective minimum spanning tree problem, such algorithms are not known to exist and for other problems, like multiobjective versions of many NP-hard problems, such algorithms cannot exist, unless 𝒫=𝒩𝒫.

Dynamic programming over tree decompositions is a common technique in parameterized algorithms. In this paper, we study whether this technique can also be applied to compute Pareto sets of multiobjective optimization problems. We first derive an algorithm to compute the Pareto set for the multicriteria s-t cut problem and show how this result can be applied to a polygon aggregation problem arising in cartography that has recently been introduced by Rottmann et al. (GIScience 2021). We also show how to apply these techniques to also compute the Pareto set of the multiobjective minimum spanning tree problem and for the multiobjective TSP. The running time of our algorithms is 𝒪(f(w)poly(n,pmax)), where f is some function in the treewidth w, n is the input size, and pmax is an upper bound on the size of the Pareto sets of the subproblems that occur in the dynamic program. Finally, we present an experimental evaluation of computing Pareto sets on real-world instances of polygon aggregation problems. For this matter we devised a task-specific data structure that allows for efficient storage and modification of large sets of Pareto-optimal solutions. Throughout the implementation process, we incorporated several improved strategies and heuristics that significantly reduced both runtime and memory usage, enabling us to solve instances with treewidth of up to 22 within reasonable amount of time. Moreover, we conducted a preprocessing study to compare different tree decompositions in terms of their estimated overall runtime.

Keywords and phrases:
parameterized algorithms, treewidth, multicriteria optimization problems, multicriteria MST, multicriteria TSP, polygon aggregation
Copyright and License:
[Uncaptioned image] © Joshua Marc Könen, Heiko Röglin, and Tarek Stuck; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
; Theory of computation Parameterized complexity and exact algorithms ; Theory of computation Graph algorithms analysis
Related Version:
Full Version: https://arxiv.org/abs/2509.06124
Funding:
This work has been funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 390685813; 459420781 and by the Lamarr Institute for Machine Learning and Artificial Intelligence lamarr-institute.org.
Editors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz Herman

1 Introduction

Multiobjective optimization problems arise naturally in many contexts. When booking a train ticket, one might be interested in minimizing the travel time, the price, and the number of changes. Similarly when planning a new infrastructure, companies often have to find a compromise between reliability and costs, to name just two illustrative examples. For multiobjective optimization problems there is usually not a single solution that is optimal for all criteria simultaneously, and hence, one has to find a trade-off between the different criteria. Since it is not a priori clear which trade-off is desired, one often studies the set of Pareto-optimal solutions (also known as Pareto set), where a solution is Pareto-optimal if it is not dominated by any other solution, i.e., there does not exist another solution that is better in at least one criterion and at least as good in all other criteria. The reasoning behind this is that a dominated solution cannot be a reasonable compromise of the different criteria and it makes sense to restrict one’s attention to only the Pareto-optimal solutions.

A lot of research in multiobjective optimization deals with algorithms for computing the Pareto set and with studying the size of this set for various optimization problems. If this set is not too large then it can be presented to a (human) decision maker who can select a solution. The Pareto set also has the important property that any monotone function that combines the different objectives into a single objective is optimized by a Pareto-optimal solution. This means, in order to optimize such a function, one could first generate the Pareto set and then pick the best solution from this set.

This approach is only reasonable if there are not too many Pareto-optimal solutions. While in the worst case the Pareto set can be of exponential size for almost all multiobjective optimization problems, it has been observed that the Pareto set is often small in practice (see, e.g., [16, 23]). It has also been shown in the probabilistic model of smoothed analysis that for many problems the expected size of the Pareto set is polynomial if the input is subject to a small amount of random noise [10, 5]. This motivates the development of algorithms for computing the Pareto set that have polynomial running time with respect to the output size.

In the literature such output-sensitive algorithms have been developed for different multiobjective optimization problems (e.g., for the multiobjective shortest path problem [12, 17, 29], multiobjective flow problems [15, 24], and the knapsack problem when viewed as a bicriteria optimization problem [25]). There is, however, a small caveat. Since these algorithms are usually based on dynamic programming, they are not output-polynomial in the strict sense but they solve certain subproblems of the given instance and their running times depend not only polynomially on the size of the Pareto set of the entire instance but also polynomially on the sizes of the Pareto sets of the subproblems. While in theory it could be the case that some of the subproblems have Pareto sets of exponential size while the Pareto set of the entire instance is small (see, e.g., [9]), this behavior is neither observed in experiments nor does it occur in probabilistic input models. Hence, algorithms that are output-polynomial in the weak sense that their running time depends polynomially on the sizes of the Pareto sets of all subproblems that occur in the dynamic program are useful and state-of-the-art for many multiobjective optimization problems. For some problems such output-sensitive algorithms are not known to exist. In particular, for the multiobjective spanning tree problem no algorithm is known that computes the Pareto set in output-polynomial time (not even in the weak sense).

Dynamic programming is a common technique in parameterized algorithms. For many graph problems, dynamic programming on tree decompositions is applied to obtain fixed-parameter tractable (FPT) algorithms with respect to the treewidth of the input graph. In this paper, we explore for the first time the potential of dynamic programming on tree decompositions for computing Pareto sets. First we design an algorithm for the multicriteria s-t cut problem and apply it on a problem from cartography that has recently been introduced by Rottmann et al. [28] and that was the original motivation for our research.

Rottmann et al. study maps with building footprints and present a new model how less detailed maps can be derived from a given map. Their method is based on viewing the problem as a bicriteria optimization problem. It is assumed that the plane containing the building footprints is triangulated and a set of triangles to glue together some of the buildings is to be selected that on the one hand minimizes the total area and on the other hand minimizes the total perimeter. Rottmann et al. present an algorithm that computes the set of extreme nondominated solutions (i.e., the set of solutions that optimize a linear combination of the objectives), which is a subset of the Pareto set. They ask if it is also possible to compute the entire Pareto set in some output-efficient way.

We show how a treewidth-based algorithm can be used to compute the Pareto set for the s-t cut problem. The running time of this algorithm is FPT in the treewidth and output-polynomial in the weak sense, i.e., it is of the form 𝒪(f(w)poly(n,pmax)) where f is some function in the treewidth w, n denotes the input size, and pmax denotes an upper bound on the size of the Pareto sets of the subproblems that occur in the dynamic program.

As a special case, the results for the multiobjective s-t cut algorithm directly translate to the cartography problem by computing the whole Pareto set in FPT time and being output-polynomial in the weak sense. We experimentally analyze the computation of Pareto sets for this cartography problem on real-world datasets by designing a specialized data structure, developing heuristics, and integrating additional implementation-specific techniques tailored to the task.

1.1 Our Results

We first consider the multiobjective s-t cut problem and its special case, the triangle aggregation problem [28]: Given some polygons P and triangles T, the goal is to find all Pareto-optimal subsets TT minimizing both the total area and perimeter of PT. This problem arises in cartography, where the goal is to compute a less detailed map from a given map. Here the polygons are building footprints and the space between these footprints is triangulated. By including triangles, one can glue together these building footprints, leading to a less detailed version of the map. We show that the multiobjective s-t cut problem implies an algorithm to compute the Pareto set in time 𝒪(nw2wpmax2log(pmax)) for this problem, where w is the treewidth of the triangle adjacency graph.

In the full version of the paper, we show that this method for computing Pareto sets can also be applied for other multiobjective optimization problems, such as the multiobjective minimum spanning tree problem (MST) and the multiobjective traveling salesman problem (TSP). For both, we present algorithms with running time 𝒪(nw𝒪(w)pmax2logd2(w𝒪(w)pmax2)) where w is the treewidth of the input graph, n is the number of vertices in the graph, pmax is the size of the largest Pareto set computed in one of the subproblems of the dynamic program, and d2 is the (constant) number of different objectives. This shows that the Pareto set can be efficiently computed for graphs with small treewidth if the Pareto sets of the subproblems are not too large, which is often the case in realistic inputs.

Finally, we experimentally evaluate our algorithm on real-world datasets. We introduce new heuristic pruning techniques, runtime estimates for tree decompositions and their root, memory optimization, multi-threading, and other improvements for reducing runtime and memory usage.

1.2 Related Work

Output-polynomial time algorithms (at least in the weak sense) are known for many problems. Examples include the multiobjective shortest path problem [12, 17, 29], multiobjective flow problems [15, 24], and the knapsack problem when viewed as a bicriteria optimization problem [25]. For the multiobjective spanning tree problem such algorithms are not known and there are only results on the set of extreme nondominated solutions. This is a subset of the Pareto set and it contains all solutions that optimize some linear combination of the different objectives. For the multiobjective spanning tree problem it is known that there are only polynomially many extreme nondominated solutions [14], and efficient algorithms for enumerating them exist [2].

Rottmann et al. [28] introduce the triangle aggregation problem under the objective of map simplification by grouping multiple building footprints together. They show that, similar to the multiobjective minimum spanning tree problem, there exist only a polynomial number of nondominated extreme solutions and they present an algorithm to compute this set in polynomial time. They also provide an extensive experimental study of their algorithms on real-world data sets showing that the model captures nicely the intention to aggregate building footprints to obtain less detailed maps. They put it as an open question whether or not it is possible to also compute the set of Pareto-optimal solutions in an output-efficient way.

Motivated by the observation that for many problems the Pareto set is often small in applications, the number of Pareto-optimal solutions has been studied in the probabilistic framework of smoothed analysis in a sequence of articles [27, 22, 10, 5]. It has been shown for a large class of linear integer optimization problems (which contains in particular the multiobjective MST problem and the multiobjective TSP) that the expected number of Pareto-optimal solutions is polynomially bounded if the coefficients (in our case, the edge weights) are independently perturbed by random noise. Formally, the coefficients of d1 out of the d objectives are assumed to be independent random variables with adversarially chosen distributions with bounded density. It is shown that not only the expected number of Pareto-optimal solutions is bounded polynomially in this setting but also all constant moments, so in particular the expected squared number of Pareto-optimal solutions is also polynomially bounded. Combined with this, our results imply that the Pareto set for the multiobjective MST problem, the multiobjective TSP and the multiobjective s-t cut problem can be computed in expected FPT running time 𝒪(f(w)poly(n)) in the model of smoothed analysis.

While dynamic programming over tree decompositions is a well-established technique, using it effectively in practice comes with more difficulties than simply minimizing the width of the decomposition. In experiments the runtime of an algorithm can vary largely for different tree decompositions with the same width.

Bodlaender and Fomin [8] introduced the concept of the f-cost of a tree decomposition, which sums up a cost function f applied to the bag sizes across all nodes. This framework formalizes the observation that not all nodes contribute equally to runtime or memory, and that practical efficiency can benefit from structurally favorable decompositions.

The f-cost approach was later extended and evaluated experimentally by Abseher et al. [1] using machine learning to select a tree decomposition based on different features such as bag sizes, branching factors, and node depths. With their method they could often select a decomposition that performs well in practice. Kangas et al. [18] used the same model for the problem of counting linear extensions, confirming the value of such estimators, and suggesting the average join-node depth as a good single feature for estimating runtime.

Beyond selection heuristics, tree decompositions have also been studied with respect to memory efficiency. Charwat et al. [11] explored compressed representations of intermediate states using decision diagrams, and Betzler et al. [6] proposed anchor-based compression techniques to reduce hard drive memory usage.

In our case, we not only solve an optimization problem, but compute the entire set of Pareto-optimal solutions. This makes memory usage more sensitive to the structure of join operations and limits the applicability of pointer-based or diagram-compressed representations. To address this, we develop a custom estimation strategy for predicting both runtime and memory usage, based on predicting the number of Pareto-optimal solutions at each node and the runtime impact of join operations. We then select a decomposition based on a combination of both runtime and memory usage. Details of this procedure are provided in the full version of the paper.

2 Preliminaries

Let [i]:={1,,i}. We consider an arbitrary optimization problem with a constant number d of objectives to be minimized. Each (feasible) solution s is mapped to a cost vector f(s)=(f(s)1,,f(s)d)d where f is our objective function. A solution s1 is said to dominate another solution s2 if f(s1)if(s2)i for all i[d] and there exists some j[d] such that f(s1)j<f(s2)j. We write s1s2 to denote that s1 dominates s2. The set of all non-dominated solutions is called the Pareto set. These definitions extend naturally to optimization problems where a subset of objectives is to be maximized instead of minimized. In the following we assume that all objectives are to be minimized, but all arguments and results can be adapted analogously when some (or all) objectives are to be maximized.

Let 𝒫1 and 𝒫2 be two sets of Pareto-optimal solutions. We define their combined Pareto set as 𝒫1+𝒫2:={p𝒫1𝒫2p is Pareto-optimal in 𝒫1𝒫2}, their union as S𝒫1,𝒫2:={p1p2p1𝒫1,p2𝒫2}, which contains all pairwise unions of solutions from 𝒫1 and 𝒫2, and the Pareto set of S𝒫1,𝒫2 as 𝒫1𝒫2:={sS𝒫1,𝒫2s is Pareto-optimal in S𝒫1,𝒫2}.

2.1 Treewidth

Our algorithms rely on dynamic programming over a tree decomposition. The concept of tree decomposition and treewidth, introduced by Robertson and Seymour [26], provide a measure of how similar a graph is to a tree. Many 𝒩𝒫-hard problems become tractable on graphs with small treewidth. Computing the optimal treewidth is 𝒩𝒫-hard [3], but it is FPT with respect to the treewidth [7], and a tree decomposition of width at most 2w(G) can be computed in linear time using approximation algorithms [21]. Hence, we assume that our algorithm starts from such an approximate decomposition if no optimal one is available. In the following, we briefly introduce the notation used for tree decompositions.

Definition 1 (Tree decomposition).

A tree decomposition of a graph G=(V,E) is a pair 𝒯=(𝕋=({t1,,tm},E),{Xt}tV(𝕋)) where 𝕋 is a tree and each node tV(𝕋) is associated with a bag XtV, such that the following conditions hold:

  • tV(𝕋)Xt=V(G), i.e., every vertex of G appears in at least one bag,

  • {u,v}E(G)tV(𝕋):uXtvXt, i.e., for every edge {u,v}E(G) the vertices u and v must appear together in at least one bag,

  • Tu={tV(𝕋)uXt} is a connected subtree of 𝕋 for every uV(G).

The width of a tree decomposition is w(𝒯):=maxtV(𝕋)|Xt|1, and the treewidth w(G) is the smallest possible width of a tree decomposition. We denote with Vt the union of vertices that were introduced at t or some child of t. In the following, we write w for the width of the tree decomposition used by our algorithm, which may be larger than w(G) if an approximation is used.

We assume we have a nice tree decomposition, which is a binary tree decomposition rooted at some node rV(𝕋) with Xr= and with four specific node types:

  • Leaf node: Let t be a node with no child nodes. Then t is a Leaf node iff Xt=.

  • Introduce node: Let tparent,tchild be two nodes in V(𝕋), where tparent has exactly one child tchild. Then tparent is an Introduce node iff Xtparent=Xtchild{v} for some vV(G).

  • Forget node: Let tparent,tchild be two nodes in V(𝕋), where tparent has exactly one child tchild. Then tparent is a Forget node iff Xtparent{v}=Xtchild for some vV(G).

  • Join node: Let tparent,tchild1,tchild2 be three nodes in V(𝕋), where tparent has exactly two children tchild1 and tchild2. Then tparent is a Join node iff Xtparent=Xtchild1=Xtchild2.

From any tree decomposition of width w, a nice tree decomposition of the same width with 𝒪(|V(G)|w) nodes can be computed in polynomial time [13].

The algorithm works as follows: Given an instance G=(V,E) and a cost function c:Ed, it first computes a nice tree decomposition (𝕋,{Xt}tV(𝕋)). In each leaf node the Pareto sets are initialized as empty sets. Then, for each node t, depending on its type, the algorithm recursively computes the set of Pareto-optimal solutions for each subproblem at this node. We distinguish between the functions computeLeafNode, computeIntroduceNode, and computeForgetNode, corresponding to their respective node types, and show correctness assuming that the Pareto sets for all child nodes are correctly computed.

3 Multiobjective 𝒔-𝒕 cut problem

We consider the problem of computing all Pareto-optimal s-t cuts in a graph. Given a graph G=(V{s,t},E) with n=|V| and a cost function c:Ed, a cut corresponds to a subset SV. The cost of a cut S is defined as the sum of costs of all edges crossing from S{s} to (VS){t}. Let δ(S):={{u,v}EuS{s},v(VS){t}} be the set of edges cut by S. The cost of S is given by eδ(S)c(e). We now wish to compute all s-t cuts that are Pareto-optimal. Without loss of generality, we assume {s,t}E, since such an edge only introduces a constant shift in cost to every Pareto-optimal solution.

We denote δ(A,B):={{u,v}EuA,vB} for A,BV(G) as the set of edges between vertices in A and B that are being cut. We define c:2V(G)×2V(G)d with c(A,B)=eδ(A,B)c(e) for any subsets A,BV(G). For simplicity, for a single vertex pV(G) and subset AV(G), we write c(p,A) instead of c({p},A).

Let G=G[V] be the subgraph induced by V. Given a nice tree decomposition (𝕋,{Xt}tV(𝕋)) for G, we compute for each node tV(𝕋) and subset SXt the Pareto set 𝒫tS consisting of all Pareto-optimal solutions pVt with pXt=S. We refer to such a subset S as a selection. Every such solution p has cost c(δ(p))=c(p,Vp)+c(p,t)+c(Vp,s). We refer to such an instance by a pair (t,S).

Theorem 2.

For an arbitrary s-t cut instance (G=(V{s,t},E),c) with cost function c:Ed, we can compute the set of Pareto-optimal s-t cuts in time 𝒪(nw2wpmax2logmax{d2,1}(pmax2)) if a nice tree decomposition of graph G=G[V] with width w is provided.

In the following we describe how to compute the Pareto set for each node type. For each instance (t,S), the set of Pareto-optimal solutions is stored in an entry D[t,S]. For simplicity, we assume that D[t,S] contains the actual corresponding Pareto set. Since the algorithm only depends on the cost vectors and selection S, in the actual implementation we will only save their cost vectors and reconstruct each solution by additionally storing pointers to the solutions from which it originated. Since we assume d as a constant, the encoding length of every Pareto-optimal solution is constant as well. Proofs of the following lemmas are provided in the full version of the paper.

computeLeafNode(t,S): For a leaf node t, we have only one solution D[t,]={}.

computeIntroduceNode(t,S): Assume we have Xt=Xt{v}. If vS, then D[t,S]=D[t,S]. If vS, then v is only adjacent to vertices in Xt and VVt. If Xt is fixed, placing v on the same side of the cut as s only changes the cost of every solution by a fixed amount. Therefore we have D[t,S]=D[t,S{v}], and their costs change by c(v,Xt(VVt){t})2c(S{v},v)c(s,v).

Lemma 3.

If node t introduces some vertex v and has a child t for which D[t,S] has been computed, then computeIntroduceNode(t,S) computes D[t,S] in time 𝒪(pmax) for any partition S.

computeForgetNode(t,S): Assume we have Xt=Xt{v}. To obtain D[t,S], we compute the union of the sets D[t,S] and D[t,S{v}] and then remove all dominated solutions.

Lemma 4.

If node t forgets some vertex v and has a child t for which all possible sets D[t,S] have been computed, then computeForgetNode(t,S) computes D[t,S] in time 𝒪(pmaxlogd2(pmax)).

For the correctness of computeJoinNode(t,S), the following lemma will be useful:

Lemma 5.

For each Pareto-optimal solution pV and any node tV(𝕋), the solution pVt is Pareto-optimal in the instance (t,Xtp).

computeJoinNode(t,𝒮): Assume we have Xt=Xt1=Xt2. Since Xt separates Vt1Xt,Vt2Xt and VVt, we can combine every solution from D[t1,S] with every solution from D[t2,S] and then remove all dominated ones.

Lemma 6.

If node tV(𝕋) is a join node with children t1,t2V(𝕋) and sets D[t1,S] and D[t2,S] have been correctly computed, then computeJoinNode(t,S) computes D[t,S] in time 𝒪(pmax2logmax{d2,1}(pmax2)).

Proof of Theorem 2.

The correctness follows directly from the previous lemmas for the different node types, and from D[r,] being the Pareto set for the entire instance.

The runtime is dominated by the join nodes. Our nice tree decomposition contains 𝒪(nw) nodes, and for each node t the number of subsets SXt is bounded by 2w+1. Computing D[t,S] at a join node takes time 𝒪(pmax2logmax{d2,1}(pmax2)). Hence, the total running time is bounded by 𝒪(nw2wpmax2logmax{d2,1}(pmax2)).

The technique of computing the Pareto set for the s-t cut problem can also be applied to other multiobjective optimization problems as well. To illustrate this, we show how to compute the Pareto set for the multiobjective MST problem and the multiobjective TSP problem. Proofs of the following theorems are provided in the full version of the paper.

Theorem 7.

For an arbitrary multiobjective spanning tree instance (G=(V,E),c) with d2 objectives and cost function c:Ed, we can compute the set of Pareto-optimal spanning trees in time 𝒪(n(2w)𝒪(w)pmax2logmax{d2,1}((2w)𝒪(w)pmax2)), if a nice tree decomposition of graph G with width w is provided.

Theorem 8.

For an arbitrary multiobjective traveling salesman instance (G=(V,E),c) with d2 objectives and cost function c:Ed, we can compute the set of all Pareto-optimal tours in time 𝒪(n(3w+3)𝒪(w)pmax2logmax{d2,1}((3w+3)𝒪(w)pmax2)), if a nice tree decomposition of graph G with width w is provided.

4 Experiments

In this section we present the experimental evaluation of our proposed algorithm for a special case of the multiobjective s-t cut problem: the bicriteria polygon aggregation problem, introduced by Rottmann et al. [28]. An instance (T,P) consists of a set P={p1,,pm} of polygons and a set T={t1,,tn} of triangles. For any polygon sTP, let A(s)>0 denote its area and P(s)>0 its perimeter. Given a set of triangles ST, we define A(S) as the total area of SP and P(S) as the total perimeter of SP. The goal is to compute the Pareto set 𝒫 of subsets ST minimizing both A(S) and P(S).

As with many multiobjective optimization problems, the set of Pareto-optimal solutions for the bicriteria polygon aggregation problem can be of exponential size. In fact, one can more generally show that the bicriteria polygon aggregation problem can be seen as a generalization of the bicriteria knapsack problem. The statement on the size of the Pareto set follows as a corollary. The proof of the following theorem is provided in the full version of the paper.

Theorem 9.

For every instance I of the bicriteria knapsack problem, there exists an instance I of the bicriteria triangle aggregation problem such that there is a one-to-one correspondence between the Pareto-optimal solutions in I and I.

Our implementation follows the methodology outlined in Section 3, is written in Java, and was evaluated on the datasets used by Rottmann et al.[28], which consist of triangle aggregation problems derived from various cities and villages in Germany. The source code is available at https://github.com/Tarek-pub/Bicriteria_Aggregation.

We use the s-t cut construction from Rottmann et al. [28] to generating each instance. Additionally, we remove vertices corresponding to polygons from the graph G, resulting in a more compact, yet equivalent graph structure. To reduce complexity, weights (i.e. area and perimeter) are rounded to one decimal place (perimeter in decimals), and only unique-weight solutions are retained. Tree decompositions are computed using the JDrasil framework [4]. Several observations during development led to significant improvements in runtime and memory usage, which we integrated iteratively into the implementation and quantify through an analysis on the Ahrem dataset (Table 1). Experiments were run using 16 threads of an Intel Core i9-13900 with 100GB RAM. Multiple tree decompositions were generated for exactly 1 hour using 16 threads.

Further implementation details are provided in the full version of the paper, including an extended description of optimizations and a complete list of datasets with corresponding runtimes and memory usage.

Table 1: Performance comparison of different algorithmic improvements on the dataset Ahrem.
*: These values are extrapolated based on sampled subproblems and scaled proportionally to estimate full runtimes (more details are provided in the full version of the paper).
**: For the versions before choosing a good root or tree decomposition, we chose the median rated root / tree decomposition in this table for a fair comparison.
Algorithm improvement Runtime [h] Storage usage [GiB]
Outsourcing & Pruning** 13767.8* 3707
Choosing a good root** 8571.6* (38%) 3991 (+8%)
Choosing from multiple tree decompositions 7129.6* (17%) 3647 (9%)
Join node heuristic 162.0 (98%) 3647 (0%)
Join-forget nodes 8.2 (95%) 1499 (59%)
Introduce-join-forget nodes 7.0 (15%) 122 (92%)
Summary 99.95% 96.71%

As noted in Section 3, it suffices to store only the overall cost of a solution, rather than the full set of triangles included. The actual solutions can be reconstructed later by maintaining, additionally to the cost, pointers to the solutions from which the current one originated.

We briefly describe the data structure used in our implementation, as well as a rough description of the algorithm adaptations. Each solution p is represented as a weighted pair (A(p),P(p)) for area and perimeter.

4.1 Outsourcing and Pruning

Initially, the implementation was unable to solve even small datasets, such as Osterloh (treewidth 14), due to RAM exhaustion reaching up to 100GB. To solve this problem, we implemented an efficient outsourcing strategy that stores all currently unneeded solutions onto a hard drive, keeping only the necessary solutions in RAM. Each solution is stored as a 16-byte entry in a dedicated origin-pointer file. Each DP table D[t,S] is stored in a separate surface-pointer file, containing solution IDs and their weights. These files are kept on disk and only loaded into RAM when needed. This data structure allowed solving Osterloh without exceeding RAM limits, but other datasets remained unsolvable due to hard drive usage growing up to 2TB. To free up disk space, we remove obsolete surface-pointer files and, when necessary, run a slow in-place pruning step to clean the origin-pointer file. This involves recursively marking only reachable entries by following their pointers and compacting the file accordingly. Pruning is only triggered when space is critically low, based on conservative estimates of future storage use. These strategies prevented further storage issues, but many datasets were still unsolvable due to excessive runtime, particularly in join nodes. For example, the runtime required to solve the Ahrem dataset at this stage was estimated at approximately 13,768 hours, or one and a half year worth of CPU.

4.2 Choosing an appropriate tree decomposition

When generating multiple tree decompositions for the same graph, we observed that both the decomposition structure and the choice of the root node can significantly affect performance. Similar to Abseher et al. [1], we therefore generate multiple tree decompositions for the same graph and select one that is expected to yield good performance. However, our approach differs in two important aspects: First, since we consider a concrete algorithm rather than a general class of dynamic programming approaches, we can simulate its execution to approximate actual performance. Second, we do not only estimate runtime but also account for memory consumption, selecting the decomposition that minimizes a weighted combination of both. To this end, each decomposition is traversed in postorder, and we estimate the number of solutions at each node to predict resource usage.

Earlier versions of our approach focused solely on minimizing runtime, which did not always lead to reduced memory usage (see Table 1). Only in the final version did we optimize both runtime and memory simultaneously. Incorporating this performance estimator into the algorithm led to a significant improvement in runtime. For the Ahrem dataset we estimated a runtime decrease to around 7,130 hours, a 48% decrease in the estimated runtime.

In an experiment with 100 decompositions for the instance Osterloh, our estimator selected the decomposition that ranked first in both runtime and storage. The correlation between the predicted score and actual performance was 0.91 for runtime and 0.97 for storage. We obtained similarly strong results when estimating performance for different root node choices, again observing a high correlation between predicted and actual runtime and memory usage. These results suggest that our estimator reliably selects either the best tree decomposition or at least one that still performs well with regards to runtime and memory consumption.

4.3 Join node algorithm

Let t be a join node with children t1 and t2. For each subset SXt, we compute 𝒫tS=𝒫t1𝒫t2 by combining all solutions p1𝒫t1S with all solutions p2𝒫t2S and subsequently remove all dominated solutions. To do this efficiently, we employ a lexicographical ordering of the input lists and process the combinations using a min-heap. Assume |𝒫t1S||𝒫t2S|. We initialize a min-heap with one heap node for each p1𝒫t1S, each pointing to the first entry in 𝒫t2S. We then repeatedly extract the root of the min-heap, process the corresponding solution pair (p1,p2), and increment its pointer to reference the next solution p2𝒫t2S that follows p2 in lexicographical order. If such a solution p2 exists, we update the heap accordingly, otherwise, we remove the heap node from the data structure. This process continues until the heap is empty, ensuring that every valid solution pair has been considered. This min-heap mechanism ensures that all candidate pairs are enumerated in lexicographical order, which enables an efficient check for Pareto-optimality.

4.4 Optimizing join node computations

As with many dynamic programming algorithms over a nice tree decomposition, join nodes are the main computational bottleneck. At each join node t we needed to compute 𝒫=𝒫1𝒫2 for two Pareto sets 𝒫1,𝒫2 exactly 2|Xt| many times. Computing 𝒫1𝒫2 for d2 can be done in 𝒪(pmax2logmax{d2,1}(pmax2)). In the worst case, 𝒫1𝒫2 can contain up to |𝒫1||𝒫2| many solutions if every combination is Pareto-optimal. By considering every combination (p1,p2)𝒫1×𝒫2 and filtering out all dominated ones, this becomes very time-consuming. However, in practice we observed only a roughly linear growth in size relative to the larger input Pareto set. This discrepancy indicated that many combinations were computed that would ultimately be discarded. To circumvent this problem, we utilize a heuristic pruning strategy to efficiently exclude many non-Pareto-optimal combinations at the same time.

We construct a heuristic set for 𝒫 using a small subset of 𝒫1 and 𝒫2. Afterwards we partition 𝒫1,𝒫2 and into multiple sections and compute lower bounds for each section in 𝒫1,𝒫2 and upper bounds for . We then iterate over all combinations of lower bounds and compare them with the upper bounds. If a combination of lower bounds is entirely dominated by all upper bounds, we can conclude that no solution in these sections will be Pareto-optimal (PO) in 𝒫 and safely skip them in the min-heap. To construct , we apply the same optimization recursively until a base case is reached. Furthermore, we apply multi-threading for each join node t, enabling parallel computation of multiple entries D[t,S] at the same time.

The heuristic pruning process introduces a trade-off between its own runtime and the efficiency gains from skipping combinations. By selecting appropriate parameters, we achieved a favorable balance, reducing the runtime by nearly 98% and ultimately solving Ahrem in under seven days.

4.5 Join-forget nodes

To further improve runtime and reduce memory usage, we analyzed recurring patterns in the algorithm process that allowed for optimization. In many instances, after computing a join node, many of these solutions were immediately discarded in subsequent forget nodes. To address this inefficiency, we introduce a new node type, called join-forget node, which merges a join node with subsequent forget nodes into a single operation. Formally, for a join node t in the tree decomposition, if its unique parent and all ancestors up to the next non-forget node are forget nodes, we replace t and all these forget nodes by a single join-forget node t. This new node retains the original bag Xt and additionally stores the set F of forgotten vertices from the removed forget nodes.

We consider which lists would be merged in the upcoming forget nodes of the original join node while combining solutions in the min-heap. More specifically, for a join-forget node with forgotten vertices F, we only create a min-heap for each subset SXtF. Each such heap allows us to efficiently iterate over all combinations from the union S{SFFF}{p1p2p1𝒫t1S,p2𝒫t2S}.

This setup also enhances our join node heuristic. Instead of computing a heuristic set and upper bounds for each pair 𝒫t1S,𝒫t2S separately, we reuse the join-forget structure. Specifically, for each subset SXtF, we compute the same heuristic set for all S{SFFF}. We do so by using subsets of 𝒫t1S and 𝒫t2S for all such S. As before, we recursively apply this heuristic until a small base case is reached. A side effect of this strategy is that it significantly increases RAM usage, as each thread now requires not just three lists in memory (namely 𝒫t1S,𝒫t2S and 𝒫t1S𝒫t2S), but up to 2|F|+1+1 lists.

This strategy led to another significant boost in runtime for many of our datasets. For the Ahrem dataset, using join-forget nodes reduced the runtime by 95%, allowing us to solve the instance in almost 8 hours. Moreover, storage usage also decreased by 59%. We emphasize that the improvements are not solely due to the use of join-forget nodes, but by adapting the tree decomposition performance estimator accordingly as well.

Table 2: Overview over some of the datasets we managed to solve. For all datasets see the full version of the paper.
Dataset w Time [h] Storage [GIB] #PO Solutions Percentage nonextreme Graph #Vertices Graph #Edges |V(𝕋)|
Osterloh 14 0.15 5.8 83055 99.43 1717 1887 7586
Ahrem 17 3.1 122 219969 98.99 5280 5607 21778
Lottbek 19 10.5 355 316567 99.48 5595 6101 24426
Erlenbach 22 70.7 1754 303565 99.47 5644 6284 25682

4.6 Introduce-join-forget nodes

Together with the newly defined join-forget node, we searched for new patterns in the algorithm computations to reduce the storage consumption further. To this end, we introduce a second node type called introduce-join-forget node, which further generalizes the concept of join-forget nodes. After replacing applicable structures with join-forget nodes, we consider each join-forget node t with child nodes t1,t2, where at least one child (t1, t2, or both) is an introduce node. We remove all consecutive introduce nodes among the children and store their introduced vertices as additional information in t.

A major inefficiency of standard introduce nodes arises from the exponential growth in possible subsets SXt. When a new vertex v is introduced, each existing solution p𝒫tS for SXt{v} is, with a constant weight adjustment, duplicated for both S and S{v}. This significantly increases both memory usage and I/O overhead in the outsourcing step, making it a significant bottleneck. To resolve this, we avoid explicit duplication of solutions in introduce-join-forget nodes. Let IXt be the set of vertices whose introduction was skipped on one side. If a set 𝒫S is required for the min-heap computations and SI, we instead load the solutions of the surface-pointer file of 𝒫SI and add the constant weight adjustments according to SI. Additionally, we create an origin-pointer entry for such a solution only if they are identified as Pareto-optimal in the min-heap. This strategy also reduces the growth of the origin-pointer file.

Together with adapting the performance estimator, this final improvement reduced the memory consumption by an additional 92%, resulting in a total of only 122GB of storage required for the Ahrem dataset, while also achieving a further 15% decrease in runtime, bringing the total runtime down to 7 hours.

Figure 1: Three PO aggregations of the dataset Osterloh. Input polygons are filled dark gray, chosen triangles are filled light gray. Middle: nonextreme solution. Left/Right: Closest extreme solutions to the nonextreme solution.

5 Results

Using an efficient data structure and multiple algorithmic improvements described in Section 4, we were able to compute the full Pareto set for many datasets, including instances with treewidths of up to 22, within a reasonable amount of time. Table 2 shows a subset of datasets we successfully solved. All of these computations were performed on a high performance computing system. We used 96 threads of two Intel Xeon “Sapphire Rapids” 2.10GHz and about two terabytes of storage.

For the polygon aggregation problem, it is known that all extreme solutions are hierarchically compatible [28]. In Figure 1 we illustrate this by showing two consecutive extreme solutions p1,p2 and one nonextreme solution p that lies between them (i.e. A(p1)<A(p)<A(p2) and P(p1)>P(p)>P(p2)). While the nonextreme solution p is also a viable aggregation, it exhibits a significantly different structure compared to the extreme solutions. Focusing solely on extreme solutions may lead to large gaps in the solution space, as seen in Figure 2.

Across the datasets we solved, extreme solutions accounted for only less than one percent of the total PO solutions on average. As a result, computing the set of nonextreme solutions can therefore lead to a significant richer range of solutions for the user to choose from, while the extreme solutions are forced being hierarchical depending on each other.

We published a tool to interactively investigate all PO solutions for the dataset Osterloh. The tool is available at https://github.com/Tarek-pub/Bicriteria_Aggregation_plotting.

Refer to caption
Figure 2: The weights of all PO solutions of Osterloh. The arrow indicates which nonextreme solution is shown in Figure 1.

6 Conclusions

We presented the first algorithms for computing Pareto sets using dynamic programming over tree decompositions and showed that this framework can naturally be applied to various multiobjective optimization problems. The main motivation for our work was the article of Rottmann et al. [28], who raised the question of whether it is possible to compute the Pareto set in output-polynomial time. We also conducted an experimental analysis of the polygon aggregation problem on real-world instances and developed several techniques to improve both runtime and memory usage.

We want to highlight that the theoretical running times of our algorithms are only worst-case bounds. In practice, because not every bag in a tree decomposition has the maximum possible size and the number of join nodes is often relatively small, the actual running time will usually be smaller. Additionally, it is a common phenomenon for multiobjective optimization that Pareto sets are not too large for real-world inputs. Hence, the dependency on the size of the largest Pareto set is not prohibitive in practice. For problems like the multiobjective minimum spanning tree problem, the multiobjective TSP, and many other problems, this is further supported by theoretical results in smoothed analysis, which shows that Pareto sets are expected to be polynomially bounded when instances are randomly perturbed [10].

References

  • [1] Michael Abseher, Frederico Dusberger, Nysret Musliu, and Stefan Woltran. Improving the efficiency of dynamic programming on tree decompositions via machine learning. In Qiang Yang and Michael J. Wooldridge, editors, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, IJCAI’15, pages 275–282. AAAI Press, 2015. doi:10.1613/jair.5312.
  • [2] Pankaj K. Agarwal, David Eppstein, Leonidas J. Guibas, and Monika Rauch Henzinger. Parametric and kinetic minimum spanning trees. In 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, pages 596–605. IEEE Computer Society, 1998. doi:10.1109/SFCS.1998.743510.
  • [3] Stefan Arnborg, Derek G. Corneil, and Andrzej Proskurowski. Complexity of finding embeddings in a k-tree. SIAM Journal on Algebraic Discrete Methods, 8(2):277–284, 1987. doi:10.1137/0608024.
  • [4] Max Bannach, Sebastian Berndt, and Thorsten Ehlers. Jdrasil: A modular library for computing tree decompositions. In Costas S. Iliopoulos, Solon P. Pissis, Simon J. Puglisi, and Rajeev Raman, editors, 16th International Symposium on Experimental Algorithms, SEA 2017, June 21-23, 2017, London, UK, volume 75 of LIPIcs, pages 28:1–28:21, Dagstuhl, Germany, 2017. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.SEA.2017.28.
  • [5] René Beier, Heiko Röglin, Clemens Rösner, and Berthold Vöcking. The smoothed number of pareto-optimal solutions in bicriteria integer optimization. Math. Program., 200(1):319–355, September 2023. doi:10.1007/s10107-022-01885-6.
  • [6] Nadja Betzler, Rolf Niedermeier, and Johannes Uhlmann. Tree decompositions of graphs: Saving memory in dynamic programming. Discret. Optim., 3(3):220–229, 2006. Graphs and Combinatorial Optimization. doi:10.1016/j.disopt.2006.05.008.
  • [7] Hans L. Bodlaender. A linear time algorithm for finding tree-decompositions of small treewidth. In S. Rao Kosaraju, David S. Johnson, and Alok Aggarwal, editors, Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, May 16-18, 1993, San Diego, CA, USA, STOC ’93, pages 226–234, New York, NY, USA, 1993. ACM. doi:10.1145/167088.167161.
  • [8] Hans L. Bodlaender and Fedor V. Fomin. Tree decompositions with small cost. Discret. Appl. Math., 145(2):143–154, 2005. Structural Decompositions, Width Parameters, and Graph Labelings. doi:10.1016/j.dam.2004.01.008.
  • [9] Fritz Bökler. Output-sensitive complexity of multiobjective combinatorial optimization with an application to the multiobjective shortest path problem. PhD thesis, Dortmund University, Germany, 2018. doi:10.17877/DE290R-19130.
  • [10] Tobias Brunsch and Heiko Röglin. Improved smoothed analysis of multiobjective optimization. J. ACM, 62(1):4:1–4:58, 2015. doi:10.1145/2699445.
  • [11] Günther Charwat and Stefan Woltran. Efficient problem solving on tree decompositions using binary decision diagrams. In Francesco Calimeri, Giovambattista Ianni, and Miroslaw Truszczynski, editors, Logic Programming and Nonmonotonic Reasoning - 13th International Conference, LPNMR 2015, Lexington, KY, USA, September 27-30, 2015. Proceedings, volume 9345 of Lecture Notes in Computer Science, pages 213–227, Cham, 2015. Springer. doi:10.1007/978-3-319-23264-5_19.
  • [12] H. William Corley and I. Douglas Moon. Shortest paths in networks with vector weights. Journal of Optimization Theory and Application, 46(1):79–86, 1985. doi:10.1007/BF00938761.
  • [13] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 1st edition, 2015. doi:10.1007/978-3-319-21275-3.
  • [14] Tamal K. Dey. Improved bounds on planar k-sets and k-levels. In 38th Annual Symposium on Foundations of Computer Science, FOCS ’97, Miami Beach, Florida, USA, October 19-22, 1997, pages 156–161. IEEE Computer Society, 1997. doi:10.1109/SFCS.1997.646104.
  • [15] Matthias Ehrgott. Integer solutions of multicriteria network flow problems. Investigacao Operacional, 19:229–243, 1999. doi:10.1007/978-3-642-59179-2_2.
  • [16] Matthias Ehrgott. Multicriteria Optimization (2. ed.). Springer, 2005. doi:10.1007/3-540-27659-9.
  • [17] Pierre Hansen. Bicriterion path problems. In Multiple Criteria Decision Making: Theory and Applications, volume 177 of Lecture Notes in Economics and Mathematical Systems, pages 109–127, 1980. doi:10.1007/978-3-642-48782-8_9.
  • [18] Kustaa Kangas, Mikko Koivisto, and Sami Salonen. A faster tree-decomposition based algorithm for counting linear extensions. Algorithmica, 82(8):2156–2173, 2020. doi:10.1007/s00453-019-00633-1.
  • [19] Joshua Marc Könen, Heiko Röglin, and Tarek Stuck. Bicriteria Aggregation. Software, swhId: swh:1:dir:b6eacf189792239e9115cf288be27c6753b8df17 (visited on 2025-09-09). URL: https://github.com/Tarek-pub/Bicriteria_Aggregation, doi:10.4230/artifacts.24714.
  • [20] Joshua Marc Könen, Heiko Röglin, and Tarek Stuck. Bicriteria Aggregation Plotting. InteractiveResource (visited on 2025-09-09). URL: https://github.com/Tarek-pub/Bicriteria_Aggregation_plotting, doi:10.4230/artifacts.24713.
  • [21] Tuukka Korhonen. Single-exponential time 2-approximation algorithm for treewidth. CoRR, abs/2104.07463, 2021. doi:10.48550/arXiv.2104.07463.
  • [22] Ankur Moitra and Ryan O’Donnell. Pareto optimal solutions for smoothed analysts. SIAM J. Comput., 41(5):1266–1284, 2012. doi:10.1137/110851833.
  • [23] Matthias Müller-Hannemann and Karsten Weihe. On the cardinality of the pareto set in bicriteria shortest path problems. Ann. Oper. Res., 147(1):269–286, 2006. doi:10.1007/s10479-006-0072-1.
  • [24] Adli Mustafa and Mark Goh. Finding integer efficient solutions for bicriteria and tricriteria network flow problems using DINAS. Comput. Oper. Res., 25(2):139–157, 1998. doi:10.1016/S0305-0548(97)00027-0.
  • [25] George L. Nemhauser and Zev Ullmann. Discrete dynamic programming and capital allocation. Management Science, 15(9):494–505, 1969. doi:10.1287/mnsc.15.9.494.
  • [26] Neil Robertson and Paul D. Seymour. Graph minors. II. algorithmic aspects of tree-width. J. Algorithms, 7(3):309–322, 1986. doi:10.1016/0196-6774(86)90023-4.
  • [27] Heiko Röglin and Shang-Hua Teng. Smoothed analysis of multiobjective optimization. In 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2009, October 25-27, 2009, Atlanta, Georgia, USA, pages 681–690. IEEE Computer Society, 2009. doi:10.1109/FOCS.2009.21.
  • [28] Peter Rottmann, Anne Driemel, Herman J. Haverkort, Heiko Röglin, and Jan-Henrik Haunert. Bicriteria aggregation of polygons via graph cuts. In Krzysztof Janowicz and Judith Anne Verstegen, editors, 11th International Conference on Geographic Information Science, GIScience 2021, September 27-30, 2021, Poznań, Poland (Virtual Conference) - Part II, volume 208 of LIPIcs, pages 6:1–6:16, Dagstuhl, Germany, 2021. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.GIScience.2021.II.6.
  • [29] Anders J. V. Skriver and Kim Allan Andersen. A label correcting approach for solving bicriterion shortest-path problems. Comput. Oper. Res., 27(6):507–524, 2000. doi:10.1016/S0305-0548(99)00037-4.