Counting Locally Optimal Tours in the TSP

Manthey, Bodo; van Rhijn, Jesse

doi:10.4230/LIPIcs.MFCS.2025.73

Counting Locally Optimal Tours in the TSP

Bodo Manthey

Faculty of Electrical Engineering, Mathematics, and Computer Science, University of Twente, The Netherlands Jesse van Rhijn

Faculty of Electrical Engineering, Mathematics, and Computer Science, University of Twente, The Netherlands

Abstract

We show that the problem of counting 2-optimal tours in instances of the Travelling Salesperson Problem (TSP) on complete graphs is #P-complete. In addition, we show that the expected number of 2-optimal tours in random instances of the TSP on complete graphs is $O(1.2098^{n}\sqrt{n!})$ . Based on numerical experiments, we conjecture that the true bound is at most $O(\sqrt{n!})$ , which is approximately the square root of the total number of tours.

Keywords and phrases:

Travelling salesman problem, probabilistic analysis, local search, heuristics, 2-opt

Funding:

Jesse van Rhijn: Supported by NWO grant OCENW.KLEIN.176.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Randomness, geometry and discrete structures ; Theory of computation

\rightarrow

Approximation algorithms analysis ; Theory of computation

\rightarrow

Discrete optimization

Related Version:

Full Version: https://arxiv.org/abs/2410.18650

Acknowledgements:

We thank Alexander E. Black for helpful discussions that led to the geometric perspective outlined in Section 4.4.

DOI:

10.4230/LIPIcs.MFCS.2025.73

Event:

50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)

Editors:

Paweł Gawrychowski, Filip Mazowiecki, and Michał Skrzypczak

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The Travelling Salesperson Problem is among the best-studied problems in computer science. It can be stated compactly: given a weighted graph $G=(V,E)$ with edge weights $w:E\to$ , find the Hamiltonian cycle (tour) on $G$ with the smallest total weight. The TSP is a classic example of a hard optimization problem, being even among Karp’s original 21 NP-hard problems [23].

Owing to this hardness, practitioners often turn to approximate methods. One extremely successful method is local search [25]. This is a general optimization framework where one modifies an existing (sub-optimal) solution into a better solution.

The simplest local search heuristic for the TSP is 2-opt [2]. This heuristic takes as its input a tour $T$ , and finds two sets of two edges each, $\{e_{1},e_{2}\}\subseteq T$ and $\{f_{1},f_{2}\}\nsubseteq T$ , such that exchanging $\{e_{1},e_{2}\}$ for $\{f_{1},f_{2}\}$ yields again a tour $T^{\prime}$ , and the total weight of $T^{\prime}$ is strictly less than the total weight of $T$ . This procedure is repeated with the new tour, and stops once no such edges exist. The resulting tour is said to be locally optimal with respect to the 2-opt neighborhood.

A convenient way to view 2-opt (and other local search heuristics) is via the transition graph $\mathcal{T}$ . This directed graph contains a node for every tour of $G$ . An arc $(T_{1},T_{2})$ exists in $\mathcal{T}$ if and only if $T_{2}$ can be obtained from $T_{1}$ by a 2-opt step and $T_{2}$ has strictly lower cost than $T_{1}$ . The sinks of $\mathcal{T}$ are exactly the locally optimal tours of $G$ . A run of 2-opt can then be characterized by a directed path through $\mathcal{T}$ ending in a sink.

Much research has previously focused on understanding the running time of 2-opt [9, 16, 17, 26, 27] and its approximation ratio [6, 15, 16, 20, 24]. On the other hand, little is known about the structure of $\mathcal{T}$ . In this paper, we are concerned with counting the sinks of $\mathcal{T}$ , which is equivalent to counting 2-optimal tours in the instance represented by $\mathcal{T}$ .

There are practical reasons to study the transition graphs of local search heuristics. First, observe that the transition graph of 2-opt for the random TSP instances we consider is a type of random directed acyclic graph. A run of 2-opt can be viewed as a path through this random DAG, so that the the length of the longest path in $\mathcal{T}$ is an upper bound for the number of iterations 2-opt can perform. This upper bound is however rather crude: If we consider a run of 2-opt with a random initialization, then the probability that we start the run on a node of the longest path is likely small. If most paths are much shorter, then this can provide a better explanation for the practical running time of 2-opt than only studying the longest path.

Structural results on the transition graph may in addition have implications for the running time of metaheuristics. In particular, 2-opt is often used as the basis of simulated annealing, a physics-inspired metaheuristic [2, Chapter 8]. It has long been known that the structure of the transition graph strongly influences the running time of this algorithm. Structural parameters of this graph often enter convergence results and running time estimates [19, 21, 28]. A recent result by Chen et al. [10] especially illustrates this point. They showed that the Metropolis process (in essence, simulated annealing at a constant temperature) is unable to find even very large planted cliques in an Erdős-Rényi random graph. Their analysis hinges on several structural results on cliques in such random graphs.

In particular, the aforementioned results rely (roughly) on the following structural property of the transition graph: If $S$ is the set of desired cliques, then the set $\delta S$ of neighbors of $S$ is small. Since simulated annealing must pass through $\delta S$ to reach $S$ , it can be shown that the hitting time of $S$ is large in expectation.

The result by Chen et al., as well as the preceding result by Jerrum [21], deals with the purely discrete problem of finding cliques. However, simulated annealing is often applied to weighted problems [1], which yields significant challenges in understanding the transition graph. We believe that understanding more about the structure of transition graphs is key to proving rigorous results on simulated annealing for weighted problems.

To be explicit, for 2-opt, if $S$ is the set of 2-optimal tours, then trivially $|\delta S|\leq O(n^{2})\cdot|S|$ . Thus, if $|S|$ is sufficiently small (Theorem 3), then we easily obtain a lower bound for the hitting time of the 2-optimal tours. We do not perform this analysis here, since (i) it is not the main focus of this paper, and (ii) the resulting bound is still somewhat weak: a better bound likely needs a more thorough structural investigation of the transition graph.

Results

We start by showing that the problem of counting 2-optimal tours is #P-complete, even on complete weighted graphs. Recall that #P is the counting analogue to NP, asking not whether a solution exists but how many solutions exist. A formal definition of #P is provided in Section 2.

Our result is in fact slightly stronger. To state it in full, we need the notion of a path cover. A set of paths $\mathcal{P}$ in a graph $G$ is a path cover if every vertex of $G$ is contained in exactly one path of $\mathcal{P}$ . We then have the following.

Theorem 1.

Let $f_{\text{2-opt}}$ be a function that maps a complete weighted graph on the vertex set $V$ to the number of 2-optimal tours on this graph. Using $|V|$ calls to $f_{\text{2-opt}}$ , we can compute the number of path covers of size $\ell$ for each $1\leq\ell\leq|V|$ in polynomial time, using $f_{\text{2-opt}}$ as an oracle.

This result yields #P-hardness of #2Opt on complete graphs as a corollary. This counting problem asks the following question: Given a weighted graph $G$ , how many 2-optimal tours are there on $G$ ? Note that counting Hamiltonian cycles is trivial on complete graphs, whereas hardness of counting 2-optimal tours in the same setting is not immediately obvious.

Theorem 2.

#2Opt is #P-complete, even on complete graphs.

We note that the result remains true for metric TSP instances on complete graphs, which can be seen to hold by adding a sufficiently large number to every edge weight of the original instance.

While counting 2-optimal tours on complete graphs is thus likely intractable in general, we may still wonder about the average case. In the case where the edge weights are given by independent uniformly distributed random variables, we obtain the following upper bound on the number of 2-optimal tours.

Theorem 3.

Let $G$ be a complete graph on $n=2^{k}+1$ vertices, with edge weights drawn independently from $U[0,1]$ for each edge. Then the expected number of 2-optimal tours on $G$ is bounded from above by $O\mathopen{}\mathclose{{\left(1.2098^{n}\sqrt{n!}}}\right)$ .

In the process of proving Theorem 3 we obtain a link between the number of 2-optimal tours and the probability that all entries of a multivariate normal random vector are positive. This quantity is also known as the positive orthant probability. To estimate this probability we prove the following theorem.

Theorem 4.

Let $X$ be a multivariate normal vector with zero mean and covariance matrix $\Sigma$ . The positive orthant probability $\mathbb{P}(_{+}^{d})=\mathbb{P}(X\in_{+}^{d})$ satisfies

\displaystyle\mathbb{P}(_{+}^{d})\leq\frac{\exp\mathopen{}\mathclose{{\left(% \frac{1}{2}\sum_{i=1}^{d}\Sigma_{ii}^{-1}\mathopen{}\mathclose{{\left(\mathbb{% E}\mathopen{}\mathclose{{\left[X_{i}^{2}\,\Big{\lvert}\,_{+}^{d}}}\right]}}% \right)}}\right)}{2^{d-1}e^{d/2}}.

In particular, if the diagonal elements of $\Sigma^{-1}$ are each $\Sigma_{ii}^{-1}=1$ , then

\mathbb{P}(_{+}^{d})\leq 2^{-d+1}e^{-d/2}\exp\mathopen{}\mathclose{{\left(% \frac{1}{2}\mathbb{E}\mathopen{}\mathclose{{\left[\|X\|_{2}^{2}\,\Big{\lvert}% \,_{+}^{d}}}\right]}}\right).

Theorem 4 makes no assumptions on the covariance matrix $\Sigma$ , and thus holds for any set of zero-mean multivariate normal variables. Hence, it may be of independent interest in applications where bounds on positive orthant probabilities are necessary.

2 Preliminaries

2.1 Notation and Definitions

We start with some notational shorthand. Given a symbol $a$ , we write $a^{m}$ for the string consisting of $m$ copies of $a$ . Throughout, $\log(\cdot)$ denotes the logarithm to base 2. We denote the positive orthant of ^d by ${}_{+}^{d}=\{x\in^{d}\>\lvert\>x_{i}>0,\,i\in[d]\}$ . The negative orthant ${}_{-}^{d}$ is defined similarly.

Let $G=(V(G),E(G))$ be a simple graph. For $T$ a tour through $G$ , we call the edges of $T$ the tour-edges and the edges of $E(G)\setminus T$ the chord-edges. A 2-change on $G$ then removes two tour-edges from $T$ and adds two chord-edges to $T$ .

For a fixed tour, if two tour-edges are removed then there is only one choice of chord-edges that yields a new tour. Thus, we can characterize a 2-change fully by the tour-edges it removes. If a 2-change removes the tour-edges $e$ and $f$ we denote this 2-change by $S_{T}(e,f)$ , omitting the subscript $T$ when the tour is clear from the context. We say that two 2-changes on $T$ are chord-disjoint if they have no chord-edges in common.

Given a set $\mathcal{S}$ of $2$ -changes, we define $P(\mathcal{S})=\{\{e,f\}\mid S_{T}(e,f)\in\mathcal{S}\}$ as the set of pairs of tour-edges that participate in the 2-changes in $\mathcal{S}$ . For $e\in T$ , we define $k_{e}(\mathcal{S})=|\{p\in P(\mathcal{S})\mid e\in p\}|$ , the number of 2-changes in $\mathcal{S}$ in which $e$ participates. For each of these quantities, we may omit the argument $\mathcal{S}$ whenever the set meant is clear from context.

Counting Complexity

For our complexity results, we state the definitions of #P and #P-completeness taken verbatim from Arora and Barak [4].

Definition 5.

A function $f:\{0,1\}^{*}\to\mathbb{N}$ is in #P if there exists a polynomial $p:\mathbb{N}\to\mathbb{N}$ and a polynomial-time Turing machine $M$ such that for every $x\in\{0,1\}^{*}$ ,

f(x)=\mathopen{}\mathclose{{\left|\mathopen{}\mathclose{{\left\{y\in\{0,1\}^{p% (x)}\>\lvert\>M(x,y)=1}}\right\}}}\right|.

For completeness, we need to recall the complexity class FP, which is the functional analogue to P. This means that FP is the set of functions $f:\{0,1\}^{*}\to\{0,1\}^{*}$ computable by a polynomial-time deterministic Turing machine. Moreover, we denote by $\text{{FP}{}}^{f}$ the set of such functions where the Turing machine additionally has access to an oracle for $f$ .

Definition 6.

A function $f$ is #P-complete if it belongs to #P and every $g\in\text{{\#P}{}}$ is in $\text{{FP}{}}^{f}$ .

By this definition, if we can solve some #P-complete problem in polynomial time, then we can solve every problem in #P in polynomial time.

2.2 Multivariate Normal Distribution

We require some basic facts about multivariate normal distributions. Let $\mu\in^{d}$ and let $\Sigma\in^{d\times d}$ be symmetric and positive definite. The multivariate normal distribution $\mathcal{N}(\mu,\Sigma)$ is defined by the probability density function

\displaystyle f(x)=\frac{\exp\mathopen{}\mathclose{{\left(-\frac{1}{2}(x-\mu)^% {T}\Sigma^{-1}(x-\mu)}}\right)}{(2\pi)^{d/2}\sqrt{\det\Sigma}}

(1)

with support ^d.

Let $X\sim\mathcal{N}_{d}(\mu,\Sigma)$ . The positive orthant probability of $X$ is the probability that $X$ falls in the positive orthant ${}_{+}^{d}$ . When $\mu=0$ this quantity is also referred to simply as the orthant probability, without specifiying which orthant, as each orthant is related by flipping the sign of a set of coordinates.

3 Complexity of Counting 2-Optimal Tours

We now proceed to show hardness of counting 2-optimal tours on complete graphs. (For space reasons, we defer some proofs to the appendix.) We start by recalling the notion of a path cover. Given a simple graph $G=(V,E)$ , a path cover $\mathcal{P}$ of $G$ is a collection of vertex-disjoint paths such that every $v\in V$ belongs to exactly one path of $\mathcal{P}$ . The size of $\mathcal{P}$ is the number of paths in $\mathcal{P}$ . Note that a path cover of size 1 is equivalent to a Hamiltonian path, and that a path cover may contain paths that consist of a single vertex.

From an instance $G=(V,E)$ of #HamPath we construct a family of instances $G_{m}=(V\cup S,E\cup F)$ of #2Opt for $m\geq|V|+1$ . First, we add a new set $S$ of vertices to $G_{m}$ , where $|S|=m\geq|V|+1$ . To ensure that the reduction is polynomial-time computable, we also require $m\leq 2|V|$ ; the reason for this choice will be apparent shortly. The edges of $G_{m}$ are the edges of $G$ , which are assigned a weight of 0 plus the set of edges $F$ , which contains:

$\blacksquare$

all missing edges between the vertices of $V$ , with weight $M=4$ , which we call non-edges;
$\blacksquare$

all edges between the vertices of $S$ , with weight $N=2$ , which we call $S$ -edges;
$\blacksquare$

all edges between the vertices of $V$ and the vertices of $S$ , with weight $L=1$ , which we call $(V,S)$ -edges.

The precise values of $M$ , $N$ and $L$ are not important, only that they are related as $M\gg N=2L$ . See Figure 1 for a schematic depiction of the reduction.

By this construction, $G_{m}$ is a complete graph on $|V|+m$ vertices. We claim that by computing the number $f_{\text{2-opt}}(G_{m})$ of 2-optimal tours on $G_{m}$ for $m\in\{|V|+1,\ldots,2|V|\}$ we can compute the number of path covers of $G$ of size $1\leq\ell\leq|V|$ . To this end, we first characterize the 2-optimal tours on $G_{m}$ .

(a) The reduction we use to prove #P-hardness of #2Opt. The set

V

represents the vertices of the original graph, and

S

is the set of

m

vertices added in the reduction. Each depicted edge is labelled with its weight. Note that in

G_{m}

, the non-edges of

G

are added as well (represented here by the dashed edge).

(b) A 2-optimal tour through

G_{m}

consisting of two segments. Solid black curves represent paths through

V

, solid gray lines are paths through

S

, and dotted lines are

(V,S)

-edges.

Figure 1: Schematic depiction of the reduction we use to prove #P-hardness of #2Opt, and of a 2-optimal tour in the image instance.

Lemma 7.

A tour $T$ through $G_{m}$ is 2-optimal if and only if it contains no non-edges.

Proof.

To begin, we note that since $|S|>|V|$ , any tour must contain at least one $S$ -edge. Now suppose $T$ contains a non-edge $uv\notin E$ , and let $a b$ be an $S$ -edge in $T$ . Assume these vertices are traversed in the written order on $T$ . Then we replace $u v$ and $a b$ by $u a$ and $v b$ , a 2-change which decreases the tour length by $M+N-2L>0$ . Hence, $T$ is not 2-optimal, proving one direction.

For the other direction, suppose $T$ contains no non-edges. A 2-change that adds a non-edge to the tour increases the tour length by at least $M-2L\geq 0$ ; hence, any improving 2-change cannot add any non-edges to the tour. Any 2-changes on $T$ must then involve only edges of $G$ , $S$ -edges and $(V,S)$ -edges. We go case by case, considering the possible types of the edges removed by any 2-change on $T$ .

Two $S$ -edges.: Since the vertices involved are all vertices of $S$ the added edges are also $S$ -edges, and hence the 2-change does not change the length of the tour at all and is therefore not feasible.
Two edges of $G$ .: Since these edges have zero weight the tour length cannot decrease, and the 2-change is not feasible.
Two $(V,S)$ -edges.: There are two possibilities for the added edges. In one case we obtain one $S$ -edge and one edge of $G$ , for an improvement of $2L-N=0$ . In the other case we obtain again two $(V,S)$ -edges, again for no improvement.
One $S$ -edge $a b$ and one edge $u v$ of $G$ .: The added edges are $a u$ and $b v$ , both of which are $(V,S)$ -edges and thus of weight $L$ . The tour is thus shortened by $N-2L=0$ , and the 2-change yields no improvement.
One $(V,S)$ -edge $a u$ and one edge $v w$ of $G$ .: The added edges are $a v$ and $u w$ . The added edges are again a $(V,S)$ -edge and an edge of $G$ , yielding no improvement.
One $(V,S)$ -edge $a u$ and one $S$ -edge $b c$ .: The added edges consist of one $(V,S)$ -edge and one $S$ -edge, and there is no change in tour length.

These cases cover all possibilities, and hence a tour that contains no non-edges is 2-optimal, concluding the proof. $\hfill\blacktriangleleft$

Let $T$ be a 2-optimal tour through $G_{m}$ . If we remove from $T$ all edges incident to $S$ , then we obtain a path cover of $G$ . We call the size of the resulting path cover the number of segments of $T$ . On the other hand, from any path cover of $G$ , we can construct many 2-optimal tours by connecting the paths in the cover using vertices of $S$ .

More formally, we say that a path cover $\mathcal{P}$ corresponds to $T$ if we obtain $\mathcal{P}$ by removing the edges of $T$ incident to $S$ . Note that two distinct path covers of $G$ correspond to two disjoint sets of 2-optimal tours through $G_{m}$ , and every 2-optimal tour corresponds to exactly one path cover. The following lemma counts the number of 2-optimal tours that correspond to a single path cover of size $\ell$ .

Lemma 8.

A path cover of size $\ell$ corresponds to exactly $\frac{2^{\ell-1}\cdot m!\cdot(m-1)!}{(m-\ell)!}$ 2-optimal tours in $G_{m}$ with $\ell$ segments.

Proof.

By Lemma 7, the 2-optimal tours are exactly those tours which contain no non-edges. Given a path cover $\mathcal{P}$ of size $\ell$ , we can then construct a 2-optimal tour in $G_{m}$ as follows.

Any tour must visit all vertices in $S$ in some order. Hence, we first fix a tour $T_{S}$ through $S$ ; there are $\frac{1}{2}(m-1)!$ such choices. Next, we break $\ell$ edges of $T_{S}$ ; there are $\binom{m}{\ell}$ choices of which edges to break.

We must now insert the $\ell$ paths of $\mathcal{P}$ in place of these broken edges, reconnecting the endpoints of each path to the endpoints of the broken edge it replaces. Note that whenever we perform this operation, there are two possible ways to connect the path. Moreover, there are $\ell!$ ways to match an endpoint to a broken edge, for a total of $2^{\ell}\cdot\ell!$ ways to insert the paths.

Putting the pieces together, we thus have

\frac{(m-1)!}{2}\cdot\binom{m}{\ell}\cdot 2^{\ell}\cdot\ell!=\frac{2^{\ell-1}% \cdot m!\cdot(m-1)!}{(m-\ell)!}

ways to construct a 2-optimal tour through $G_{m}$ . $\hfill\blacktriangleleft$

Let $c(\ell,m)=\frac{2^{\ell-1}\cdot m!\cdot(m-1)!}{(m-\ell)!}$ , and let $C\in\mathbb{Z}^{|V|\times|V|}$ be a matrix with entries $C_{ij}=c(i,j+|V|)$ . Recall that $\ell$ runs from $1$ to $|V|$ and $m$ runs from $|V|+1$ to $2|V|$ .

Lemma 9.

The matrix $C$ defined above has full rank.

Proof.

To start, we write out the entries of $C$ explicitly. Letting $n=|V|$ ,

C_{ij}=\frac{2^{i-1}\cdot(n+j)!\cdot(n+j-1)!}{(n+j-i)!}.

Scaling any row or column of a matrix uniformly does not change its rank. Hence, we multiply column $j$ by $\frac{1}{(n+j)!\cdot(n+j-1)!}$ , and subsequently multiply row $i$ by $1/2^{i-1}$ . Finally, interchanging any two rows also does not change the rank of the matrix, and hence we also mirror the rows of the matrix. The resulting matrix $C^{\prime}$ has entries

C^{\prime}_{ij}=\frac{1}{(i+j-1)!}.

To show that this matrix has full rank, we first recall that a square matrix has full rank if and only if its determinant is nonzero. While we could in principle compute the determinant exactly, this is tedious and not necessary for our purposes. Hence, we use a concise argument from analysis to only show $\det C^{\prime}\neq 0$ [30], reproduced here for the sake of completeness. The proof uses the notion of the Wronskian $W(x)$ of a set of functions $f_{j}(x)$ , which is the determinant of the matrix with $(i,j)$ -th element $f_{j}^{(i)}(x)$ . If the functions are analytic, then $W(x)$ vanishes identically if and only if the functions are linearly dependent [5].

We observe that the determinant of $C^{\prime}$ is, up to a sign, the Wronskian of the functions $f_{j}(x)=\frac{x^{n+j-1}}{(n+j-1)!}$ evaluated at $x=1$ . Since these functions are polynomials of different degrees, they are linearly independent and hence from analyticity of the polynomials it follows that the Wronskian does not vanish identically. By factoring out all the powers of $x$ from the rows and columns, we see that the Wronskian is a power of $x$ multiplied by its value at $x=1$ , which is $\det C^{\prime}$ up to a sign. As the Wronskian does not vanish identically, we must have $\det C^{\prime}\neq 0$ . $\hfill\blacktriangleleft$

Now we proceed to our main algorithmic result, from which #P-completeness of #2Opt follows as a corollary.

Theorem 1. [Restated, see original statement.]

Let $f_{\text{2-opt}}$ be a function that maps a complete weighted graph on the vertex set $V$ to the number of 2-optimal tours on this graph. Using $|V|$ calls to $f_{\text{2-opt}}$ , we can compute the number of path covers of size $\ell$ for each $1\leq\ell\leq|V|$ in polynomial time, using $f_{\text{2-opt}}$ as an oracle.

Proof.

For a graph $G$ , let $G_{m}$ be the complete weighted graph resulting from the reduction described above. Let $a_{\ell}(G)$ be the number of path covers of size $\ell$ , and let $b_{\ell}(G_{m})$ be the number of 2-optimal tours on $G_{m}$ consisting of $\ell$ segments. From Lemma 8, we have

b_{\ell}(G_{m})=c(\ell,m)a_{\ell}(G).

We have $f_{\text{2-opt}}(G_{m})=\sum_{\ell=1}^{|V|}b_{\ell}(G_{m})=\sum_{\ell=1}^{|V|}% c(\ell,m)a_{\ell}(G)$ .

We aim to compute the numbers $a_{\ell}(G)$ for each $\ell\in[|V|]$ . Let $a=(a_{\ell}(G))_{\ell\in[|V|]}$ , and let $b=(f_{\text{2-opt}}(G_{m}))_{m=|V|+1}^{2|V|}$ . Then the above yields the matrix equation $b=C^{T}a$ .

By Lemma 9, the matrix $C$ has full rank, and is hence invertible. While $C$ is rather ill-conditioned, the elements of $C$ can be encoded using a number of bits polynomial in $|V|+|E|$ . Thus, after making $|V|$ calls to $f_{\text{2-opt}}$ to compute the vector $b$ , we can compute $a$ in polynomial time. The entries $a_{\ell}$ of $a$ are, by construction, exactly the number of path covers of size $\ell$ of $G$ . $\hfill\blacktriangleleft$

Theorem 2. [Restated, see original statement.]

#2Opt is #P-complete, even on complete graphs.

Proof.

Membership in #P is obvious, since the problem of verifying 2-optimality of a tour is in P. For hardness, we rely on the #P-hardness of #HamPath, which was shown by Valiant [29]. Note that the number of Hamiltonian paths through a graph $G$ is exactly the number of path covers of size $\ell=1$ . By Theorem 1, given oracle access to $f_{\text{2-opt}}$ , we can solve #HamPath – and thus every problem in #P – in polynomial time, and hence #2Opt is #P-complete when restricted to complete graphs. $\hfill\blacktriangleleft$

4 Counting 2-Optima in Random Instances

While counting 2-optimal tours is in general hard on complete graphs, we can still provide some results for special cases. (Again, missing proofs may be found in the appendix.) In this section, we restrict our attention to complete graphs on $n=2^{k}+1$ vertices for some integer $k$ . The weights of the edges of our graphs are drawn independently from the uniform distribution on $[0,1]$ .

The strategy we use is to bound the probability that a given tour is 2-optimal. Since we assume the input graph is complete, this probability is the same for all tours. It then suffices to multiply this probability by the total number of tours, which is $\frac{1}{2}(n-1)!$ .

To bound the probability of a tour $T$ being 2-optimal we find a large set $\mathcal{S}$ of mutually chord-disjoint 2-changes on $T$ . We then apply a general result that links the probability of $2$ -optimality of $T$ to how often each edge of $T$ is used in $\mathcal{S}$ (Lemma 10). Details of the construction of this set are given in Section 4.2.

4.1 Orthant Probabilities and 2-Optimality

Let $\mathcal{S}$ be a set of chord-disjoint 2-changes on a tour $T$ on the complete graph on $n$ vertices, and define for $x\in^{E}$

g_{\mathcal{S}}(x)=\exp\mathopen{}\mathclose{{\left(-\sum_{\{e,f\}\in P(% \mathcal{S})}\frac{x_{e}x_{f}}{\sqrt{k_{e}(\mathcal{S})k_{f}(\mathcal{S})}}}}% \right).

Now let $X=(X_{e})_{e\in T,k_{e}(\mathcal{S})>0}$ be a sequence of independent half-normal distributed random variables with unit variance. We define the function

\mathcal{G}(\mathcal{S})=\mathbb{E}[g_{\mathcal{S}}(X)].

Observe that $\mathcal{G}$ is solely a function of $\mathcal{S}$ .

Lemma 10.

Let $G$ be a complete graph on $n$ vertices, with each edge of $G$ independently assigned a weight drawn from $U[0,1]$ . Let $T$ be any tour through $G$ . Let $\mathcal{S}$ be a set of chord-disjoint 2-changes on $T$ . Then

\mathbb{P}(\text{$T$ is 2-optimal})\leq\mathcal{G}(\mathcal{S})\prod_{\begin{% subarray}{c}e\in T\\ k_{e}(\mathcal{S})>0\end{subarray}}\sqrt{\frac{\pi}{2k_{e}(\mathcal{S})}}.

Proof sketch.

We start by conditioning on the values of the weights on the edges of $T$ . Since $\mathcal{S}$ is chord-disjoint, the events $\{\Delta(S)\leq 0\}_{S\in\mathcal{S}}$ are independent subject to this conditioning. Since the event that $T$ is 2-optimal is equivalent to $\bigwedge_{S\in\mathcal{S}}\mathopen{}\mathclose{{\left(\Delta(S)\leq 0}}\right)$ , the probability of this event then separates into the product of the probabilities that each 2-change in $\mathcal{S}$ is non-improving. These probabilities are then straightforwardly computed. In the resulting expression, we recognize the function $\mathcal{G}$ defined above, as well as the joint density of a multivariate half-normal distribution, from which the lemma follows after normalizing this density. $\hfill\blacktriangleleft$

Bounding $\mathbb{P}(\text{$T$ is 2-optimal})$ now involves two steps. First, we must construct a set $\mathcal{S}$ of chord-disjoint 2-changes such that each edge of $T$ is used in many 2-changes in $\mathcal{S}$ . Second, given this set, we must bound $\mathcal{G}(\mathcal{S})$ .

We remark now that $\mathcal{G}(\mathcal{S})$ is trivially bounded from above by 1. However, this leaves a factor of about $(\pi/2)^{n/2}$ . Although this factor is small compared to $\prod_{e\in T,\,k_{e}>0}\frac{1}{\sqrt{k_{e}(\mathcal{S})}}$ for the set $\mathcal{S}$ we construct, leaving it in is somewhat unsatisfactory. We thus make an attempt to prove a stronger bound for $\mathcal{G}$ .

Computing $\mathcal{G}(\mathcal{S})$ directly is unfeasible. It helps to recast it in terms of a positive orthant probability.

Lemma 11.

For a set $\mathcal{S}$ of chord-disjoint 2-changes on a tour $T$ , let $k$ denote the number of tour-edges with $k_{e}(\mathcal{S})>0$ . Label these edges of $T$ arbitrarily from $1$ to $k$ . For any $I\subseteq[k]$ , the function $\mathcal{G}(\mathcal{S})$ is bounded from above by $2^{|I|}\cdot\sqrt{\det{\Sigma}}\cdot\mathbb{P}\mathopen{}\mathclose{{\left(% \bigwedge_{i\in I}Z_{i}>0}}\right)$ , where $(Z_{i})_{i\in I}$ is distributed according to a multivariate normal distribution with mean 0 and inverse covariance matrix $\Sigma^{-1}\in\mathbb{R}^{|I|\times|I|}$ with entries indexed by $I$ ,

\Sigma^{-1}_{ij}=\begin{cases}1,&\text{ if\leavevmode\nobreak\ $i=j$},\\ s_{ij},&\text{ otherwise},\end{cases}

where $s_{ij}\leq 1/\sqrt{k_{i}k_{j}}$ .

Proof.

Since $g_{\mathcal{S}}(x)\leq 1$ and $g_{\mathcal{S}}$ is decreasing in each variable, we can bound $g_{\mathcal{S}}$ from above uniformly by setting the coefficients of any subset of the variables to zero. Setting these to zero for all variables outside $I$ then leaves

\displaystyle\mathcal{G}(\mathcal{S})\leq\mathopen{}\mathclose{{\left(\frac{2}% {\pi}}}\right)^{|I|/2}\int_{{}_{+}^{|I|}}\exp\mathopen{}\mathclose{{\left(-% \frac{1}{2}\sum_{i\in I}x_{i}^{2}}}\right)\exp\mathopen{}\mathclose{{\left(-% \sum_{i\in P_{I}(\mathcal{S})}\frac{x_{i}x_{j}}{\sqrt{k_{i}k_{j}}}}}\right)% \prod_{i\in I}\mathop{}\!\mathrm{d}x_{i},

where $P_{I}(\mathcal{S})$ is the same as $P(\mathcal{S})$ , except we keep only pairs with both elements in $I$ .

Observe that $e^{-x_{i}x_{j}/\sqrt{k_{i}k_{j}}}\leq e^{-s_{ij}x_{i}x_{j}}$ , allowing us to replace the coefficients of these products. We now only need to insert the appropriate normalization factor for a multivariate normal distribution. Comparing the resulting expression with Equation 1 completes the proof. $\hfill\blacktriangleleft$

Still, computing the positive orthant probability directly is rather difficult. Explicit formulas are known for low-dimensional cases, as well as general recursive formulas [3, 11, 12], but none of these are particularly helpful in bounding $\mathcal{G}(\mathcal{S})$ . As we only need a nontrivial upper bound, some simplifications are possible. In Theorem 4 (restated below), we show that it suffices to bound the expected squared norm of a multivariate normal vector.

Theorem 4. [Restated, see original statement.]

Let $X$ be a multivariate normal vector with zero mean and covariance matrix $\Sigma$ . The positive orthant probability $\mathbb{P}(_{+}^{d})=\mathbb{P}(X\in_{+}^{d})$ satisfies

\displaystyle\mathbb{P}(_{+}^{d})\leq\frac{\exp\mathopen{}\mathclose{{\left(% \frac{1}{2}\sum_{i=1}^{d}\Sigma_{ii}^{-1}\mathopen{}\mathclose{{\left(\mathbb{% E}\mathopen{}\mathclose{{\left[X_{i}^{2}\,\Big{\lvert}\,_{+}^{d}}}\right]}}% \right)}}\right)}{2^{d-1}e^{d/2}}.

In particular, if the diagonal elements of $\Sigma^{-1}$ are each $\Sigma_{ii}^{-1}=1$ , then

\mathbb{P}(_{+}^{d})\leq 2^{-d+1}e^{-d/2}\exp\mathopen{}\mathclose{{\left(% \frac{1}{2}\mathbb{E}\mathopen{}\mathclose{{\left[\|X\|_{2}^{2}\,\Big{\lvert}% \,_{+}^{d}}}\right]}}\right).

4.2 Finding Chord-Disjoint 2-Changes

Figure 2: Colors of the edges in the tour

T

at stage

t=3

, for

n=2^{4}+1

. The dotted lines are drawn to show the boundaries of each segment

T_{i}

more clearly. The segments of the tour are numbered starting at the right and proceeding counterclockwise. The 2-changes we consider in the proof of Lemma 12 are then the 2-changes formed from the red edges in

T_{i}

(drawn black) and the blue edges in

T_{i+1}

(drawn gray) that appear in even positions along

T

, for

i

odd. Note that the last edge along

T

is drawn dashed to indicate that it is not used in the construction of

\mathcal{S}

.

Next, we construct a set $\mathcal{S}$ of 2-changes such that any two 2-changes in $\mathcal{S}$ are chord-disjoint, and most of the edges of $T$ participate in many 2-changes in $\mathcal{S}$ . The construction we provide here works for complete graphs with $n=2^{k}+1$ vertices for some integer $k$ .

The construction of $\mathcal{S}$ proceeds as follows. Recall that we ordered the edges of $T$ as $(e_{1},e_{2},\ldots,e_{n})$ . We define the following process on $T$ , occurring in stages. At stage $t$ we divide the tour into $2^{t}$ equal segments $\{T_{1},\ldots,T_{2^{t}}\}$ , starting at $e_{1}$ . In each stage, we color all the edges in each odd segment red and the edges in each even segment blue. The only exception to this rule is the last edge $e_{n}$ , which we color black; this edge is not used to form any 2-changes. See Figure 2 for an illustration at stage $t=3$ .

At each stage we consider the 2-changes formed by the red edges in each odd segment $T_{i}$ together with the even blue edges in its successor segment $T_{i+1}$ . We say that these are the 2-changes added in stage $t$ , and denote the set of these 2-changes by $\mathcal{S}_{t}$ . We continue this process for $\log(n-1)-1$ stages. Note that in the final stage, each segment contains two colored edges.

With some effort, one can show that $\mathcal{S}$ is chord-disjoint.

Lemma 12.

The 2-changes in $\mathcal{S}=\bigcup_{t=1}^{\log n-1}\mathcal{S}_{t}$ are chord-disjoint.

We now determine how often each edge of $T$ is used in $\mathcal{S}$ , which we need to apply Lemma 10.

Figure 3: Illustration of the construction used in the proof of Lemma 13. The path indicated in bold assigns a binary string 011 to each edge in

T_{5}

.

Lemma 13.

For $\mathcal{S}$ as constructed above, we have $\{k_{e}(\mathcal{S})\}_{e\in T}={\{0,1,2,\ldots,n-3\}}$ . Moreover, there are two edges with $k_{e}(\mathcal{S})=0$ and two edges with $k_{e}(\mathcal{S})=\frac{n-1}{2}$ .

Proof.

We maintain the same order on the edges of $T$ as used in the construction of $\mathcal{S}$ . Using this order, we call an edge even if it appears in an even position in this order, and odd otherwise. Note that the last edge $e_{n}$ is not considered in the construction of $\mathcal{S}$ , and so $k_{e_{n}}=0$ . For the remainder of the proof, we consider $T$ with $e_{n}$ removed.

For convenience, denote by $k_{e}^{t}$ the number of 2-changes that $e$ participates in at stage $t$ , so that $k_{e}=\sum_{t=1}^{\log n-1}k_{e}^{t}$ . We observe the following property of $\mathcal{S}$ .

Fact.

Consider stage $t$ in the construction of $\mathcal{S}$ . If an edge $e$ is red in stage $t$ , then $k_{e}^{t}=n/2^{t+1}$ . If $e$ is blue, then $k_{e}^{t}=n/2^{t}$ if $e$ is even, and $k_{e}^{t}=0$ otherwise.

For any edge of $T$ , we can count the number of 2-changes of $\mathcal{S}$ it participates in as follows. (See Figure 3 for an illustration.) Construct a rooted directed binary tree $H$ , where the nodes of $H$ are sub-paths of $T$ . The root of $H$ is $T$ itself, while the children of any node $P$ of $H$ are the first and the second halves of the edges in $P$ under the ordering of the edges of $T$ , labelled $P_{1}$ and $P_{2}$ respectively. Thus, the children of $T$ are $P_{1}=\{e_{1},\ldots,e_{{(n-1)}/2}\}$ and $P_{2}=\{e_{(n-1)/2+1},\ldots,e_{n-1}\}$ . We moreover label the arcs of $H$ . If $P_{1}$ and $P_{2}$ are the children of a node $P$ as described above, then the arc $a_{1}=(P,P_{1})$ gets a label $L(a_{1})=1$ , while $a_{2}=(P,P_{2})$ gets a label $L(a_{2})=0$ .

We construct $H$ in this way from the root, continuing until $H$ has depth $\log(n-1)-1$ . (We define the depth of a node $v$ in a rooted directed tree as the number of arcs in the directed path from the root to $v$ . The depth of the tree itself is the largest depth among all nodes of the tree.) Then the nodes at depth $t$ of $H$ are the parts into which $T$ is partitioned at stage $t$ . Note that the leaves of $H$ each contain two successive edges of $T$ .

Let $a=(P,Q)$ be an arc in $H$ with $L(a)=1$ . From the construction of $\mathcal{S}$ , it follows that any edge $e\in Q$ is colored red in the stage corresponding to the depth of $Q$ . Conversely, if $L(a)=0$ , then $Q$ is colored blue in this stage.

For each leaf $P$ of $H$ , we then consider the path $P(v)=Ta_{1}P_{1}a_{2}\ldots a_{\log(n-1)-1}Q$ from the root to this leaf. Following this path, we collect the labels of the arcs along this path into a string $x(Q)$ , so $x(Q)=L(a_{1})L(a_{2})\ldots L(a_{\log(n-1)-1})$ . For $e\in Q$ , we set $x_{e}=x(Q)$ .

Given $x_{e}$ for some edge $e\in T$ , we know that $e$ is red in stage $t$ if and only if $x_{e}(t)=1$ . We thus know exactly in which stages $e$ is colored red, and in which it is colored blue. Using this information together with the fact above, we can derive formulae for $k_{e}$ for any $e\in T$ . There are two distinct cases.

Case 1: $e$ odd.

Since $e$ is odd, it only participates in any 2-change when it is colored red. Thus, stage $t$ contributes $x_{e}(t)\cdot(n-1)/2^{t+1}$ 2-changes, and so we count

	$\displaystyle k_{e}$	$\displaystyle=(n-1)\sum_{t=1}^{\log(n-1)-1}x_{e}(t)\cdot\frac{1}{2^{t+1}}=\sum% _{t=1}^{\log(n-1)-1}x_{e}(t)\cdot 2^{\log(n-1)-1-t}$
		$\displaystyle=\sum_{j=0}^{\log(n-1)-2}x_{e}(\log(n-1)-1-j)\cdot 2^{j}.$

Note that for a given bit string $x_{e}$ , this is simply the decimal expansion of the binary number represented by $x_{e}$ .

Since every bit string of length $\log(n-1)-1$ is present for some odd edge (there are $2^{\log(n-1)-1}$ leaves in $H$ and each leaf corresponds to a distinct string), we find that $\{k_{e}\mid\text{odd }e\in T\}=\{0,1,\ldots,(n-1)/2-1\}$ .

Case 2: $e$ even.

An even edge contributes $(n-1)/2^{t+1}$ 2-changes at stage $t$ if it is red, and $(n-1)/2^{t}$ 2-changes if it is blue. Thus, the contribution at this stage is $(n-1)/2^{t}-x_{e}(t)\cdot(n-1)/2^{t+1}$ , and so we have

	$\displaystyle k_{e}$	$\displaystyle=(n-1)\mathopen{}\mathclose{{\left(\sum_{t=1}^{\log(n-1)-1}2^{-t}% -\sum_{t=2}^{\log(n-1)-1}x_{e}(t)\cdot\frac{1}{2^{t+1}}}}\right)$
		$\displaystyle=n-3-\frac{n-1}{2}\cdot\sum_{t=1}^{\log(n-1)-1}x_{e}(t)\cdot\frac% {1}{2^{t}}$
		$\displaystyle=n-3-\sum_{j=0}^{\log(n-1)-2}x_{e}(\log(n-1)-1-j)\cdot 2^{j}.$

As in the previous case, we recognize here the decimal expansion of $x_{e}$ in the second term. Thus, $\{k_{e}\mid\text{even }e\in T\}=\{(n-1)/2-1,(n-1)/2,\ldots,n-2,n-3\}$ .

$\hfill\blacktriangleleft$

4.3 Putting the Pieces Together

We return to the positive orthant probability by constructing an inverse covariance matrix $\Sigma^{-1}$ corresponding to $\mathcal{S}$ according to Lemma 11. In the following, we sort the edges of $T$ by decreasing value of $k_{e}(\mathcal{S})$ .

Observe that the first $d=(n-3)/2$ edges of $T$ in this order each form 2-changes with one another in $\mathcal{S}$ . Note that $d$ is an integer, as $n=2^{k}+1$ is odd. The entries of $\Sigma^{-1}$ corresponding to these 2-changes are $\Sigma_{ij}^{-1}=1/\sqrt{k_{i}k_{j}}\geq\frac{1}{n-3}$ . Thus, the inverse covariance matrix $\Sigma^{-1}$ constructed from $\mathcal{S}$ is upper-left triangular, except with ones on the diagonal. We can then write it in block form,

\Sigma^{-1}=\begin{pmatrix}\tilde{\Sigma}^{-1}&A\\ A^{T}&I\end{pmatrix},

where the diagonal entries of $\tilde{\Sigma}^{-1}\in\mathbb{R}^{d\times d}$ are each 1, and the off-diagonal entries each satisfy $\tilde{\Sigma}^{-1}_{ij}\geq\frac{1}{n-3}=\frac{1}{2d}$ . We consider the matrix $\hat{\Sigma}^{-1}$ , which is identical to $\tilde{\Sigma}^{-1}$ , but with the off-diagonal entries replaced by $\frac{1}{2d}$ .

Comparing $\hat{\Sigma}^{-1}$ to Lemma 11, we proceed to bound the positive orthant probability associated with $\hat{\Sigma}^{-1}$ . A trivial bound for this probability is $2^{-d}$ . Lemma 14 therefore represents a non-trivial, if modest, improvement.

Lemma 14.

Let $X$ be distributed according to $\mathcal{N}_{d}(0,\hat{\Sigma})$ , where $\hat{\Sigma}^{-1}\in^{d\times d}$ has unit diagonal entries and off-diagonal entries $\frac{1}{2d}$ . The positive orthant probability of $X$ is bounded from above by $O\mathopen{}\mathclose{{\left(2^{-d}e^{-\frac{2d}{9\pi}}}}\right)$ .

The following is now an immediate consequence of Lemmas 11 and 14, recalling that $d=(n-3)/2$ .

Lemma 15.

For $\mathcal{S}$ as constructed above, $\mathcal{G}(\mathcal{S})=O\mathopen{}\mathclose{{\left(e^{-\frac{n}{9\pi}}}}\right)$ .

Proof.

By Lemma 11, we need to multiply the bound from Lemma 14 by a factor $2^{d}\sqrt{\det{\hat{\Sigma}}}$ . It is easily seen from the proof of the latter lemma that this determinant is $O(1)$ . $\hfill\blacktriangleleft$

Finally, we combine all of the above for the last crucial lemma, from which our main result follows directly.

Lemma 16.

Let $G$ be a complete graph on $n=2^{k}+1\geq 5$ vertices, with edge weights drawn independently from $U[0,1]$ for each edge. Let $T$ be any tour through $G$ . The probability that $T$ is 2-optimal is bounded from above by

O\mathopen{}\mathclose{{\left(\frac{c^{n}}{\sqrt{(n-2)!}}}}\right),

where $c=\sqrt{\frac{\pi}{2}}\cdot e^{-\frac{1}{9\pi}}<1.2098$ .

Proof.

Let $\mathcal{S}$ be a set of 2-changes on $G$ constructed as described above, and let $k_{e}$ be the number of 2-changes in $\mathcal{S}$ in which $e\in T$ participates. From Lemma 13 we have

\displaystyle\prod_{\begin{subarray}{c}e\in T\\ k_{e}>0\end{subarray}}\frac{1}{k_{e}}=\frac{1}{\frac{n-1}{2}-1}\cdot\frac{1}{(% n-3)!}=O\mathopen{}\mathclose{{\left(\frac{1}{(n-2)!}}}\right).

Using this result in Lemma 10 together with Lemma 15 yields the claim. $\hfill\blacktriangleleft$

This leads to our last result (Theorem 3), using the fact that the number of tours on a complete graph is $\frac{1}{2}(n-1)!$ .

4.4 Numerical Experiment

It seems unlikely that the bound in Theorem 3 is tight. A simple numerical estimate of $\mathcal{G}(\mathcal{S})$ already shows that it could be improved to $O(1.0223^{n}\sqrt{n!})$ , but even this may be a coarse approximation; after all, in constructing $\mathcal{S}$ , we discard many 2-changes.

To estimate the number of 2-optimal tours numerically, one could take a naïve approach: simply fix a tour $T$ , generate edge weights from $U[0,1]$ , and check 2-optimality of $T$ with these weights. By repeating this experiment we can estimate $\mathbb{P}(\text{$T$ is 2-optimal})$ . However, since this probability is super-exponentially small (Lemma 16), this is rather inefficient.

We can do better by taking a different view of the problem. Fix again an arbitrary tour $T$ . We write the edge weights as a vector $w\in[0,1]^{E}$ . Given $T$ , we can determine all $n(n-3)/2$ possible 2-changes on $T$ . Local optimality of $T$ means that all these 2-changes yield a negative improvement. Thus, for a 2-change that removes edges $e_{1}$ and $e_{2}$ and adds $f_{1}$ and $f_{2}$ , this yields an inequality

\displaystyle w_{e_{1}}+w_{e_{2}}-w_{f_{1}}-w_{f_{2}}\leq 0.

(2)

Each possible 2-change on $T$ yields such a constraint. Together with the constraints $0\leq w_{e}\leq 1$ for each edge, this yields a convex polytope $P$ embedded in ^m. Since the edge weights are i.i.d uniform random variables,

\mathbb{P}(\text{$T$ is 2-optimal})=\mathrm{vol}(P),

the $|E|$ -dimensional volume of $P$ .

We remark now that our approach through Lemma 10 is equivalent to bounding the volume of a polytope that contains $P$ , by discarding some of the inequalities (Equation 2). It may be possible to use methods from convex geometry to compute better estimates of $\mathrm{vol}(P)$ ; we leave this discussion to Section 5.

Computing the volume of an arbitrary polytope is itself not an easy task: this problem is #P-hard in general, as shown by Dyer and Frieze [13]. In the same work however, they show that volume approximation can be done efficiently by randomized algorithms. We use the open-source software package Volesti [8, 14] to numerically approximate the volume of $P$ .

For a range of different $n$ , we compute the inequalities that define $P$ , and output them in a format readable for Volesti. We used the “Cooling Balls” algorithm of Volesti, with an error parameter of 0.001 (see the specification of Volesti [8]), and approximated $\mathrm{vol}(P)$ up to $n=40$ . The results of these computations are shown in Figure 4.

Figure 4: Estimated volume of the 2-opt polytope

P_{n}

, for different values of

n

, as computed by Volesti. For comparison, the function

n\mapsto 1/\sqrt{n!}

is also plotted.

5 Discussion

The result we present in Theorem 2 is to our knowledge the first hardness result for counting locally optimal solutions for a natural, weighted local search problem. We note that it is easy to show that counting the local optima is #P-hard for some local search problem. For instance, Johnson et al. [22] provided a local search problem whose locally optimal solutions are exactly the satisfying assignments of an instance of Boolean Satisfiability (plus one extra solution). However, their construction is rather artificial, whereas 2-opt is a heuristic often used in practice. Moreover, for unweighted problems, Goldberg et al. [18] previously showed some hardness results for counting maximal independent sets (which can be seen as locally optimal independent sets under the simple neighborhood of vertex addition).

Theorem 3 is to our knowledge the first non-trivial bound on the number of locally optimal solutions for a local search problem. It must be noted that we only proved Theorem 3 for specific values of $n$ , namely $n=2^{k}+1$ for $k\in\mathbb{N}$ . This restriction simplifies mainly the construction of $\mathcal{S}$ , our set of chord-disjoint 2-changes. We believe that the results can be extended at the cost of some complexity in the proofs. For the sake of simplicity, we chose not to do so.

The bound in Theorem 3 shows that in expectation, the number of 2-optimal tours in a random TSP instance is approximately the square root of the total number of tours. While still a rather large number, it is nonetheless a super-exponentially small fraction of the total number of tours. While our analysis is specialized to 2-opt, we believe the techniques can be straightforwardly extended to other heuristics, so long as the instance is sufficiently symmetric and the improvement steps can be described by a fixed number of random variables (e.g., four random edge weights for 2-opt). The extension would be more involved if the number of variables per 2-opt step is not fixed, or if different solutions differ in structure.

Limitations

A natural question concerns the tightness of the bound in Theorem 3. We have not succeeded in constructing a lower bound for the number of 2-optimal tours. Presumably such a construction would be rather involved, since there is little structure in this random instance model. Nonetheless, we can still argue that Theorem 3 can be improved significantly from two directions: bounding $\mathcal{G}(\mathcal{S})$ and constructing $\mathcal{S}$ .

Lemma 15 is far from optimal. A simple numerical experiment to estimate $\mathcal{G}(\mathcal{S})$ yields $\mathcal{G}(\mathcal{S})=O(1.226^{-n})$ . This results in a bound of $O(1.0223^{n}\sqrt{n!})$ in Theorem 3. We also note that, even though our set $\mathcal{S}$ achieves essentially the largest number of chord-disjoint 2-changes possible on any tour, it may not be an optimal choice: different choices may lead to better sequences of values for $k_{e}$ .

Using the trivial bound of $\mathcal{G}(\mathcal{S})\leq 1$ instead of Lemma 15 in Lemma 16, we would obtain in Theorem 3 a bound of approximately $O(1.2534^{n}\sqrt{n!})$ . The difference with our bound is thus less than a factor of $1.04^{n}$ , which is a rather minute improvement for the amount of trouble we went through to obtain it. We therefore regard the calculations leading to Lemma 15 more as a proof-of-concept that significant improvements are still possible to obtain rigorously, and as a demonstration of the effort required to achieve anything non-trivial.

Another possible direction for improving the bound in Theorem 3 lies in polyhedron volume estimation. In Section 4.4, we defined a polytope $P$ such that the volume of $P$ is the probability that a tour on $n$ vertices is 2-optimal. While computing the volume of a polytope is #P-hard in general [13], it may be possible to obtain an asymptotic formula for our specific case, as was done by Canfield and McKay for the well-studied Birkhoff polytope [7].

A conjecture

In light of these observations, and the estimate resulting from the numerical experiments, we conjecture the following.

Conjecture 17.

Let $G$ be a complete graph on $n$ vertices, with edge weights drawn independently from $U[0,1]$ for each edge. Then the expected number of 2-optimal tours on $G$ is bounded from above by $O(\sqrt{n!})$ .

The numerical data suggest that the expected number of tours may even be $c^{n}\sqrt{n!}$ for some $c<1$ (as opposed to $c>1$ as in Theorem 3), although we could not perform the numerical experiments for large enough $n$ to state this with confidence.

References

[1] Emile Aarts and Jan Korst. Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. John Wiley & Sons, Inc., USA, 1989.
[2] Emile Aarts and Jan Karel Lenstra, editors. Local Search in Combinatorial Optimization. Princeton University Press, 2003. doi:10.2307/j.ctv346t9c.
[3] I. G. Abrahamson. Orthant Probabilities for the Quadrivariate Normal Distribution. The Annals of Mathematical Statistics, 35(4):1685–1703, December 1964. doi:10.1214/aoms/1177700391.
[4] Sanjeev Arora and Boaz Barak. Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge, 2009. doi:10.1017/CBO9780511804090.
[5] Maxime Bôcher. Certain cases in which the vanishing of the Wronskian is a sufficient condition for linear dependence. Transactions of the American Mathematical Society, 2(2):139–149, 1901. doi:10.2307/1986214.
[6] Ulrich A. Brodowsky and Stefan Hougardy. The Approximation Ratio of the 2-Opt Heuristic for the Euclidean Traveling Salesman Problem. In DROPS-IDN/v2/Document/10.4230/LIPIcs.STACS.2021.18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.STACS.2021.18.
[7] E. Rodney Canfield and Brendan D. McKay. The asymptotic volume of the Birkhoff polytope. Online Journal of Analytic Combinatorics, 2009.
[8] Apostolos Chalkis and Vissarion Fisikopoulos. Volesti: Volume Approximation and Sampling for Convex Polytopes in R. The R Journal, 13(2):642–660, 2021.
[9] Barun Chandra, Howard Karloff, and Craig Tovey. New Results on the Old k-opt Algorithm for the Traveling Salesman Problem. SIAM Journal on Computing, 28(6):1998–2029, January 1999. doi:10.1137/S0097539793251244.
[10] Zongchen Chen, Elchanan Mossel, and Ilias Zadik. Almost-Linear Planted Cliques Elude the Metropolis Process. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Proceedings, pages 4504–4539. Society for Industrial and Applied Mathematics, January 2023. doi:10.1137/1.9781611977554.ch171.
[11] M. C. Cheng. The Orthant Probabilities of Four Gaussian Variates. The Annals of Mathematical Statistics, 40(1):152–161, 1969. URL: https://www.jstor.org/stable/2239206.
[12] F. N. David. A note on the evaluation of the multivariate normal integral. Biometrika, 40(3-4):458–459, December 1953. doi:10.1093/biomet/40.3-4.458.
[13] M. E. Dyer and A. M. Frieze. On the Complexity of Computing the Volume of a Polyhedron. SIAM Journal on Computing, 17(5):967–974, October 1988. doi:10.1137/0217060.
[14] Ioannis Z. Emiris and Vissarion Fisikopoulos. Practical Polytope Volume Approximation. ACM Trans. Math. Softw., 44(4):38:1–38:21, June 2018. doi:10.1145/3194656.
[15] Christian Engels and Bodo Manthey. Average-case approximation ratio of the 2-opt algorithm for the TSP. Operations Research Letters, 37(2):83–84, March 2009. doi:10.1016/j.orl.2008.12.002.
[16] Matthias Englert, Heiko Röglin, and Berthold Vöcking. Worst Case and Probabilistic Analysis of the 2-Opt Algorithm for the TSP. Algorithmica, 68(1):190–264, January 2014. Corrected version: https://arxiv.org/abs/2302.06889. doi:10.1007/s00453-013-9801-4.
[17] Matthias Englert, Heiko Röglin, and Berthold Vöcking. Smoothed Analysis of the 2-Opt Algorithm for the General TSP. ACM Transactions on Algorithms, 13(1):10:1–10:15, September 2016. doi:10.1145/2972953.
[18] Leslie Ann Goldberg, Rob Gysel, and John Lapinskas. Approximately counting locally-optimal structures. Journal of Computer and System Sciences, 82(6):1144–1160, 2016. doi:10.1016/j.jcss.2016.04.001.
[19] Bruce Hajek. Cooling Schedules for Optimal Annealing. Mathematics of Operations Research, 13(2):311–329, 1988. doi:10.1287/MOOR.13.2.311.
[20] Stefan Hougardy, Fabian Zaiser, and Xianghui Zhong. The approximation ratio of the 2-Opt Heuristic for the metric Traveling Salesman Problem. Operations Research Letters, 48(4):401–404, July 2020. doi:10.1016/j.orl.2020.05.007.
[21] Mark Jerrum. Large Cliques Elude the Metropolis Process. Random Structures & Algorithms, 3(4):347–359, 1992. doi:10.1002/rsa.3240030402.
[22] David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. How easy is local search? Journal of Computer and System Sciences, 37(1):79–100, August 1988. doi:10.1016/0022-0000(88)90046-3.
[23] Richard M. Karp. Reducibility among Combinatorial Problems. In Raymond E. Miller, James W. Thatcher, and Jean D. Bohlinger, editors, Complexity of Computer Computations: Proceedings of a Symposium on the Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Springer US, Boston, MA, 1972. doi:10.1007/978-1-4684-2001-2_9.
[24] Marvin Künnemann, Bodo Manthey, and Rianne Veenstra. Smoothed Analysis of the 2-Opt Heuristic for the TSP under Gaussian Noise, August 2023. Comment: Combination of an ISAAC 2013 paper by Bodo Manthey and Rianne Veenstra and an ICALP 2015 paper by Marvin Künnemann and Bodo Manthey. The results of the ISAAC 2013 paper have been improved. doi:10.48550/arXiv.2308.00306.
[25] S. Lin and B. W. Kernighan. An Effective Heuristic Algorithm for the Traveling-Salesman Problem. Operations Research, 21(2):498–516, April 1973. doi:10.1287/opre.21.2.498.
[26] Bodo Manthey and Jesse van Rhijn. Improved Smoothed Analysis of 2-Opt for the Euclidean TSP. In ISAAC 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ISAAC.2023.52.
[27] Bodo Manthey and Rianne Veenstra. Smoothed Analysis of the 2-Opt Heuristic for the TSP: Polynomial Bounds for Gaussian Noise. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, Algorithms and Computation, Lecture Notes in Computer Science, pages 579–589, Berlin, Heidelberg, 2013. Springer. Full, improved version: https://arxiv.org/abs/2308.00306. doi:10.1007/978-3-642-45030-3_54.
[28] Andreas Nolte and Rainer Schrader. A Note on the Finite Time Behavior of Simulated Annealing. Mathematics of Operations Research, 25(3):476–484, 2000. doi:10.1287/MOOR.25.3.476.12211.
[29] Leslie G. Valiant. The Complexity of Enumeration and Reliability Problems. SIAM Journal on Computing, 8(3):410–421, August 1979. doi:10.1137/0208032.
[30] Qiaochu Yuan. Answer to “Determinant of a matrix involving factorials”, November 2022.

[bib.bib1] [1] Emile Aarts and Jan Korst. Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. John Wiley & Sons, Inc., USA, 1989.

[bib.bib2] [2] Emile Aarts and Jan Karel Lenstra, editors. Local Search in Combinatorial Optimization. Princeton University Press, 2003. doi:10.2307/j.ctv346t9c.

[bib.bib3] [3] I. G. Abrahamson. Orthant Probabilities for the Quadrivariate Normal Distribution. The Annals of Mathematical Statistics, 35(4):1685–1703, December 1964. doi:10.1214/aoms/1177700391.

[bib.bib4] [4] Sanjeev Arora and Boaz Barak. Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge, 2009. doi:10.1017/CBO9780511804090.

[bib.bib5] [5] Maxime Bôcher. Certain cases in which the vanishing of the Wronskian is a sufficient condition for linear dependence. Transactions of the American Mathematical Society, 2(2):139–149, 1901. doi:10.2307/1986214.

[bib.bib6] [6] Ulrich A. Brodowsky and Stefan Hougardy. The Approximation Ratio of the 2-Opt Heuristic for the Euclidean Traveling Salesman Problem. In DROPS-IDN/v2/Document/10.4230/LIPIcs.STACS.2021.18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.STACS.2021.18.

[bib.bib7] [7] E. Rodney Canfield and Brendan D. McKay. The asymptotic volume of the Birkhoff polytope. Online Journal of Analytic Combinatorics, 2009.

[bib.bib8] [8] Apostolos Chalkis and Vissarion Fisikopoulos. Volesti: Volume Approximation and Sampling for Convex Polytopes in R. The R Journal, 13(2):642–660, 2021.

[bib.bib9] [9] Barun Chandra, Howard Karloff, and Craig Tovey. New Results on the Old k-opt Algorithm for the Traveling Salesman Problem. SIAM Journal on Computing, 28(6):1998–2029, January 1999. doi:10.1137/S0097539793251244.

[bib.bib10] [10] Zongchen Chen, Elchanan Mossel, and Ilias Zadik. Almost-Linear Planted Cliques Elude the Metropolis Process. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Proceedings, pages 4504–4539. Society for Industrial and Applied Mathematics, January 2023. doi:10.1137/1.9781611977554.ch171.

[bib.bib11] [11] M. C. Cheng. The Orthant Probabilities of Four Gaussian Variates. The Annals of Mathematical Statistics, 40(1):152–161, 1969. URL: https://www.jstor.org/stable/2239206.

[bib.bib12] [12] F. N. David. A note on the evaluation of the multivariate normal integral. Biometrika, 40(3-4):458–459, December 1953. doi:10.1093/biomet/40.3-4.458.

[bib.bib13] [13] M. E. Dyer and A. M. Frieze. On the Complexity of Computing the Volume of a Polyhedron. SIAM Journal on Computing, 17(5):967–974, October 1988. doi:10.1137/0217060.

[bib.bib14] [14] Ioannis Z. Emiris and Vissarion Fisikopoulos. Practical Polytope Volume Approximation. ACM Trans. Math. Softw., 44(4):38:1–38:21, June 2018. doi:10.1145/3194656.

[bib.bib15] [15] Christian Engels and Bodo Manthey. Average-case approximation ratio of the 2-opt algorithm for the TSP. Operations Research Letters, 37(2):83–84, March 2009. doi:10.1016/j.orl.2008.12.002.

[bib.bib16] [16] Matthias Englert, Heiko Röglin, and Berthold Vöcking. Worst Case and Probabilistic Analysis of the 2-Opt Algorithm for the TSP. Algorithmica, 68(1):190–264, January 2014. Corrected version: https://arxiv.org/abs/2302.06889. doi:10.1007/s00453-013-9801-4.

[bib.bib17] [17] Matthias Englert, Heiko Röglin, and Berthold Vöcking. Smoothed Analysis of the 2-Opt Algorithm for the General TSP. ACM Transactions on Algorithms, 13(1):10:1–10:15, September 2016. doi:10.1145/2972953.

[bib.bib18] [18] Leslie Ann Goldberg, Rob Gysel, and John Lapinskas. Approximately counting locally-optimal structures. Journal of Computer and System Sciences, 82(6):1144–1160, 2016. doi:10.1016/j.jcss.2016.04.001.

[bib.bib19] [19] Bruce Hajek. Cooling Schedules for Optimal Annealing. Mathematics of Operations Research, 13(2):311–329, 1988. doi:10.1287/MOOR.13.2.311.

[bib.bib20] [20] Stefan Hougardy, Fabian Zaiser, and Xianghui Zhong. The approximation ratio of the 2-Opt Heuristic for the metric Traveling Salesman Problem. Operations Research Letters, 48(4):401–404, July 2020. doi:10.1016/j.orl.2020.05.007.

[bib.bib21] [21] Mark Jerrum. Large Cliques Elude the Metropolis Process. Random Structures & Algorithms, 3(4):347–359, 1992. doi:10.1002/rsa.3240030402.

[bib.bib22] [22] David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. How easy is local search? Journal of Computer and System Sciences, 37(1):79–100, August 1988. doi:10.1016/0022-0000(88)90046-3.

[bib.bib23] [23] Richard M. Karp. Reducibility among Combinatorial Problems. In Raymond E. Miller, James W. Thatcher, and Jean D. Bohlinger, editors, Complexity of Computer Computations: Proceedings of a Symposium on the Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Springer US, Boston, MA, 1972. doi:10.1007/978-1-4684-2001-2_9.

[bib.bib24] [24] Marvin Künnemann, Bodo Manthey, and Rianne Veenstra. Smoothed Analysis of the 2-Opt Heuristic for the TSP under Gaussian Noise, August 2023. Comment: Combination of an ISAAC 2013 paper by Bodo Manthey and Rianne Veenstra and an ICALP 2015 paper by Marvin Künnemann and Bodo Manthey. The results of the ISAAC 2013 paper have been improved. doi:10.48550/arXiv.2308.00306.

[bib.bib25] [25] S. Lin and B. W. Kernighan. An Effective Heuristic Algorithm for the Traveling-Salesman Problem. Operations Research, 21(2):498–516, April 1973. doi:10.1287/opre.21.2.498.

[bib.bib26] [26] Bodo Manthey and Jesse van Rhijn. Improved Smoothed Analysis of 2-Opt for the Euclidean TSP. In ISAAC 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ISAAC.2023.52.

[bib.bib27] [27] Bodo Manthey and Rianne Veenstra. Smoothed Analysis of the 2-Opt Heuristic for the TSP: Polynomial Bounds for Gaussian Noise. In Leizhen Cai, Siu-Wing Cheng, and Tak-Wah Lam, editors, Algorithms and Computation, Lecture Notes in Computer Science, pages 579–589, Berlin, Heidelberg, 2013. Springer. Full, improved version: https://arxiv.org/abs/2308.00306. doi:10.1007/978-3-642-45030-3_54.

[bib.bib28] [28] Andreas Nolte and Rainer Schrader. A Note on the Finite Time Behavior of Simulated Annealing. Mathematics of Operations Research, 25(3):476–484, 2000. doi:10.1287/MOOR.25.3.476.12211.

[bib.bib29] [29] Leslie G. Valiant. The Complexity of Enumeration and Reliability Problems. SIAM Journal on Computing, 8(3):410–421, August 1979. doi:10.1137/0208032.

[bib.bib30] [30] Qiaochu Yuan. Answer to “Determinant of a matrix involving factorials”, November 2022.