- ds.algorithms approximation-algorithms sorting tsp
- Updated Fri, 24 Jun 2022 21:17:08 GMT

The one-dimensional traveling salesperson path problem is, obviously, the same thing as sorting, and so can be solved exactly by comparisons in $O(n\log n)$ time, but it is formulated in such a way that approximation as well as exact solution makes sense. In a model of computation in which the inputs are real numbers, and rounding to integers is possible, it's easy to approximate to within a $1+O(n^{-c})$ factor, for any constant $c$, in time $O(n)$: find the min and max, round everything to a number within distance $(\max-\min)n^{-(c+1)}$ of its original value, and then use radix sort. But models with rounding have problematic complexity theory and this led me to wonder, what about weaker models of computation?

So, how accurately can the one-dimensional TSP be approximated, in a linear comparison tree model of computation (each comparison node tests the sign of a linear function of the input values), by an algorithm whose time complexity is $o(n\log n)$? The same rounding method allows any approximation ratio of the form $n^{1-o(1)}$ to be achieved (by using binary searches to do the rounding, and rounding much more coarsely to make it fast enough). But is it possible to achieve even an approximation ratio like $O(n^{1-\epsilon})$ for some $\epsilon>0$?

EDIT (UPDATE): *The lower bound in my answer below was proven (by a different proof) in "On the complexity of approximating Euclidean traveling salesman tours and minimum spanning trees", by Das et al; Algorithmica 19:447-460 (1997).*

is it possible to achieve even an approximation ratio like $O(n^{1-\epsilon})$ for some $\epsilon>0$ in $o(n\log n)$ time using a comparison-based algorithm?

No. Here's a lower bound.

**Claim.** *For any $\epsilon>0$, every comparison-based
$n^{1-\epsilon}$-approximation algorithm
requires $\Omega(\epsilon n\log n)$ comparisons in the worst case.*

By "comparison-based" I mean any algorithm that only queries the input with binary (True/False) queries.

Here's an attempt at a proof. Hopefully there are no mistakes. FWIW the lower bound seems likely to extend to randomized algorithms.

Fix any $n$ and any arbitrarily small but constant $\epsilon>0$.

Consider just the $n!$ "permutation" input instances $(x_1,x_2,\ldots,x_n)$ that are permutations of $[n]$. The optimum solution for any such instance has cost $n-1$.

Define the *cost* of a permutation $\pi$
to be $c(\pi) = \sum_i |\pi(i+1) - \pi(i)|$.
Model the algorithm as taking as input a permutation $\pi$,
outputting a permutation $\pi'$,
and paying cost $d(\pi,\pi') = c(\pi'\circ \pi)$.

Define $C$ to be the minimum number of comparisons for any comparison-based algorithm to achieve competitive ratio $n^{1-\epsilon}$ on these instances. Since opt is $n-1$, the algorithm must guarantee cost at most $n^{2-\epsilon}$.

We will show $C\ge \Omega(\epsilon n\log n)$.

Define $P$ to be, for any possible output $\pi'$, the fraction of possible inputs for which output $\pi'$ would achieve cost at most $n^{2-\epsilon}$. This fraction is independent of $\pi'$.

$P$ also equals the probability that, for a random permutation $\pi$, its cost $c(\pi)$ is at most $n^{2-\epsilon}$. (To see why, take $\pi'$ to be the identity permutation $I$. Then $P$ is the fraction of inputs for which $d(\pi,I)$ at most $n^{2-\epsilon}$, but $d(\pi,I) = c(\pi)$.)

**Lemma 1.** *$C \ge \log_2 1/P$.*

*Proof.* Fix any algorithm that always uses less than $\log_2 1/P$ comparisons.
The decision tree for the algorithm has depth less than $\log_2 1/P$,
so there are at less than $1/P$ leaves, and, for some output
permutation $\pi'$, the algorithm gives $\pi'$ as output
for more than a $P$ fraction of the inputs. By definition of $P$,
for at least one such input, the output $\pi'$ gives cost more than $n^{2-\epsilon}$.
QED

**Lemma 2.** *$P \le \exp(-\Omega(\epsilon n\log n))$.*

Before we give the proof of Lemma 2, note that the two lemmas together give the claim: $$C ~\ge~ \log_2 \frac{1}{P} ~=~ \log_2 \exp(\Omega(\epsilon n\log n)) ~=~ \Omega(\epsilon n\log n).$$

*Proof of Lemma 2.*
Let $\pi$ be a random permutation.
Recall that $P$ equals the probability that
its cost $c(\pi)$ is at most $n^{2-\epsilon}$.
Say that any pair $(i,i+1)$ is an *edge*
with cost $|\pi(i+1)-\pi(i)|$,
so $c(\pi)$ is the sum of the edge costs.

Suppose $c(\pi) \le n^{2-\epsilon}$.

Then, for any $q>0$, at most $n^{2-\epsilon}/q$ of the edges have cost $q$ or more.
Say that edges of cost less than $q$ are *cheap*.

Fix $q=n^{1-\epsilon/2}$. Substituting and simplifying, at most $n^{1-\epsilon/2}$ of the edges are not cheap.

Thus, *at least* $n - n^{1-\epsilon/2} \ge n/2$ of the edges are cheap.
Thus, there is a set $S$ containing $n/2$ cheap edges.

**Claim.** *For any given set $S$ of $n/2$ edges,
the probability that all edges in $S$ are cheap
is at most $\exp(-\Omega(\epsilon n \log n))$.*

Before we prove the claim, note that it implies the lemma as follows.
By the claim, and the naive union bound,
the probability that any there *exists* such a set $S$
is at most
$${n\choose n/2} \exp(-\Omega(\epsilon n \log n))
~\le~ 2^n \exp(-\Omega(\epsilon n \log n))$$
$$~\le~ \exp(O(n) -\Omega(\epsilon n \log n))
~\le~ \exp(-\Omega(\epsilon n \log n)).$$

*Proof of Claim.*
Choose $\pi$ by the following process.
Choose $\pi(1)$ uniformly from $[n]$,
then choose $\pi(2)$ uniformly from $[n] - \{\pi(1)\}$,
then choose $\pi(3)$ uniformly from $[n]-\{\pi(1),\pi(2)\}$, etc.

Consider any edge $(i,i+1)$ in $S$. Consider the time just after $\pi(i)$ has been chosen, when $\pi(i+1)$ is about to be chosen. Regardless of the first $i$ choices (for $\pi(j)$ for $j\le i$), there are at least $n-i$ choices for $\pi(i+1)$, and at most $2n^{1-\epsilon/2}$ of those choices will give the edge $(i,i+1)$ cost less than $n^{1-\epsilon/2}$ (making it cheap).

Thus, conditioned on the first $i$ choices, the probability that the edge is cheap is at most $\frac{2n^{1-\epsilon/2}}{n-i}$. Thus, the probability that all $n/2$ edges in $S$ are cheap is at most $$\prod_{(i,i+1)\in S} \frac{2n^{1-\epsilon/2}}{n-i}.$$ Since $|S|\ge n/2$, there are at least $n/4$ edges in $S$ with $n-i\ge n/4$. Thus, this product is at most $$\big(\frac{2n^{1-\epsilon/2}}{n/4}\big)^{n/4} ~\le~(8n^{-\epsilon/2})^{n/4} ~=~\exp(O(n)-\Omega(\epsilon n \log n)) ~=~\exp(-\Omega(\epsilon n \log n)).$$

QED

- +6 – p.s. I got a request to make this citable, so I put it on arvix.org here. — Mar 13, 2013 at 00:35