Abstract
In this paper we initiate a systematic study of exact algorithms for some of the well known clustering problems, namely kMEDIAN and kMEANS. In kMEDIAN, the input consists of a set X of n points belonging to a metric space, and the task is to select a subset C ⊆ X of k points as centers, such that the sum of the distances of every point to its nearest center is minimized. In kMEANS, the objective is to minimize the sum of squares of the distances instead. It is easy to design an algorithm running in time max_{k ≤ n} {n choose k} n^𝒪(1) = 𝒪^*(2ⁿ) (here, 𝒪^*(⋅) notation hides polynomial factors in n). In this paper we design first nontrivial exact algorithms for these problems. In particular, we obtain an 𝒪^*((1.89)ⁿ) time exact algorithm for kMEDIAN that works for any value of k. Our algorithm is quite general in that it does not use any properties of the underlying (metric) space  it does not even require the distances to satisfy the triangle inequality. In particular, the same algorithm also works for kMeans. We complement this result by showing that the running time of our algorithm is asymptotically optimal, up to the base of the exponent. That is, unless the Exponential Time Hypothesis fails, there is no algorithm for these problems running in time 2^o(n)⋅n^𝒪(1).
Finally, we consider the "facility location" or "supplier" versions of these clustering problems, where, in addition to the set X we are additionally given a set of m candidate centers (or facilities) F, and objective is to find a subset of k centers from F. The goal is still to minimize the kMedian/kMeans/kCenter objective. For these versions we give a 𝒪(2ⁿ (mn)^𝒪(1)) time algorithms using subset convolution. We complement this result by showing that, under the Set Cover Conjecture, the "supplier" versions of these problems do not admit an exact algorithm running in time 2^{(1ε) n} (mn)^𝒪(1).
BibTeX  Entry
@InProceedings{fomin_et_al:LIPIcs.IPEC.2022.13,
author = {Fomin, Fedor V. and Golovach, Petr A. and Inamdar, Tanmay and Purohit, Nidhi and Saurabh, Saket},
title = {{Exact Exponential Algorithms for Clustering Problems}},
booktitle = {17th International Symposium on Parameterized and Exact Computation (IPEC 2022)},
pages = {13:113:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772600},
ISSN = {18688969},
year = {2022},
volume = {249},
editor = {Dell, Holger and Nederlof, Jesper},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2022/17369},
URN = {urn:nbn:de:0030drops173691},
doi = {10.4230/LIPIcs.IPEC.2022.13},
annote = {Keywords: clustering, kmedian, kmeans, exact algorithms}
}
Keywords: 

clustering, kmedian, kmeans, exact algorithms 
Collection: 

17th International Symposium on Parameterized and Exact Computation (IPEC 2022) 
Issue Date: 

2022 
Date of publication: 

14.12.2022 