,
Kelin Luo
,
Dorian Reineccius
,
Heiko Röglin
,
Melanie Schmidt
Creative Commons Attribution 4.0 International license
The connected k-median problem is a constrained clustering problem that combines distance-based k-clustering with connectivity information. The problem allows to input a metric space and an unweighted undirected connectivity graph that is completely unrelated to the metric space. The goal is to compute k centers and corresponding clusters such that each cluster forms a connected subgraph of G, and such that the k-median cost is minimized.
The problem has applications in very different fields like geodesy (particularly districting), social network analysis (especially community detection), or bioinformatics. We study a version with overlapping clusters where points can be part of multiple clusters which is natural for the use case of community detection. This problem variant is Ω(log n)-hard to approximate, and our main result is an 𝒪(k² log n)-approximation algorithm for the problem. We complement it with an Ω(n^{1-ε})-hardness result for the case of disjoint clusters without overlap with general connectivity graphs, as well as an exact algorithm in this setting if the connectivity graph is a tree.
@InProceedings{eube_et_al:LIPIcs.ESA.2025.63,
author = {Eube, Jan and Luo, Kelin and Reineccius, Dorian and R\"{o}glin, Heiko and Schmidt, Melanie},
title = {{Connected k-Median with Disjoint and Non-Disjoint Clusters}},
booktitle = {33rd Annual European Symposium on Algorithms (ESA 2025)},
pages = {63:1--63:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-395-9},
ISSN = {1868-8969},
year = {2025},
volume = {351},
editor = {Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.63},
URN = {urn:nbn:de:0030-drops-245317},
doi = {10.4230/LIPIcs.ESA.2025.63},
annote = {Keywords: Clustering, Connectivity constraints, Approximation algorithms}
}