Document

APPROX

**Published in:** LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

Given a metric space (F ∪ C, d), we consider star covers of C with balanced loads. A star is a pair (i, C_i) where i ∈ F and C_i ⊆ C, and the load of a star is ∑_{j ∈ C_i} d(i, j). In minimum load k-star cover problem (MLkSC), one tries to cover the set of clients C using k stars that minimize the maximum load of a star, and in minimum size star cover (MSSC) one aims to find the minimum number of stars of load at most T needed to cover C, where T is a given parameter.
We obtain new bicriteria approximations for the two problems using novel rounding algorithms for their standard LP relaxations. For MLkSC, we find a star cover with (1+O(ε))k stars and O(1/ε²)OPT_MLk load where OPT_MLk is the optimum load. For MSSC, we find a star cover with O(1/ε²) OPT_MS stars of load at most (2 + O(ε)) T where OPT_MS is the optimal number of stars for the problem. Previously, non-trivial bicriteria approximations were known only when F = C.

Buddhima Gamlath and Vadim Grinberg. Approximating Star Cover Problems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 57:1-57:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{gamlath_et_al:LIPIcs.APPROX/RANDOM.2020.57, author = {Gamlath, Buddhima and Grinberg, Vadim}, title = {{Approximating Star Cover Problems}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)}, pages = {57:1--57:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-164-1}, ISSN = {1868-8969}, year = {2020}, volume = {176}, editor = {Byrka, Jaros{\l}aw and Meka, Raghu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.57}, URN = {urn:nbn:de:0030-drops-126609}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2020.57}, annote = {Keywords: star cover, approximation algorithms, lp rounding} }

Document

**Published in:** LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

We study k-means clustering in a semi-supervised setting. Given an oracle that returns whether two given points belong to the same cluster in a fixed optimal clustering, we investigate the following question: how many oracle queries are sufficient to efficiently recover a clustering that, with probability at least (1 - delta), simultaneously has a cost of at most (1 + epsilon) times the optimal cost and an accuracy of at least (1 - epsilon)?
We show how to achieve such a clustering on n points with O{((k^2 log n) * m{(Q, epsilon^4, delta / (k log n))})} oracle queries, when the k clusters can be learned with an epsilon' error and a failure probability delta' using m(Q, epsilon',delta') labeled samples in the supervised setting, where Q is the set of candidate cluster centers. We show that m(Q, epsilon', delta') is small both for k-means instances in Euclidean space and for those in finite metric spaces. We further show that, for the Euclidean k-means instances, we can avoid the dependency on n in the query complexity at the expense of an increased dependency on k: specifically, we give a slightly more involved algorithm that uses O{(k^4/(epsilon^2 delta) + (k^{9}/epsilon^4) log(1/delta) + k * m{({R}^r, epsilon^4/k, delta)})} oracle queries.
We also show that the number of queries needed for (1 - epsilon)-accuracy in Euclidean k-means must linearly depend on the dimension of the underlying Euclidean space, and for finite metric space k-means, we show that it must at least be logarithmic in the number of candidate centers. This shows that our query complexities capture the right dependencies on the respective parameters.

Buddhima Gamlath, Sangxia Huang, and Ola Svensson. Semi-Supervised Algorithms for Approximately Optimal and Accurate Clustering. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 57:1-57:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{gamlath_et_al:LIPIcs.ICALP.2018.57, author = {Gamlath, Buddhima and Huang, Sangxia and Svensson, Ola}, title = {{Semi-Supervised Algorithms for Approximately Optimal and Accurate Clustering}}, booktitle = {45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)}, pages = {57:1--57:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-076-7}, ISSN = {1868-8969}, year = {2018}, volume = {107}, editor = {Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.57}, URN = {urn:nbn:de:0030-drops-90612}, doi = {10.4230/LIPIcs.ICALP.2018.57}, annote = {Keywords: Clustering, Semi-supervised Learning, Approximation Algorithms, k-Means, k-Median} }