Henzinger, Monika ;
Kale, Sagar
FullyDynamic Coresets
Abstract
With input sizes becoming massive, coresets  small yet representative summary of the input  are relevant more than ever. A weighted set C_w that is a subset of the input is an εcoreset if the cost of any feasible solution S with respect to C_w is within [1±ε] of the cost of S with respect to the original input. We give a very general technique to compute coresets in the fullydynamic setting where input points can be added or deleted. Given a static (i.e., not dynamic) εcoresetconstruction algorithm that runs in time t(n, ε, λ) and computes a coreset of size s(n, ε, λ), where n is the number of input points and 1λ is the success probability, we give a fullydynamic algorithm that computes an εcoreset with worstcase update time O((log n) ⋅ t(s(n, ε/log n, λ/n), ε/log n, λ/n)) (this bound is stated informally), where the success probability is 1λ. Our technique is a fullydynamic analog of the mergeandreduce technique, which is due to HarPeled and Mazumdar [HarPeled and Mazumdar, 2004] and is based on a technique of Bentley and Saxe [Jon Louis Bentley and James B. Saxe, 1980], that applies to the insertiononly setting where points can only be added. Although, our space usage is O(n), our technique works in the presence of an adaptive adversary, and we show that Ω(n) space is required when adversary is adaptive.
As a concrete implication of our technique, using the result of Braverman et al. [{Braverman} et al., 2016], we get fullydynamic εcoresetconstruction algorithms for kmedian and kmeans with worstcase update time O(ε^{2} k² log⁵ n log³ k) and coreset size O(ε^{2} k log n log² k) ignoring log log n and log(1/ε) factors and assuming that ε = Ω(1/poly(n)) and λ = Ω(1/poly(n)) (which are very weak assumptions made only to make these bounds easy to parse). This results in the first fullydynamic constantapproximation algorithms for kmedian and kmeans with update times O(poly(k, log n, ε^{1})). Specifically, the dependence on k is only quadratic, and the bounds are worstcase. The best previous bound for both problems was amortized O(nlog n) by CohenAddad et al. [CohenAddad et al., 2019] via randomized O(1)coresets in O(n) space.
We also show that under the OMv conjecture [Monika Henzinger et al., 2015], a fullydynamic (4  δ)approximation algorithm for kmeans must either have an amortized update time of Ω(k^{1γ}) or amortized query time of Ω(k^{2  γ}), where γ > 0 is a constant.
BibTeX  Entry
@InProceedings{henzinger_et_al:LIPIcs:2020:12923,
author = {Monika Henzinger and Sagar Kale},
title = {{FullyDynamic Coresets}},
booktitle = {28th Annual European Symposium on Algorithms (ESA 2020)},
pages = {57:157:21},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959771627},
ISSN = {18688969},
year = {2020},
volume = {173},
editor = {Fabrizio Grandoni and Grzegorz Herman and Peter Sanders},
publisher = {Schloss DagstuhlLeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12923},
URN = {urn:nbn:de:0030drops129230},
doi = {10.4230/LIPIcs.ESA.2020.57},
annote = {Keywords: Clustering, Coresets, Dynamic Algorithms}
}
26.08.2020
Keywords: 

Clustering, Coresets, Dynamic Algorithms 
Seminar: 

28th Annual European Symposium on Algorithms (ESA 2020)

Issue date: 

2020 
Date of publication: 

26.08.2020 