LIPIcs.ESA.2020.57.pdf
- Filesize: 0.54 MB
- 21 pages
With input sizes becoming massive, coresets - small yet representative summary of the input - are relevant more than ever. A weighted set C_w that is a subset of the input is an ε-coreset if the cost of any feasible solution S with respect to C_w is within [1±ε] of the cost of S with respect to the original input. We give a very general technique to compute coresets in the fully-dynamic setting where input points can be added or deleted. Given a static (i.e., not dynamic) ε-coreset-construction algorithm that runs in time t(n, ε, λ) and computes a coreset of size s(n, ε, λ), where n is the number of input points and 1-λ is the success probability, we give a fully-dynamic algorithm that computes an ε-coreset with worst-case update time O((log n) ⋅ t(s(n, ε/log n, λ/n), ε/log n, λ/n)) (this bound is stated informally), where the success probability is 1-λ. Our technique is a fully-dynamic analog of the merge-and-reduce technique, which is due to Har-Peled and Mazumdar [Har-Peled and Mazumdar, 2004] and is based on a technique of Bentley and Saxe [Jon Louis Bentley and James B. Saxe, 1980], that applies to the insertion-only setting where points can only be added. Although, our space usage is O(n), our technique works in the presence of an adaptive adversary, and we show that Ω(n) space is required when adversary is adaptive. As a concrete implication of our technique, using the result of Braverman et al. [{Braverman} et al., 2016], we get fully-dynamic ε-coreset-construction algorithms for k-median and k-means with worst-case update time O(ε^{-2} k² log⁵ n log³ k) and coreset size O(ε^{-2} k log n log² k) ignoring log log n and log(1/ε) factors and assuming that ε = Ω(1/poly(n)) and λ = Ω(1/poly(n)) (which are very weak assumptions made only to make these bounds easy to parse). This results in the first fully-dynamic constant-approximation algorithms for k-median and k-means with update times O(poly(k, log n, ε^{-1})). Specifically, the dependence on k is only quadratic, and the bounds are worst-case. The best previous bound for both problems was amortized O(nlog n) by Cohen-Addad et al. [Cohen-Addad et al., 2019] via randomized O(1)-coresets in O(n) space. We also show that under the OMv conjecture [Monika Henzinger et al., 2015], a fully-dynamic (4 - δ)-approximation algorithm for k-means must either have an amortized update time of Ω(k^{1-γ}) or amortized query time of Ω(k^{2 - γ}), where γ > 0 is a constant.
Feedback for Dagstuhl Publishing