Given a point set P ⊂ ℝ^d, the kernel density estimate of P is defined as 𝒢-_P(x) = 1/|P| ∑_{p ∈ P}e^{-∥x-p∥²} for any x ∈ ℝ^d. We study how to construct a small subset Q of P such that the kernel density estimate of P is approximated by the kernel density estimate of Q. This subset Q is called a coreset. The main technique in this work is constructing a ± 1 coloring on the point set P by discrepancy theory and we leverage Banaszczyk’s Theorem. When d > 1 is a constant, our construction gives a coreset of size O(1/ε) as opposed to the best-known result of O(1/ε √{log 1/ε}). It is the first result to give a breakthrough on the barrier of √log factor even when d = 2.
@InProceedings{tai:LIPIcs.SoCG.2022.63, author = {Tai, Wai Ming}, title = {{Optimal Coreset for Gaussian Kernel Density Estimation}}, booktitle = {38th International Symposium on Computational Geometry (SoCG 2022)}, pages = {63:1--63:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-227-3}, ISSN = {1868-8969}, year = {2022}, volume = {224}, editor = {Goaoc, Xavier and Kerber, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2022.63}, URN = {urn:nbn:de:0030-drops-160719}, doi = {10.4230/LIPIcs.SoCG.2022.63}, annote = {Keywords: Discrepancy Theory, Kernel Density Estimation, Coreset} }
Feedback for Dagstuhl Publishing