Optimal Coreset for Gaussian Kernel Density Estimation

Author Wai Ming Tai

Thumbnail PDF


  • Filesize: 0.71 MB
  • 15 pages

Document Identifiers

Author Details

Wai Ming Tai
  • University of Chicago, IL, USA

Cite AsGet BibTex

Wai Ming Tai. Optimal Coreset for Gaussian Kernel Density Estimation. In 38th International Symposium on Computational Geometry (SoCG 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 224, pp. 63:1-63:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Given a point set P ⊂ ℝ^d, the kernel density estimate of P is defined as 𝒢-_P(x) = 1/|P| ∑_{p ∈ P}e^{-∥x-p∥²} for any x ∈ ℝ^d. We study how to construct a small subset Q of P such that the kernel density estimate of P is approximated by the kernel density estimate of Q. This subset Q is called a coreset. The main technique in this work is constructing a ± 1 coloring on the point set P by discrepancy theory and we leverage Banaszczyk’s Theorem. When d > 1 is a constant, our construction gives a coreset of size O(1/ε) as opposed to the best-known result of O(1/ε √{log 1/ε}). It is the first result to give a breakthrough on the barrier of √log factor even when d = 2.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
  • Discrepancy Theory
  • Kernel Density Estimation
  • Coreset


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Pankaj K Agarwal, Sariel Har-Peled, Haim Kaplan, and Micha Sharir. Union of random minkowski sums and network vulnerability analysis. Discrete & Computational Geometry, 52(3):551-582, 2014. Google Scholar
  2. Christoph Aistleitner, Dmitriy Bilyk, and Aleksandar Nikolov. Tusnády’s problem, the transference principle, and non-uniform qmc sampling. In International Conference on Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing, pages 169-180. Springer, 2016. Google Scholar
  3. Nachman Aronszajn. Theory of reproducing kernels. Transactions of the American mathematical society, 68(3):337-404, 1950. Google Scholar
  4. Wojciech Banaszczyk. Balancing vectors and gaussian measures of n-dimensional convex bodies. Random Structures & Algorithms, 12(4):351-360, 1998. Google Scholar
  5. Nikhil Bansal, Daniel Dadush, Shashwat Garg, and Shachar Lovett. The gram-schmidt walk: a cure for the banaszczyk blues. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 587-597, 2018. Google Scholar
  6. Nikhil Bansal and Shashwat Garg. Algorithmic discrepancy beyond partial coloring. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 914-926, 2017. Google Scholar
  7. József Beck. Roth’s estimate of the discrepancy of integer sequences is nearly sharp. Combinatorica, 1(4):319-325, 1981. Google Scholar
  8. Jon Louis Bentley and James B Saxe. Decomposable searching problems i: Static-to-dynamic transformation. J. algorithms, 1(4):301-358, 1980. Google Scholar
  9. Moses Charikar, Michael Kapralov, Navid Nouri, and Paris Siminelakis. Kernel density estimation through density constrained near neighbor search. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 172-183. IEEE, 2020. Google Scholar
  10. Frédéric Chazal, Brittany Fasy, Fabrizio Lecci, Bertrand Michel, Alessandro Rinaldo, Alessandro Rinaldo, and Larry Wasserman. Robust topological inference: Distance to a measure and kernel distance. The Journal of Machine Learning Research, 18(1):5845-5884, 2017. Google Scholar
  11. Bernard Chazelle. The discrepancy method: randomness and complexity. Cambridge University Press, 2001. Google Scholar
  12. Bernard Chazelle and Jiřı Matoušek. On linear-time deterministic algorithms for optimization problems in fixed dimension. Journal of Algorithms, 21(3):579-597, 1996. Google Scholar
  13. Kenneth L Clarkson. Coresets, sparse greedy approximation, and the frank-wolfe algorithm. ACM Transactions on Algorithms (TALG), 6(4):1-30, 2010. Google Scholar
  14. Luc Devroye and László Györfi. Nonparametric Density Estimation: The L₁ View. Wiley, 1984. Google Scholar
  15. Bernd Gärtner and Martin Jaggi. Coresets for polytope distance. In Proceedings of the twenty-fifth annual symposium on Computational geometry, pages 33-42, 2009. Google Scholar
  16. Leslie Greengard and John Strain. The fast gauss transform. SIAM Journal on Scientific and Statistical Computing, 12(1):79-94, 1991. Google Scholar
  17. Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. A kernel two-sample test. Journal of Machine Learning Research, 13(Mar):723-773, 2012. Google Scholar
  18. Mingxuan Han, Michael Matheny, and Jeff M Phillips. The kernel spatial scan statistic. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 349-358, 2019. Google Scholar
  19. Phillips Jeff and Tai Wai Ming. The gaussiansketch for almost relative error kernel distance. In International Conference on Randomization and Computation (RANDOM), 2020. Google Scholar
  20. Sarang Joshi, Raj Varma Kommaraji, Jeff M Phillips, and Suresh Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In Proceedings of the twenty-seventh annual symposium on Computational geometry, pages 47-56, 2011. Google Scholar
  21. Zohar Karnin and Edo Liberty. Discrepancy, coresets, and sketches in machine learning. In Conference on Learning Theory, pages 1975-1993, 2019. Google Scholar
  22. Simon Lacoste-Julien, Fredrik Lindsten, and Francis Bach. Sequential kernel herding: Frank-wolfe optimization for particle filtering. In Artificial Intelligence and Statistics, pages 544-552, 2015. Google Scholar
  23. Jasper CH Lee, Jerry Li, Christopher Musco, Jeff M Phillips, and Wai Ming Tai. Finding an approximate mode of a kernel density estimate. In 29th Annual European Symposium on Algorithms (ESA 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2021. Google Scholar
  24. David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, and Iliya Tolstikhin. Towards a learning theory of cause-effect inference. In International Conference on Machine Learning, pages 1452-1461, 2015. Google Scholar
  25. AA Markov. On a question of di mendeleev, zap. Petersburg Akad. Nauk, 62:1-24, 1889. Google Scholar
  26. Jiri Matousek. Geometric discrepancy: An illustrated guide, volume 18. Springer Science & Business Media, 2009. Google Scholar
  27. Jiří Matoušek, Aleksandar Nikolov, and Kunal Talwar. Factorization norms and hereditary discrepancy. International Mathematics Research Notices, 2020(3):751-780, 2020. Google Scholar
  28. Jeff M Phillips. Algorithms for ε-approximations of terrains. In International Colloquium on Automata, Languages, and Programming, pages 447-458. Springer, 2008. Google Scholar
  29. Jeff M Phillips. ε-samples for kernels. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pages 1622-1632. SIAM, 2013. Google Scholar
  30. Jeff M Phillips and Wai Ming Tai. Improved coresets for kernel density estimates. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2718-2727. SIAM, 2018. Google Scholar
  31. Jeff M Phillips and Wai Ming Tai. Near-optimal coresets of kernel density estimates. Discrete & Computational Geometry, pages 1-21, 2019. Google Scholar
  32. Jeff M Phillips, Bei Wang, and Yan Zheng. Geometric inference on kernel density estimates. In 31st International Symposium on Computational Geometry (SoCG 2015). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015. Google Scholar
  33. Alessandro Rinaldo, Larry Wasserman, et al. Generalized density clustering. The Annals of Statistics, 38(5):2678-2722, 2010. Google Scholar
  34. Bernhard Schölkopf, Alexander J Smola, Francis Bach, et al. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002. Google Scholar
  35. David W Scott. Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, 2015. Google Scholar
  36. Bernard W Silverman. Density estimation for statistics and data analysis, volume 26. CRC press, 1986. Google Scholar
  37. Joel Spencer. Six standard deviations suffice. Transactions of the American mathematical society, 289(2):679-706, 1985. Google Scholar
  38. Bharath K Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert RG Lanckriet. Hilbert space embeddings and metrics on probability measures. The Journal of Machine Learning Research, 11:1517-1561, 2010. Google Scholar
  39. Grace Wahba et al. Support vector machines, reproducing kernel hilbert spaces and the randomized gacv. Advances in Kernel Methods-Support Vector Learning, 6:69-87, 1999. Google Scholar
  40. Yan Zheng and Jeff M Phillips. L_∞ error and bandwidth selection for kernel density estimates of large data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1533-1542, 2015. Google Scholar
  41. Shaofeng Zou, Yingbin Liang, H Vincent Poor, and Xinghua Shi. Unsupervised nonparametric anomaly detection: A kernel method. In 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 836-841. IEEE, 2014. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail