Near-Optimal Coresets of Kernel Density Estimates

Authors Jeff M. Phillips, Wai Ming Tai

Thumbnail PDF


  • Filesize: 0.57 MB
  • 13 pages

Document Identifiers

Author Details

Jeff M. Phillips
Wai Ming Tai

Cite AsGet BibTex

Jeff M. Phillips and Wai Ming Tai. Near-Optimal Coresets of Kernel Density Estimates. In 34th International Symposium on Computational Geometry (SoCG 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 99, pp. 66:1-66:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


We construct near-optimal coresets for kernel density estimate for points in R^d when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size O(sqrt{d log (1/epsilon)}/epsilon), and we show a near-matching lower bound of size Omega(sqrt{d}/epsilon). The upper bound is a polynomial in 1/epsilon improvement when d in [3,1/epsilon^2) (for all kernels except the Gaussian kernel which had a previous upper bound of O((1/epsilon) log^d (1/epsilon))) and the lower bound is the first known lower bound to depend on d for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.
  • Coresets
  • Kernel Density Estimate
  • Discrepancy


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Ery Arias-Castro, David Mason, and Bruno Pelletier. On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Journal of Machine Learning Research, 17(43):1-28, 2016. Google Scholar
  2. Francis Bach, Simon Lacoste-Julien, and Guillaume Obozinski. On the equivalence between herding and conditional gradient algorithms. In ICML 2012 International Conference on Machine Learning, 2012. Google Scholar
  3. Wojciech Banaszczyk. Balancing vectors and gaussian measures of n-dimensional convex bodies. Random Structures &Algorithms, 12(4):351-360, 1998. Google Scholar
  4. Nikhil Bansal, Daniel Dadush, Shashwat Garg, and Shachar Lovett. The Gram-Schmidt walk: A cure for the Banaszczyk blues (to appear). Proceedings of the fiftieth annual ACM symposium on Theory of computing, 2018. Google Scholar
  5. Jon Louis Bentley and James B. Saxe. Decomposable searching problems I: Static-to-dynamic transformations. Journal of Algorithms, 1(4), 1980. Google Scholar
  6. Omer Bobrowski, Sayan Mukherjee, and Jonathan E. Taylor. Topological consistency via kernel estimation. Bernoulli, 23:288-328, 2017. Google Scholar
  7. Bernard Chazelle. The Discrepancy Method. Cambridge, 2000. Google Scholar
  8. Bernard Chazelle and Jiri Matousek. On linear-time deterministic algorithms for optimization problems in fixed dimensions. J. Algorithms, 21:579-597, 1996. Google Scholar
  9. Luc Devroye and László Györfi. Nonparametric Density Estimation: The L₁ View. Wiley, 1984. Google Scholar
  10. Petros Drineas and Michael W. Mahoney. On the Nyström method for approximating a Gram matrix for improved kernel-based learning. JLMR, 6:2153-2175, 2005. Google Scholar
  11. Jianqing Fan and Irene Gijbels. Local polynomial modelling and its applications: monographs on statistics and applied probability 66, volume 66. CRC Press, 1996. Google Scholar
  12. Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. Confidence sets for persistence diagrams. The Annals of Statistics, 42:2301-2339, 2014. Google Scholar
  13. Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Scholkopf, and Alexander Smola. A kernel two-sample test. Journal of Machine Learning Research, 13:723-773, 2012. Google Scholar
  14. Matthias Hein and Olivier Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In AISTATS, pages 136-143, 2005. Google Scholar
  15. Thomas Hofmann, Bernhard Schölkopf, and Alexander J. Smola. A review of kernel methods in machine learning. Technical Report 156, Max Planck Institute for Biological Cybernetics, 2006. Google Scholar
  16. Sarang Joshi, Raj Varma Kommaraji, Jeff M Phillips, and Suresh Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In Proceedings of the twenty-seventh annual symposium on Computational geometry, pages 47-56. ACM, 2011. Google Scholar
  17. Jiri Matousek. Geometric Discrepancy; An Illustrated Guide, 2nd printing. Springer-Verlag, 2010. Google Scholar
  18. Jiri Matousek, Aleksandar Nikolov, and Kunal Talwar. Factorization norms and hereditary discrepancy. arXiv preprint arXiv:1408.1376, 2014. Google Scholar
  19. Jeff M. Phillips. Algorithms for ε-approximations of terrains. In ICALP, 2008. Google Scholar
  20. Jeff M Phillips. ε-samples for kernels. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pages 1622-1632. SIAM, 2013. Google Scholar
  21. Jeff M Phillips and Wai Ming Tai. Improved coresets for kernel density estimates. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2718-2727. SIAM, 2018. Google Scholar
  22. Jeff M. Phillips, Bei Wang, and Yan Zheng. Geometric inference on kernel density estimates. In SOCG, 2015. Google Scholar
  23. Alessandro Rinaldo and Larry Wasserman. Generalized density clustering. The Annals of Statistics, pages 2678-2722, 2010. Google Scholar
  24. Isaac J Schoenberg. Metric spaces and completely monotone functions. Annals of Mathematics, pages 811-841, 1938. Google Scholar
  25. Bernhard Scholkopf and Alexander J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002. Google Scholar
  26. Erich Schubert, Arthur Zimek, and Hans-Peter Kriegel. Generalized outlier detection with flexible kernel density estimates. In Proceedings of the 2014 SIAM International Conference on Data Mining, pages 542-550. SIAM, 2014. Google Scholar
  27. David W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, 1992. Google Scholar
  28. Bernard W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman &Hall/CRC, 1986. Google Scholar
  29. Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert R. G. Lanckriet. Hilbert space embeddings and metrics on probability measures. JMLR, 11:1517-1561, 2010. Google Scholar
  30. Yan Zheng and Jeff M. Phillips. l_∞ error and bandwidth selection for kernel density estimates of large data. In KDD, 2015. Google Scholar