Finer-Grained Hardness of Kernel Density Estimation

Authors Josh Alman , Yunfeng Guan



PDF
Thumbnail PDF

File

LIPIcs.CCC.2024.35.pdf
  • Filesize: 0.85 MB
  • 21 pages

Document Identifiers

Author Details

Josh Alman
  • Department of Compute Science, Columbia University, New York, NY, USA
Yunfeng Guan
  • Department of Compute Science, Columbia University, New York, NY, USA

Acknowledgements

We would like to thank Amol Aggarwal for constructive discussions on Schur polynomials, and anonymous reviewers for helpful suggestions.

Cite As Get BibTex

Josh Alman and Yunfeng Guan. Finer-Grained Hardness of Kernel Density Estimation. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 35:1-35:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.CCC.2024.35

Abstract

In batch Kernel Density Estimation (KDE) for a kernel function f : ℝ^m × ℝ^m → ℝ, we are given as input 2n points x^{(1)}, …, x^{(n)}, y^{(1)}, …, y^{(n)} ∈ ℝ^m in dimension m, as well as a vector v ∈ ℝⁿ. These inputs implicitly define the n × n kernel matrix K given by K[i,j] = f(x^{(i)}, y^{(j)}). The goal is to compute a vector v ∈ ℝⁿ which approximates K w, i.e., with || Kw - v||_∞ < ε ||w||₁. For illustrative purposes, consider the Gaussian kernel f(x,y) : = e^{-||x-y||₂²}. The classic approach to this problem is the famous Fast Multipole Method (FMM), which runs in time n ⋅ O(log^m(ε^{-1})) and is particularly effective in low dimensions because of its exponential dependence on m. Recently, as the higher-dimensional case m ≥ Ω(log n) has seen more applications in machine learning and statistics, new algorithms have focused on this setting: an algorithm using discrepancy theory, which runs in time O(n / ε), and an algorithm based on the polynomial method, which achieves inverse polynomial accuracy in almost linear time when the input points have bounded square diameter B < o(log n).
A recent line of work has proved fine-grained lower bounds, with the goal of showing that the "curse of dimensionality" arising in FMM is necessary assuming the Strong Exponential Time Hypothesis (SETH). Backurs et al. [NeurIPS 2017] first showed the hardness of a variety of Empirical Risk Minimization problems including KDE for Gaussian-like kernels in the case with high dimension m = Ω(log n) and large scale B = Ω(log n). Alman et al. [FOCS 2020] later developed new reductions in roughly this same parameter regime, leading to lower bounds for more general kernels, but only for very small error ε < 2^{- log^{Ω(1)} (n)}.
In this paper, we refine the approach of Alman et al. to show new lower bounds in all parameter regimes, closing gaps between the known algorithms and lower bounds. For example:  
- In the setting where m = Clog n and B = o(log n), we prove Gaussian KDE requires n^{2-o(1)} time to achieve additive error ε < Ω(m/B)^{-m}, matching the performance of the polynomial method up to low-order terms. 
- In the low dimensional setting m = o(log n), we show that Gaussian KDE requires n^{2-o(1)} time to achieve ε such that log log (ε^{-1}) > ̃ Ω ((log n)/m), matching the error bound achievable by FMM up to low-order terms. To our knowledge, no nontrivial lower bound was previously known in this regime.  Our approach also generalizes to any parameter regime and any kernel. For example, we achieve similar fine-grained hardness results for any kernel with slowly-decaying Taylor coefficients such as the Cauchy kernel. Our new lower bounds make use of an intricate analysis of the "counting matrix", a special case of the kernel matrix focused on carefully-chosen evaluation points. As a key technical lemma, we give a novel approach to bounding the entries of its inverse by using Schur polynomials from algebraic combinatorics.

Subject Classification

ACM Subject Classification
  • Theory of computation → Problems, reductions and completeness
Keywords
  • Kernel Density Estimation
  • Fine-Grained Complexity
  • Schur Polynomials

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amol Aggarwal and Josh Alman. Optimal-Degree Polynomial Approximations for Exponentials and Gaussian Kernel Density Estimation. In 37th Computational Complexity Conference (CCC 2022), 2022. Google Scholar
  2. Josh Alman, Timothy Chu, Aaron Schild, and Zhao Song. Algorithms and hardness for linear algebra on geometric graphs. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00057.
  3. Josh Alman and Zhao Song. Fast attention requires bounded entries. In NeurIPS, 2023. Google Scholar
  4. Josh Alman and Ryan Williams. Probabilistic polynomials and hamming nearest neighbors. 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, October 2015. URL: https://doi.org/10.1109/focs.2015.18.
  5. Arturs Backurs, Moses Charikar, Piotr Indyk, and Paris Siminelakis. Efficient density evaluation for smooth kernels. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 615-626, 2018. URL: https://doi.org/10.1109/FOCS.2018.00065.
  6. Arturs Backurs, Piotr Indyk, and Ludwig Schmidt. On the fine-grained complexity of empirical risk minimization: Kernel methods and neural networks. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/635440afdfc39fe37995fed127d7df4f-Paper.pdf.
  7. Arturs Backurs, Piotr Indyk, and Tal Wagner. Space and time efficient kernel density estimation in high dimensions. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2019. Curran Associates Inc. Google Scholar
  8. Moses Charikar, Michael Kapralov, Navid Nouri, and Paris Siminelakis. Kernel density estimation through density constrained near neighbor search. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 172-183, 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00025.
  9. Moses Charikar, Michael Kapralov, and Erik Waingarten. A quasi-monte carlo data structure for smooth kernel evaluations. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 5118-5144, 2024. URL: https://doi.org/10.1137/1.9781611977912.184.
  10. Moses Charikar and Paris Siminelakis. Hashing-based-estimators for kernel density in high dimensions. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 1032-1043, 2017. URL: https://doi.org/10.1109/FOCS.2017.99.
  11. Moses Charikar and Paris Siminelakis. Multi-resolution hashing for fast pairwise summations. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 769-792, 2019. URL: https://doi.org/10.1109/FOCS.2019.00051.
  12. Lijie Chen. On the hardness of approximate and exact (bichromatic) maximum inner product. Theory of Computing, 16(4):1-50, 2020. URL: https://doi.org/10.4086/toc.2020.v016a004.
  13. Yen-Chi Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1(1):161-187, 2017. Google Scholar
  14. L Greengard and V Rokhlin. A fast algorithm for particle simulations. Journal of Computational Physics, 73(2):325-348, 1987. URL: https://doi.org/10.1016/0021-9991(87)90140-9.
  15. Leslie Greengard and John A. Strain. The fast gauss transform. SIAM J. Sci. Comput., 12:79-94, 1991. URL: https://api.semanticscholar.org/CorpusID:3145209.
  16. Sean Hallgren, Alexander Russell, and Amnon Ta-Shma. Normal subgroup reconstruction and quantum computation using group representations. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC '00, pages 627-635, New York, NY, USA, 2000. Association for Computing Machinery. URL: https://doi.org/10.1145/335305.335392.
  17. Christian Ikenmeyer and Greta Panova. Rectangular kronecker coefficients and plethysms in geometric complexity theory. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 396-405, 2016. URL: https://doi.org/10.1109/FOCS.2016.50.
  18. Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Bernhard Schölkopf, et al. Kernel mean embedding of distributions: A review and beyond. Foundations and Trendsregistered in Machine Learning, 10(1-2):1-141, 2017. Google Scholar
  19. Ryan O'Donnell and John Wright. Quantum spectrum testing. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC '15, pages 529-538, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2746539.2746582.
  20. Jeff M. Phillips and Wai Ming Tai. Near-optimal coresets of kernel density estimates. Discrete Comput. Geom., 63(4):867-887, June 2020. URL: https://doi.org/10.1007/s00454-019-00134-6.
  21. Aviad Rubinstein. Hardness of approximate nearest neighbor search. Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, June 2018. URL: https://doi.org/10.1145/3188745.3188916.
  22. Bernhard Schölkopf and Alexander J Smola. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002. Google Scholar
  23. Richard Stanley. Enumerative Combinatorics. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2 edition, 2023. Google Scholar
  24. Paraskevas Syminelakis. Fast kernel evaluation in high dimensions: Importance sampling and near neighbor search. PhD thesis, Stanford University, 2019. Google Scholar
  25. Ryan Williams. On the difference between closest, furthest, and orthogonal pairs: Nearly-linear vs barely-subquadratic complexity. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail