Finer-Grained Hardness of Kernel Density Estimation

Alman, Josh; Guan, Yunfeng

doi:10.4230/LIPIcs.CCC.2024.35

Abstract

In batch Kernel Density Estimation (KDE) for a kernel function f : ℝ^m × ℝ^m → ℝ, we are given as input 2n points x^{(1)}, …, x^{(n)}, y^{(1)}, …, y^{(n)} ∈ ℝ^m in dimension m, as well as a vector v ∈ ℝⁿ. These inputs implicitly define the n × n kernel matrix K given by K[i,j] = f(x^{(i)}, y^{(j)}). The goal is to compute a vector v ∈ ℝⁿ which approximates K w, i.e., with || Kw - v||_∞ < ε ||w||₁. For illustrative purposes, consider the Gaussian kernel f(x,y) : = e^{-||x-y||₂²}. The classic approach to this problem is the famous Fast Multipole Method (FMM), which runs in time n ⋅ O(log^m(ε^{-1})) and is particularly effective in low dimensions because of its exponential dependence on m. Recently, as the higher-dimensional case m ≥ Ω(log n) has seen more applications in machine learning and statistics, new algorithms have focused on this setting: an algorithm using discrepancy theory, which runs in time O(n / ε), and an algorithm based on the polynomial method, which achieves inverse polynomial accuracy in almost linear time when the input points have bounded square diameter B < o(log n).
A recent line of work has proved fine-grained lower bounds, with the goal of showing that the "curse of dimensionality" arising in FMM is necessary assuming the Strong Exponential Time Hypothesis (SETH). Backurs et al. [NeurIPS 2017] first showed the hardness of a variety of Empirical Risk Minimization problems including KDE for Gaussian-like kernels in the case with high dimension m = Ω(log n) and large scale B = Ω(log n). Alman et al. [FOCS 2020] later developed new reductions in roughly this same parameter regime, leading to lower bounds for more general kernels, but only for very small error ε < 2^{- log^{Ω(1)} (n)}.
In this paper, we refine the approach of Alman et al. to show new lower bounds in all parameter regimes, closing gaps between the known algorithms and lower bounds. For example:  
- In the setting where m = Clog n and B = o(log n), we prove Gaussian KDE requires n^{2-o(1)} time to achieve additive error ε < Ω(m/B)^{-m}, matching the performance of the polynomial method up to low-order terms. 
- In the low dimensional setting m = o(log n), we show that Gaussian KDE requires n^{2-o(1)} time to achieve ε such that log log (ε^{-1}) > ̃ Ω ((log n)/m), matching the error bound achievable by FMM up to low-order terms. To our knowledge, no nontrivial lower bound was previously known in this regime.  Our approach also generalizes to any parameter regime and any kernel. For example, we achieve similar fine-grained hardness results for any kernel with slowly-decaying Taylor coefficients such as the Cauchy kernel. Our new lower bounds make use of an intricate analysis of the "counting matrix", a special case of the kernel matrix focused on carefully-chosen evaluation points. As a key technical lemma, we give a novel approach to bounding the entries of its inverse by using Schur polynomials from algebraic combinatorics.

Cite As Get BibTex

Josh Alman and Yunfeng Guan. Finer-Grained Hardness of Kernel Density Estimation. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 35:1-35:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.CCC.2024.35

Author Details

Josh Alman

Department of Compute Science, Columbia University, New York, NY, USA

Yunfeng Guan

Department of Compute Science, Columbia University, New York, NY, USA

Funding

Supported in part by a grant from the Simons Foundation (Grant Number 825870 JA) and a Google Research Scholar award.

Acknowledgements

We would like to thank Amol Aggarwal for constructive discussions on Schur polynomials, and anonymous reviewers for helpful suggestions.

References

Amol Aggarwal and Josh Alman. Optimal-Degree Polynomial Approximations for Exponentials and Gaussian Kernel Density Estimation. In 37th Computational Complexity Conference (CCC 2022), 2022.
Josh Alman, Timothy Chu, Aaron Schild, and Zhao Song. Algorithms and hardness for linear algebra on geometric graphs. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00057.
Josh Alman and Zhao Song. Fast attention requires bounded entries. In NeurIPS, 2023.
Josh Alman and Ryan Williams. Probabilistic polynomials and hamming nearest neighbors. 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, October 2015. URL: https://doi.org/10.1109/focs.2015.18.
Arturs Backurs, Moses Charikar, Piotr Indyk, and Paris Siminelakis. Efficient density evaluation for smooth kernels. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 615-626, 2018. URL: https://doi.org/10.1109/FOCS.2018.00065.
Arturs Backurs, Piotr Indyk, and Ludwig Schmidt. On the fine-grained complexity of empirical risk minimization: Kernel methods and neural networks. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/635440afdfc39fe37995fed127d7df4f-Paper.pdf.
Arturs Backurs, Piotr Indyk, and Tal Wagner. Space and time efficient kernel density estimation in high dimensions. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2019. Curran Associates Inc.
Moses Charikar, Michael Kapralov, Navid Nouri, and Paris Siminelakis. Kernel density estimation through density constrained near neighbor search. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 172-183, 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00025.
Moses Charikar, Michael Kapralov, and Erik Waingarten. A quasi-monte carlo data structure for smooth kernel evaluations. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 5118-5144, 2024. URL: https://doi.org/10.1137/1.9781611977912.184.
Moses Charikar and Paris Siminelakis. Hashing-based-estimators for kernel density in high dimensions. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 1032-1043, 2017. URL: https://doi.org/10.1109/FOCS.2017.99.
Moses Charikar and Paris Siminelakis. Multi-resolution hashing for fast pairwise summations. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 769-792, 2019. URL: https://doi.org/10.1109/FOCS.2019.00051.
Lijie Chen. On the hardness of approximate and exact (bichromatic) maximum inner product. Theory of Computing, 16(4):1-50, 2020. URL: https://doi.org/10.4086/toc.2020.v016a004.
Yen-Chi Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1(1):161-187, 2017.
L Greengard and V Rokhlin. A fast algorithm for particle simulations. Journal of Computational Physics, 73(2):325-348, 1987. URL: https://doi.org/10.1016/0021-9991(87)90140-9.
Leslie Greengard and John A. Strain. The fast gauss transform. SIAM J. Sci. Comput., 12:79-94, 1991. URL: https://api.semanticscholar.org/CorpusID:3145209.
Sean Hallgren, Alexander Russell, and Amnon Ta-Shma. Normal subgroup reconstruction and quantum computation using group representations. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC '00, pages 627-635, New York, NY, USA, 2000. Association for Computing Machinery. URL: https://doi.org/10.1145/335305.335392.
Christian Ikenmeyer and Greta Panova. Rectangular kronecker coefficients and plethysms in geometric complexity theory. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 396-405, 2016. URL: https://doi.org/10.1109/FOCS.2016.50.
Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Bernhard Schölkopf, et al. Kernel mean embedding of distributions: A review and beyond. Foundations and Trendsregistered in Machine Learning, 10(1-2):1-141, 2017.
Ryan O'Donnell and John Wright. Quantum spectrum testing. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC '15, pages 529-538, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2746539.2746582.
Jeff M. Phillips and Wai Ming Tai. Near-optimal coresets of kernel density estimates. Discrete Comput. Geom., 63(4):867-887, June 2020. URL: https://doi.org/10.1007/s00454-019-00134-6.
Aviad Rubinstein. Hardness of approximate nearest neighbor search. Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, June 2018. URL: https://doi.org/10.1145/3188745.3188916.
Bernhard Schölkopf and Alexander J Smola. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002.
Richard Stanley. Enumerative Combinatorics. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2 edition, 2023.
Paraskevas Syminelakis. Fast kernel evaluation in high dimensions: Importance sampling and near neighbor search. PhD thesis, Stanford University, 2019.
Ryan Williams. On the difference between closest, furthest, and orthogonal pairs: Nearly-linear vs barely-subquadratic complexity. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018.

Finer-Grained Hardness of Kernel Density Estimation

Authors Josh Alman , Yunfeng Guan

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message