DROPS

Document

DOI: 10.4230/LIPIcs.ITCS.2021.53

No Quantum Speedup over Gradient Descent for Non-Smooth Convex Optimization

Authors: Ankit Garg, Robin Kothari, Praneeth Netrapalli, and Suhail Sherif

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

Abstract

We study the first-order convex optimization problem, where we have black-box access to a (not necessarily smooth) function f:ℝⁿ → ℝ and its (sub)gradient. Our goal is to find an ε-approximate minimum of f starting from a point that is distance at most R from the true minimum. If f is G-Lipschitz, then the classic gradient descent algorithm solves this problem with O((GR/ε)²) queries. Importantly, the number of queries is independent of the dimension n and gradient descent is optimal in this regard: No deterministic or randomized algorithm can achieve better complexity that is still independent of the dimension n. In this paper we reprove the randomized lower bound of Ω((GR/ε)²) using a simpler argument than previous lower bounds. We then show that although the function family used in the lower bound is hard for randomized algorithms, it can be solved using O(GR/ε) quantum queries. We then show an improved lower bound against quantum algorithms using a different set of instances and establish our main result that in general even quantum algorithms need Ω((GR/ε)²) queries to solve the problem. Hence there is no quantum speedup over gradient descent for black-box first-order convex optimization without further assumptions on the function family.

Cite as

Ankit Garg, Robin Kothari, Praneeth Netrapalli, and Suhail Sherif. No Quantum Speedup over Gradient Descent for Non-Smooth Convex Optimization. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 53:1-53:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{garg_et_al:LIPIcs.ITCS.2021.53,
  author =	{Garg, Ankit and Kothari, Robin and Netrapalli, Praneeth and Sherif, Suhail},
  title =	{{No Quantum Speedup over Gradient Descent for Non-Smooth Convex Optimization}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{53:1--53:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.53},
  URN =		{urn:nbn:de:0030-drops-135921},
  doi =		{10.4230/LIPIcs.ITCS.2021.53},
  annote =	{Keywords: Quantum algorithms, Gradient descent, Convex optimization}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2017.2

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, and Aaron Sidford

Published in: LIPIcs, Volume 93, 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)

Abstract

This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addresses model mis-specification.

Cite as

Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, and Aaron Sidford. A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares). In 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 93, pp. 2:1-2:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{jain_et_al:LIPIcs.FSTTCS.2017.2,
  author =	{Jain, Prateek and Kakade, Sham M. and Kidambi, Rahul and Netrapalli, Praneeth and Pillutla, Venkata Krishna and Sidford, Aaron},
  title =	{{A Markov Chain Theory Approach to Characterizing the Minimax Optimality  of Stochastic Gradient Descent  (for Least Squares)}},
  booktitle =	{37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)},
  pages =	{2:1--2:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-055-2},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{93},
  editor =	{Lokam, Satya and Ramanujam, R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2017.2},
  URN =		{urn:nbn:de:0030-drops-83941},
  doi =		{10.4230/LIPIcs.FSTTCS.2017.2},
  annote =	{Keywords: Stochastic Gradient Descent, Minimax Optimality, Least Squares Regression}
}

@InProceedings{jain_et_al:LIPIcs.FSTTCS.2017.2,
  author =	{Jain, Prateek and Kakade, Sham M. and Kidambi, Rahul and Netrapalli, Praneeth and Pillutla, Venkata Krishna and Sidford, Aaron},
  title =	{{A Markov Chain Theory Approach to Characterizing the Minimax Optimality  of Stochastic Gradient Descent  (for Least Squares)}},
  booktitle =	{37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2017)},
  pages =	{2:1--2:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-055-2},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{93},
  editor =	{Lokam, Satya and Ramanujam, R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2017.2},
  URN =		{urn:nbn:de:0030-drops-83941},
  doi =		{10.4230/LIPIcs.FSTTCS.2017.2},
  annote =	{Keywords: Stochastic Gradient Descent, Minimax Optimality, Least Squares Regression}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2018.8

Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

Authors: Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, and David P. Woodruff

Published in: LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)

Abstract

Understanding the singular value spectrum of an n x n matrix A is a fundamental task in countless numerical computation and data analysis applications. In matrix multiplication time, it is possible to perform a full SVD of A and directly compute the singular values \sigma_1,...,\sigma_n. However, little is known about algorithms that break this runtime barrier. Using tools from stochastic trace estimation, polynomial approximation, and fast linear system solvers, we show how to efficiently isolate different ranges of A's spectrum and approximate the number of singular values in these ranges. We thus effectively compute an approximate histogram of the spectrum, which can stand in for the true singular values in many applications. We use our histogram primitive to give the first algorithms for approximating a wide class of symmetric matrix norms and spectral sums faster than the best known runtime for matrix multiplication. For example, we show how to obtain a (1 + \epsilon) approximation to the Schatten 1-norm (i.e. the nuclear or trace norm) in just ~ O((nnz(A)n^{1/3} + n^2)\epsilon^{-3}) time for A with uniform row sparsity or \tilde O(n^{2.18} \epsilon^{-3}) time for dense matrices. The runtime scales smoothly for general Schatten-p norms, notably becoming \tilde O (p nnz(A) \epsilon^{-3}) for any real p >= 2. At the same time, we show that the complexity of spectrum approximation is inherently tied to fast matrix multiplication in the small \epsilon regime. We use fine-grained complexity to give conditional lower bounds for spectrum approximation, showing that achieving milder \epsilon dependencies in our algorithms would imply triangle detection algorithms for general graphs running in faster than state of the art matrix multiplication time. This further implies, through a reduction of (Williams & William, 2010), that highly accurate spectrum approximation algorithms running in subcubic time can be used to give subcubic time matrix multiplication. As an application of our bounds, we show that precisely computing all effective resistances in a graph in less than matrix multiplication time is likely difficult, barring a major algorithmic breakthrough.

Cite as

Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, and David P. Woodruff. Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 8:1-8:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{musco_et_al:LIPIcs.ITCS.2018.8,
  author =	{Musco, Cameron and Netrapalli, Praneeth and Sidford, Aaron and Ubaru, Shashanka and Woodruff, David P.},
  title =	{{Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness}},
  booktitle =	{9th Innovations in Theoretical Computer Science Conference (ITCS 2018)},
  pages =	{8:1--8:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-060-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{94},
  editor =	{Karlin, Anna R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.8},
  URN =		{urn:nbn:de:0030-drops-83397},
  doi =		{10.4230/LIPIcs.ITCS.2018.8},
  annote =	{Keywords: spectrum approximation, matrix norm computation, fine-grained complexity, linear algebra}
}

Search Results

Documents authored by Netrapalli, Praneeth

No Quantum Speedup over Gradient Descent for Non-Smooth Convex Optimization

Abstract

Cite as

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Abstract

Cite as

Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

Abstract

Cite as