Relative Lipschitzness in Extragradient Methods and a Direct Recipe for Acceleration

Authors Michael B. Cohen, Aaron Sidford, Kevin Tian



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2021.62.pdf
  • Filesize: 0.55 MB
  • 18 pages

Document Identifiers

Author Details

Michael B. Cohen
  • Massachusetts Institute of Technoolgy, Cambridge, MA, USA
Aaron Sidford
  • Stanford University, CA, USA
Kevin Tian
  • Stanford University, CA, USA

Acknowledgements

The existence of an extragradient algorithm in the primal-dual formulation of smooth minimization directly achieving accelerated rates is due to discussions with the first author, Michael B. Cohen. The second and third authors are indebted to him, and this work is dedicated in his memory. We also thank our collaborators in concurrent works, Yair Carmon and Yujia Jin, for many helpful and encouraging conversations throughout the duration of this project, Jelena Diakonikolas, Jonathan Kelner, and Jonah Sherman for helpful conversations, and anonymous reviewers for multiple helpful comments on earlier versions of the paper.

Cite AsGet BibTex

Michael B. Cohen, Aaron Sidford, and Kevin Tian. Relative Lipschitzness in Extragradient Methods and a Direct Recipe for Acceleration. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 62:1-62:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ITCS.2021.62

Abstract

We show that standard extragradient methods (i.e. mirror prox [Arkadi Nemirovski, 2004] and dual extrapolation [Yurii Nesterov, 2007]) recover optimal accelerated rates for first-order minimization of smooth convex functions. To obtain this result we provide fine-grained characterization of the convergence rates of extragradient methods for solving monotone variational inequalities in terms of a natural condition we call relative Lipschitzness. We further generalize this framework to handle local and randomized notions of relative Lipschitzness and thereby recover rates for box-constrained 𝓁_∞ regression based on area convexity [Jonah Sherman, 2017] and complexity bounds achieved by accelerated (randomized) coordinate descent [Zeyuan {Allen Zhu} et al., 2016; Yurii Nesterov and Sebastian U. Stich, 2017] for smooth convex function minimization.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Convex optimization
Keywords
  • Variational inequalities
  • minimax optimization
  • acceleration
  • 𝓁_∞ regression

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jacob D. Abernethy, Kevin A. Lai, Kfir Y. Levy, and Jun-Kun Wang. Faster rates for convex-concave games. In Conference On Learning Theory, COLT 2018, Stockholm, Sweden, 6-9 July 2018., pages 1595-1625, 2018. Google Scholar
  2. Jacob D. Abernethy, Kevin A. Lai, and Andre Wibisono. Last-iterate convergence rates for min-max optimization. CoRR, abs/1906.02027, 2019. URL: http://arxiv.org/abs/1906.02027.
  3. Zeyuan Allen Zhu. Katyusha: the first direct acceleration of stochastic gradient methods. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 1200-1205, 2017. Google Scholar
  4. Zeyuan Allen Zhu and Elad Hazan. Optimal black-box reductions between optimization objectives. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 1606-1614, 2016. Google Scholar
  5. Zeyuan Allen Zhu, Zheng Qu, Peter Richtárik, and Yang Yuan. Even faster accelerated coordinate descent using non-uniform sampling. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pages 1110-1119, 2016. Google Scholar
  6. Heinz H. Bauschke, Jérôme Bolte, and Marc Teboulle. A descent lemma beyond lipschitz gradient continuity: First-order methods revisited and applications. Math. Oper. Res., 42(2):330-348, 2017. Google Scholar
  7. Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sciences, 2(1):183-202, 2009. Google Scholar
  8. Digvijay Boob, Saurabh Sawlani, and Di Wang. Faster width-dependent algorithm for mixed packing and covering lps. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 15253-15262, 2019. Google Scholar
  9. Sébastien Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231-357, 2015. Google Scholar
  10. Yair Carmon, Yujia Jin, Aaron Sidford, and Kevin Tian. Variance reduction for matrix games. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 11377-11388, 2019. Google Scholar
  11. Yair Carmon, Yujia Jin, Aaron Sidford, and Kevin Tian. Coordinate methods for matrix games. In 61st Annual IEEE Symposium on Foundations of Computer Science, FOCS 2020, 2020. Google Scholar
  12. Tatjana Chavdarova, Gauthier Gidel, François Fleuret, and Simon Lacoste-Julien. Reducing noise in GAN training with variance reduced extragradient. CoRR, abs/1904.08598, 2019. URL: http://arxiv.org/abs/1904.08598.
  13. Jelena Diakonikolas and Lorenzo Orecchia. Accelerated extra-gradient descent: A novel accelerated first-order method. In 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA, pages 23:1-23:19, 2018. Google Scholar
  14. Radu-Alexandru Dragomir, Adrien Taylor, Alexandre d'Aspremont, and Jérôme Bolte. Optimal complexity and certification of bregman first-order methods. CoRR, abs/1911.08510, 2019. URL: http://arxiv.org/abs/1911.08510.
  15. Filip Hanzely, Peter Richtarik, and Lin Xiao. Accelerated bregman proximal gradient methods for relatively smooth convex optimization. CoRR, abs/1808.03045, 2018. URL: http://arxiv.org/abs/1808.03045.
  16. Yu-Guan Hsieh, Franck Iutzeler, Jérôme Malick, and Panayotis Mertikopoulos. On the convergence of single-call stochastic extra-gradient methods. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 6936-6946, 2019. Google Scholar
  17. Arun Jambulapati, Aaron Sidford, and Kevin Tian. A direct Õ(1/ε) iteration parallel algorithm for optimal transport. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 11355-11366, 2019. Google Scholar
  18. Rie Johnson and Tong Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 315-323, 2013. Google Scholar
  19. Anatoli Juditsky, Arkadi Nemirovski, and Claire Tauvel. Solving variational inequalities with stochastic mirror-prox algorithm. Stochastic Systems, 1(1):17-58, 2011. Google Scholar
  20. Yin Tat Lee and Aaron Sidford. Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 147-156, 2013. Google Scholar
  21. Tianyi Lin, Chi Jin, and Michael I. Jordan. Near-optimal algorithms for minimax optimization. CoRR, abs/2002.02417, 2020. URL: http://arxiv.org/abs/2002.02417.
  22. Haihao Lu. "relative-continuity" for non-lipschitz non-smooth convex optimization using stochastic (or deterministic) mirror descent. INFORMS Journal on Optimization, pages 288-303, 2019. Google Scholar
  23. Haihao Lu, Robert M. Freund, and Yurii E. Nesterov. Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Optim., 28(1):333-354, 2018. Google Scholar
  24. Panayotis Mertikopoulos, Bruno Lecouat, Houssam Zenati, Chuan-Sheng Foo, Vijay Chandrasekhar, and Georgios Piliouras. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019. Google Scholar
  25. A. Nemirovski and D.~B. Yudin. Problem Complexity and Method Efficiency in Optimization. Wiley, 1983. Google Scholar
  26. Arkadi Nemirovski. Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229-251, 2004. Google Scholar
  27. Yurii Nesterov. A method for solving a convex programming problem with convergence rate o(1/k²). Doklady AN SSSR, 269:543-547, 1983. Google Scholar
  28. Yurii Nesterov. Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Program., 109(2-3):319-344, 2007. Google Scholar
  29. Yurii Nesterov and Sebastian U. Stich. Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM Journal on Optimization, 27(1):110-123, 2017. Google Scholar
  30. Yuyuan Ouyang and Yangyang Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming, 2019. Google Scholar
  31. Balamurugan Palaniappan and Francis R. Bach. Stochastic variance reduction methods for saddle-point problems. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 1408-1416, 2016. Google Scholar
  32. Alexander Rakhlin and Karthik Sridharan. Optimization, learning, and games with predictable sequences. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States, pages 3066-3074, 2013. Google Scholar
  33. Mark W. Schmidt, Nicolas Le Roux, and Francis R. Bach. Minimizing finite sums with the stochastic average gradient. Math. Program., 162(1-2):83-112, 2017. Google Scholar
  34. Jonah Sherman. Area-convexity, l_∞ regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 452-460, 2017. Google Scholar
  35. Aaron Sidford and Kevin Tian. Coordinate methods for accelerating 𝓁_∞ regression and faster approximate maximum flow. In 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018, 7-9 October, 2018, Paris, France, 2018. Google Scholar
  36. Fedor Stonyakina, Alexander Tyurin, Alexander Gasnikov, Pavel Dvurechensky, Artem Agafonov, Darina Dvinskikh, Dmitry Pasechnyuk, Sergei Artamonov, and Victorya Piskunova. Inexact relative smoothness and strong convexity for optimization and variational inequalities by inexact model. CoRR, abs/2001.09013, 2020. URL: http://arxiv.org/abs/2001.09013.
  37. Jun-Kun Wang and Jacob D. Abernethy. Acceleration through optimistic no-regret dynamics. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada., pages 3828-3838, 2018. Google Scholar
  38. Junyu Zhang, Minyi Hong, and Shuzhong Zhang. On lower iteration complexity bounds for the saddle point problems. CoRR, abs/1912.07481, 2019. URL: http://arxiv.org/abs/1912.07481.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail